1 Introduction

Many computer vision systems highly rely on the quality of the input image to provide useful outputs. As a result, many researches have been made on enhancing the input images to avoid undesired features such as noise, visual artifacts and reducing redundancy on the information. Segmentation is one of the most commonly used techniques to address such issues; it separates the pixels of the image into different classes based on the intensity level of each pixel. Segmentation has been used to extract features of digital images (Kong et al. 2015), for object identification (Cao et al. 2014) and surveillance (Bhandari et al. 2014). Thresholding is the easiest and simplest method for image segmentation; it takes the information of the histogram from the analyzed image and determinates a threshold (th) value to separate the pixels in different regions. Two approaches are used to segment an image using thresholding: bi-level thresholding (BT) and multilevel thresholding (MT). BT requires a single value th to generate two classes (e.g., foreground and background). Likewise, MT uses a finite number of threshold values to divide the image into more than two homogeneous classes (Gonzalez and Woods 1992; Akay 2013).

Thresholding-based techniques are divided in parametric and nonparametric (Otsu 1979; Kapur et al. 1985; Horng 2010; Olugbara et al. 2015). A parametric approach considers that each class, or group of pixels in an image, can be modeled using probability density functions and that the mixture of those classes will represent all the pixels in the image. In other words, the parametric methods approximate the histogram of an image using different mechanisms and assuming that it has a Gaussian distribution (Horng 2010). On the other hand, nonparametric techniques use discriminative criteria to separate the pixels into homogenous regions (Horng 2010). Such criteria are metrics that verify the quality of a th value and are also used as objective function since they result as an attractive option due to their robustness and accuracy. Thresholding segmentation has also become popular due to its simple concept. Contrary to clustering approaches, thresholding only requires the number of thresholds to be set a priori, while most classical cluster-based techniques for MT also require the centroid of each class and extra information to be correctly initialized (Olugbara et al. 2015).

On the literature, there are two classical nonparametric methods for bi-level thresholding: the first one was proposed by Otsu (1979) and maximizes the variance between classes. The second one was proposed by Kapur et al. (1985), and it proposes the maximization of entropy as a measure of the homogeneity among classes. The two methods are extensively used in image processing, and both of them have proved to be efficient and accurate alternatives to segment pixels into two classes (Sathya and Kayalvizhi 2011). Both methods can be extended for multilevel thresholding; however, their computational complexity increases, while their accuracy decreases with each new threshold added into the searching process (Sathya and Kayalvizhi 2011; Agrawal et al. 2013). The importance of thresholding techniques is evidenced by the large amount of research made in recent years (Bhandari et al. 2014; Liu et al. 2015; Khairuzzaman and Chaudhury 2017).

The minimum cross entropy (MCET) was originally proposed by Kullback (1968) and is also known as direct divergence. The MCET is a distance metric between two probabilistic distributions, proposed as an extension of the maximum entropy principle. It has been used to select correctly the best threshold, minimizing the cross entropy between the original image and the segmented results (Li and Lee 1993). An improved version of the MCET was presented by Yin (2007) which uses a recursive programming to reduce the computational effort of computing the MCET for multilevel thresholding. Different to the Otsu and Kapur methods, the MCET produces a functional formulation whose accuracy does not depend on the number of threshold points (Horng and Liou 2011). Although the MCET is an efficient method and it gives excellent results for bi-level thresholding, its performance is affected for multilevel thresholding. The complexity increases with each threshold that is added, generating both restrictions as well as multimodality in the solutions, even for the recursive programming approach. The problem in MCET is to find the appropriate threshold values without affecting the computational cost. In order to overcome this problem, different evolutionary techniques are applied as a search mechanism using as objective function the MCET, generating interesting segmentation approaches that have been reported in the related literature. For example, the use of the popular particle swarm optimization (Kennedy and Eberhart 1995) to minimize the MCET is proposed by Yin (2007). Sarkar presented an algorithm based on differential evolution (Sarkar et al. 2011). Horng presented two interesting segmentation mechanisms, one of them based on the Honey Bee Mating algorithm (Horng 2010) and the second one uses the Firefly algorithm (FF) (Horng and Liou 2011). All these approaches optimize the MCET objective function despite of its high multimodality characteristics. However, to obtain the optimal values, they usually require a large number of iterations and evaluations of the objective function, even if they use the recursive programming techniques. Moreover, their results are suboptimal because they have a lack of accuracy in the optimization process; for instance, the accuracy of the FF depends directly on the control parameters (Yang 2014). Recently, MCET has proven to be an interesting tool in important fields such as medical image processing by performing brain image and tumor segmentation (Kaur et al. 2016; Oliva et al. 2017). However, the use of a high number of iterations and evaluations to produce satisfying results is still present.

The aim of this paper is to present the use of the electromagnetism-like algorithm (EMO) for multilevel thresholding to reduce the amount of iterations and function evaluations. EMO is an evolutionary computation technique that uses a population of charged particles to find the optimal solution. It was introduced by Birbil and Fang (2003) to solve unconstrained optimization problems. The analogy of EMO is the electromagnetism, in specific the attraction-repulsion of charged particles within an electromagnetic field. Under the EMO context, the particles are candidate solutions, and they contain an amount of charge that depends on the objective function. Such particles have a position in the search space and in each iteration they are moved to new position considering the force exerted among them. Different to other evolutionary methods, EMO exhibits interesting search capabilities such as fast convergence still keeping its ability to avoid local minima in high modality environments (De Jong 1988; Dorigo et al. 1996; De Castro and Von Zuben 2002). Recent works in this field (Birbil et al. 2004; Wu et al. 2004; Rocha and Fernandes 2009a, b) demonstrate that the EMO algorithm presents the best balance between optimization results and demand of function evaluations. Considering such features, EMO has been effectively employed to solve problems from different fields of engineering. Some examples of the areas where EMO has been implemented are: flow-shop scheduling (Naderi et al. 2010), communications (Hung and Huang 2011), vehicle routing (Yurtkuran and Emel 2010), array pattern optimization in circuits (Jhang and Lee 2009), neural network training (Lee and Chang 2010), image processing and control systems (Ghamisi et al. 2012). For that reason, this technique is used to find the best threshold values, by minimizing the cross entropy. Such approach only takes as input the histogram and the number of thresholds to be found. In this sense, the size of the image does not affect the quality and accuracy of the results and doesn’t require any additional information or initialization about the problem. As a result, the proposed algorithm substantially reduces the number of function evaluations preserving the excellent search capabilities of an evolutionary method. The segmentation algorithm encodes as a particle the set of candidate threshold points. The MCET objective function evaluates the quality of the candidate particle. Guided by the values of this objective function, the set of encoded candidate solutions are modified using the operators present on EMO so that they can improve their segmentation quality as the optimization process evolves. In comparison to similar approaches, the proposed method deploys better segmentation results yet consuming less MCET function evaluations, and it is reflected on the lesser computational effort.

The remainder sections of the paper is organized as follows. In Sect. 2, the standard EMO algorithm is presented. Section 3 gives a simple description of the minimum cross entropy method. Section 4 explains the implementation of the proposed algorithm. Section 5 discusses experimental results and comparisons after testing the proposal over a set of benchmark images. Finally, in Sect. 6 the conclusions are discussed.

2 Electromagnetism-Like Optimization Algorithm (EMO)

Birbil and Fang proposed the EMO algorithm (Birbil and Fang 2003), a population-based evolutionary method created to solve unconstrained optimization problems. EMO is able to converge fast to the optimal values avoiding the local values that are commonly found in multimodal search spaces (De Jong 1988; Dorigo et al. 1996; De Castro and Von Zuben 2002). Another important feature of EMO is a good balance between optimization results and function evaluations (Birbil et al. 2004; Wu et al. 2004; Rocha and Fernandes 2009a, b). EMO has a population \(\mathbf{Sp}_t =\left\{ {x_{1,t} ,x_{2,t} ,\ldots ,x_{N,t} } \right\} \)of N particles \(\left( {x_{i,t} } \right) \) with an \(n-\)dimensional size \(\left( {i=1,2,\ldots ,n} \right) \). The population is used to search for a feasible set \(\mathbf{X}=\left\{ {x\in \mathfrak {R}^{n}\left| {l_i \le x\le u_i } \right. } \right\} \) where t represents the iteration (or generation) of the algorithm. Prior to the optimization process, the population \(\mathbf{Sp}_t \) is initialized being \(t=1\) taking uniformly distributed samples from the search region X. After the initialization of \(\mathbf{Sp}_t \), the EMO iterative process continues until a stopping condition (e.g., the maximum number of iterations) is met. An iteration of EMO consists of two main steps: in the first step, each particle in \(\mathbf{Sp}_t \) moves to a different location by using the attraction–repulsion mechanism of the electromagnetism theory (Birbil et al. 2004). In the second step, the particles moved by the electromagnetism principle are further perturbed locally by a local search and then become members of \(\mathbf{Sp}_{t+1} \) in the next iteration. Both, the attraction–repulsion mechanism and the local search in EMO are responsible for driving the particles of \(\mathbf{Sp}_t \) near to the global optimum.

Similar to the electromagnetism theory for charged particles, every particle \(x_{i,t} \in \mathbf{Sp}_t \) in the search space X is considered as a charged particle where the charge is computed based on its objective function value. Points with better objective function value have higher charges than remaining particles. The attraction–repulsion mechanism is a process in EMO by which particles with more charge attract other particles from \(\mathbf{Sp}_t \), and particles with less charge repel other elements. Finally, a total force vector \(F_i^t \), exerted on a point (e.g., the i-th point \(x_{i,t} )\) is calculated by the sum of these attraction–repulsion forces, and each \(x_{i,t} \in \mathbf{Sp}_t \) is moved in the direction of its total force to the location \(y_{i,t} \). A local search is used to explore the vicinity of the each particle according to its fitness. The members of the new population, \(x_{i,t+1} \in \mathbf{Sp}_{t+1} \) are determined by using the following update equation:

$$\begin{aligned} x_{i,t+1} =\left\{ {{\begin{array}{ll} {y_{i,t} }&{}\quad {\hbox {if }}\quad {f(y_{i,t} )\le f(z_{i,t} )} \\ {z_{i,t} }&{}\quad {\hbox {otherwise}} \\ \end{array} }} \right. \end{aligned}$$
(1)

In Eq. (1), \(z_{i,t} \) is the particle perturbed in the local search procedure. Both \(y_{i,t} \) and \(z_{i,t} \) are evaluated using the objective function \(f\left( \cdot \right) \).This equation updates of an element of the population if the value of the objective function is lower. Notice that the sign \(\le \) is used for minimization and it could be changed for maximization problems.

Algorithm 1 shows the general scheme of EMO, where each step is accordingly described.

figure a

Input parameters (Line 1) EMO algorithm runs for \(\mathrm{Iter}_{\max } \) iterations. In the local search phase, \(n\times \mathrm{Iter}_{\mathrm{local}} \) is the maximum number of locations \(z_{i,t} \), within a \(\delta \) distance of \(y_{i,t} \), for each i dimension.

Initialize (Line 2) The points \(x_{i,t} \), \(t=1\), are selected uniformly in X, i.e., \(x_{i,1} \approx \mathrm{Unif}(\mathbf{X})\), \(i=1,2,\ldots ,N\), where Unif represents the uniform distribution in X. The objective function values \(f(x_{i,t} )\) are computed, and the best point is identified for minimization in Eq. (2) and for maximization in Eq. (3).

$$\begin{aligned} x_t^B= & {} \hbox {arg }\mathop {\min }\limits _{x_{i,t} \in \mathbf{Sp}_t } \left\{ {f(x_{i,t} )} \right\} \end{aligned}$$
(2)
$$\begin{aligned} x_t^B= & {} \hbox {arg }\mathop {\max }\limits _{x_{i,t} \in \mathbf{Sp}_t } \left\{ {f(x_{i,t} )} \right\} \end{aligned}$$
(3)

From Eqs. (2) and (3) the best value \(x_t^B \) is selected from the current population in the t iteration.

Calculate force (Line 4) In this step, a charged-like value \((q_{i,t} )\) is assigned to each point \((x_{i,t} )\). The charge \(q_{i,t} \) of \(x_{i,t} \) depends on \(f(x_{i,t} )\) and points with better objective function have more charge than others. The charges are computed according to Eq. (4).

$$\begin{aligned} q_{i,t} =\exp \left( {-n\frac{f(x_{i,t} )-f(x_t^B )}{\sum \nolimits _{j=1}^N {f(x_{j,t} )-f(x_t^B )} }} \right) \end{aligned}$$
(4)

From Eq. (4), n is the number of dimensions of the search space, exp is the exponential function, N is the number of candidate solutions and \(x_t^B \) is the best point of the population at the iteration t.

The force that exists between two points \(x_{i,t} \) and \(x_{j,t} \) of the population is obtained using Eq. (5).

$$\begin{aligned} F_{i,j}^t =\left\{ {{\begin{array}{lll} {\left( {x_{j,t} -x_{i,t} } \right) \frac{q_{i,t} \cdot q_{j,t} }{\left\| {x_{j,t} -x_{i,t} } \right\| ^{2}}}&{}\quad {\hbox {if}}&{} {f(x_{i,t} )>f(x_{j,t} )} \\ {\left( {x_{i,t} -x_{j,t} } \right) \frac{q_{i,t} \cdot q_{j,t} }{\left\| {x_{j,t} -x_{i,t} } \right\| ^{2}}}&{}\quad {\hbox {if}}&{} {f(x_{i,t} )\le f(x_{j,t} )} \\ \end{array} }} \right. \end{aligned}$$
(5)

The total force, \(F_i^t \), corresponding to a specific particle \(x_{i,t} \) is now calculated as follow:

$$\begin{aligned} F_i^t =\sum _{j=1,j\ne i}^N {F_{i,j}^t } \end{aligned}$$
(6)

Once the forces are computed, the next step (Line 5) is to move the selected point \(x_{i,t} \) along the direction of the vector \(F_i^t \) using the next equation:

$$\begin{aligned} y_{i,t} =x_{i,t} +\lambda \frac{F_i^t }{\left\| {F_i^t } \right\| }(\mathrm{RNG}),\quad i=1,2,\ldots ,N;\,i\ne B \end{aligned}$$
(7)

where \(\lambda \approx \mathrm{Unif}(0,1)\) is a uniformly distributed random number in the interval [0,1] for each coordinate of \(x_{i,t} \), and RNG denotes the allowed range of movement toward the lower or upper bound for the corresponding dimension. B, refers to the index of the best element of the population.

Local search (Line 6) For each \(y_{i,t} \) a maximum of \(\mathrm{iter}_{\mathrm{local}} \) points are generated in each coordinate direction in the \(\delta \) neighborhood of \(y_{i,t} \). This means that the process of generating local point is continued for each \(y_{i,t} \) until either a better \(z_{i,t} \) is found or the \(n\times \mathrm{Iter}_{\mathrm{local}} \) trial is reached.

Selection for the next iteration (Line 7) In this step, \(x_{i,t+1} \in \mathbf{Sp}_{t+1} \)are selected between \(y_{i,t} \) and \(z_{i,t} \) using Eq. (1), and the best point is identified using Eq. (2) for minimization or Eq. (3) for maximization.

The optimization process performed by EMO involves different operations that are able to manage local and global information of the search space. This process is more complex than other evolutionary approaches which only employ one (or two) mathematical operations to modify the population.

3 Minimum cross entropy (MCET)

The cross entropy also known as divergence is an information theoretic metric used to verify the amount of information in a random process (Tang et al. 2011). In other words, it helps to measure the distance between two probabilistic distributions (Pal 1996). The cross entropy introduced by Kullback (1968) is in general defined as:

$$\begin{aligned} D\left( {\mathbf{B},\mathbf{C}} \right) =\sum _{k=1}^N {b_k \log \left( {\frac{b_k }{c_k }} \right) } \end{aligned}$$
(8)

where \(\mathbf{B}=\left\{ {b_1 ,b_2 ,\ldots ,b_N } \right\} \) and \(\mathbf{C}=\left\{ {c_1 ,c_2 ,\ldots ,c_N } \right\} \) are two probabilistic distributions of the same set. D is the minimum cross entropy that is an extension of the maximum cross entropy. The MCET could be interpreted as the expectation of change in the information content in when is used B instead of C (Li and Lee 1993). It is important to mention that a higher value of MCET represents more uncertainty in the random process.

In image processing, and especially for image segmentation, a set of different thresholds (\(\mathbf{th}=\left[ {th_1 ,th_2 ,\ldots th_{nt} } \right] )\) is selected from the image histogram (h). The values contained in th must minimize the MCET that exists between the original image \(\mathbf{I}_{\mathrm{or}} \) and the thresholded image \(\mathbf{I}_{th} \). The simplest example of image segmentation is using one threshold (\(\mathbf{th}=\left[ {th_1 } \right] )\), once it was selected the \(\mathbf{I}_{th} \) is generated according to the following rule:

$$\begin{aligned} \mathbf{I}_{th} \left( {i,j} \right) =\left\{ {{\begin{array}{lll} {\mu \left( {1,th_1 } \right) }&{}\quad {\hbox {if}}&{} {\mathbf{I}_{\mathrm{or}} \left( {i,j} \right) <th_1 ,} \\ {\mu \left( {th_1 ,L+1} \right) }&{}\quad {\hbox {if}}&{} {\mathbf{I}_{\mathrm{or}} \left( {i,j} \right) \ge th_1 } \\ \end{array} }} \right. \end{aligned}$$
(9)

where L is the higher intensity value of the histogram (\(L = 255\) for an 8 bits gray scale image). The rule presented in Eq. (9) could be easily expanded for more than one th. Considering the histogram (h) of the original image, the normalized value \(\mu \) for a specific range restricted by a and b is then computed as:

$$\begin{aligned} {\mu \left( {a,b} \right) =\frac{\sum \nolimits _{i=a}^{b-1} {ih\left( i \right) } }{\sum \nolimits _{i=a}^{b-1} {h\left( i \right) } },}\quad {i=1,2,\ldots ,L} \end{aligned}$$
(10)

In order to verify if the original image was correctly segmented is necessary to apply the minimum cross entropy for digital images. The MCET is then computed using the method proposed by Li and Lee (1993) that is defined for one threshold (\(\mathbf{th}=\left[ {th_1 } \right] )\) as follows:

$$\begin{aligned} D\left( {\mathbf{th}} \right)= & {} \sum _{i=1}^{th_1 -1} {ih\left( i \right) \log \left( {\frac{i}{\mu \left( {1,th_1 } \right) }} \right) }\nonumber \\&+\sum _{i=th_1 }^L {ih\left( i \right) \log \left( {\frac{i}{\mu \left( {th_1 ,L+1} \right) }} \right) } \end{aligned}$$
(11)

The aim is to find the best set of thresholds that minimizes the cross entropy. To achieve this, Eq. (12) is defined as the objective function.

$$\begin{aligned} \mathbf{th}_{\mathrm{opt}} =\arg \mathop {\min }\limits _{\mathbf{th}} \left( {D\left( {\mathbf{th}} \right) } \right) \end{aligned}$$
(12)

The computational complexity to obtain an optimal threshold (\(\mathbf{th}=\left[ {th_1 } \right] )\) is \(O\left( {nt\cdot L^{2}} \right) \) but it becomes computationally expensive for multilevel thresholding approaches, for nt thresholds values the complexity is \(O\left( {nt\cdot L^{nt+1}} \right) \) (Tang et al. 2011).

3.1 Recursive MCET

As is mentioned above, the MCET is computationally expensive for multilevel thresholding. In order to reduce the computational effort, there is a faster recursive programming technique proposed to obtain the optimal thresholds for a digital image (Yin 2007).

In the recursive programming MCET Eq. (11) could be rewritten as (Yin 2007; Hammouche et al. 2008):

$$\begin{aligned} D\left( {\mathbf{th}} \right)= & {} \underbrace{\sum _{i=1}^L {ih\left( i \right) \log \left( i \right) } }_{\hbox {Image Entropy}}-\underbrace{\sum _{i=1}^{th_1 -1} {ih\left( i \right) \log \left( {\mu \left( {1,th_1 } \right) } \right) } }_{\hbox {Entropy from 1 to }th_1 -1}\nonumber \\&-\underbrace{\sum _{i=th_1 }^L {ih\left( i \right) \log \left( {\mu \left( {th_1 ,L+1} \right) } \right) } }_{\hbox {Entropy from }th_1 \hbox { to }L+1} \end{aligned}$$
(13)

Eq. (13) is defined for only one th and two classes. The first term is constant for a given digital image, and the remainder elements depend directly on the selected thresholds. Considering such facts, the objective function could be rewritten as:

$$\begin{aligned} \varphi \left( {\mathbf{th}} \right)= & {} -\sum _{i=1}^{th_1 -1} {ih\left( i \right) \log \left( {\mu \left( {1,th_1 } \right) } \right) }\nonumber \\&-\sum _{i=th_1 }^L {ih\left( i \right) \log \left( {\mu \left( {th_1 ,L+1} \right) } \right) } \nonumber \\= & {} -\left( {\sum _{i=1}^{th_1 -1} {ih\left( i \right) } } \right) \log \left( {\frac{\sum \nolimits _{i=1}^{th_1 -1} {ih\left( i \right) } }{\sum \nolimits _{i=1}^{th_1 -1} {h\left( i \right) } }} \right) \nonumber \\&-\,\left( {\sum _{i=th_1 }^L {ih\left( i \right) } } \right) \log \left( {\frac{\sum \nolimits _{i=th_1 }^L {ih\left( i \right) } }{\sum \nolimits _{i=th_1 }^L {h\left( i \right) } }} \right) \nonumber \\= & {} -m^{1}\left( {1,th_1 } \right) \log \left( {\frac{m^{1}\left( {1,th_1 } \right) }{m^{0}\left( {1,th_1 } \right) }} \right) \nonumber \\&-\,m^{1}\left( {th_1 ,L+1} \right) \log \left( {\frac{m^{1}\left( {th_1 ,L+1} \right) }{m^{0}\left( {th_1 ,L+1} \right) }} \right) \end{aligned}$$
(14)

here for a partial range of the image histogram, the values of the zero-moment point \(m^{0}\left( {a,b} \right) =\sum _{i=a}^{b-1} {h(i)} \) and the first-moment point \(m^{1}\left( {a,b} \right) =\sum _{i=a}^{b-1} {ih(i)} \) are computed. Eq. (14) can be extended for multilevel thresholding, considering a set of thresholds denoted by \(\mathbf{th}=\left[ {th_1 ,th_2 ,\ldots th_{nt} } \right] \), where nt is the number of thresholds to be found. For convenience two additional dummy thresholds \(th_0 \equiv 0\) and \(th_{nt+1} \equiv L+1\) are added, it is important the order of such values: \(th_0<th_1<\cdots th_{nt} <th_{nt+1} \). The recursive programming is applied to formulate the objective function for multilevel thresholding using the minimum cross entropy that is defined as:

$$\begin{aligned} \varphi \left( {\mathbf{th}} \right)= & {} m^{1}\left( {th_{i-1} ,th_i } \right) \log \left( {\frac{m^{1}\left( {th_{i-1} ,th_i } \right) }{m^{0}\left( {th_{i-1} ,th_i } \right) }} \right) ,\nonumber \\ \mathbf{th}= & {} \left[ {th_1 ,th_2 ,\ldots ,th_{nt} } \right] ,\quad {i=1,2,\ldots ,nt} \end{aligned}$$
(15)

With the use of Eq. (15) the complexity is reduced from \(O\left( {nt\cdot L^{nt+1}} \right) \) to \(O\left( {nt\cdot L^{nt}} \right) \) (Tang et al. 2011). However, it is still computationally expensive (Sarkar et al. 2015).

4 Multilevel thresholding using EMO and minimum cross entropy (MCET)

The computation of MCET for image thresholding is a complex task even if it is redefined using recursive programming. The main problem is to obtain the best thresholds that reduce the MCET; an exhaustive search requires a high computational effort. This paper proposes the use of the electromagnetism-like optimization (EMO) for multilevel segmentation based on the MCET. The approach uses the EMO algorithm to find the optimal th vectors by minimizing the complex objective function defined by the MCET. Compared with other evolutionary methods, EMO exhibits interesting search capabilities such as fast convergence while still keeping its ability to avoid local minima in high modality environments (De Jong 1988; Dorigo et al. 1996; De Castro and Von Zuben 2002). In the related literature, there are studies (Birbil et al. 2004; Wu et al. 2004; Rocha and Fernandes 2009a, b) that demonstrate that EMO provides one of the best trade-offs between the optimization performance and the demand for function of evaluations. The results showed that this algorithm substantially reduce the number of function evaluations while preserving its good search capabilities. Although EMO is a good alternative for global optimization, it includes a process to compute the elements of the new population that involve several operations described in Eqs. (17). This section describes the implementation of the proposed approach based on EMO and MCET.

4.1 Particle representation

Each particle uses nt decision variables in the optimization algorithm. Such elements represent a different threshold value used for the segmentation. Therefore, the entire population is represented as:

$$\begin{aligned} \mathbf{Sp}_t =[\mathbf{th}_1 ,\mathbf{th}_2 ,\ldots ,\mathbf{th}_N ],\hbox { }\mathbf{th}_i =\left[ {th_1 ,th_2 ,\ldots ,th_{nt} } \right] ^{T} \end{aligned}$$
(16)

where t represents the iteration number, Trefers to the transpose operator, N is the size of the population and nt is the number of thresholds for each member of the population (\(i = 1,2,{\ldots },nt\)).

4.2 EMO implementation

The proposed segmentation algorithm has been implemented considered the minimum cross entropy as objective function (Eq. 15). The implementation of the EMO algorithm can be summarized into the following steps:

Step 1 :

Read the image I and store it as the gray scale image \(I_{\mathrm{Gr}}\).

Step 2 :

Obtain histogram \(h^{\mathrm{Gr}}\) of \(I_{\mathrm{Gr.}}\)

Step 3 :

Initialize the EMO parameters: \(\mathrm{Iter}_{\max }\), \(\mathrm{Iter}_\mathrm{{local}}\), \(\delta \), k and N.

Step 4 :

Initialize a population \(\mathbf{Sp}_t\) of N random particles with nt dimensions.

Step 5 :

Evaluate \(\mathbf{Sp}_t\) in the recursive programming objective function \(\varphi (\mathbf{Sp}_t )\), Eq. (15).

Step 6 :

Compute the charge of each particle using Eq. (4), and with Eqs. (5) and (6) compute the total force vector.

Step 7 :

Move the entire population \(\mathbf{Sp}_t\) along the total force vector using Eq. (7).

Step 8 :

Apply the local search to the moved population and select the best elements of this search based on their objective function values.

Step 9 :

The t index is increased in 1, If \(t\ge \mathrm{Iter}_{\max } \) or if the stop criteria is satisfied the algorithm finishes the iteration process and jump to step 10. Otherwise, jump to step 5.

Step 10 :

Select the particle that has the best \(x_t^{B^{c}} \) objective function value (Eqs. (2) and (15)).

Step 11 :

Apply the thresholds values contained in \(x_t^{B^{c}} \) to the image \(I_{\mathrm{Gr}}\).

4.3 Multilevel thresholding

Once the EMO algorithm finds the best threshold values which maximize the objective function, the pixels of the image are segmented using such values. In this paper, we use the following rule for two levels and three classes:

$$\begin{aligned} I_s \left( {r,c} \right) =\left\{ {{\begin{array}{lll} {I_{\mathrm{Gr}} \left( {r,c} \right) }&{}\quad {\hbox {if}}&{} {I_{\mathrm{Gr}} \left( {r,c} \right) \le th_1 } \\ {th_1 }&{}\quad {\hbox {if}}&{} {th_1 <I_{\mathrm{Gr}} \left( {r,c} \right) \le th_2 } \\ {I_{\mathrm{Gr}} \left( {r,c} \right) }&{}\quad {\hbox {if}}&{} {I_{\mathrm{Gr}} \left( {r,c} \right) >th_2 } \\ \end{array} }} \right. \end{aligned}$$
(17)

where \(I_s \left( {r,c} \right) \) is the gray value of the segmented image, \(I_{\mathrm{Gr}} \left( {r,c} \right) \) is the gray value of the original image both in the pixel position rc. \(th_1 \) and \(th_2 \) are the threshold values obtained by the EMO approach. Equation (17) can be easily extended to more than two levels using Eq. (18).

$$\begin{aligned} I_s \left( {r,c} \right) =\left\{ {{\begin{array}{lll} {I_{\mathrm{Gr}} \left( {r,c} \right) }&{}\quad {\hbox {if}}&{} {I_{\mathrm{Gr}} \left( {r,c} \right) \le th_1 } \\ {th_{i-1} }&{}\quad {\hbox {if}}&{} {th_{i-1} <I_{\mathrm{Gr}} \left( {r,c} \right) \le th_i } \\ {I_{\mathrm{Gr}} \left( {r,c} \right) }&{}\quad {\hbox {if}}&{} {I_{\mathrm{Gr}} \left( {r,c} \right) >th_{nt} } \\ \end{array} }} \right. ,\hbox { }i=2,3,\ldots nt-1 \end{aligned}$$
(18)

5 Experimental results

In this paper, a set of benchmark images is used to test the proposed approach. This set contains 11 images with different complexity level; all the images have the same size (\(512\times 512\) pixels), and they are in JPGE format. Some of these images are widely used in the image processing literature to test different methods (Lena, Cameraman, Hunter, Baboon, etc.) (Agrawal et al. 2013).

The proposed multilevel thresholding algorithm based on EMO is compared with standard versions of the differential evolution (DE) (Storn and Price 1997), the particle swarm optimization (PSO) (Kennedy and Eberhart 1995), the harmony search (HS) [HS], social spider algorithm (SSA) [SSA] and the artificial bee colony (ABC) [ABC]. All those methods were programmed and tested on MATLAB 8.3 using a Xeon E5-2620 CPU @ 2.4Ghz with 16GB of RAM. Since the six algorithms are stochastic, statistical metrics are employed to verify the efficiency of the algorithms. For experimental purposes, all the algorithms are executed 35 times for each image. In accordance with the related literature, the number of thresholds is set to \(nt= 2,3,4,5\) (Horng 2010; Horng and Liou 2011). The stop criteria for each experiment are set to 50 iterations, once each test is performed the standard deviation (STD) is computed using Eq. (19). The STD represents the stability of the tested method, and if the STD increases, the algorithm becomes more instable (Ghamisi et al. 2012).

$$\begin{aligned} \mathrm{STD}=\sqrt{\sum _{i=1}^{\mathrm{Iter}_{\max } } {\frac{(\sigma _i -\mu )}{Ru}} } \end{aligned}$$
(19)

Once the best thresholds are obtained, the segmented images are generated using Eq. (18). However, it is necessary to verify the quality of the results. The peak-to-signal ratio (PSNR) is used to compare the similarity of an image (image segmented) against a reference (original image) based on the mean square error (MSE) of each pixel (Il-Seok Oh et al. 2004; Horng 2011; Akay 2013; Agrawal et al. 2013). Both PSNR and MSE are defined as:

$$\begin{aligned} \begin{array}{l} \mathrm{PSNR}=20\log _{10} \left( {\frac{255}{\mathrm{RMSE}}} \right) ,\hbox { }(\hbox {dB}) \\ \mathrm{RMSE}=\sqrt{\frac{\sum \nolimits _{i=1}^{ro} {\sum \nolimits _{j=1}^{co} {\left( {I_{\mathrm{Gr}} \left( {i,j} \right) -I_{th} \left( {i,j} \right) } \right) } } }{ro\times co}} \\ \end{array} \end{aligned}$$
(20)

where \(I_{\mathrm{Gr}} \) is the original image, \(I_{th} \) is the segmented image obtained after applying the best th and ro, co are the total number of rows and columns of the image, respectively. Another interesting metric is the structure similarity index (SSIM) that is used to compare the structures of the original against the thresholded image (Wang et al. 2004). The SSIM method is defined in Eq. (21), and a higher value indicates a better segmentation performance.

$$\begin{aligned}&\mathrm{SSIM}\left( {I_{\mathrm{Gr}} ,I_{th} } \right) =\frac{\left( {2\mu _{I_{\mathrm{Gr}} } \mu _{I_{th} } +C1} \right) \left( {2\sigma _{I_{\mathrm{Gr}} I_{th} } +C2} \right) }{\left( {\mu _{I_{\mathrm{Gr}} } ^{2}+\mu _{I_{th} } ^{2}+C1} \right) \left( {\sigma _{I_{\mathrm{Gr}} } ^{2}+\sigma _{I_{th} } ^{2}+C2} \right) } \nonumber \\&\sigma _{I_o I_{\mathrm{Gr}} } =\frac{1}{N-1}\sum _{i=1}^N {\left( {I_{Gr_i } +\mu _{I_{\mathrm{Gr}} } } \right) \left( {I_{th_i } +\mu _{I_{th} } } \right) }\nonumber \\ \end{aligned}$$
(21)

From Eq. (21), \(\mu _{I_{\mathrm{Gr}} } \) and \(\mu _{I_{th} } \) are the mean value of the original and the segmented image, respectively, and for each image the values of \(\sigma _{I_{\mathrm{Gr}} } \) and \(\sigma _{I_{th} } \) correspond to the standard deviation. C1 and C2 are constants used to avoid the instability when \(\mu _{I_{\mathrm{Gr}} } ^{2}+\mu _{I_{th} } ^{2}\approx 0\), experimentally in (Agrawal et al. 2013) both values are \(C1=C2=0.065\). Another method used to measure the quality of the segmented image is the feature similarity index (FSIM) (Zhang et al. 2011). FSIM calculates the similarity between two images: in this case, the original gray scale image and the segmented image. As PSNR and SSIM, the higher value is interpreted as better performance of the thresholding method. The FSIM is then defined as:

$$\begin{aligned} \mathrm{FSIM}=\frac{\sum \nolimits _{w\in \Omega } {S_L \left( w \right) \mathrm{PC}_m \left( w \right) } }{\sum \nolimits _{w\in \Omega } {\mathrm{PC}_m \left( w \right) } } \end{aligned}$$
(22)

where \(\Omega \) represents the entire domain of the image, the values of \(S_L \) are then computed as:

$$\begin{aligned} S_L \left( w \right)= & {} S_{\mathrm{PC}} \left( w \right) S_G \left( w \right) \nonumber \\ S_{\mathrm{PC}} \left( w \right)= & {} \frac{2\mathrm{PC}_1 \left( w \right) \mathrm{PC}_2 \left( w \right) +T_1 }{\mathrm{PC}_1 ^{2}\left( w \right) +\mathrm{PC}_2 ^{2}\left( w \right) +T_1 } \nonumber \\ S_G \left( w \right)= & {} \frac{2G_1 \left( w \right) G_2 \left( w \right) +T_2 }{G_1 ^{2}\left( w \right) +G_2 ^{2}\left( w \right) +T_2 } \end{aligned}$$
(23)

From Eq. (23), G is the gradient magnitude (GM) of a digital image and is defined as:

$$\begin{aligned} G=\sqrt{G_x ^{2}+G_y ^{2}} \end{aligned}$$
(24)

On Eq. (22), the value of PC is the phase congruence defined as follows:

$$\begin{aligned} \mathrm{PC}\left( w \right) =\frac{E\left( w \right) }{\left( {\varepsilon +\sum \limits _n {A_n \left( w \right) } } \right) } \end{aligned}$$
(25)

here \(A_n \left( w \right) \) is the local amplitude on scale n and \(E\left( w \right) \) is the magnitude of the response vector in w on n. \(\varepsilon \)is a small positive number and \(\mathrm{PC}_m \left( w \right) =\max \left( {\mathrm{PC}_1 \left( w \right) ,\mathrm{PC}_2 \left( w \right) } \right) \).

The parameters of EMO algorithm are set according to the criteria presented in Birbil and Fang (2003) and reported in Table 1; also, they are used for all the tests, and only the \(\mathrm{Iter}_{\max } \) is modified for the statistical analysis. The values in Table 1 are specially selected for the optimization problem considering the MCET; they could be tuned depending on the problem to solve.

Table 1 EMO parameters for the MCET
Table 2 Results after applying the EMO with MCET to the set of benchmark images

5.1 Minimum cross entropy results

This section analyzes the thresholding results obtained by the EMO algorithm using as objective function the MCET (Eq. 15) (Li and Lee 1993). The proposed approach is applied to the complete set of benchmark images, and the segmentation results are presented in Table 2. Such values correspond to the best thresholds found by the EMO algorithm considering four different segmentation levels \(nt= 2,3,4,5\). Table 2 also shows the values of the statistical metrics (PSNR, STD, SSIM, and FSIM) obtained for each image in each level. From Table 2, it is possible to analyze that the values obtained by EMO are stable (STD column) even if the number of thresholds is increased. On the other hand, the quality of the output segmented image obtained by EMO is measured with the PSNR, SSIM, and FSIM. The corresponding columns in Table 2 provide evidence of the segmentation capacities of the proposed approach according to the definition of each metric.

From the set of eleven benchmark images, five of them are selected due to the complexity of their histograms. The pictures have been chosen to show the segmentation results graphically. Figure 1 presents the images selected from the benchmark set and their respective histograms which possess irregular distributions (particularly Fig. 1j). Under such circumstances, classical methods face great difficulties to find the best threshold values.

Fig. 1
figure 1

a Camera man, c Lena, e Baboon, g Hunter and i Butterfly, the selected benchmark images. b, d, f, h, j histograms of the images

The five selected images are processed using the proposed algorithm based on EMO and MCET. The results presented in Fig. 2 consider four different threshold levels \(th=2,3,4,5\). Figure 2 also shows the evolution of the objective function during one execution. From the results, it is possible to appreciate that the EMO–MCET converges (stabilizes) around the first 100 iterations. The segmented images provide evidence that the outcome is better with \(th=4\) and \(th=5\); however, if the segmentation task does not require to be extremely accurate, then it is possible to select \(th=3\).

Fig. 2
figure 2

Results after applying the EMO using minimum cross entropy over the selected benchmark images

5.2 Comparisons

In order to demonstrate that the use of EMO and MCET is an interesting alternative for MT, the proposed algorithm is compared with two state-of-the-art implementations. The methods employed for comparison are: the differential evolution (DE) (Storn and Price 1997; Sarkar et al. 2015), the particle swarm optimization (PSO) (Kennedy and Eberhart 1995; Yin 2007), the harmony search (HS) (Loganathan et al. 2001), the social spider algorithm (SSA) (Yu and Li 2015), and the artificial bee colony (ABC) (Karaboga and Basturk 2007) all those methods use the minimum cross entropy as objective function. The six algorithms were run 35 times over each selected image. The images used for this test are the same of the selected in Sect. 5.1 (Cameraman, Lena, Baboon, Hunter and Butterfly). For each image (I) the PSNR, STD, SSIM, FSIM values and the mean of the objective function are computed.

The comparison results between the six methods are divided into two groups: in the first group, Table 3 shows the STD and mean values of the MCET as the fitness function. On the other hand, in the second group Tables 4 and 5 present the values of the quality metrics obtained after applying the thresholds over the test images. Such values provide evidence that the segmented images obtained using the thresholds computed by the MCET have better quality, in specific the computed by EMO.

The fitness (also called objective function) values of five methods are statistically compared with EMO using a nonparametric significance proof known as the Wilcoxon’s rank test (García et al. 2009) that is conducted with 35 independent samples. Such proof allows assessing result differences among two related methods. The analysis is performed considering a 5% significance level over the best fitness (MCET) value data corresponding to the five threshold points. Table 6 reports the p-values produced by Wilcoxon’s test for a pair-wise comparison of the fitness function between two groups formed as EMO versus DE, EMO versus PSO, EMO versus HS, EMO versus SSA and EMO versus ABC. As a null hypothesis, it is assumed that there is no difference between the values of the two algorithms tested. The alternative hypothesis considers an existent difference between the values of both approaches. All p values reported in Table 6 are less than 0.05 (5% significance level) which is strong evidence against the null hypothesis, indicating that the EMO fitness values for the performance are statistically better, and it has not occurred by chance.

Table 3 Comparison of the STD and mean values of the EMO, DE, PSO, HS, SSA and ABC applied over the selected test images using minimum cross entropy
Table 4 Comparison of the PSNR, SSIM and FSIM values of the EMO, DE and PSO applied over the selected test images using the MCET method
Table 5 Comparison of the PSNR, SSIM and FSIM values of the HS, SSA and ABC applied over the selected test images using the MCET method
Table 6 Wilcoxon p values of the compared algorithms EMO versus DE, EMO versus PSO, EMO versus HS, EMO versus SSA and EMO versus ABC
Table 7 Mean of the computational time of EMO, DE, PSO, HS, SSA and ABC for the problem of MTH using the minimum cross entropy
Fig. 3
figure 3

Fitness comparison of DE (red line), PSO (cyan line), HS (green line), SSA (magenta line), ABC (black line) and EMO (blue line) applied for multilevel thresholding using MCET (color figure online)

On the other hand, Table 7 presents a comparison between the EMO and the five selected algorithms regarding the computational time. This measurement is used to evaluate the computational effort required for each algorithm. For this comparison, 35 independent experiments are performed using each algorithm. For this test, the number of iterations is set to 1000, and each algorithm runs over all the images. The time is stored after each experiment for a single image. At the end, the mean is computed. This test is performed to provide evidence of the speed of EMO. Since the six algorithms are complex stochastic processes, it is difficult to analyze their complexity and the computational time is the best tool to support the results.

From Table 7, it is possible to see that the EMO-based approach in most of the cases needs less time to find the best optimal solution. When the number of thresholds increases (for example with \(k=5)\), EMO requires more time than DE, PSO or SSA. However, the time is consistent for each experiment. On the other hand, the HS approach requires more time since it uses a single candidate solution along the iterative process. Meanwhile, the operators of ABC are completely iterative, and it also uses two populations to find the best solutions. The SSA algorithm has some initial configurations that affect their efficacy, the values of such configurations are difficult to set, and they also affect the computational effort. Based on these facts, the computational time of ABC is higher in comparison with EMO, DE or PSO, but is close to the values of HS and SSA.

In Fig. 3, the fitness values obtained for the five selected images are presented. For this experiment, each algorithm runs 1000 times, and the best values are stored at the end of each iteration. For a better understanding and appreciation of the convergence, Fig. 3 includes the zoom of the graph of the objective function values. The zoom considers only 25 iterations for the six algorithms. This number is selected because experimentally we notice that the six algorithms decrease the MCET value in the first 25 iterations. From Fig. 3, it is possible to deduce that the proposed MTH algorithms based on EMO require fewer iterations than other similar approaches to obtain the best thresholds. In this context, the stop criteria could also be modified. Moreover, the results presented in Fig. 3 for HS support their low speed of convergence that occurs because it works only with one candidate solution at the time. Finally, it is possible to establish that the approach based on EMO and MCET reaches the minimum cross entropy values in fewer iterations and obtain more accurately solutions than DE, PSO, HS, SSA and ABC.

6 Conclusions

This paper presents a new algorithm for multilevel segmentation based on the electromagnetism-like algorithm (EMO) to reduce iterations and function evaluations. The proposed approach considers the segmentation process as an optimization problem, where EMO is employed to find the optimal threshold points that minimize the cross entropy (MCET). MCET combined with EMO focuses directly on the search for the best set of threshold values. This method considers only the histogram and the number of thresholds as input. In contrast, similar approaches such as clustering techniques need to explore initialization conditions leading to more complex schemes.

The EMO-based algorithm can substantially reduce the number of function evaluations preserving the good search capabilities of an evolutionary method. The presented technique is able to find the best values even with large and complex images. In our approach, the algorithm uses particles to encode a set of candidate threshold points. The MCET is the objective function, where the quality of all the candidate solutions is evaluated. The particles are evolved using the force, charge, movement and local search operators of EMO. Once the optimal thresholds are obtained, they are used to segment the image. In order to evaluate the quality of the segmented images, the use of the PSNR, STD, SSIM and FSIM is proposed. Such metrics consider the coincidences between the original and the segmented image.

The experimental study compares the proposed approach with other five related approaches, differential evolution (DE), particle swarm optimization algorithm (PSO), harmony search (HS), social spider algorithm (SSA) and the widely used artificial bee colony (ABC). The efficiency of the algorithms is evaluated in terms of PSNR, STD, SSIM, FSIM and fitness values. Such comparisons provide evidence of the accuracy, convergence and robustness of the proposed approach. The high rate of convergence of EMO is evident on the comparisons reported. Likewise, EMO outperforms on most of the experiments providing high scores on the evaluated metrics. It is possible to establish that the approach based on EMO and MCET reaches the minimum cross entropy values in fewer iterations and obtains more accurately solutions than DE, PSO, HS, SSA and ABC. Although the results offer evidence to demonstrate that the EMO method can yield good results on complicated images, the aim of our paper is not to devise a multilevel thresholding algorithm that could beat all currently available methods, but to show that electromagnetism systems can be considered as an attractive alternative for this purpose.

In the spirit of contributing to future works, the development of thresholding techniques might take three equally important paths. The first is the current trend where thresholding approaches are beneficiated from new optimization strategies, especially if such methods improve the overall state of the art. A second path can be taken if the thresholding problem is solved with a multi-objective methodology. The incorporation of multi-objective optimizers can lead to simultaneous improvements on thresholding criteria. Such topic should be addressed to further improve the quality of segmented images. Third, the effectiveness of thresholding can be directly applied to critical image processing areas such as medical or satellite image processing. Any of the aforementioned paths will significantly contribute to the growth of the image processing community.