1 Introduction

A wireless sensor network (WSN) consists of spatially distributed autonomous devices known as sensors. Each senor is capable of performing some basic processing, gathering sensory information and communicating with other connected sensor nodes in the network [1, 2].

Cluster topology is a WSN topology architecture where sensor nodes are grouped into various sets called clusters. Each cluster has a cluster head (CH) with one or more cluster member (CM) sensor nodes. This topology uses multihop routing to transmit data from the source sensors to the base station (BS). Clustering approach reduces communication overheads and due to effective allocations of resource as a result, decrease the overall energy consumption and reduce the interference among sensor nodes [3]. Clustering approach reduces the size of collected data by keeping only significant information by applying data aggregation techniques at CHs [4]. The aggregated data from CHs will be transmitted to the BS. Figure 1 shows the functionality of a cluster based WSN.

Fig. 1
figure 1

A heterogeneous WSN model with three types of sensor nodes

Information entropy theory such as administration entropy, environment entropy, and economy entropy [5] have been employed in many discrete areas. Entropy in information theory uses the discrete probability distribution to represent the amount of uncertainty. Let X be a discrete random variable with alphabet X and probability mass function \(p(x) = Pr\{X = x\}, x \in X\). The entropy H(X) of a discrete random variable X is defined by [6] as follows:

$$\begin{aligned} H(X)= -\sum _{x \in X} p(x)log_2 p(x) \end{aligned}$$
(1)

The minimum entropy is 0 and it occurs when one of the probabilities is 1 and the rest are 0s, while the entropy achieves the maximum value \(( H_{max}=log_2 (n))\) when all the probabilities have equal values of \(\frac{1}{n}\), where n is the number of outcomes.

Entropy indicates the valuable information produced by the data, as a result it can be used to determine the weights. The entropy value becomes smaller when the analyzed objects have fairly big difference between each others on a given specific criterion. However, the weight of the criterion should be smaller when the difference between objects is smaller (larger entropy value). Consequently, the entropy coefficient method is a target empowering method to determine the weight by calculating the entropy weights of each criterion based on determining variation degree with respect to every criterion value [5, 7,8,9].

Using the local information of the sensor nodes such as residual energy, a clustering algorithm can be adopted in terms of entropy as the election criterion. Such a new algorithm is energy efficient, moreover, we consider selection of CH node as a decision problem based on multiple criteria such as residual energy, sensor node density, and distance from the BS. Hence, we have multiple alternatives (set of sensor nodes) where each alternative consists of multiple criteria (i.e., sensor node’s information) and we need to make selection decision. For that, one of multi-criteria decision analysis (MCDA) methods is used to solve the decision problem with multiple criteria. In our proposed algorithm, the weighted product model (WPM) [10] is utilized for addressing the decision problem and the entropy weight coefficient [5, 7, 9, 11] is used to assess the weight of different criteria. Our simulation results reveal that the proposed strategy can manage power consumption better than existing algorithms and achieves the desired results for WSNs. The main contribution of this paper can be summarized as follows:

  1. 1.

    A new dynamic, distributive, self-organizing, and energy-efficient clustering algorithm that uses sensor nodes’ local information (e.g., residual energy) in terms of entropy as the election criterion is proposed. The new algorithm does the following:

    • Organizes the WSN into two-levels of the hierarchical network and consider three-levels of energy heterogeneity of sensor nodes.

    • Considers the selection of CH as a decision problem with multiple criteria such as residual energy, node density, and distance from the BS.

  2. 2.

    We provide a solution to the decision problem through Weighted Product Model (WPM) where sensor nodes represent the multiple alternatives and each alternative consists of multiple criteria (sensor nodes information).

  3. 3.

    We introduce the concept of entropy weight coefficient to assess the weight of different criteria which reflects the current status of node information in the network.

  4. 4.

    Provide extensive simulation results to prove that the proposed strategy can manage power consumption and substantially outperforms the baseline algorithms in terms of energy consumption, stability period, and the network lifetime.

The remainder of this paper is structured as follows: the discussion of the related research is given in Sect. 2. The considered models description is presented in Sect. 3. Section 4 describes the proposed algorithm. In Sect. 5, the performance analysis and results are presented. Evaluations indicate that the proposed approach efficiently solves the problem and exceeds other existing algorithms. The conclusion of the proposed work is given in Sect. 6. Used notations through the paper are given in Table 1.

Table 1 Table of notations

2 Related research

Several recent works address the problem of clustering in a WSN. LEACH [12], was the first most routing protocol that used clustering concept in a WSN. LEACH collects data in rounds. Each round is split into two main phases: setup phase where a specific ratio of sensor nodes are selected as CHs according to their probability in particular round. Each non-CH joins the nearest CH as a cluster member (CM). Then in the second phase, each CH determines a schedule for its CM to transmit its data without any collision. In the steady state phase, each CM transmits its data in assigned time slot.

LEACH has many variant versions such as [3, 13,14,15,16,17,18,19,20,21,22,23,24,25,26,27]. In [13], a centralized cluster formation algorithm (LEACH-C) is designed such that there are a certain number of clusters, k, during each round. In [14], to enhance LEACH-C and LEACH, the authors proposed a technique to optimize LEACH (O-LEACH) by dynamically selecting CH according to the residual energy of sensor nodes.

Deterministic energy efficient clustering protocol (DEC) is presented in [16]. DEC is three-level energy sensor nodes and autonomously select CHs according to their energy levels. Only the set-up phase in LEACH is modified and the BS takes part in the election of the CHs in the first round. As in LEACH, when the steady phase begins, each sensor node transmits its data to the nearest CH that performs data aggregation. However, in DEC, before the end of steady phase, each CH checks the piggybacked of received cluster member data to decide whether it will remain CH or will relinquish its role by choosing another sensor node among its members with the highest energy as the new CH. In the next round, the new selected CHs broadcast their role then the non CHs will select their CH and the steady phase begins again. This process continues in each round until the last sensor node dies. In [19], the authors proposed intra-balanced LEACH protocol to extend LEACH by balancing the energy consumption in WSN. In [27], the authors proposed a protocol that improves the CH election, the special node processing and inter cluster routing problem, and then an improved protocol called LEACH-Impt was proposed.

In [21], the authors proposed LEACH with two level CH (LEACH-TLCH) protocol that deploys a secondary CH (2CH) to relieve the CH burden in these circumstances. However, in LEACH-TLCH the optimal distance of CH to BS, and the choicest CH energy level for the 2CH to be deployed for achieving an optimal network lifetime was not considered. In [22], the authors improved LEACH-TLCH by investigating the conditions set to deploy the 2CH for an optimal network lifetime. In [23], a hybrid unequal energy efficient clustering is proposed to increase lifetime of the network. In the proposed protocol, a new mechanism based on arrangement of nodes in a network to determine whether nodes should use information of their neighbors or should not use this information.

In [24], a protocol in form of cluster-head restricted energy efficient protocol (CREEP) has been proposed to overcome high system complexity due to computation and selection of large number of CHs and improve the network lifetime by modifying the CH selection thresholds in a two level heterogeneous WSN. The work in [25] focuses on reducing the energy dissipation thereby improving the network lifetime of the randomly deployed senor nodes in a WSN and also improving the rate of data packets forwarded to the BS at the same time. This is achieved by dividing the deployment region into various sectors and then selecting the CHs from each sector based on highest remaining energy of the sensors. The work in [26] reviews the routing concepts for diverse heterogeneous WSNs scenarios and covers the state of the art in the area. It also discusses the effects and inter-dependencies of different heterogeneities in routing decisions and unveils new research directions in the area.

The authors in [17] presented stability period protocol (SEP) which is based on weighted election probabilities using the remaining energy of each sensor node to be selected as CH. The energy heterogeneity problem in the network is characterized using two-node power classification (nodes are labeled with ”normal” and “advanced”). A variant of LEACH called extended heterogeneous LEACH (EHE-LEACH) is proposed in [18]. EHE-LEACH uses same model as in SEP. M-SEP [28] is a modified version of SEP. M-SEP enhances the network lifetime by considering the energy of nodes in a current ongoing round and an average energy of the whole network.

In [29], the authors proposed distance incorporated modified stable election protocol (D-MSEP) which is an improved version of modified stable election protocol (M-SEP) as it incorporates the distance factor in probabilistic formula of CH selection in each cluster. The inclusion of distance factor helps in avoidance of CH selection farther from the BS. This leads to saving of huge amount of energy leading to the escalated stability period.

A heterogeneity-aware energy efficient clustering (HEC) algorithm is suggested in [20]. HEC selects CHs based on the three network lifetime phases: only advanced sensor nodes are allowed to become CHs in the initial phase; in the second active phase, all the sensor nodes are allowed to participate in CH selection process with equal probability, and in the last dying out phase, clustering is relaxed by allowing direct transmission. A distributed energy-efficient clustering scheme for heterogeneous WSNs (DEEC) is presented in [30]. DEEC follows the same rules of energy-heterogeneity of SEP. An Enhanced Distributed Energy Efficient Clustering scheme (E-DEEC) is proposed in [31] by expanding the DEEC to three-level heterogeneity. However, CHs selection probabilities are not adapted as per the sensor nodes energy levels [32].

A three level heterogeneous network model for WSNs is proposed in [33]. The proposed heterogeneous WSN model selects CHs and their respective CMs by using weighted election probability and a threshold function. The proposed model is implemented based on DEEC. An enhanced developed distributed energy-efficient clustering (EDDEEC) with heterogeneous network model is proposed in [32, 34] that is established on three energy levels of sensor nodes. EDDEEC uses clustering mechanism which dynamically modifies the CH election probability. Modified enhanced developed distributed energy-efficient clustering (MED-DEEC) algorithm has been presented in [35]. MED-DEEC employs an energy-aware protocol similarity in [32, 34]. However, the decision of electing CHs in MED-DEEC is based on the remaining and initial energy of the sensor node, average distance and average energy of the network. A distributed unequal cluster-based routing (DUCR) proposed in [36] is a distributed multi-objective-based clustering algorithm. DUCR uses multi-objective optimization technique to assign CMs to an appropriate CH so that the load is balanced. The work in [37] proposed a distributed energy-efficient clustering protocol (DCE) for heterogeneous WSNs. DCE is based on a Double-phase CH Election scheme where, the CH election process is divided into two phases. In the first phase, tentative CHs are selected based on initial and residual energy of sensor nodes. In the second phase, tentative CHs are replaced by their CMs to form the final set of CHs if any CM in their cluster has more residual energy.

All previous works consider special attributes of sensor nodes, e.g., remaining energy, distance to base station, … etc. as criteria for selecting CHs, ignoring the cluster load, i.e., the number of cluster members that can be served by CH or the number of sensor nodes that can be supported by the CH. In our proposed work, we consider cluster load as a criterion for selecting CHs combined with the remaining energy, distance to base station and degree of neighboring sensor nodes. Moreover, during the operation of the network the CH could not have enough energy to complete its task for that we introduce a new method to predict the remaining energy of each sensor node at the end of the next round before the round starts and selects the most appropriate sensor node that can continue its work till the end of the round to be the CH. Also, in the proposed algorithm, three-level energy heterogeneity of sensor nodes are used as in [16, 31, 32, 38] and entropy weight coefficient as in [5, 7, 9, 11] is used in order to come up with optimal CHs.

Several works [5, 7,8,9, 11] use entropy method to take a decision. Entropy weighted multi criteria routing (EMCR) [11] is a routing scheme in which the next hop decision is based on MCDA method along with entropy method of information theory. The work in [7] uses fuzzy comprehensive evaluation based entropy weight coefficient to analyze the intrusion detection systems (IDSs). Also, entropy weight coefficient method is used in [9] for soil quality assessment. In [8], optimization for objective weights and subjective weights is proposed based on the fuzzy multiple attributes decision making routing (MADMR) algorithm that selects next hop in a WSN. The entropy coefficient model is used in [8] to avoid excessive deviation from the objective weights. In [5], an index system for capacity assessment is developed by employing the entropy weight coefficient method.

3 Models description

In this section, the different used models in our proposed algorithm such as energy model, radio energy dissipation model, network model are described.

3.1 Energy model

Energy heterogeneity is the most significant characteristic of a WSN which includes: computational heterogeneity, link heterogeneity, and energy heterogeneity. In this paper, we address three heterogeneity energy levels as in, DEC and EDDEEC where, we have three types of sensor nodes: normal, intermediate, and advanced. The initial energy of intermediate is between the initial energy of normal and advanced. If \(E_0\) is the initial energy of normal sensor nodes, then the initial energy of intermediate sensor nodes will be \(E_{int}=(1+\mu )E_0\) and of advanced sensor nodes will be \(E_{adv}=(1+ \alpha ) E_0\), here \(\alpha\) and \(\mu\) are the energy weight factor of advanced and intermediate sensor nodes respectively. Therefore, the total initial energy will be:

$$\begin{aligned} E_{total}=nE_0(1+m\alpha + b\mu ). \end{aligned}$$
(2)

Here, n is the total number of sensors in the network, m is the ratio of advanced sensor nodes, and b represents the ratio of intermediate sensor nodes.

3.2 Radio energy dissipation model

Receiving and sending messages are the two main operations that consume energy in a WSN. The consumed energy of sending a message is greater than the consumed energy of receiving a message due to the additional energy that is required to expand the signal between the distance and the destination. Here, we use the same models (radio, data aggregation, and energy parameters ) as in many previous works [12, 16, 17, 30, 32, 38]. As per previous works, the consumed radio power to transmit l-bits message to a distance d is given by:

$$\begin{aligned} E_{Tx}(l,d)=\left\{ ^{lE_{elec}+l\epsilon _{fs} d^2 \quad if \; \; d< d_0,}_{lE_{elec}+l\epsilon _{mp} d^4 \quad otherwise.}\right. \end{aligned}$$
(3)

and the consumed radio power to receive this message is:

$$\begin{aligned} E_{Rx}(l)=l E_{elec} \end{aligned}$$
(4)

Here, the electronics energy consumed is \(E_{elec}\) and consumed energy by the transmitter amplifier for distance greater than d is \(\epsilon _{mp}\) and for distance less than d is \(\epsilon _{fs}\), the consumed energy for aggregating data is \(E_{DA}\), and the threshold distance is given as \(d_0=\sqrt{\frac{\epsilon _{fs}}{\epsilon _{mp}}}\).

3.3 Network model

We assume that n sensor nodes are randomly deployed in a field R of size \(M \times M\), the sensor nodes are static and each sensor node has a unique ID. It is also assumed that each sensor node knows its communicating neighbors, including their identifications and coordinates, which can be gathered statically via hello message, or periodically if frequent changes occur in the topology.

4 Entropy based clustering algorithm (EBCS)

In this section, we discuss our proposed entropy based clustering algorithm (EBCS). The main goal of EBCS is in designing a dynamic distributed and self-organizing clustering algorithm that prolongs the lifetime of the network and minimizes the consumed energy in the network.

The EBCS includes three phases: cluster heads election, cluster formation, and data transmission. A flow-graph of the proposed algorithm is shown in Fig. 2, and the detailed flow-graph is shown in Fig. 3. In cluster heads election phase, weighted product model (WPM) along with entropy weighted coefficient method (EWC) are used for resolving the election decision problem based on different criteria. In cluster formation phase, each non-CH node selects the CH that uses minimum communication cost, based on the signal strength of received advertisement. In data transmission phase, the data will be collected by CH and will be sent to BS.

Fig. 2
figure 2

Diagram of EBCS algorithm

The Pseudocodes that will be executed by BS, CH, and non-CH nodes are presented respectively in Algorithms 2, 3, and 4.

Fig. 3
figure 3

Flow-graph of the proposed algorithm

4.1 Cluster heads election phase

WPM along with EWC are used for resolving the election decision problem based on different criteria: residual energy \((E(s_j))\), number of supported sensor nodes \((N_{support}(s_j))\), distance to BS \((D(s_j ))\), and sum of distance to all neighbor sensor nodes \(s_j\)s \((DN(s_j ))\). These criteria can be described as follows:

  • Residual energy (E) Residual energy is the most important feature for every sensor node and the network lifetime mainly depends on the residual energy among the sensor nodes.

  • Distance to BS (D) Distance to BS is important to be considered as more distance to BS implies more energy will be consumed in transmitting a packet.

  • Sum of distances to neighbor sensor nodes (DN) As the CH supports its members, therefore, the number of packets transmitted over long distance between members and CHs may deplete the battery power relatively faster.

  • Number of sensor nodes that can be supported (\(N_{support}\)) Since there are three-level energy heterogeneity of sensor nodes, we may have three types of CHs based on the initial energy used, \(CH_0\), \(CH_{int}\), and \(CH_{adv}\) with initial energy \(E_0\), \(E_{int}\), and \(E_{adv}\), respectively. The energy consumption percentage of each CH type will be \(c_0=e_{CH}/E_0\), \(c_{int}=e_{CH}/E_{int}\), and \(c_{adv}=e_{CH}/E_{adv}\) for \(CH_0\), \(CH_{int}\), and \(CH_{adv}\), respectively, where \(e_{CH}\) is given by Equation 11. It is clear that \(c_0> c_{int}> c_{adv}~as~ E_{adv}> E_{int} > E_0\), which will result in uneven energy consumption among these CHs. Therefore, it is required to determine the number of sensor nodes that can be assisted by each type of CHs such that \(c_0\cong c_{int} \cong c_{adv}\). The number of sensor nodes that can be assisted by each sensor node \(N_{support}\) can be estimated as follows:

    $$\begin{aligned} N_{support}=\left\lceil \frac{E_{remaining}*(n-k)}{\frac{k}{3}(E_0+E_{int}+E_{adv})}\right\rceil . \end{aligned}$$
    (5)

Based on sensor nodes local information a decision has to be taken to select the CHs. This type of information is considered as the criteria for the election decision making (e.g., multi-criteria decision-making problem). In our proposed approach, we adapt the weighted product model (WPM) which is one of the Multi-criteria decision analysis method (MCDA) [10] and then we use entropy weight coefficient method to elect the CHs.

4.1.1 Weighted product model (WPM)

In WPM, given is a finite set of decision alternatives \((A_1, A_2, \ldots , A_n)\) described in terms of a number of decision criteria. Each decision alternative is compared with the others by multiplying a number of ratios, one for each decision criterion. Each ratio is raised to the power equivalent of the relative weight (w) of the corresponding criterion. In order to compare two alternatives \(A_K\) and \(A_{L}\) using WPM; the following product (P) has to be calculated for m number of criteria and n number of alternatives:

$$\begin{aligned} P\left( \frac{A_K}{A_L}\right) =\prod ^m_{j=1}\left( \frac{a_{Kj}}{a_{Lj}}\right) ^{w_j}. \end{aligned}$$
(6)

Here, \(K \ne L, K, L= 1,2,\ldots ,n\) and \(a_{ij}\) is the performance value of the alternative \(A_i\). If the ratio \(P(\frac{A_K}{A_L})\ge 1\), then it implies that the alternative \(A_K\) is more useful than the alternative \(A_L\). The best alternative is the one that is better than or at least equal to all others.

The alternative approach of WPM is the decision maker where we use only products without ratios as follows [10]:

$$\begin{aligned} P(A_K)=\prod ^m_{j=1} (a_{Kj})^{w_j}. \end{aligned}$$
(7)

In this formula, the term \(P(A_K )\) indicates the performance value of alternative \(A_K\) when all the criteria are evaluated under the WPM model. The weights \(w_j\) of different criteria are calculated using entropy coefficient method that will be described in the next subsection.

4.1.2 Entropy weight coefficient method (EWC)

The entropy coefficient method is applied to determine the weights for the criteria. The main steps of calculating the weights of m criteria and n alternatives are as follows:

  • Calculate the entropy of each criterion \(i=1,\ldots , m\) that can be derived from Eq. 1.

    $$\begin{aligned} H_i= -\frac{1}{log_2 n} \sum ^n_j p_{ij} log_2 p_{ij} \end{aligned}$$
    (8)

    Here, \(p_{ij}=\frac{c_i(s_j)}{\sum _j^m c_i(s_j) }\), \(j=1, \ldots , n\), \(c_i(s_j)\) is the performance value of alternative \(s_j\) and if \(p_{ij}=0\) then \(p_{ij}log_2 p_{ij}=0\).

  • Calculate the entropy coefficient weight \((w_i)\) of each criterion i :

    $$\begin{aligned} w_i=\frac{(1-H_i)}{m-\sum _i^m H_i} \end{aligned}$$
    (9)

    Here, \(0\le w_i \le 1\) , \(\sum _{i=1}^m w_i =1\) .

figure d

4.1.3 Cluster head election decision

The election decision of CH is made by executing the following steps (The pseudocode of this procedure is presented in Algorithm 1):

  • Using EWC, compute the weight of each criterion.

  • Using WPM, compute the product value P of each sensor node \((s_j )\) considering m criteria as:

    $$\begin{aligned} s_j.P=(s_j.c_1)^{w_1} \times (s_j.c_2)^{w_2}\times \cdots \times (s_j.c_i)^{w_i}. \end{aligned}$$
    (10)

    Here, \(w_1\), \(w_2\), \(\ldots\), \(w_i\) are the weights for the criteria \(c_1\), \(c_2\), ..., \(c_i\) respectively.

  • Sensor nodes with highest P values will be elected as CHs.

4.1.4 Cluster heads election procedure

  1. 1.

    Initial step In EBCS, the BS finds the optimal number k of CHs in the first round \((r=1)\) for the WSN. BS participates in selection of CHs only in the first round. The election process will be performed by executing the selection procedure (Algorithm 1) considering the criteria, E, \(N_{support}\), D, and DN, i.e., BS call EP(Sk) procedure where S is the set of sensor nodes in WSN.

    BS sends \(BE_{CH}\) message to k sensor nodes that have highest P values. After the sensor node receiving the \(BE_{CH}\) message, the selected CHs announce their role using CSMA [12, 16, 38] and then each CH forms its cluster by the cluster formation step below.

  2. 2.

    CH election step The consumed energy by the CH involves the consumed energy of receiving data from all the CMs, aggregating data, and forwarding data to the BS. As a result, we employ energy prediction technique to predict the CH failure due to energy depletion. If we assume that n sensor nodes are uniformly dispersed and there sensor exist k clusters in the topology. Thus, on average, there are n / k sensor nodes per cluster (one CH and (n / k)-1 CMs). Consequently, the total energy consumption by the CH (\(e_{CH}\)) for single round can be calculated as follows [12]:

    $$\begin{aligned} e_{CH}=\left( \frac{n}{k}-1\right) .E_{Rx}(l)+ \frac{n}{k}.l.E_{DA}+E_{Tx}(l,d_{toBS}). \end{aligned}$$
    (11)

    The CM nodes only need to transmit data to its corresponding CH. Thus, the total energy consumption by the CM node (\(e_{CM}\)) during one single frame can be calculated as follows:

    $$\begin{aligned} e_{CM}= E_{Tx} (l, d_{toCH}). \end{aligned}$$
    (12)

    Here, \(d_{toBS}\) is the mean distance between the CH and the BS, and \(d_{toCH}\) is the average distance between CMs and the CH. In an area with size \(M\times M\), \(d_{toBS}\) and \(d_{toCH}\) can be estimated as:

    $$\begin{aligned} d_{toBS}= \frac{M}{\sqrt{2\pi k}}, \quad d_{toCH}=0.765 \frac{M}{2}. \end{aligned}$$
    (13)

    If each sensor node s takes \(t_{cm}\) times to work as CM node and \(t_{CH}\) times to work as CH node, then, the total energy consumption for sensor node s at round r can be calculated as follows:

    $$\begin{aligned} E_{consumed} (s,r)=t_{ch} e_{CH}+t_{cm} e_{CM} \end{aligned}$$
    (14)

    According to the current energy of sensor node s and Eq. 14, the remaining energy of sensor node s at the start of round \(r+1\) can be used to predict when r round ends as follows:

    $$\begin{aligned} E_{prediction} (s,r+1)=E_{residual} (s,r)-E_{consumed}(s,r). \end{aligned}$$
    (15)

    Sensor node s will decide if its current residual energy is close to the predicted energy value at the end of the round by using the following equation:

    $$\begin{aligned} \upsilon =|1-\frac{E_{prediction} (s,r)}{ E_{residual} (s,r)} |. \end{aligned}$$
    (16)

    The error of energy prediction can be tolerated if \(\upsilon\) is less than a constant \(\epsilon\).

    At the end of each round r and based on the received members’ information (id, \(E_{prediction}\), \(E_{residual}\), \(N_{support}\)), CH executes Algorithm 1 in order to take a decision for the next round. CH has three cases (Algorithm 3, steps 26-38):

    1. (a)

      The list of sensor nodes to be nominated as CHs (\(\psi\)) is empty that means none of CMs or current CHs can be CH for the next round because the energy needed to act as CH \(e_{CH}\) is larger than the predicted energy \(E_{prediction}\). In this case, the current CHs inform their CMs to send their data directly to the BS.

      In such a case, we avoid unreliable and unpredicted behavior of the network by avoiding forming clusters. However, the positive aspect is that with the remaining energy of a sensor node, it may succeed if data is sent directly to the BS.

    2. (b)

      If \(\psi\) contains only one CH, then the CH will remain working as CH for the next round.

    3. (c)

      If \(\psi\) contains more than one CH (\(|\psi |>1\)), as a result CH has to execute the election procedure EP with the set \(\psi\) (\(EP(\psi ,1)\)) to make a decision.

    If the cardinality of \(\psi\) is larger than one, then the CH sensor node has a set of alternatives. The first alternative is to stay working as CH sensor node, and the second alternative is to select one of its CMs to work as CH.

    CH executes the election procedure EP considering \(\psi\) members and the four criteria \(N_{support}\), E, D, and DN, then the product value of each member node is computed and compared, then the sensor node with the highest product value P will be selected as the next CH (Algorithm 3, Step 34).

figure e

4.2 Cluster formation phase

Each CH sends \(BE_{CH}\) message to the elected CHs. The elected CHs announce their role using CSMA. The advertising short message \(CH_{Announce}\) contains the ID of CH. Then, cluster formation step is executed (Algorithm 4).

To ensure that the concise information will be sent to BS, each non-CH node selects the CH that uses minimum communication cost, based on the signal strength of received advertisement and then transmits \(JOIN_{Request}\) message to the selected CH.

The CHs build a TDMA schedule or transmission plan that decides the time for each CM to send its collected data.

This guarantees collisions avoidance between data messages and permits radio components of CMs to be turned on only during their transmission time [12, 38]. Creating and transmitting TDMA schedule will be the last step in setup phase.

4.3 Data transmission phase

Normally, each CM sends only its collected data to the corresponding CH in its allocated time slot (Algorithm 4, Step 8–11). However, in our proposed EBCS, the data message includes the ID of CM, residual energy, \(N_{support}\) and \(E_{prediction}\) data, i.e., the local information of the CMs can be employed to localize the CH rotation in the subsequent rounds.

Once all data is received by the CH, it executes some signal processing function such as data aggregation. Then, the aggregated result is sent to the BS. At the end of this phase, using piggybacked CM information, each present CH decides whether it would continue as CH or give up its role (Algorithm 3, Steps 17–38).

figure f
figure g

4.4 Entropy based clustering algorithm analysis

Each CH gathers and aggregates the data from its CMs, and then forward the aggregated data to BS. As a result, the consumed energy by CH is more than the consumed energy by CM. In EBCS, the sensor nodes \(CH_{Announce}\) with greater product value of the sensor node with greater remaining energy, greater number of neighbors, and the nearest to BS will more plausible to be elected as a CH. Moreover, every non-CM that receives \(CH_{Announce}\) adopts the CH that needs minimum communication cost based on the distance between it and the CH candidates. This will balance the network load during the CH rotation in successive rounds.

Lemma 1

The proposed EBCS hasO(n) exchanged messages as an overhead, wherenis the number of sensor nodes.

Proof

If k is the average number of CHs per round. In each round we will have the following overhead messages:

  • k messages for broadcasting \(CH_{Announce}\) by all CHs.

  • \(n-k\) messages for \(JOIN_{Request}\) by non-CMs.

  • k messages for broadcasting TDMA schedule by all CHs.

  • \(n-k\) messages for sending data to CHs by CMs.

  • k messages for sending data to BS by CHs.

  • Finally, in the worst case, k messages for the rotation of CHs (it will take place where each CH determines to stay as CH or select one of its members to be the successive CH).

Therefore, the EBCS overhead will be O(n) as \(k \ll n\). \(\square\)

5 Simulation results

Our algorithm is simulated using MATLAB R2015a. 100 sensors are randomly deployed in our simulations in the following regions \(100 \times 100, 150 \times 150, 200 \times 200,\)\(250 \times 250,\)\(300 \times 300,\)\(350 \times 350 \,\hbox {m}^{2}\) on a two-dimensional plane with BS placed at the center. The radio model and energy parameters are used as previously described in Sect. 3. The main parameters and their values of the simulation are provided in Table 2. The performance metrics we pursue here are as follows:

  1. 1.

    Stability period It is the duration from the start of a network simulation till the time first sensor node dies (FND).

  2. 2.

    Half-life Time from the beginning of WSN deployment to a time when a sensor node consumes up to half of its energy.

  3. 3.

    Consumed energy Shows how much energy has been consumed by the network with increasing number of rounds.

  4. 4.

    Number of alive sensor nodes per round The number of alive sensor nodes in the WSN after each round.

The algorithm has been tested using different random topologies and it is compared with DEC, SEP-E, EDDEEC [32], MEDDEEC[35], CREEP [24] and DMSEP [29] algorithms. The optimal parameters of EBCS with DEC, SEP-E, EDDEEC, MEDDEEC, CREEP, DMSEP are utilized to achieve the best performance. We assume that the total energy of the network is 102.5 J for all algorithms. Table 3 shows the ratio of sensor nodes and their corresponding energies.

Table 2 Simulated parameters
Table 3 Energy setting of sensor nodes

In the next sections, we explore the performance metrics and compare the results of the proposed algorithm with many existing baseline algorithms.

5.1 Analysis of optimum cluster heads

In EBCS, the clustering algorithm was created to guarantee that the expected number of CHs per round is fixed, which can be set a prior. we use a probabilistic-based model such as EDDEEC, MEDDEEC, CREEP, DMSEP and SEP-E and non-probabilistic-based such as DEC to analyze the number of elected CHs per round. A number of experiments have been conducted by varying the number of CHs per round for each Algorithm. Figure 4 shows the number of CHs for the first 2500 rounds in EBCS, DEC, EDDEEC, MEDDEEC, SEP-E, CREEP, and DMSEP algorithms. We notice that the number of CHs per round for EDDEEC, MEDDEEC, CREEP, DMSEP and SEP-E fluctuate which indicates that the optimal number of elected CHs cannot be guaranteed in each round as they depend on probability function for CH selection). Due to this, in some rounds very few CHs are elected, and the cluster-members will need to transmit at a much longer distances to reach their CHs. Likewise, if the number of elected CHs is higher, not much data aggregation will be done, since the cluster size is smaller. Thus, more energy will be used for transmitting. This is one of the major drawback of the probabilistic-based model. Another disadvantage of this model is that, the energy consumption across nodes become increasingly uneven as the network progresses. While this is not the case for DEC and EBCS. This ensures that for any network size, once the optimal number of CHs is defined in the beginning of the network operation, the number of elected CHs will remain fixed that evenly distributes energy dissemination as the network evolves.

Fig. 4
figure 4

Number of clusters per round in DEC, SEP-E, EDDEEC, EBCS, MEDDEEC, DMSEP, and CREEP algorithms

5.2 Stability period

The stability period is defined as the duration from the start of the network simulation till the time first sensor node dies. Figures 5 and 6 show the following points:

  1. 1.

    The stability period of the proposed algorithm exceeds other algorithms as the remaining energy is predicated and cluster load is considered during CH selection process. Moreover, using entropy enhances the stability period of the network by selecting the nodes with more weight according to their energy level, etc.

  2. 2.

    The stability period of all algorithms degrades as the network size increases, where the distance between sensor nodes increases and so the consumed energy for sending and aggregating data increases.

  3. 3.

    The stability period of DEC exceeds the stability period of SEP-E by 35% for the network size of \(100 \times 100\). But, as the network size increases from \(200 \times 200\) to \(350 \times 350\), stability period of SEP-E exceeds the stability period of DEC by 25%, 31%, 37%, 29% respectively, because the DEC depends on the sensor node energy as selection parameter for CHs as network size expands the distance between sensor nodes increases and so the consumed energy increases while the selection processes in SEP-E is probabilistic.

  4. 4.

    The stability period of MEDDEEC, DMSEP and EDDEEC significantly degrades as the network size increases to \(350 \times 350\) because MEDDEEC and EDDEEC depend on the absolute threshold value (T absolute) that is calculated in each round, and the energy of all sensor nodes are compared with the threshold value. Only if the energy of a sensor node is less than the threshold, it will be treated independently. The value of the absolute threshold does not depend on the network size which affects the network lifetime as the network size increases. While, DMSEP considers distance threshold value that is calculated as the average distance between nodes and BS. However, the threshold value is calculated at the network start up and fixed in all rounds which affects the network lifetime too. As the network size increases to \(350 \times 350\), the proposed algorithm enhances the stability period compared with DEC, EDDEEC and MEDDEEC up to of 32%, 74% , 73%, respectively.

  5. 5.

    EDDEEC calculates the probability for CH selection based on the initial energy and remaining energy level of sensor nodes and average energy of the network while MEDDEEC considers average distance and the distance from BS as parameters to compute the probability function that allows MEDDEEC to select a sensor node to be CH only if it is closer to the base station and hence minimizing the energy consumption is reduced. As a result the performance of MEDDEEC is better than EDDEEC in terms of energy consumption.

  6. 6.

    As the network size increases to \(250 \times 250\) and \(300 \times 300\), the stability period of SEP-E increases slightly by 1% over EBCS for network size. However, as the network size increases to \(350 \times 350\) the stability period of EBCS is increased by 5% over SEP-E.

Fig. 5
figure 5

Stability period for DEC, SEP-E, EDDEEC, EBCS, MEDDEEC, DMSEP, and CREEP algorithms for different network sizes

Fig. 6
figure 6

Number of alive sensor nodes per round of DEC, SEP-E, EDDEEC, EBCS, MEDDEEC, DMSEP, and CREEP algorithms for different network sizes

In SEP-E, there are three reasons that lead to a reduction in network lifetime

  • In SEP-E, CHs selection depends on the probability of each type of sensor node without considering the remaining energy of each sensor node so that the sensor nodes with comparatively small remaining energy can be CHs.

  • As we discussed in previous Sect. 5.1, SEP-E does not have stable number of CHs, i.e., the optimal number of CHs is not guaranteed in many rounds.

  • In case of increasing network size, at the edge of the network or in low density regions, the number of CHs increases and so many cluster members inefficiently utilize energy while communicating with the CHs.

For the above reasons, stability period of SEP-E becomes very close to EBCS as the network size increases to specific size (300 \(\times\) 300), then after this specific size, the EBCS will start to exceed SEP-E.

5.3 Half-life time

Our algorithm enhances the Half-life of WSN better than DEC, SEP-E, EDDEEC, MEDDEEC, CREEP and DMSEP for different network sizes as the weights are updated in each round for the criteria and are calculated based on the current status of sensor nodes. Moreover, we try not to overload the sensor node that becomes CH by considering cluster load as the selection criteria and assuming if it can continue its task until the end of the round. This leads to energy saving for sensor nodes that have low energy levels as they work as normal sensor nodes and do not handle the CH tasks. Figure 7 shows that the proposed algorithm enhances the WSN Half-life compared with DEC, SEP-E, EDDEEC, MEDDEEC, CREEP, and DMSEP up to a magnitude of 47%, 45%, 48% , 88%, 128% and 47% respectively for network size of \(100 \times 100\).

Fig. 7
figure 7

Half-life ofDEC, SEP-E, EDDEEC, EBCS, MEDDEEC, DMSEP, and CREEP algorithms for different network sizes

The results show that the network lifetime of SEP-E increases slightly by 1% over EBCS for network size \(250 \times 250\) and \(300 \times 300\). However, as the network size increases to \(350 \times 350\), the stability period of EBCS increases by 5% over SEP-E. This is because the number of CHs in each round is not fixed due to its randomized behavior in SEP-E; so the chance of selecting fewer number of CHs than the required and near to the BS increases. This leads to some indirect energy saving. Overall, our proposed EBCS increases the WSN lifetime as compared with DEC, SEP-E, EDDEEC, MEDDEEC, CREEP, DMSEP up to a magnitude of 7%, 43%, 60%, 47%, 61%, 50%, respectively for network size of \(100 \times 100\).

5.4 Average remaining energy

We compute the average standard deviation of the residual energy of sensor nodes per round with different network sizes. Figure 8 shows the average of remaining energy per round for DEC, SEP-E, EDDEEC, EBCS, MEDDEEC, CREEP, DMSEP algorithms. Notice that as the network size expands the consumed energy for SEP-E and EBCS slightly raises, while there is a remarkable increase in the case of DEC, EDDEEC, and MEDDEEC.

Fig. 8
figure 8

Average residual energy per round versus network size for DEC, SEP-E, EDDEEC, EBCS, MEDDEEC, DMSEP, and CREEP algorithms

Fig. 9
figure 9

Standard deviation per round versus network size forDEC, SEP-E, EDDEEC, EBCS, MEDDEEC, DMSEP, and CREEP algorithms

Figures 8 and 9 show the average and standard deviation of residual energy of sensor nodes as the number of rounds increase. The standard deviation of residual energy of sensor nodes refers to the distribution rate of residual energy of sensor nodes. A small standard deviation represents that the residual energy of each sensor node is almost uniform and is similar to the average value. The standard deviations for the residual energy among the sensor nodes of each protocol is computed as shown in Fig. 9. The rapid decrease in both of DEC and EBCS curves indicates that DEC and EBCS balance the energy consumption in the network better than the other protocols. This is due to even distribution of consumed energy among the sensor nodes.

6 Conclusion

In this paper, we have proposed a new entropy based clustering algorithm (EBCS) which divides a WSN into two-levels of the hierarchical network and deals with three-levels of energy heterogeneity of sensor nodes. The dynamic, distributive, and self-organizing EBCS benefits from the advantage of the local information of sensor nodes that can be measured in terms of entropy as criteria for cluster head election and cluster formation. Our simulation results show that the proposed EBCS exceeds the baseline algorithms DEC, EDDEEC, MEDDEEC, SEP-E, CREEP and DMSEP in terms of energy consumption, lifetime, and stability period.