1 Introduction

WSNs consist of a set of small devices called sensors randomly distributed in a geographical zone. Each sensor can supervise a particular phenomenon (temperature, pressure,…) and send information to the collection point (base station or BS) via a wireless connectivity. WSNs generally include a large number of nodes (many hundreds). Sensors are characterized by their limitation in energy resources, which poses problems particularly if the application must function for a long time [15].

In order to optimize energy consumption and to maintain the connectivity of the network structure, nodes can be organized into clusters. In the WSNs based clustering, the network architecture breaks up into three under-layers: a sensor network which consists of information receivers (sensors), the CHs that carry out the complex tasks of signal treatment, and a BS that receives information on behalf of the CHs [68]. Within this organisation, the intra-cluster and inter-cluster communications can be achieved either in one-hop or multi-hops. The multi-hops transmission from CHs to the BS results in an unbalanced energy consumption in equal clusters (hot spots problem). In fact, CHs closer to the BS transmit more relay data than CHs far from the BS [9]. Unequal clustering size (UCS) clustering algorithms control the size of each cluster in the network in order to balance and minimize energy dissipation between nodes and increase the network lifetime [1013].

In this paper, a new unequal clustering algorithm called EDDUCA was suggested. EDDUCA forms fixed clusters in order to minimize energy consumption during the clusters formation phase. It uses the ‘Sierpinski triangle’ method to divide the network into unequal clusters. Clusters having the same distance to the BS will have equal sizes. This strategy aims to extend the lifetime of the CHs closer to the BS, to maintain the connectivity of the structure and to balance energy consumption between clusters. In order to balance energy consumption between nodes in each cluster, the CHs choice procedure is based on three parameters: residual energy, the degree and the distance.

The remaining of this paper was organized as follows: Sect. 2 reviewed some unequal clustering algorithms. Section 3 described EDDUCA clustering algorithm in details. The simulation results were given in Sect. 4. Finally, our conclusions were drawn in Sect. 5.

2 Unequal Clustering Algorithms

In this section, we presented the advantages and the disadvantages of some unequal clustering algorithms.

2.1 Unequal Clustering Methods

To solve the hot spots problem, unequal clustering algorithms are defined. These algorithms tend to minimize the size of clusters closer to the BS, this later will have more energy to achieve the inter-cluster communication [14].

In recent years, several unequal clustering algorithms have been designed such as [9]. In this approach, the network is divided into unequal clusters in order to increase the lifetime of nodes closer to the BS that can reduce, according to Xiang et al., the loss packet ratio and improve the network performances. To this end, authors assumed that the nodes are deployed with uniform density \(\rho\) (\(nodes/m^2\)) in the wedge area defined by an angle \(\theta\) namely clustering angle. Thereafter, the wedge area is partitionned into m rings \(V1,V2,\ldots ,Vm\) that represent clusters. The distances between clusters (\(d1,d2,\ldots , dm\)) denote the one-hop inter-cluster communication. Then, the authors determined the optimal one-hop distance and the optimal clustering angle that would minimize energy consumption for both of intra-cluster and inter-cluster communications. Simulation results of the unequal clustering algorithm described in [9] showed that the network lifetime was extended. However, the proposed clustering model requires that nodes have to be deployed in the circular zone, which is not always possible in practice.

In [15], authors defined a new unequal clustering protocol named energy-driven unequal clustering (EDUC). In this algorithm, they used unequal competition ranges in order to define unequal clusters. For each cluster, an energy-driven CHs rotation method is used to rotate the CH selection and to balance energy consumption among the network.

In [16], authors proposed a new unequal clustering scheme namely energy efficient clustering scheme (EECS). For EECS, a distance-based clustering method is designed to produce unequal size clusters. In order to solve the hot spot problem and maximize the clusters size, a weighted function was proposed.

Another example of an unequal clustering algorithm was described in [17] where the authors introduced a new algorithm called energy efficient and balanced cluster-based data aggregation (EEBCDA). In this approach, the network is cut out into unequal rectangular zones (called swim lanes) and each zone is in turn split into smaller zones (called grids). For each grid, the node with the highest residual energy becomes a CH. The inter-cluster transmission is achieved by one hop to the BS. EEBCDA has shown its effectiveness compared to EECS algorithm in aspects of balanced energy and network lifetime. Nevertheless, the one hop inter-cluster communication may increase energy consumption especially for large size networks.

In [18], energy-efficient unequal clustering (EEUC) algorithm was proposed. A localized competition is defined to select CHs. For each node, a competition range is calculated according to its distance to the BS. The inter-cluster communication is achieved in k hops and depends on the energy consumption of the CH and its distance to the BS.

In [14], authors introduced a new unequal clustering algorithm called energy-aware unequal clustering fuzzy (EAUCF). The solution suggested by EAUCF is to minimize the number of nodes in the clusters closer to the BS in order to minimize the number of transmitting packets. The CHs are randomly selected, then, the competition radius of each CH is determined according to its residual energy and its distance to the BS. In addition, EAUCF uses the fuzzy logic to calculate the competition radius. EAUCF has proven its good performance compared to other unequal and equal clustering algorithms. However, the random CHs selection phase cannot garantee the energy consumption balance if the selected CHs have a weak energy reserve.

In [19], another algorithm using the fuzzy logic was defined. The proposed algorithm called improved fuzzy unequal clustering (IFUC) is based on the fuzzy logic model. The employed fuzzy logic system uses residual energy, the distance to the BS and the local density as three parameters to select CHs and their raduii.

Table 1 makes a comparison between the clustering algorithms described previously. The comparison parameters are:

  • The type of clustering: which can be static or dynamic. For static clustering, all clusters are firstly designed once and for all. Then, the update concerns the CHs selection phase. Conversely, for the dynamic clustering, the update concerns the overall structure and each time new clusters are designed.

  • CH selection parameters: which can be the energy, the distance,…

  • Energy efficiency: low, medium or high.

Table 1 Table of comparison

2.2 Weakness of Existing Clustering Methods

Referring to the Table 1, we can clearly depict that most algorithms used the dynamic clustering method. This strategy is used normally to balance energy consumption between nodes in the network, but, it can have a negative influence on the network lifetime. In fact, the deployement and the update of the overall structure need the important control traffic caused by the k-neighbourhood discovery. The use of the static method has a positive influence on the network lifetime since static clusters are firstly deployed without any energy consumption and the structure update method is realized locally in each cluster. Consequently, this method reserves energy and helps to minimize energy consumption.

The criterion selection of CHs can or cannot depend on the energy of the nodes. If the energy parameter is not considered, nodes having weak energy reserve could be selected as CHs, which involves the performances degradation of the network and the batteries exhaustion of considered nodes. Other parameters can also be used in this phase such as the distance between nodes which can help to minimize energy consumption during the communication. The degree of nodes can also minimize the number of the built clusters and facilitate the communication between clusters by increasing the number of gateway.

In addition, the designed algorithms tend to balance the energy consumption between nodes in each cluster and ignore the balance of energy consumption between clusters. In fact, there is no control of the size of the clusters having the same distance to the base station.

In this paper, we define a new unequal clustering algorithm aiming to minimize energy consumption, balance energy consumption between clusters and between nodes in the same cluster. The phase of clusters creation does not consume energy. The formed clusters are static and having unequal size. The clusters having the same distance to the base station will have equal sizes. To balance energy consumption between nodes in the same cluster, the CH selection rotate among members and the node having the higher residual energy, higher degree and nearest to the cluster center is more likely selected as a CH.

In the following section, we will describe the EDDUCA clustering algorithm.

3 EDDUCA Clustering Algorithm

In this section, we described a new clustering algorithm called energy degree distance unequal clustering algorithm (EDDUCA).

3.1 Network Model

We modeled the network by a graph \(G=(V,E)\) where V is the set of sensors and \(E=\left\{ (u,v)\subset V/D(u,v) \le R\right\}\) represents the wireless connections between nodes. R is the transmission range, and D(uv) represents the distance between the node u and the node v. \(c=\left\{ u\subset V\right\}\) is the cluster which contains a set of nodes u and \(r_c\) is its center. \(C=\left\{ c\right\}\) is the set of clusters. We assume also that nodes are deployed with uniform density \(\rho\) (\(nodes/m^2\)) in the square zone Z, the area of Z is \(A m^2\). The distance between the clsuter c and the BS is defined by the distance betwen \(r_c\) and the BS.

Table 2 displays the parameters and their definitions.

Table 2 Parameters definition

3.2 Energy Consumption Model

The consumed energy by each sensor consists of three components: a communication (transceiver) unit, a sensing unit and a processing unit [20]. Therfore, we only consider the consumed energy by the communication unit (the consumed energy by the sensing and the processing units is neglected):

$$\begin{aligned} E_c(u) = E_{c/comm}(u) \end{aligned}$$
(1)

The consumed energy by the communication unit is made up of transmission energy \(E_{Tx}\) and reception energy \(E_{Rx}\) [21], so:

$$\begin{aligned} E_c(u) = E_{Tx}(k,d)+E_{Rx}(k) \end{aligned}$$
(2)

For \(E_{Tx}\) and \(E_{Rx}\), we apply the same model proposed in [21] which is described as follows:

$$E_{Tx}(k,d)= E_{elec}\times k+ \epsilon _{amp} \times k\times d^\lambda$$
(3)
$$E_{Rx}(k)= E_{elec}\times k$$
(4)

where \(E_{elec}{:}\) electronics energy. k: message length (bits). \(\epsilon _{amp}{:}\) transmit amplifier. d: distance between the transmitting node and the receiving node (m). \(\lambda{:}\) path loss exponent.

The energy consumption is noticed to be directly proportional to the message length k and to the distance between nodes d. Thus, it can be reduced by minimizing these two parameters.

3.3 Clusters Formation Phase

In this section, the clusters formation phase adopted in EDDUCA clustering algorithm was introduced. Our objective in this phase is to minimize energy consumption and to balance the energy consumption between clusters:

  • Minimize energy consumption: usually, clustering algorithms use dynamic clusters. In this case, the number of clusters and cluster members change dynamically. For these types of clustering, the energy consumed in the network will be equal to the sum of the energy consumed in the clusters construction phase, the energy consumed during the clusters update phase and the energy consumed during the communication phase [16].

    $$\begin{aligned} E_c(network) = E_c(clusters-construction)+E_c(clusters-update)+E_c(communication) \end{aligned}$$
    (5)

    In our case, we will define static clusters which are built in advance. The energy consumed during the cluster construction phase will be nill and the energy consumed in the network will be:

    $$\begin{aligned} E_c(network) = E_c(clusters-update)+E_c(communication) \end{aligned}$$
    (6)
  • Balance energy consumption between clusters: we assumed that nodes are uniformely deployed in the square zone. In the clusters formation phase, we will define unequal clustering method. The nearest clusters to the BS have a lower size than the other clusters. In this case, they will have more energy to achieve the inter-cluster communication. In addition, when two clusters are located at the same distance to the base station, they have the same size and the same number of nodes since nodes are deployed uniformally. These two conditions allow to balance energy consumption among clusters in the network.

To achieve our goals previously mentioned, we will use the method of ‘Sierpinski gasket’. Initially, this technique was described and, then, the clusters formation phase was explained in details.

The ‘Sierpinski triangle’ [22] namely also ‘Sierpinski gasket’ or ‘Sierpinski Sieve’ is the famous fractal triangle created by the matimatician ‘Waclaw Sierpilski’. It is constructed using an initial equilateral triangle according to these steps:

Step 1 :

Divide the initial triangle into four small triangles by connecting the mid-points of its sides.

Step 2 :

Remove the central triangle.

Step 3 :

Repeat the steps 1 and 2 to the remaining triangles.

Figure 1 describes the differents steps of ‘Sierpenski triangle’ process.

Fig. 1
figure 1

Sierpenski triangle

In our clusters formation phase, we used the same principle of ‘Sierpenski triangle’. First, we assume that N nodes are deployed uniformly in the square zone of 200 m × 200 m and the BS coordinates are (0, 0) as shown in the following figure (Fig. 2):

Fig. 2
figure 2

Initial WSN

Then, we draw the two diagonals of the rectangular zone and we form four triangles (see Fig. 3).

Fig. 3
figure 3

First step

Then, for each triangle, we connect the mid-points of the sides to form four small triangles (Fig. 4): the central triangle ‘c1’, the triangle placed in the top ‘h1’, the triangle located in the bottom ‘l1’ and the triangle situated on the left.

Fig. 4
figure 4

Second step

Next, the same procedure is repeated (n times) for the nearest triangle to the BS. The obtained scheme (Fig. 5) appears as a ‘Sierpenski triangle’ (first iteration).

Fig. 5
figure 5

Final step

Each triangle represents one cluster. According to Fig. 5, we can notice that the obtained clusters are unequal and their sizes decrease if the distance to the BS increases. We can clearly depict that the four red triangle have the same area. They are, in fact, four clusters having the same distance to the BS and their area is equal to \(A/16\,\hbox {m}^2\). Consequently, they have the same number of nodes: \(\rho \,.A/16\) (nodes).

The process of the clusters formation phase is explained in Algorithm 1.

figure d

3.4 CHs Selection Phase

In this section, the CH selection phase is described. Our objective is to balance e\nergy consumption between nodes. For each cluster, the CH selection procedure is achieved relying on three parameters: energy, degree and distance:

  • Energy: is a very important parameter for WSN. In fact, a node with low energy reserves, is inable to transmit information especially if the distance between nodes is high and requires significant power. For this reason, many researchers such as [2325] and [26] have used this parameter during the CHs selection phase to balance the load (energy consumption) between nodes in order to provide a stable structure and to increase the network lifetime.

  • Degree: the node degree is the number of its neighbors (1-neighborhood). Researchers in the field such as [23, 27, 28] and [29] chose degree as CHs selection criterion. The node with the highest degree is selected as a CH. This approach provides a stability to the structure.

  • The distance between nodes is also a very important criterion in the CHs selection phase. This parameter influences the transmission power and the number of hops to reach the destination and, consequently, the energy used to make the transmission. Many other researchers choose this setting to minimize the energy consumed during the inter cluster communication [30].

Since these parameters are very interesting to minimize energy consumption and to produce a stable structure. These will be used in the CH selection phase. The CH must have the highest residual energy to be able to manage the intra-cluster and inter-cluster communications. We choose the CH with the highest degree in order to be a neighbor of several CHs and facilitate the inter-cluster communication. To minimize the consumed energy in inter-cluster communication, the nearest node from the cluster center is selected as CH.

For each cluster, nodes calculate their combined weights, then, the node with the smallest weight becomes a CH. For each round, this procedure is repeated in order to balance the energy consumption between nodes.

The CHs selection phase is described in Algorithm 2.

figure e

3.5 Transmission Data to the BS Phase

Each node detects the information and transmits it directly to its CH. The CH receives data and forwards it to the upper layer cluster until reaching the BS. For each level i, the CHs of the clusters hi and li send data to the CH of the central cluster ci. The latter transmits the recieved data to the CH of the central cluster \(c(i+1)\) until reaching the BS. This process is further described in the following figure (Fig. 6).

Fig. 6
figure 6

Transmission data to the BS phase

4 Simulation Results

The simulation results were implemented using Matlab. The parameters of simulation are described in Table 3.

Table 3 Simulation parameters

Figure 7 shows the number of contructed clusters vs the number of iterations n. According to this figure, we can obviously notice that the number of clusters is proportional to the number of iterations. The slope of the regression line is 12. So, for each added iteration, 12 clusters would be added to the structure.

Fig. 7
figure 7

Number of clusters

Figure 8a–c represents the consumed energy values for each node after 250, 750 and 1500 rounds, respectively. We can clearly depict that values are closer to each other for each figure. After 250 rounds, the values are in the range of 0 to 0.2 J and the consumed energy values do not exceed 20 % of the initial energy. During the 750th rounds, the value range may be between 0.35 and 0.55 J which represents 55 % of the initial energy. After reaching 1500 rounds, some nodes batteries will be exhausted and the values vary between 0.75 and 1 J. These results show that EDDUCA tends to balance the nodes energy consumption of over the network which leads to increase the network’s lifetime as much as possible.

Fig. 8
figure 8

Energy consumption: rounds. a Energy consumption: 250 rounds. b Energy consumption: 750 rounds. c Energy consumption: 1500 rounds

Figure 9a–d represents the different characteristics of the energy consumption for each cluster during the 600th round: the maximum and the minimum values, the difference between them as well as the average value. The four figures clearly show that an energy balance has been achieved between all clusters.

Fig. 9
figure 9

Energy consumption: clusters. a Maximum energy consumption. b Minimum energy consumption. c Max–Min energy consumption. d Average energy consumption

The max–min values vary between 0 and 0.1051 J. We can also observe that 7 values are nil, this means that, for 7 clusters, all the values are equal and nodes have taken turns to be selected as CH. Theses clusters represent 25 % of the total number of clusters.

For the same figure, we can also notice that 10 values are lower than 0.02 J and they represent 35.71 % of the total number of clusters. All these results illustrate that EDDUCA can efficiently manage energy between nodes for each cluster.

Figure 10 reflects the comparison between EDDUCA, enhanced centralized low energy adaptive clustering hierarchy (ECLEACH) [31] algorithm and the algorithm proposed in [9] in terms of the number of nodes alive. ECLEACH is an equal clustering algorithm which selected CHs based on their residual energy, their distance to other nodes and the energy reserve of other nodes. [9] is a unequal clustering algorithm which select CHs randomly.

Fig. 10
figure 10

The number of nodes alive

Figure 10 clearly shows that unequal clustering algorithms (EDDUCA and [9]) outperform equal algorithm (ECLEACH). This figure proves the effectiveness of unequal clustering algorithms comparing to equal clustering algorithms. In fact, unequal clustering methods were suggested to solve the hot spot problem caused by equal clustreing strategies. The same figure also depicts that EDDUCA is more energy efficient than [9].

In [32], authors introduced three metrics to evaluate the lifetime of WSNs: the first node dies (FND) which represents the round value at which the first sensor dies, half of the nodes alive (HNA) which denotes the round value at which half of the nodes die and last node dies (LND) which indicates the round value at which all nodes die.

Table 4 represents a comparison between EDDUCA and [9] in terms of FND, HNA and LND. When, the FND metric is considered, the two algorithms have the same network lifetime. When considering HNA, results show that EDDUCA is 76.47 % more efficient than the optimal parameters. This result reflects the impact of the energy consumption balance achieved by the proposed algorithm which tends to divide the energy consumption among the remaining nodes for each round. By considering FND, Table 4 clearly depicts that EDDUCA is 15.62 % more efficient than [9].

Table 4 The number of nodes alive

EDDUCA, [9] and ECLEACH are also tested on different WSNs with 10, 30, 50, 70, 100, 150 and 200 nodes. Figure 11 represents the average energy consumption for each algorithm using different networks. This figure shows that EDDUCA reduces energy consumption for each case and it gives always lower values than other algorithms.

Fig. 11
figure 11

Average energy consumption

5 Conclusion

In this paper, a new unequal clustering algorithm using ‘Sierpinski triangle’ was designed in order to minimize energy consumption, balance energy consumption between clusters and balance energy consumption between nodes. The designed clusters are static and having unequal sizes. The clusters having the same distance to the BS have equal sizes. The CHs are selected according to their energy reserve, their degree and their distance to the clusters centers. Simulation results show that the proposed algorithm is effective in balancing enregy consumption and extending the network lifetime.

Upcoming work will compute the impact of the deployement heterogenous nodes having high energy capacity in the energy consumption and will consider more factors to select CHs.