1 Introduction

The recent rapid development of WSNs has increased the range of WSN applications and their scale [1]. Popular examples are smart grids and renewable energy systems in which large quantities of data are required to be collected. WSNs have multiple applications, for example to help manage peak load and optimize electricity generating resources. With the growing size of smart grids, there are increasing challenges in maintaining a networks’ performance, reliability, stability, economy of scale [1, 2]. WSNs can give many benefits over traditional communications used in existing electrical power systems. WSNs have been increasingly adopted as a useful technology to improve different areas of electric power systems, finding applications in the various stages of electricity generation, delivery, and utilization [3]. Thus WSNs are an integral component of our complicated electrical power systems. This entails dangers and responsibilities and requires high reliability. The challenges are in designing reliable WSN systems in terms of security, energy efficiency and adaptability. Another application for WSNs in the same context is their integration into everyday consumer electronics and appliances with the aim of smoothing peak demand spikes in at the consumer level. Customer appliances and their electric power meters can be equipped with wireless sensors to form a network capable of providing real time data to the customers about their electricity consumption. The objective is to manage the use of electricity in a cost-effective and efficient way [4].

In most WSN applications, the sensor nodes are often deployed in an ad hoc manner where well-defined placement may not be practical or too costly. Once deployed, the wireless sensor nodes must have the ability to self-organize and integrate into an efficient and reliable wireless communication network [5]. WSNs should be able to provide reliable and secure communication and control capabilities at low cost. A common use of WSNs is for collecting data in different IoT applications because of their lower cost and deployment flexibility. WSNs are diverse with many different proprietary and non-proprietary solutions [1, 6]. To apply a WSN effectively, the characteristics and constraints of WSNs need to be fully understood. The most fundamental constraint being energy usage [7]. Other common metrics for WSNs are efficiency in memory storage, processing power and the resulting data throughput [8]. However since energy consumption and the lifetime of the whole WSN are the most fundamental constraints, they are commonly used to evaluate the merit of WSN network protocols and algorithms. A number of studies have been conducted to investigate the issue of energy constraints of WSNs [9].

In this paper we have compared various protocols in hierarchical routing. Therefore, due to the relevance of routing in WSNs and their importance in the literature, we attempt to present a comprehensive study of various routing algorithms and protocols and their effects on the WSN performance, those protocols are classified in groups under chain, tree, grid or area based network. As discussed later on Sect. 2, surveys have been conducted on WSNs regarding clustering and routing algorithms [10,11,12,13,14,15,16,17,18,19,20,21]. However, we identified gaps on the comparison points. As example, authors in [10] has done a performance comparison of different atypical hierarchical routing protocols, but it has not discussed and compared the parameters used in formation of clusters in a detailed and clear manner. The main objective of this article is to be used as tutorial for the comparison of the most relevant base-line hierarchical routing protocols. Therefore, in this article our contributions are as follow, we further discuss and analyze the cluster formation method, hierarchical structure, and leader selection criteria in a more comprehensive way. Furthermore, instead of just analyzing the energy efficiency of different routing algorithms generally, we will consider and review the energy consumption by (1) sending nodes and (2) leader nodes in a separately. Then, conclusions of the energy burden and limitations of different routing algorithms are also deeply discussed. Besides, data loss caused by node failure is also used as a performance indicator for analyzing different routing algorithms. Thus, this survey is also looking at different traditional WSN hierarchical routing protocols. In this study, various hierarchical routing algorithms for WSNs are compared and their performance is discussed for further analysis. Based on our analysis, we provide justification for ranking the best state-of-the-art routing techniques according to the optimization metrics. Additionally, since the well know algorithm LEACH [22] is also classified as a hierarchical-based wireless sensor network, but it does not belong to one of the groups under chain, tree, grid or area based network. This is because the cluster head in LEACH is selected randomly from the whole region, and it is not limited and bounded by any grid or area. Besides, the cluster head will send data to the base station directly with single hop transmission, it is different with the chain and tree based transmissions. Therefore, this survey also discuss and compare different parameters used in cluster formation for different enhanced LEACH versions in order to provide the factors to be considered in forming a cluster and routing. Finally, we close this article pointing out the most relevant challenges and future research trends of WSN.

The rest of this article is organized as follows. In Sect. 2 this paper summarizes the most relevant routing protocols surveys and points out our contributions. Section 3 concentrates on introducing the background to WSNs, while in Sect. 4 a classification system of WSN routing protocols is presented. Sections 5, 6, 7 and 8 concentrate on analysing and discussing chain-based, tree-based, grid-based and area-based hierarchical routing algorithms, respectively. Finally, due to the relevance and importance of LEACH routing protocols, Sect. 9 focuses on analysis and compares the latest improvements of LEACH routing protocols, while Sect. 10 the challenges and future research trends are discussed. Finally, Sect. 11 concludes this survey with relevant discussions.

2 Related work: routing protocols surveys

Recent surveys have been conducted on WSNs regarding clustering and routing algorithms. These surveys are mainly focusing on energy saving, scalability, reliability, auto-configuration and they discuss the different techniques used for improving the performance of WSNs as summarized in Table 1 and explained bellow.

Table 1 Review of routing protocols surveys

There are different limitations and challenges that a routing protocol should concern itself with. In [16, 19], the routing protocols are classified into four main categories: data centric, hierarchical, location based and multipath based routing protocols. Each protocol is then further analyzed under each category. The comparison indicators are based on node mobility, power consumption, data aggregation, scalability, and multipath ability. It is concluded that hierarchical routing protocols are still a good approach regarding scalability and transmission efficiency, and further research can be done to improve their energy efficiency, especially for high density sensor networks. There are many factors to be balanced in designing WSN protocols, such as fault tolerance, energy efficiency, scalability, latency, power consumption and network topology. The authors in [20] mainly concentrate on two main factors: shortening the latency and minimizing the energy consumption, in designing WSN routing protocols. The authors, also discusses hierarchical routing algorithms, TEEN and APTEEN, aiming at selecting a suitable cluster head and controlling the frequency of data transmissions to save energy. To shorten the latency in data transmissions, some protocols like SPEED can maintain a desirable transmission speed in data communications. RAP can provides a real time request and query transmission with scheduling for large area network. Moreover, LAP is a location based and connectionless communication protocol, and it can shorten the communication delay and save energy by maintaining a table containing neighbour nodes to select the best transmission path for forwarding data packets. RPAR is a real time protocol to save energy consumption and shorten the time delay by meeting a packet transmission deadline with desired transmission velocity.

Different energy efficient clustering approaches have been categorized and compared in the survey [11]. The challenges and limitations of WSN are discussed as well as the different pros and cons of each algorithm. The analysis of hierarchical routing and its use in WSNs is based on different parameters such as the clustering approach and the selection of cluster heads. The indicators of performance in terms of a network’s lifetime, battery life, data transmission and sensing techniques are summarized. The authors conclude that there is no single routing algorithm which is suitable for all situations. Similarly, different recent heterogeneous clustering methods of sensors with different level of initial energy are discussed in [12]. The analysis is based on some predefined performance metrics such as network lifetime, number of heterogeneity levels, cluster head selection, energy efficiency and stability. A new routing protocol named m-BEENISH is proposed. The cluster head selection is based on the initial and residual energy level of nodes. Five different energy level are defined for nodes, and the nodes with higher energy level will have a greater chance to become a cluster head. The authors concludes that m-BEENISH is more energy efficient than other similar protocols under the heterogeneous clustering environment with higher stability, longer lifetime, and higher successful transmission ratio.

LEACH is one of the hierarchical routing algorithms to save nodes’ energy for data communications. However, there are some security concerns on LEACH when it comes to IoT, because many sensors nodes are connected together to communicate sensitive and private data. In [13], the improvements and categorization of different LEACH versions in two main groups (1) single hop, and multi-hop transmissions. Then, it further analyses the algorithms based on different parameters, such as clustering methods, energy efficiency, overhead, scalability complexity, load balancing, location of nodes, and delay. It discusses the strengths and weaknesses of the different LEACH versions under different domains, such as energy efficient, security, optimization, data aggregation, mobility, scalability, cluster size, etc. It also mentions that knowing the location of sensors nodes is costly because it consumes a lot of energy in data extraction and communication. More overheads and delays may occur in multi-hop transmissions due to paths construction. Energy consumption is also a major factor in considering the cluster head selection, thus conclude the authors. The survey in [14] describes different LEACH versions and security methodologies. It describes the formation of clusters and selection of cluster heads in different aspects, and mentions different attacks on LEACH, such as Sybil Attack, Selective Forwarding, and Flooding attack. It further suggests ways to secure LEACH by various methods such as hop by hop, end to end data aggregation, and other security mechanisms.

Standardization, flooding and gradient approaches have been reviewed in [21]. Flooding technique developed by IETF MANET working group aims at finding a route on-demand to optimize the number of relaying nodes. Furthermore, clustering techniques are proposed in order to divide the large network into groups so that data can be transmitted in an acceptable range. Various clustering techniques proposed by different researchers are also analysed. The location of nodes is useful for environmental monitoring and mobile applications. Sensor nodes can be equipped with a GPS function to determine their location. Hop count, Euclidian distance, and power distance can be used to measure the distance between nodes. Gradient approach, which measures the distance according to the height of nodes, is also discussed. These are different limitations and challenges that a routing protocol should be concerned with. In [16, 19], It then further analyses each protocol under each category. The comparison indicators are based on the nodes’ mobility, power consumption, data aggregation, scalability, and multiple paths. It concludes that hierarchical routing protocols are still a good approach for scalability and transmission efficiency, and more research work can be carried out to improve the energy efficiency especially for high density sensor networks. With the emergence of IoT, there will be many of applications with mobile sensor nodes, and there is a need to modify some of the existing routing algorithms to suit the new applications.

The implementation of routing algorithms for duty-cycled WSNs, consisting of sleep and wakeup schedules, several techniques are discussed in [17, 18]. Authors state that integration of tree based network with duty cycles can shorten the delay time and k-neighbourhood algorithm method reduces the number of neighbourhood connections to a desired value. For this the neighbourhood nodes adjust the sleep schedule and routing path among themselves to save energy consumption. It also explains the concerns on broadcast and multicast routing for duty-cycled WSNs. It is because a simple broadcast method is not a suitable solution because each node may have a different sleep schedule, and also collisions may be occur when more than one node sends data simultaneously. Besides, redundant data will be sent if similar or identical data is sent from neighbouring nodes in high density networks. Some analysts proposed considering the probability that a node rebroadcasts a packet in the current active time, and the probability that a node remains on after the active time when it normally would sleep in order to determine the multicast approach. Others may add a flag to the packet to store the quality of links to all its neighbours and its broadcast packet reception status to reduce redundant transmissions.

3 Wireless sensor networks: background

A lot of attention has been drawn to research on WSNs in the past decades, which has underpinned and driven the more recent evolution of WSNs towards IoT [23]. WSNs are widely used for their sensing, wireless communications and computation capabilities. WSN incorporated smart sensors that allows to provide smart services and application, the most relevant examples are depicted in Fig. 1. WSNs consist of large numbers of low cost wireless sensor nodes using low power in places where traditional networks cannot compete. Power constraint especially dominate the performance of a WSN. WSNs are differ from other traditional data communication networks in that sensors are densely deployed, and nodes can be easily damaged often because of harsh environmental conditions. In some deployments, the topology may change from time to time, requiring the links between nodes to be reconfigured which may cause some instability and require more energy. For these and other reasons, WSNs may be unstable in the field. Therefore, maintaining stable WSNs is a challenging task which requires mature monitoring and control strategies appropriate to the specific deployment.

Fig. 1
figure 1

Most relevant applications of WSNs where smart WSN equipped with powerful sensors trigger the evolution of smart services

The energy source of sensor nodes is usually battery power and nodes are required to run for long periods of time without physical maintenance. Often node sensors are difficult to access and it would be difficult to change or recharge the energy source. Thus the most challenging constraint remains energy consumption. With their usually huge number of sensor nodes, WSNs require well-defined energy efficient and adaptable routing algorithms. Sensor nodes are battery powered. The power limitation of the WSN is mainly caused by the small physical size of sensors, their batteries and the absence of a rechargeable energy supply [24,25,26]. In most applications, a WSN may consist of hundreds or even thousands of sensor nodes. These nodes have limited energy power, low storage size, and narrow bandwidth for communication. Moreover, it is usually difficult to replace or recharge batteries when they are sparsely deployed in remote environments. Every node uses power for its sensor transducer, communication among other sensor nodes and microprocessor computation. The energy required is much more than data sensing and computation. In fact, most of the energy is consumed in data communications between nodes. It is expected that the design of routing algorithm and protocols will entail crucial decisions in managing the complicated WSN environment by balancing key parameters so as to improve the robustness of networks. As an example a large number of sensor nodes are required to establish a reliable data communication network between a Utility Company and its customers. Such systems require efficient algorithms to maintain reliability of the WSNs, which implies using the limited battery power of the nodes in the most efficient way.

3.1 Criteria of a robust wireless sensor network

There are several criteria to determine a robust WSNs, and the degree of importance of each criteria will vary with different applications [27]. Here, we focus on the criteria regarding routing algorithms in WSNs.

  • Efficient power usage Energy source of sensors are mainly from a battery, we assume it is hard to charge or recharge their batteries because of the great amount of sensor nodes in often a hostile and hazardous environment. Besides, they are often difficult to be accessed. Therefore to reduce the energy used by sensor nodes it is crucial to apply energy-efficient routing algorithms to extend the lifetime of the whole WSN [24,25,26].

  • Scalability The number of sensor nodes deployed in a WSN may range from tens, hundreds, or even tens of thousands of nodes. Thus, when designing the routing algorithm, it should be scalable for different network sizes [28, 29].

  • Reliability This is also a critical factor for evaluating WSNs performance. Basically, reliability is also related to routing and power consumption because a dead sensor node cannot transmit any data. In addition, if the dead node is a cluster head, the cluster’s performance will be affected and the successful delivery ratio will be reduced. Reliability is also affected by congestion therefore, depending on the application, congestion control mechanisms in the routing algorithm are almost mandatory [30].

  • Self-organization After sensor nodes are deployed in the network environment, sensors should be able to re-organize themselves if and when nodes fail or the network topology changes. Adaptive routing protocols able to follow the real time topology change need to be deployed in such dynamic scenarios [31, 32].

  • Adaptability In sensor networks, sensor nodes can join or leave a cluster in different iterations, which will change the node density and network topology of the newly formed cluster. Thus, routing algorithms used for sensor networks should be flexible enough to cater for the frequent changes of members of a cluster [33].

  • Security A sensor network can be used to deliver personal and private data. So, a secured data communication network is essential for data transfer in order to protect the data from being copied, destroyed or altered in the path. Routing protocols should not exclude security in their operations [34, 35].

3.2 Constraints of wireless sensor networks

It is always hard to balance the ideal criteria for an ideal robust WSN, and especially when the requirements of specific WSN applications add additional constraints [36]. The more relevant constrain are depicted in Fig. 2 and discussed below:

  • Limited and unstable energy supply The number of dead nodes is an indication of the energy management across the entire WSN. This is because the energy source of sensor nodes are usually batteries which have a limited life. Networks may be used in environments, where it is difficult to change node batteries. Thus the main challenge consists in designing energy efficient routing algorithms for WSNs which balance the available energy of the entire network in the most efficient way [24,25,26].

  • Massive, random, and varying node deployment Deployment of sensor nodes can be static or random which calls for different performance requirements and routing algorithms. In many applications, sensor nodes can be scattered randomly or sparsely distributed over an area. If the distribution of the sensor nodes are changing from time to time, optimal routing algorithms need to be able to adapt to this changing network topology to manage the whole sensor network in an energy efficient way [37].

  • Unreliable network environment Some sensor nodes may be unreliable because of physical damage, malfunction, or lack of energy. This affects the performance of the WSN. Ideally the routing algorithms should be able to reconfigure themselves around dead or unreliable nodes [38].

  • Scalability Routing algorithms should adjust to different scales of the network. Sensor nodes may also be equipped with residual energy sensing ability, or special processing, and communication functions. Also, physical communication paths between different sensor nodes may vary. The ideal routing algorithm should be flexible enough to consider these different parameters in a changing environment [28, 29].

Fig. 2
figure 2

Most relevant constrains of WSNs

4 WSN routing protocols classification

As stated earlier, in order to create a robust WSN, a well-developed routing protocol is essential. The WSN routing protocols can be categorized into three different structures as follows [16, 19]:

  1. 1.

    Flat routing algorithms Sensor nodes have similar functionality in data gathering, functionalities, transmission and power consumption.

  2. 2.

    Hierarchical routing algorithms Sensor nodes are divided into several clusters. In each cluster, the node with the higher energy level is basically commonly chosen as the cluster head based on different well-know metrics.

  3. 3.

    Location-based routing algorithms Sensor nodes use geographical information to send data to specified regions. So sensor nodes need to be able to localize themselves, or their location be calculable.

Since hierarchical routing protocols are the most popular and likely the choice of IoT sensor networks, this survey will focus on further classifying and analysing several hierarchical routing protocols based on different criteria (see Fig. 3).

Fig. 3
figure 3

Routing protocols classification of WSN

4.1 Hierarchical routing algorithms background

Hierarchical routing algorithms in WSNs have been studied from a variety of angles [13]. A common method is clustering by dividing sensor nodes into groups [11]. This is a commonly used data communication technique to reduce the energy consumption by sending data from sensors to cluster head and to the base station. In hierarchical clustering, the whole sensor network is divided into different clusters or multiple layers. Transmission within a cluster is coordinated by each cluster head which is also responsible for routing between clusters or base stations. Data travels from one level to another enabling it to travel longer distances. This can make the data communication faster and more energy efficient. Thus, clustering provides data aggregation advantages among cluster heads at different levels in order to improve the performance of the whole WSN. The following categories are commonly used:

  • Single hop transmission A cluster head sends data to the base station directly without passing through other cluster heads. This is the simplest transmission method without the need to consider other information. However, it may not be suitable for a large scale network because there is a transmission distance limitation with sensors, and they are not allowed to transmit data outside a certain range. Even if the data can be transmitted, it may lead to a heavy burden on the cluster head because the energy consumption is directly proportional to the distance, and is higher for longer distances [13].

  • Multiple hop transmission Cluster heads send data to the next cluster head(s) until the base station is reached. This method can divide a single long distance into multiple shorter distances for transmissions. This can share the loading among cluster heads, and it is more suitable for large scale networks. However, a suitable routing method is needed because energy will be wasted for unnecessary transmissions. Cluster heads which are closer to the sink are always overloaded with heavy traffic which causes them to be exhausted rapidly. Clustering with unequal size has been the main solution under investigation to handle such problems [13].

Fig. 4
figure 4

Representation of hierarchical routing protocol strategies in WSN, where sensor nodes have different roles (S sink, L leader)

Hierarchical routing can be also classified into the following main categories: (1) chain-based, (2) tree-based, (3) grid-based, and (4) area-based routing. The graphical representation of hierarchical routing algorithms are depicted in Fig. 4. In this survey, the most relevant state-of-the-art algorithms have been selected for each category as show in Fig. 3. These techniques and algorithms are explained in the following sections.

Table 2 shows a comparison among hierarchical routing protocols based on energy, load balancing, delay, scalability and data loss probability and how each of those protocols perform in those fields.

Table 2 Comparison of hierarchical routing algorithms based on energy consumed by nodes and leaders

For each category of the hierarchical routing techniques we will discussed in detail the main features and characteristics in the following section. Table 3 provides a comparison of the performance of hierarchical routing algorithms regarding their energy consumption. We will later further elaborate all this metrics.

Table 3 Comparison of energy consumption of hierarchical routing protocols

5 Hierarchical chain-based routing algorithms

In chain-based hierarchical routing, the whole WSN is divided into multiple chains; a leader node is selected in each chain. Each sensor node will deliver the packets to the next nearest node until reaching the leader node (see Fig. 4a). Aggregation of data is carried out through transmission. The leader will then send the aggregated data to the base station. Chain based routing is easy to set up and maintain, there is no frequent change of the formation of the chain, and the nodes always send the data to the nearest node. Therefore the energy consumed for chain formation is low. However, the problem of chain based routing is that there may be many nodes in a chain, and if the source is far away from the leader node, then the data needs to travel a long distance to the leader. The time used for the delivery is long, and this may cause time delays, and may not be suitable for time critical applications. Besides, nodes which are very closer to the leader node will always be involved in data transmissions. This creates a heavy burden on these nodes, and the energy consumption of these nodes higher. Another consideration is that if there is a node malfunction in a chain, all the data travelling through the failed node and chain will be lost. This affects the reliability of the applications and the performance of the chain. The most relevant chain-based routing algorithms are listed below:

  1. 1.

    Power-efficient gathering in sensor information systems (PEGASIS) [39] A chain can be formed by the sink and the nodes themselves. All nodes know global network information and the location of every node. The node which is farthest away from the sink starts the formation. It will find the closest node as the next connection until it arrives at the leaders. The leaders will send the aggregated data to the sink and leaders can be rotated. This helps with load balancing to some extent because every node in the network may become involved in the data transmission, and the leaders can be rotated to share the burden. However, very long time delays may occur because of the long distances travelled from the source to the sink as the data passes through many intermediate nodes. Therefore, it is not suitable for time critical application and not appropriate for a large scale network. Several improvements of PEGASIS has been proposed in the literature. In [55] PEGASIS with Improved Network Lifetime (PEGASIS-INL) is proposed. The PEGASIS-INL selects a node as a leader only if it is within a strong communication range of base station. Similarly, in Modified PEGASIS [56], each node communicates only with a close neighbor node with minimum distance and transmits data to the base station, thus reducing the amount of energy spent per round. Thus PEGASIS-INL and Modified PEGASIS algorithms outperform PEGASIS in terms of the energy consumption and the network lifetime.

  2. 2.

    Concentric clustering scheme (CCS) [40] It is built with multiple chains in contrast with PEGASIS which has one chain only. The multiple chains of CCS are divided into different layers. One cluster head will be selected in every chain. Within each chain, all nodes will send data to the nearest node until data reaches the cluster head. Next, the cluster head on the farthest chain will send data to another cluster head on the next higher layer until it reaches the base station. Compared with PEGASIS, the distance travelled can be shortened greatly because the data is transmitted among cluster heads up to the base station. Therefore, the time delay is shorter and it is more suitable for large scale networks. However, the cluster head on the chain which is closest to the base station will have a heavier traffic loading and will likely be exhausted earlier than others. Note that residual energy is not considered in selecting the cluster head. Thus a low residual energy node may be selected and also become quickly exhausted.

  3. 3.

    Energy-balanced chain-cluster routing protocol (EBCRP) [41] The whole network is divided into many rectangular sections, and one chain is established in each rectangular area by using a ladder algorithm. Cluster heads are selected based on the residual energy and they are rotated to share the loading. The cluster head of each chain collects data from all nodes in its chain and send data directly to the base station. Compared with the greedy algorithm, the ladder algorithm that EBCRP is using, is more energy efficient because the total transmission distance may be shorter. The traffic loading and burden is shared. This avoids exhausting any one cluster head node. However, the whole network is divided into rectangles. The neighbouring nodes within a rectangle may not be in the shortest distance path, and therefore may consume more energy. Furthermore, because of single hop communication, EBCRP is not suitable for large scale networks.

  4. 4.

    Chain-based hierarchical routing protocol (CHIRON) [42] Different from the EBCRP which is formed with fixed rectangular shapes, CHIRON divides the whole network into fan-shaped areas, which is more flexible. The node farthest away from the base station starts the chain formation. The node then connects the closest node to form a chain by using a greedy algorithm. In each chain, a leader is selected based on the residual energy, the node with the highest residual energy becomes the chain leader. The chain leader will then collect the data from its chain members, transmit the data to the next chain leader, and then finally to the base station. Due to its multiple hop transmissions, this routing protocol is suitable for large scale networks. It can also shorten the transmission distance and reduce the time delay for data transmission. The fan-shaped division may be more flexible than rectangles, and the chain can be formed according to the requirements of the situation at that moment.

5.1 Chain-based routing algorithms comparison

As explained above these algorithms have advantages and disadvantages based in the main objective and scope of the network as shown in Table 4. Below the most relevant aspects of these algorithms are compared.

Table 4 Comparison of hierarchical chain-based routing algorithms

5.1.1 Energy efficiency

One of the most important issues in WSNs is optimal use of the resources in the network in order to optimize the energy consumption of each sensor node. For PEGASIS, there is one chain only, the data must be transmitted to all the nodes in the chain, and therefore the energy consumptions of nodes is very high. Also the sending of data packets to the base station will be highly concentrated on the one leader, and therefore that leader will become exhausted sooner because of the huge data volume that it needs to handle. Therefore, PEGASIS is the worst in terms of energy efficiency [39]. For CCS, the energy consumption is less than PEGASIS, because the network is divided into multiple chains in concentric circular tracks. Therefore, the length of each chain can be shortened, and data transmission does not involve as many nodes. On the other hand, the data is transmitted between chain leaders to the base station. Therefore, the burden of data transmissions can be shared among chain leaders [40]. For EBCRP, the energy efficiency is similar to CCS. The network is divided into rectangles, and therefore the length of the chain in each rectangle may be shorter than that in CCS. Therefore, fewer unnecessary nodes may be involved in data transmissions. However, due to its single hop transmission to the base station, the distance between the chain leader and the base station may be long, and therefore the chain leader will consume more energy. A drawback is that the nodes next to each other inside a rectangle may not form the shortest path. CHIRON, is the best in terms of energy efficiency among the chain-based routing algorithms. When compared to EBCRP, the network is divided into fan-shaped areas. It is more flexible than a rectangle divided in EBCRP, and therefore the probability of sending data to the next node in the shortest distance will increase and thus reduce the energy consumption. Moreover, the transmission from the chain leader to the base station is multiple hop, therefore, the individual transmission distance of each chain leader is shortened, thus sharing the burden and saving energy. Therefore, among all the chain-based routing algorithms, CHIRON seems the most energy efficient.

5.1.2 Stability and node failure

For PEGASIS, there is only one chain. If there is an exhausted or malfunctioning node, the data transmitted through that chain will be affected. The leader is selected randomly, and the residual energy of nodes is not considered. If a node with not enough energy is selected, it does not have the required amount of energy to send to the base station. Therefore, the data packets will be lost. Moreover, there is one leader for the whole network, and that leader will be exhausted quite soon [39]. For CCS, EBCRP, and CHIRON, the network is divided into multiple chains instead of one single chain. Therefore, the length of the chains is shortened, and the data transmissions can be shared by many leaders. Therefore, the data transmission is not reliant on a single leader, and the data packets lost in case of leader node failure will be greatly reduced. Among the three algorithms, CCS is the worst because the leader is selected based on the location of the chain, and residual energy is not considered. Thus if a node with little energy is selected as leader, this will affect the stability of data transmissions [40,41,42]. When comparing EBCRP and CHIRON, both algorithms consider residual energy in selecting the chain leaders. However, the single hop feature of EBCRP creates a heavy burden on the chain leader due to long transmissions distance. This quickly exhausts the chain leader and affects the stability of the entire network [41, 42]. Among all the algorithms in chain-based routing, CHIRON comes out the best.

5.1.3 Suitability for large area network

PEGASIS, is not suitable for large area networks because there is only one chain for the whole network and if one node breaks down, then all data packets will be lost. PEGASIS also has innate time delays because a data packet is required travel through all nodes in the chain to the leader. Therefore it is not suitable for large networks [39]. EBCRP is also not appropriate for large area networks because it is a single hop transmission, and the chain leader may not be located within transmission range of the base station. Also a lot of energy would be required for long distance transmissions within the transmission range in a large area network [41]. Both CCS and CHIRON are more suitable for large area networks because of multiple hop transmissions, thus transmissions to the base station from a chain leader can be shortened though multiple hop transmissions. However, CHIRON is still better because the transmission between the leaders must be in the next consecutive level in CCS. In CCS the flexibility of transmission is less than that of CHIRON, which does not have the limitations of level organization [40, 42]. Again, among all the chain-based routing algorithms, CHIRON protocol is the most suitable for large area networks.

6 Hierarchical tree-based routing algorithms

For tree-based routing, the nodes are divided into multiple branches, leaf nodes and parent nodes. The data is transmitted from the leaf node to its parent node, and further to the next parent node until it comes to the base station (see Fig. 4b). The data is aggregated and therefore some data replications can be removed. The tree topology is easy to form, every node just needs to send data to the next higher level node which is closer to the base station, and cluster formation is not required. This may reduce energy consumption for tree routing. The drawback of tree formations is that if a parent node of a tree is not functioning, then all the data transmission under its branch will be lost. Also, the parent nodes which are very close to the base station and with many branches connected will consume a lot of energy because of the greater data volume. Additionally, if the branch consists of many nodes, this may cause a time delay in data transmission and it may increase energy consumption. The most relevant tree-based routing algorithms are listed as follows:

  1. 1.

    Energy-aware data aggregation tree (EADAT) [43] The sink starts the tree formation. Every node will set a timer for itself to start the transmission, and the waiting time is associated with its residual energy. The higher the residual energy is, the shorter the waiting time will be. Then, the node will select the higher residual energy and the closest node as its parent. When the residual energy of a parent node is lower than a specific value, it will broadcast a message to let its child know. The child may then select another parent for transmission. In EADAT, the residual energy is considered in selecting the connection node and path. Therefore, the chance of selecting an exhausted node will be reduced and failure transmission can be prevented. Moreover, this can achieve some load balancing by using the node with higher energy first, and making the whole network live longer. However, although residual energy is one of the factors to select the path, the final path may not be the shortest distance, and overall more energy may be consumed, and many more nodes may be involved in data transmissions. The result is a possible increase in the total energy consumption and longer time delays.

  2. 2.

    Balanced aggregation tree routing (BATR) [44] The base station collects the global location information of all nodes and forms the routing paths. BATR will construct the minimum spanning tree based on the energy consumption, and it will calculate the number of child nodes under the tree to balance the loading. BATR can extend the lifetime of the network because it considers the energy consumption to build the routing path. Besides, the loading can be balanced by evenly distributing the child nodes among different trees. However, the residual energy of every node is not considered when creating a tree. So, some low residual energy nodes will be exhausted sooner, and induce transmission failure.

  3. 3.

    Power-efficient data gathering and aggregation protocol (PEDAP) [45] This protocol also uses the minimum spanning tree to calculate energy consumption. It uses data volume and transmission distance to build the tree. Besides that, the residual energy of nodes will be considered in data transmissions. PEDAP can balance the energy consumptions by building a minimum spanning tree with the energy and distance cost for transmission. It can also shorten the delay time after considering the distance for transmissions. However, the formation of the tree may be complex, and the energy used for calculating the path may be huge. Therefore, the setup energy may be very high, especially for large scale networks.

  4. 4.

    Enhanced tree routing (ETR) [46] Each node maintains a routing table containing the next hop information. The node will select the path with the lowest hop count. Since the routing table just stores the next hop information, the storage and computation cost is low, and this can save some energy. However, the path is selected based on the minimum hop count, and the residual energy is not considered. Therefore, a low residual energy path may be selected, and will be exhausted soon, in which nodes may fail and data will be lost.

Notice that additional techniques can be combined with tree-based routing protocol. In [57] sleep scheduled and tree-based clustering approach routing algorithm (SSTBC) for energy-efficient in WSN is been proposed. SSTBC preserves energy by turning off radio or switching to sleep mode of unnecessary nodes, which observe almost the same information, base on their location information to remove redundant data in the network.

6.1 Tree-based routing algorithms comparison

As explained above these algorithms have advantages and disadvantages based on the main objective and scope of the network as shown in Table 5. In the following the most relevant aspects of the proposed algorithms are discussed.

Table 5 Comparison of hierarchical tree-based routing algorithms

6.1.1 Centralized or decentralized organization

For EADAT, BATR, and PEDAP, the tree structure and routing path is constructed in a centralized manner by the sink. The sink will coordinate the required information of nodes to build and maintain a tree structure. Therefore, a large amount of data is required for communication between nodes and sink. Thus the setup cost may be high and it takes time to set up a tree. However, for ETR, the nodes will communicate with each other only, setting up their routing table to create a routing path. So, the routing path is formed in a decentralized manner which is more flexible, faster and adaptive to a changing environment [43,44,45].

6.1.2 Residual energy

For BATR and ETR, residual energy of the nodes is not considered in constructing a tree structure and routing path. Therefore, a node with less or not enough energy may be selected as parent, and data packets may be lost. For EADAT and PEDAP, residual energy will be a factor in forming a tree and routing path. This can prevent energy holes and balances the loading of the nodes to extend the lifetime of the whole network [44, 46].

6.1.3 Shortest path

EADAT, PEDAP and ETR will choose the shorter path for data communications, EADAT will find the parent with the shorter distance; PEDAP calculates the link cost based on the transmission distance; and ETR will decide the routing path with minimum hop counts. However, BATR will not consider the distance in routing. It just balances the tree structure and loading of nodes based on location and density of nodes [43, 45, 46].

7 Hierarchical grid-based routing algorithms

The entire network area is divided into many grids, and a leader is selected for each grid. All nodes within a grid will send data to their leader, and the leader will then send the data to the next grid’s leader until it reaches the base station (see Fig. 4c). The organization of grid-based routing is simple, the formation is based on the geographical location of the nodes. It is argued that the data can be transmitted in a more efficient way because the grid size is fixed, and only the location information of the leader of the grid is required. However if there is a relatively high number of nodes in a particular grid, this may create heavy traffic and excessively drain the leader node’s energy. The most relevant grid-based routing algorithms are listed below:

  1. 1.

    Position-based aggregator node election protocol (PANEL) [47] This uses the location information to select the aggregator. The whole network is divided geographically, and the node closest to the reference point will be the aggregator. The aggregator will collect data from the members in its cluster, and finally send them to the base station. Since every node has the chance to become an aggregator, this helps to achieve load balancing by sharing the burden of communicating with the base station. Note that some energy will be used for collecting the location of sensor nodes, and will increase the setup cost.

  2. 2.

    Two-tier data dissemination (TTDD) [48] The grid is divided into multiple cells, and some nodes are used to relay the data requested from the mobile sinks to the source. The mobile sink will send a data request to the intermediate nodes in a flooding way. The source chooses the next relay nodes by using a greedy algorithm until the data reaches the boundary of the network. The mobile sinks will move around the grids and extract the information from the closest node of the source. TTDD is suitable for on demand applications. However, the flooding method may consume a high amount of energy, and therefore it is not suitable for large scale and high traffic networks. Moreover, the movement of mobile sinks may not be in the same pace as the route formed. There may be a time delay, and retransmission may be required, which will also consume more energy.

  3. 3.

    Hierarchical geographic multicast routing (HGMR) [49] Here the whole network is divided into multiple cells depending on their geographical location. Within each cell, there is an access point to manage the location information. The network is built with a hierarchical structure. The source will deliver data from the highest level to the lowest level access points. For HGMR, different nodes will take corresponding roles in data transmissions, since the accessing points can be rotated. This helps to balance the energy consumption. As the network is divided into multiple cells and layers it is suitable for large scale networks. However, the transmission from higher level access points to lower levels does not consider the location issue. So, the path may not be the shortest which may cause some time delay, and may also consume additional energy.

  4. 4.

    Grid-based multipath with congestion avoidance routing (GMCAR) [50] The network is divided into grids. There is one leader node in each grid. The leader node will collect data from the members in its grid, and send data to the leaders in other grids. Each leader node will store a routing table, which consists of the grid density and hop count information. GMCAR will also separate high and low traffic. For low traffic there is just one path to the sink in the boundary grid. On the contrary, for high traffic there are multiple paths to the sink with non-boundary grids. Moreover, a secondary leader node will be selected to share the heavy traffic if necessary. The separation of high and low traffic, and a secondary leader if required can help to share and balance the loading of the whole network to enhance the energy efficiency, and shorten the time delay for heavy traffic with multiple paths. The leader will do the coordination work until its energy descends to a certain level.

The above list concentrates in the most popular grid-based WSNs, however several additional solutions can be found in the literature. In [58] grid-based routing protocols are enhanced via balancing a load of data traffic among sensor nodes as evenly as possible. Instead in [59], the sensor network is divided into logical grids of k-cells and a Cell Header is elected among each cell that acts as leader to cluster. For density grid-based clustering, Authors in [60] propose two new approaches (1) artificial bee colony is an swarm based optimization technique, and (2) the compressive sensing. All those techniques shown an improve in energy efficient of the overall WSN.

7.1 Grid-based routing algorithms comparison

As explained above these algorithms have advantages and disadvantages based on the main objective and scope of the network as shown in Table 6. Below the most relevant aspects of the proposed algorithms are discussed.

Table 6 Comparison of hierarchical grid-based routing algorithms

7.1.1 Time delay

GMCAR can deliver data with the shortest time delay because it can divide the network into heavy and low traffic networks with non-boundary grids and boundary grids respectively. Therefore, data can be communicated to the leader through different channels in non-boundary grids with heavy traffic. Besides, a secondary master node may be assigned to the grid for heavy traffic. This may reduce traffic jams and reduce time delay problems [50]. However, TTDD is not suitable for applications where periodic data is needed because it may introduce long time delays. This is because the sink first sends a request for data to the dissemination nodes, and the dissemination nodes will then send it to the nodes in a flooding manner. After the source node receives the request, it will then send the data to the agent nodes. Finally, the mobile sink will move to different grids and collect data from the agent node. The whole process takes a lot of time and may cause greater time delays. Thus this protocol is also not suitable for many sources or when periodic data is needed [48].

7.1.2 Energy efficiency

GMCAR is the highest energy efficient routing algorithm in its class. This is because the network is divided up based on density and there is a secondary master node to share the loading. Moreover, the master node is selected based on its residual energy, and can be rotated out when its energy drops. Therefore, this protocol can balance the loading amongst the nodes in the network [50]. Again, TTDD will consume the highest energy in data communications, because the data requests and replies involves many nodes. Also a flooding technique is applied in the data requests, and some energy will be wasted in some redundant data communications with nodes which are not the source. The cost of constructing the grid centered at the source is also very high [48].

7.1.3 Shortest transmission distance

PANEL and GMCAR can transmit data a shorter distance. is because in PANEL, a node which is closest to the reference point in a grid will be selected as the leader of the grid [47, 50]. Therefore, the leader sends data the minimum distance. GMCAR can select the routing path with the minimum hop count. It can also help to reduce the transmission distance. However, TTDD and HGMR will not consider location when selecting a leader and deciding a routing path [48].

8 Hierarchical area-based routing algorithms

The whole sensor network is divided into multiple areas, and the size of each area can be varied. The base station or sink will send a data request to the closest nodes in the area to collect the data (see Fig. 4d). Flooding of the data request will be executed until the source of the data is located. The source node will then send the data to the sink. This is suitable for mobile applications in which the mobile sink is always moving within the specific area. The most relevant area-based routing algorithms are listed as follow:

  1. 1.

    Line-based data dissemination (LBDD) [51] This is a typical area-based routing protocol. The whole network is divided into two equal areas by a vertical line of nodes. The nodes on the vertical line will store the data for serving the requests from the sinks. All nodes will know each other’s location information. The source node will send data to the closest node on the line. A sink will send a data request to the line in a perpendicular way. The node receiving the request will process it and continue to relay the request to other nodes on the lines in both directions. Finally, the node storing the data receives the request and then sends the data directly to the sink. The setup and the communication of the structure are simple. However, if the number of nodes on the line is small, there may be a high burden on them. These nodes will be exhausted quickly. Besides, if more nodes are assigned to the line, the energy consumption of all nodes involved on the line will be greater because a flooding technique is used for data requests. Therefore, this protocol is not suitable for large scale networks.

  2. 2.

    Ring routing [52] Proposes a ring topology. The operation is similar to LBDD. But, a ring is formed instead of a line. The relay nodes of the ring can be swapped with the normal nodes. The ring structure is simple to form. It improves load balancing among nodes because the relay nodes on the ring are rotated in order to protect any one node from overload. Using the ring structure, the source node can find the closest relay node in a shorter distance, and it reduces the time delay and consumes less energy in data transmissions. However, if the network is large, the ring structure setup costs may be huge because data requests are sent to all the involved nodes on the ring, and present an overhead.

  3. 3.

    Railroad [53] A data dissemination architecture named Railroad was presented for large-scale WSNs. The network is divided by one rail which coordinates the data request. The rail is located in the central part of the network for easy access of all nodes. The data request will be sent to the rail until to the source node, and the source node will send data directly to the sink. Sending the data request from the sink is by unicasts rather than flooding. A rail structure is more flexible than a line or ring structure, and the relay nodes on the rail can be easily accessed by normal nodes which can shorten the distance and time for the source node to send data to the relay nodes on the railway. However, the rail is usually quite long, and data requests transmitted along the rail may require a longer time and cause delays. These delays increase in large scale networks.

  4. 4.

    Virtual Line-based data dissemination (VLDD) [54] It is proposed to achieve energy-efficient and reliable data transmission. VLDD designs a Virtual Line Structure (VLS) for data storage. The virtual line is used for collecting data from a source. A source node knows the location information of the mobile sinks, and will calculate a suitable entry point onto the rail. The relay node on the rail will then send data to the neighbour nodes on the rail, after which mobile sink will send a data request to the VLS to obtain the data. The virtual line will be reconstructed based on the location of the mobile sinks. Different from flooding based transmissions, the nodes will calculate the suitable entry point to send, thus avoiding unnecessary delivery, reducing energy consumption and shortening the delivery time. However, similar to the problems that other area-based routing algorithm face, VLS also suffers from increased overhead and transmission delays for large scale networks.

8.1 Area-based routing algorithms comparison

As covered above, area based routing algorithms have advantages and disadvantages based on the main objectives and scope of the network as shown in Table 7. Below the most relevant aspects of the proposed algorithms are discussed.

Table 7 Comparison of hierarchical area-based routing algorithms

8.1.1 Scalability

VLDD is most suitable for large area networks because the virtual line structure can be reorganized according to the situation of a particular environment. It will also take into account the location of nodes [54]. Therefore, the formation of the virtual line is more flexible for adaptation to different network sizes. As for the Railroad method, this is also suitable for large area networks because the rail can be constructed in a flexible way and can be formed closer to the source nodes [53]. Concerning Ring Routing, this is less suitable for large area networks if the ring is too small. However, if the ring is too large, the distance of the inner source nodes to the ring is also large. Therefore, it is less suitable for large area networks when compared with VLDD and Railroad. LBDD is the worst case in terms of scalability because there is only one vertical straight line created in the middle of the network to store the data from the source nodes. Without considering the location of nodes, the distance between the sources nodes and the vertical line may be very long, and it is not suitable for large scale networks [51].

8.1.2 Energy efficiency

VLDD is the most energy efficient because the virtual line created is based on the location of the nodes, and can thus shorten the distance between the source nodes and the virtual line. The nodes on the virtual line can be swapped. This helps to balance the energy consumption of nodes [54]. The Railway method is also good for energy efficient transmission because the Railway can be formed closer to the source nodes and the data can be transmitted for a shorter distance. Besides, a unicast method is used instead of flooding, which can reduce unnecessary transmissions and save energy [53]. As for Ring Routing, the source nodes inside the ring may need to send data to the nodes on the ring for a longer distance, and therefore it will consume more energy [52]. LBDD is the worst in energy efficiency because the vertical line is formed in the middle of the network, and the sources nodes need to send data to the line from a longer distance [51]. Besides, the leader nodes on the line send requests to each other on the line in a flooding manner and this would waste a lot of energy.

9 Low-energy adaptive clustering hierarchy (LEACH) routing algorithm

LEACH is the best-known protocol in clustering in WSNs [22]. In this protocol, the cluster heads are selected based on their energy threshold value, they send advertisement messages with CSMA protocol to the whole WSN as shown in Fig. 5. For each sensor node, it will join the cluster from which it receives the strongest invitation message. Next, cluster heads prepare a TDMA scheduling program to manage data transfer from cluster members. This will prevent data collision and reduce energy consumption. Finally, the TDMA schedule in cluster nodes will be received. After that it goes to the Steady State Phase. Sensors send their specific data to respective cluster heads and cluster heads receive, aggregate and finally send them to the base station. LEACH is the commonly applicable hierarchical clustering algorithm for designing energy efficient WSNs which aim to reduce the power consumption over the whole WSN. For LEACH, the selection of cluster heads are rotated among the nodes in a cluster based on a specific period of time. Each cluster head will gather the data and transmit it to the base station. The clustering method can extend the lifetime of the WSN [13, 14]. Moreover, LEACH also uses aggregation techniques to combine the original data into a smaller packet size for transmission so that only the required information will be aggregated and forwarded to the base stations in order to save energy and bandwidth.

Fig. 5
figure 5

Representation of LEACH routing protocol strategy (S sink, L leader)

The role of cluster head is rotated. Thus every node will have an equal chance to act as cluster head in order to avoid depleting individual nodes and losing sensors. No global information of the network is required. This protocol can greatly reduce the energy used for data communications between sensors within its clusters and other cluster heads. It can also switch the non-active sensor nodes into sleep mode to save energy. LEACH uses single-hop routing methods for data transmission such that each node can send data to the cluster head which then gathers the data and sends it directly to the base station. The cluster heads will consume a lot of energy when they are located far away from the base station and it is therefore not feasible to implement this routing algorithm into large scale WSN applications. Furthermore, the idea of rotation of cluster heads constitutes an extra power overhead for the whole sensor network, e.g. cluster heads rotation, advertisements broadcasts etc, which may also lower the power available to sensors. In addition, LEACH may not guarantee a fair and uniform cluster head distribution based on remaining energy because cluster heads are selected randomly.

9.1 Latest improvements on LEACH

Since LEACH is the basic WSN routing algorithm [22], many researchers have proposed some improvements on LEACH by also considering energy requirements. The energy levels may refer to the initial, current, average or total energy. Other variations may use the distances, number of clusters, area of coverage, cluster size, and moving window size. Below the latest improved LEACH versions are summarised (see Table 8):

  • Ali et al. [61] proposed A-LEACH, which uses the initial energy and current energy of nodes for calculating the energy factor. It also considers the most suitable number of clusters among the total number of clusters in selecting a cluster head.

  • Tong and Tong [62] proposed Leach-B. It maintains a desired percentage of cluster heads. For the first round, cluster heads will be selected randomly. After that, if the number of clusters is less than the desired percentage, the node with shorter time interval will be in higher priority to be a cluster head, and the time interval is inversely proportional to the node’s residual energy. The value of time interval is set as \(t=k/E\), where k is a selected factor and E is the residual energy of each node. That means the nodes with highest residual energy will be the cluster heads. On the contrary, if the total number of clusters is higher than the desired number, then the cluster heads with lowest residual energy will become normal nodes.

  • Mehta et al. [63] introduced C-LEACH, the author considered the balancing of the clusters’ size which is uniformly distributed across the whole network, with considering the minimum and maximum number of members in each cluster and sets a threshold value for them. Moreover, C-LEACH also considers the current and initial energy of nodes.

  • Tripathi et al. [64] proposed LEACH-CE. According to the LEACH-C algorithm nodes with higher than the average nodes’ energy are selected as cluster heads. However, since nodes with residual energy higher than average do not imply the highest energy, cluster heads may also die quickly if their energy was only a little above average. Therefore, LEACH-CE suggests to select the node with maximum residual energy in the cluster as the final cluster head in order to balance the energy usage of nodes.

  • Handy et al. [65] proposed Deterministic Cluster-Head Selection (DCHS), it selects the cluster head not only based on the current and maximum energy, but also considering the number of rounds that a node has not been a cluster head.

  • Xu et al. [66] proposed E-LEACH, which also considers the current and initial energy of nodes. However, when calculating the probability of a cluster head it consider the distance to the base station and area covered.

  • Azim and Islam [67] proposed Hybrid LEACH (H-LEACH), it calculates the differences between the current and initial energy and also uses the number of clusters and total number of nodes in determining the selection of cluster heads.

  • Beiranvand et al. [68] proposed I-LEACH, which considers the residual energy, distance to the base station, and number of neighbor nodes in selecting a cluster head. I-LEACH will compare each item above with the average of all nodes in the network. Therefore, the nodes with higher residual energy, the shorter distance to the base station and more number of neighbor nodes will have a higher opportunity to become a cluster head.

  • Udompongsuk et al. [69] proposed a hybrid approach to enhance the cluster head selection probability and with moving window instead of using either initial or maximum energy and the protocol is therefore called Moving window Average and selection Probability or MAP.

  • Li et al. [70] proposed N-LEACH which uses the current and initial energy to do the calculation. However, it uses its own determined probability in selecting the cluster heads rather than considering the number of cluster heads and the total number of nodes.

  • Jin et al. [71] proposed PLEACH which considers the average energy of all nodes. It will compare the current energy of a node with the average energy of all nodes. If a node has a higher positive remaining energy compared to the average energy, then it will be have a higher priority for selection as a cluster head.

  • Manzoor et al. [72] proposed Quadrature LEACH. Each node will send its location information to the base station. Quadrature LEACH divides the whole network area into four equal regions according to the nodes’ location. Within each quadrant, some cluster heads are selected to coordinate the data transmissions of its members. This can balance the loading of cluster heads and with better coverage.

  • Wang et al. [73] proposed LEACH-SWDN which uses nodes’ residual energy to select cluster heads. To keep a stable number of cluster heads, LEACH-SWDN uses a sliding window control for the cluster heads selection criteria. The algorithm will dynamically selecting cluster head according to the number of alive nodes. It will consider the initial energy of the nodes, and the average energy of alive nodes which have not been a cluster head for that particular cycle.

  • Hou et al. [75] proposed T-LEACH. To calculate the probability of being a cluster head, the author uses the total energy of all nodes, rather than the initial energy of each node. Moreover, it also takes the distance between the member nodes within its cluster into consideration, in addition to the number of nodes and cluster heads.

  • Ren et al. [76] proposed U-LEACH because generally cluster heads further away from the base station will consume more energy in data transmission because of the longer transmission distance. Therefore, U-LEACH proposes dividing the network into concentric circles, and the clusters which are most far away from the base station should have a smaller cluster size. Conversely, as clusters come closer to the base station their size increases. Residual energy, distance and weight factors are also considered in selecting a cluster head. This is done to reduce hotspots, i.e. higher use for cluster heads which are far away from the base station.

  • So-In et al. proposed W-LEACH [77] based on K-LEACH [74]. Instead of using the current energy, W-LEACH uses the moving average energy consumption based on the window size of the energy. It can consider that a node may consume different amount of energy due to various conditions of sensors.

Table 8 Comparison of latest improved LEACH algorithms for the cluster formation

As explained above these algorithms provide several improvements over LEACH by taking into account additional metrics for implementing the routing techniques as summarized in Table 8, which compares the latest improved LEACH algorithms for cluster formation. Notice that several works have analyzed more in deep the performance between LEACH and specific routing protocols [78,79,80]. The comparisons are done on the basis of different factors that have to be considered while choosing methodology for particular project of WSN.

10 Challenges and future research trends

WSN has become one of the technologies for the future by driving the consolidation of IoT as one of the most innovative technology. Nowadays, smart sensor nodes are involved in almost every field of life, supporting various monitoring and tracking applications. Resource constraints pertaining to WSNs impose several challenges and therefore new frontiers of research [81, 82] in the field. Some of the most relevant challenges that are driving the research trends in WSN are:

  • Security The future of WSN is evolving in a subset of IoT technology. Predicting millions of nodes being added to WSNs and the internet will provide malicious actors with innumerable number of attacks, especially because the sensor nodes suffer from security holes [83]. There are many reasons behind the state of insecurity in WSN, limited computational resources, limited energy budget and limited memory are the most relevant. In the future, security concerns will no longer be limited to the protection of sensitive information and assets, it will extend to the protection of sensitive sensor as example healthcare. Therefore WSN cyber-security architectures, protocols and algorithms will be the focus of research [84]

  • Privacy The IoT creates unique challenges to privacy, many that go beyond the data privacy issues that currently exist in WSN, because WSNs were originally designed to operate in private networks. Integrating sensors devices into private environments without properly protection and connecting to Internet is becoming a common use. However, the collection of private information using WSN exposes legal and regulatory challenges facing data protection and privacy law [85].

  • Regulatory standards Regulatory standards for managing data from collection, storing and usage are missing. There is a need for clear guidelines on the retention, use, and security of the WSN data. This will drive the new generation of sensor on hardware and software [86].

  • Compatibility and longevity IoT is growing in many different directions. This will cause difficulties and require the deployment of extra hardware and software when connecting sensor devices. Other compatibility issues stem from non-unified cloud services, lack of standardized M2M protocols and diversities in firmware and operating systems among sensor devices [83, 87].

  • Real-time data processing and analysis Analysis of data on a real-time or a near real-time basis, driving timely decision making and action are becoming mandatory specially on industrial WSN [88, 89].

However the most challenging issue in WSN remain the limited amount of energy in the sensors. While renewable energy technologies similar to solar and wind panels are not new, the systems are far too large or invasive (in several applications) for WSNs. Thus, the design of networking protocols for such WSNs powered by ambient energy harvesting remain the most relevant research trend in WSN [90].

11 Final discussion and conclusions

We have classified and discussed hierarchical routing techniques as (1) chain-based, (2) tree-based, (3) grid-based, and (4) area-based routing above. For each category of the hierarchical routing techniques we have discussed in detail the main features and characteristics. Table 9 shows a comparison among hierarchical routing specify protocols based on load balancing, data aggregation, delay, suitability for large scale, mobility and implementation cost while highlighting the relevant key features of each protocol.

Concluding our survey Routing techniques in WSNs is a well know area of research, with a wide set of research results. In this survey, we presented a comprehensive survey of hierarchical routing techniques in WSNs which have been covered in the literature. Those techniques have the common objective of extending the lifetime of the WSN, while not compromising the performance of data delivery. Furthermore, we classified hierarchical routing techniques based on the routing techniques. We also highlighted some of the most relevant metrics of the routing paradigm, as well as the advantages and disadvantages of each routing technique. Finally, we presented a deeper analysis of the latest improvements provided for variations of the LEACH protocol.

Table 9 Comparison of hierarchical routing protocols relevant features