Keywords

1 Introduction

WSN is an important technology of the Internet of things, which is composed by many sensors. WSN plays a vital role in the fields of object tracking, military reconnoitering, biological conservation, etc. Limited by the volume of senor nodes, the energy carried by a single senor node is very little, and the node’s energy cannot get a timely supplement because the senor nodes are usually deployed in complex environments. Many scholars propose different solutions from different research perspectives. For example, in data aggregation field, Mohanty et al. propose an energy efficient and unstructured data aggregation and delivery protocol, named ESDAD [1]. ESDAD is utilized to determine the costly functional calculations of an unstructured next hop node. ESDAD can accurately calculate the waiting time of packets on every intermediate node, which makes the data aggregating has efficiently on routing paths. For another example, in sleeping algorithm field, Mostafaei et al. propose a sleep scheduling algorithm, named PCLA, which relies on learning automata to realize [2]. The main idea of PCLA is that the quantity of activated sensors is minimized to cover the interesting sections, and the connectivity of sensors is preserved perfectly. In addition, there are many other technologies are utilized to solve this question, such as secondary node [3], multi-sink coordination [4], and mobile agent [5].

Such solutions have great limitations for improving network lifetime and balancing nodes load. For example, PCLA algorithm still reduces the perception range of the network when it reduces the network traffic to improve the network lifetime. Recent years, to improving network lifetime and balancing nodes load, many researchers make great efforts in improving the network protocol, in which clustering algorithm is the research focus. Early, Heinzelman et al. propose a clustering routing algorithm, LEACH [6]. LEACH algorithm optimizes the network communication architecture in some ways, and eliminates hot pints in the network by introducing the mechanism of Rotary. However, this routing algorithm utilizes probability and threshold to judgment the cluster head, which leads that the cluster head far away from sink nodes in the network will consume more power, and leads that the node load is not balance. There are also many improved protocols based on LEACH, such as energy efficient routing protocol TEEN [7], self-adaptive periodic energy efficient routing protocol APTEEN, energy efficient data protocol PEGASIS, and so on. Besides these clustering algorithms, there are many other algorithms are proposed to solve the application problems [8, 9].

The classical and newer algorithms by improving the network protocol to promote the network lifetime are briefly introduced on the above. The existing routing algorithms mainly utilize the space distance [10,11,12] to describe the relationship between two nodes. This relation model depends on GPS position. Therefore, the production costs of nodes will be increased. Beyond that, because electromagnetic waves have different reflection and attenuation in different transmission mediums, the space distance applied in practical scenarios is very limited. In order to eliminate the communication’s adverse effects caused by network deployment environment, and save the complex distance measurement soft hardware, the Energy Consumption Distance (\( ECD \)), namely the communication energy consumption of each unit data, is utilized for replacing the space distance to describe the relationships of nodes.

2 Model Construction

2.1 Network Model Hypothesis

H1: In target area, \( N \) wireless sensors distribute randomly in the range of a circle, where the center of this circle is the Sink node and the radius of this circle is \( R \).

H2: Each node including Sink node has a unique identifier, namely \( ID \).

H3: All source nodes (ordinary nodes) have the same data processing ability and communication ability, and have the same sensing data in unit time.

H4: Data transmission power (\( 0 - p_{\hbox{max} } \)) can be adjusted.

H5: Sink node has infinite energy and strong data processing ability. Source nodes have limited energy and the initial energy of them is same.

2.2 Node Relation Model

In this paper, an energy consumption distance (\( ECD \)) node relation model is proposed. The core idea of this model is that the data transmission power of nodes in the system is equally divided into \( n \) segments, which are sorted from small to large in sequence, and the sequence is \( M_{1} \), \( M_{2} ,\, \ldots \,,M_{n} \). If the data transmission power of which \( a_{i} \) sends to \( a_{j} \) is within \( M_{k} \), the \( ECD \) from \( a_{i} \) to \( a_{j} \) can be represented as \( ECD\left( {i,j} \right) \), and its calculation formula is shown in formula (1).

$$ ECD\left( {a_{i} ,a_{j} } \right) = P\left( {M_{k} } \right) = \frac{{p_{\hbox{max} } }}{n}k $$
(1)

According to the symmetry of radio transmission, if node \( a_{j} \) is in the \( M_{k} \) field of node \( a_{i} \), node \( a_{i} \) must also be in the \( M_{k} \) field of node \( a_{j} \). We can draw the conclusion as shown in formula (2).

$$ ECD\left( {a_{i} ,a_{j} } \right) = ECD\left( {a_{j} ,a_{i} } \right) $$
(2)

\( ECD \) list is the basis of the link selection. Formula (1) and formula (2) give the expression of \( ECD \) and its property, and provide parameters to sink node for building the \( ECD \) list. The element structure of the \( ECD \) list is shown in Fig. 1, and the \( ECD \) list can reflect the connection relationships of any nodes that can be connected. The elements of the \( ECD \) list include the \( ID \), the Energy (\( E \)), and the Energy Consumption Distance (\( ECD \)) of the two connected nodes, and the information is necessary and complete to the link selection.

Fig. 1.
figure 1

The structure of \( ECD \) list

According to the symmetry of \( ECD \) (formula 2), if a network has \( N \) source nodes, we can find that the final constructed \( ECD \) list has \( C_{N}^{2} + N \) element items, including \( N \) relationships for each source node to Sink node. When the senor nodes are arranged and do not move, which means that \( ECD \) list only need to update the residual energy of nodes each time, and do not need to update the \( ECD \) information of nodes at every moment. The structure of \( ECD \) list effectively saves the nodes energy and improves the response speed of the network.

2.3 Energy Consumption Model

The related energy consumption algorithms mainly adopt the wireless communication energy model proposed by Heinzelman et al. On the basis of Heinzelman’s model, an improved model combined the formula (1) is proposed in this paper. The improved energy consumption model expressions are shown as follow.

(i) Energy consumption of sending node \( a_{i} \) is shown in formula (3).

$$ E_{T} \left( {l,ECD\left( {a_{i} ,a_{j} } \right)} \right) = l \cdot ETX + l\frac{{ECD\left( {a_{i} ,a_{j} } \right)}}{L} $$
(3)

(ii) Energy consumption of receiving node is shown in formula (4).

$$ E_{R} \left( l \right) = l \cdot ERX $$
(4)

Where \( L \) represents the amount of data transferred in unit time, \( l \) represents the amount of data in which \( a_{i} \) sends to \( a_{j} \), \( ETX \) represents the energy consumption in unit data circuit of the data sending node, and \( ERX \) represents the energy consumption in unit data circuit of the data receiving node.

3 The Multi-hop Routing Algorithm Based on the Path Goodness Degree

The multiple hop routing sensor network is composed by links, and two links will be interacted mutually by common nodes. During network operation, for any source node in the network, there is one link to overload this node’s percepted data and sent data to Sink node. A routing algorithm (MRPGD) is proposed in this paper, and the core idea of the MRPGD is that all source nodes in the network utilize the path contrast strategy to select the best quality link. To eliminate hot spots caused by paths crossing, the process of link selecting is executed continually. The implementation of the path contrast strategy is quantifying paths and contrasting them. Here, we define \( Q \) as the path goodness degree to quantify the path quality. Now, we will analyze the two important parameters of \( Q \), which one is the link survival time \( T \), and the other one is the node average data redundancy \( R \). After that, we will summarize the routing implement algorithm based on \( Q \).

3.1 The Analysis of Link Model

The link model constructed by \( ECD \) relations among nodes is shown in Fig. 2. Where the longitudinal direction represents the residual energy of nodes, and the transverse direction represents the \( ECD \) relations in nodes. It should be pointed out that the \( ECD \) in nodes of this model does not apply to the arithmetic addition operations. For example, assuming there are three nodes \( a_{i} \), \( a_{j} \), and \( a_{k} \), according to the formula (1), we can draw the conclusion that \( ECD\left( {a_{i} ,a_{j} } \right) \ne ECD\left( {a_{i} ,a_{j} } \right) + ECD\left( {a_{j} ,a_{k} } \right) \). This model provides an intuitive analysis for the link survival time \( T \) and the node average data redundancy \( R \).

Fig. 2.
figure 2

The link model without data fusion

3.2 The Link Survival Time

The quality of a link is decided by the nodes life in this link. If the survival time of each node is relatively longer in this link, it shows that this link can undertake much more data relay transmission service. Assuming that there are \( m \) source nodes in the link, the link survival time can be acquired by analyzing the energy consumption model and the link model (as shown in Fig. 2) in the network.

Definition 1: The link survival time \( T \) represents the running time when the link does not have any dead node, \( l_{R} \) represents the receiving data of node \( a_{n} \) when the link runs stably in \( t \) time, and \( l_{n} \) represents the sensing data of node \( a_{n} \) in \( t \) time.

According to energy consumption model, the energy consumption \( E_{n} \) of the node \( a_{n} \) equals the sum of the data receiving energy consumption \( E_{R} \) and the data sending energy consumption \( E_{T} \). Leaded \( l_{R} \) and \( l_{n} \) into \( E_{n} \), we can acquire the expression of \( E_{n} \) as shown in formula (5).

$$ E_{n} = E_{R} \left( {l_{R} } \right) + E_{T} \left[ {\left( {l_{R} + l_{n} } \right),ECD\left( {a_{n} ,a_{n + 1} } \right)} \right] $$
(5)

Introduced formulas (3) and (4) into formula (5), the energy consumption \( E_{n} \) of the node \( a_{n} \) in the link can be further inferred, and its expression is shown in formula (6).

$$ E_{n} = l_{R} \cdot ERX + \left( {l_{R} + l_{n} } \right) \cdot ETX + \left( {l_{R} + l_{n} } \right)\frac{{ECD\left( {a_{n} ,a_{n + 1} } \right)}}{L} $$
(6)

According to the hypothesis of the network model, the sensing information of all nodes in the network is same. Also considering the state without data fusion in information retransmission process, the conclusion can be inferred that \( l_{R} \) should be represented as the sum of the information sensing amount of the former \( n - 1 \) nodes except the node \( a_{n} \) in the link. And \( l_{n} \) represents the self-information sensing amount of the node \( a_{n} \) in the link. Therefore, the relation expression between \( l_{R} \) and \( l_{n} \) can be shown in formula (7).

$$ l_{R} = \left( {n - 1} \right)l_{n} $$
(7)

Introduced formula (7) into formula (6), the expression of \( E_{n} \) can be further inferred again as shown in formula (8).

$$ E_{n} = l_{n} \left[ {\left( {n - 1} \right) \cdot ERX + n \cdot \left( {ETX} \right) + \frac{{ECD\left( {a_{n} ,a_{n + 1} } \right)}}{L}} \right] $$
(8)

According to the link model (as shown in Fig. 2), the running time of the link \( t \) is the working time of the node. Because the working time of the node equals the product of \( l_{n} \) and \( L \), the relation expression between \( t \) and \( l_{n} \) is shown in formula (9).

$$ l_{n} = t \cdot L $$
(9)

Introduced formula (9) into formula (8), the relation expression between \( E_{n} \) and \( t \) can be inferred as shown in formula (10).

$$ E_{n} = t\left[ {\left( {n - 1} \right)ERX \cdot L + ETX \cdot L + ECD\left( {a_{n} ,a_{n + 1} } \right)} \right] $$
(10)

Based on the above, the relationship between the link lifetime and the node energy consumption can be obtained under the link stable running condition (definition 1). When the network run \( t_{n} \) time ago, and the energy \( E\left( n \right) \) of the node \( a_{n} \) is exhausted, we call \( t_{n} \) is the max survival time of the node \( a_{n} \) in the link. Therefore, \( t_{n} \) and \( E\left( n \right) \) are suitable for formula (10). Introduced \( t_{n} \) and \( E\left( n \right) \) into formula (10), the expression can be acquired as shown in formula (11).

$$ t_{n} = \frac{E\left( n \right)}{{\left( {n - 1} \right)ERX \cdot L + n \cdot \left[ {ETX \cdot L + ECD\left( {a_{n} ,a_{n + 1} } \right)} \right]}} $$
(11)

By analyzing the Fig. 2, we know that the link is composed of limited nodes. From source node to sink node, if an arbitrary node failed, we can judge that the link is dead. Therefore, the link’s survival time \( T \) should be the minimum of all nodes max survival time. The expression of \( T \) is shown in formula (12).

$$ T = \hbox{min} \left\{ {t_{1} ,t_{2} , \cdots ,t_{n} } \right\} $$
(12)

3.3 The Link Average Data Redundancy

The link average data redundancy means the average value of the non-failed node’s remaining communication data when the link loses its effectiveness. The expression of the average data redundancy \( R \) is shown in formula (13).

$$ R = \frac{1}{n}\sum\nolimits_{i = 0}^{n} {L\left( {t_{i} - t_{k} } \right)} $$
(13)

Suppose \( a_{k} \) is the least longevity node in the link, therefore, \( L\left( {t_{i} - t_{k} } \right) \) represents the data amount of which the node \( a_{i} \) sends to the next node when the node \( a_{k} \) loses its effectiveness.

The path goodness degree (PGD) is represented as \( Q\left( {T,R} \right) \). For the two paths of the same source node, the PGD are as \( Q_{1} \left( {T_{1} ,R_{1} } \right) \) and \( Q_{2} \left( {T_{2} ,R_{2} } \right) \) respectively, and the expression of the comparison algorithm is shown in formula (14).

$$ \begin{aligned} \left. \begin{aligned} \;\;\;\;\;\;\;\;T_{1} < T_{2} \hfill \\ T_{1} = T_{2} \& \& R_{1} < R_{2} \hfill \\ \end{aligned} \right\}Q_{1} < Q_{2} \hfill \\ \left. \begin{aligned} \;\;\;\;\;\;\;\;T_{1} > T_{2} \hfill \\ T_{1} = T_{2} \& \& R_{1} > R_{2} \hfill \\ \end{aligned} \right\}Q_{1} > Q_{2} \hfill \\ \end{aligned} $$
(14)

If the survival time and the average data redundancy of two links or more links are equal, namely \( T_{1} = T_{2} \& \& R_{1} = R_{2} \). Which means the link quality is same. To protected the Uniqueness of the link selection and the rigorism of the algorithm, and to reduce the complexity of the algorithm, the link of which firstly acquires the \( Q \) value is utilized as the data transmission path of this sources node.

4 Simulation Experiment

In order to evaluate the performance of the algorithm, this paper uses MATLAB R2016a as the experimental platform to evaluate the proposed MRPGD algorithm, and simulates the three algorithms LEACH, DEEC, UCDP [13] under the same experimental conditions. For a better discussion, we control the maximum path length of the MRPGD algorithm as 2 and 5 to form the MRPGD-2 and MRPGD-5 algorithms.

The experiment is divided into two groups. The first group of experiments is used to test the survival of the nodes of the four algorithms in the effective time of the network (this article assumes that 50% of the nodes are dead network failure). The second group of experiment compares the standard deviation of the node’s transmission data.

4.1 Unified Energy Consumption Model and Parameter Settings

Since the routing algorithm proposed in this paper is based on the node \( ECD \) relationship model, any two nodes in the implementation only need to obtain the energy consumption distance value through the tentative connection. The implementation of LEACH, DEEC, and UCDP algorithms relies on node coordinates. Free space and multipath attenuation energy consumption models are used by this algorithm. The relationship expression of two modes is shown in formula (15).

$$ E_{T} \left( {l,d} \right) = \left\{ {\begin{array}{*{20}l} {l \cdot ETX + l \cdot Efs \cdot d^{2} ,d < d_{0} } \hfill \\ {l \cdot ETX + l \cdot Emp \cdot d^{4} ,d \ge d_{0} } \hfill \\ \end{array} } \right. $$
(15)
$$ E_{R} \left( l \right) = l \cdot ERX $$

In order to ensure the consistency of the environment of the two algorithms, a unified spatial distance relationship model is used in the experiment. Therefore, the reference relationship between the \( ECD \) and the spatial distance \( d \) needs to be established, which is shown in formula (16).

$$ ECD = \left\{ {\begin{array}{*{20}l} {L \cdot Efs \cdot d^{2} ,d < d_{0} } \hfill \\ {L \cdot Emp \cdot d^{4} ,d \ge d_{0} } \hfill \\ \end{array} } \right. $$
(16)

The simulation parameters of reference [1], the experimental parameters are defined as shown in the following Table 1.

Table 1. WSN simulation parameter

Before the experiment, firstly, \( N \) nodes are generated in Network area (Table 1) and randomly deployed, so that a unified node distribution parameter can be generated. Then, under the same simulation parameters and the distribution parameters of the node, the LEACH algorithm, DEEC algorithm, UCDP algorithm and the proposed algorithm are simulated respectively. The unified node distribution parameters can ensure the fairness of the experimental results analysis.

4.2 Node Lifetime Within the Network Validity

Figure 3 reflects the relationship between the failed nodes and the network running time in the effective time of the network. The figure shows that the first dead node of the LEACH and DEEC algorithms appears in 100 rounds. Compared with the UCDP algorithm and the MRPGD algorithm proposed in this paper, the first node death time is significantly earlier. Analysis of LEACH algorithm shows that although the algorithm adopts a certain energy equalization strategy (network clustering and cluster head rotation), it is limited by the random selection of cluster heads, which is easy to generate hotspots. Although the DEEC algorithm optimizes the election of cluster heads based on the LEACH algorithm, the probability of occurrence of hotspots is reduced. It can be seen from the figure that the first dead node in the algorithm appears 41 times later than LEACH, and the network survival time is extended by 68 rounds. The UCDP algorithm has a greater improvement in LEACH and DEEC than the LEACH and DEEC in the first node, mainly due to its distributed clustering strategy using dynamic partition load balancing. In the process of clustering, the residual energy and the “integrated distance factor” are fully considered, and the nodes in the network are divided into common nodes, routing nodes and head nodes; effectively taking advantage of multi-hop routing.

Fig. 3.
figure 3

The relationship between the failed nodes and the network running time in the effective time of the network

The first node dead time of the proposed algorithm is significantly longer than the other three algorithms, and the network lifetime is longer than other algorithms. The result is that the algorithm accurately calculates the link quality. The precise link selection makes the nodes with lower energy not forward data for other nodes, and the node will get the services of other nodes in the network to the maximum extent.

4.3 Energy Balance

The second set of experiments is a comparative discussion of the standard deviation of node data transmission in the network effective time by various algorithms. The definition of standard deviation is shown in formula (17).

$$ a = \sqrt {\frac{{\sum\nolimits_{i = 1}^{n} {\left( {K_{i} - u} \right)} }}{n}} $$
(17)

Here, the \( n \) represents the number of data-aware nodes in the network’s effective time, and \( K_{i} \) represents the data-aware amount of node \( a_{i} \). The average data perception of nodes within the network is \( u = \left( {\sum\nolimits_{i = 1}^{n} {K_{i} } } \right)/n \). It can be seen from the above definition that the standard deviation can directly reflect the difference in the perceived amount of node data in each network effective time, and can be used as an important evaluation basis for node load balancing in the network. It can be seen from Fig. 4 that the MRPGD-5 algorithm proposed in this paper is one order of magnitude lower than the other three algorithms, and the MRPGD-2 is even lower. This shows that the MRPGD algorithm performs better in controlling load balancing.

Fig. 4.
figure 4

Standard deviation of node data transmission in different algorithms

5 Conclusion

This paper proposes a novel \( ECD \) relationship model, which effectively solves the problem of the spatial distance relationship of nodes–non-standard attenuation of signals and node inability to locate under complex conditions. Discussed the classic algorithms for solving load imbalance problems in the field of WSN, as well as the latest research results. And proposed a multi-hop routing algorithm based on path superiority. The core idea of the algorithm is to use path lifetime and path data redundancy as the optimal number to select the best data transmission path for the source node. Although the algorithm proposed in this paper shows better performance advantages in simulation experiments, but with the wider application of network, the diversity and complexity of network structure increase. In order to better adapt to the new form of sensor network, such as heterogeneous network, multi-Sink node, event-driven, etc. The next step is to make improvements to the needs of the algorithm in different network environments, making it more suitable for practical situations.