Keywords

1 Introduction

With the rapid development of social network and mobile Internet technology, the collection, storage, analysis and release of all kinds of data become convenient and fast. Organizations such as health care, insurance companies, e-commerce companies, social networking sites, and telecom operators publish industry data for research, and a variety of big data analytics companies, big data analytics competitions, and more. In the era of big data, businesses are hard at work mining data to analyze usage patterns and improve product features or user experience [3].

With the value mining of big data, the information related to individuals and enterprises will increase significantly. Data transmission and data privacy protection in big data and data mining have also aroused widespread concern. In big data environments, data is collected by sensors, which collect data to make more informed decisions that help organizations develop in an internet environment. Wireless technology enables sensors to be connected to each other to form a “Large network” [9]. Sensors with sensing data, communication and processing capabilities are connected to each other, which is a typical wireless sensor network. Wireless sensor networks are used in many fields, such as agriculture to monitor environmental temperature and humidity, traffic network, traffic flow and so on. They can also be deployed in many places where people can not reach, such as deserts, so there’s a military component [6].

However, wireless sensor networks also have some limitations [10], mainly facing the following two challenges:

  1. (1)

    Sensor equipment has the characteristics of micro-size and therefore also has the nature of limited resources. Energy consumption is a very outstanding problem in WSN communication.

  2. (2)

    In some cases in a real-time data environment, external adversaries can directly access real-time data from sensor nodes.

In recent years, many people have made some research contributions on how to save energy consumption of sensor nodes. In 2019, Arpita Mallick et al. proposed a multi-hop routing protocol that transfers data from body area sensor networks (BAN) to cellular devices or display coordinators. In the hospital scenario, there can be multiple display coordinators near one of Ban’s coordinators. In addition, in a hospital, multiple patients may have wearable or implantable sensors and form multiple BAN configurations nearby [4]. In 2020, Muhammad Ilyas et al. proposed a three-tier cluster-based routing protocol for wireless sensor networks. This protocol utilizes three-layer clustering mechanism, which is a center-based clustering protocol [1]. Researchers have also done some technical research on the issue of data privacy breach. In 2020, Naswan S et al. proposed an anonymous access authentication scheme for wireless sensor networks in large data environments, using a set of lightweight symmetric encryption and hash function to prevent existing known attacks, perfect forward secrecy provides two way authentication and complete anonymity [7]. 2021, Inam ul haqa et al. have proposed an efficient multi server architecture based on Hasche’s authenticated key protocol scheme, which uses a simple hash operation to achieve key sharing and identity authentication [2].

All of the above studies give rise to the enlightenment of our scheme. In order to solve the problem of unbalanced energy consumption of sensor nodes, some nodes have to bear the problem of high energy consumption resources, and the problem of privacy leakage in data transmission between sensor nodes, we propose a secure converged routing protocol with authentication and energy-saving functions. The main contributions of this article are as follows.

  1. (1)

    A new calculation method of cost function is proposed. The cost function proposed in this paper can be used to effectively classify the cost consumption level of a sink node when transmitting data by selecting a node from all sensor nodes as a cluster “Leader” node, a reasonable path is planned for the realization of energy-saving converging routing protocol based on cost function.

  2. (2)

    A new key agreement scheme with authentication function is proposed. The proposed key agreement protocol is used for efficient key distribution in communication between two parties, which realizes data privacy of communication between nodes and ensures data security.

  3. (3)

    The performance of the proposed protocol and other existing schemes is analyzed. We compare the performance of the proposed scheme with the existing schemes, and clearly verify the efficiency of our proposed protocol in terms of total energy consumption.

This article is organized as follows. In Sect. 2, we introduce the basics used in this article. In Sect. 3, a new multi-layer energy-saving convergent routing protocol with authentication key agreement is proposed. In Sect. 4 provides a performance comparison of the proposed scheme with existing related schemes. In Sect. 5 summarizes the full text.

2 Preliminary

2.1 WSN

The network environment itself has the openness, if can not carry on the effective supervision to the network environment, the entire Internet system will certainly be disorderly [5]. Based on this, it is necessary to implement constraint management for the load network environment. Wireless Sensor Networks (WSN) is a kind of distributed Sensor Networks, the end of which is the Sensor that can sense and examine the outside world. Sensors in WSN communicate wirelessly, so network settings are flexible, device locations can be changed at any time, and wired or wireless connections can be made with the Internet. A multi-hop self-organizing network formed by wireless communication. The wireless sensor network (WSN) not only connects the wireless network with the wired network, but also senses and checks the state of the external sensors through flexible network settings, and then makes device changes, it’s also a great way to avoid the dangers of the internet. Wireless network sensors can also collect and transfer data in the internet environment, not only reduce the cost of network transmission, but also simplify the form of network deployment. In the process of data transmission, the dynamic collection and detection of network resource information can be realized through the operation of internal sensor nodes, and the online monitoring of information transmission can be realized [11].

2.2 Cluster Structure in WSN

In wireless sensor networks, cluster-based routing protocol is called hierarchical routing protocol. The idea of hierarchy-based or cluster-based routing protocol is to divide the network into many different clusters based on certain attributes, including cluster head (CLH) and cluster member node (CM), cluster heads can communicate directly with Sink nodes or form a higher-level network, in which each cluster head node is regarded as a normal node and then clustered until the last node is left in the network. The cluster head node is responsible for the management of the cluster members and acts as the leader or coordinator. The collection and processing of the data of the cluster members are also carried out at the cluster head node. Meanwhile, the cluster head is responsible for the data transmission between clusters, data is taken from a cluster member node (CM), aggregated and forwarded to a base station (BS) or Sink node. The choice of CLH is based on important parameters such as residual energy and the distance to the sink node (SN). The CLH role rotates according to the level of the node. The level of a node is determined by the increase or decrease of important parameters. Now the most popular algorithm is to improve the energy balance of the network, the energy consumption of the network is evenly distributed to all nodes, the network is divided into periodic rounds. The advantage of hierarchical routing protocol is that it is scalable, can adapt to the dynamic changes of the network, and is suitable for large-scale networks [8]. Cluster-based WSN is divided into two-tier and three-tier hierarchies as shown in Fig. 1.

Fig. 1.
figure 1

Cluster-based WSN two-level structure and three-level structure.

3 Proposed Protocol

In order to solve the problems of high power consumption, network lifetime, low network throughput and privacy leakage in secret data transmission, we propose an energy-saving routing protocol with authentication and agreement key. This paper is organized as follows, in Sect. 3.1 gives the network model of the scheme, our scheme is implemented on the ordinary Internet of Things Network, in Sect. 3.2 we carry out the network deployment of the sensor nodes, and collected deployment information for the sink node. In Sect. 3.3, we first design a cost function to calculate a cost for each sensor node. The sensor node is chosen for its high energy and good link quality. The node near the sensor node becomes CLH (Cluster Head), CLG (Cluster Gateway) node. In Sect. 3.4, after selecting CLH node and CLG node, we use K-means clustering algorithm to divide sensor nodes near CLH node into the same cluster, and these cluster member nodes send information to CLH node, CLH then sends the aggregated information to the CLG node, and finally the CLG node sends the information to the Sink node to construct a three-layer cluster structure in Sect. 3.5. In Sect. 3.6, we design a communication scheme based on authenticated key agreement for two-node secret data transmission. Finally, a secure aggregation routing protocol with authentication and energy-saving functions is implemented.

3.1 Network Model

The sensor nodes monitor the physical and chemical reaction from various intelligent network environments, and then transmit these changes to the sink node (SN) for decision making and processing. The network model scenario studied in this paper is an ordinary Internet of things model, as shown in Fig. 2. The function of the realization is the process of the sensor node passing the collected information to the sink node.

Fig. 2.
figure 2

General structure model of wireless sensor networks.

3.2 Network Deployment and Initializing

At this stage, in order to monitor the various data, we begin with sensor deployment, where we deploy micro-sensor equipment over a two-dimensional target area, as shown in Fig. 2. These sensors have different properties and have different capacities, sizes and capabilities. All sensor nodes with heterogeneous properties are deployed randomly. They are stationary and do not move when deployed.

After deployment, each node determines its position according to its latitude and longitude, its energy status and its link status.

After the deployment and preparation work is completed, the pre-processing stage is entered, and the sensor node performs message broadcasting and message reply.

After the sensor nodes are deployed, the Sink node (SN) communicates with all the deployed nodes via a broadcast initiated greeting packet (\(Init\_Hell{o_{Pkt}}\)). \(Init\_Hell{o_{Pkt}}\) contains information about the SN node, such as \(SN_{id}\) and its own location.

After receiving \(Init\_Hell{o_{Pkt}}\) from SN, each node calculates the distance to SN node dtance and packages its remaining energy R.E, link quality LQ into the \(Reply_{pkt}\) for response. \(Reply_{pkt}\) contains important information about all deployed nodes, such as \(N_{id}\), residual energy R.E, link mass LQ, distance to SN node \(d_{tance}\).

After receiving the \(Reply_{pkt}\) from the node, SN responds with the confirmed packet (\(Ack_{Pkt}\)), which ensures that the message has been successfully received. \(Ack_{Pkt}\) contains the information of SN node, such as \(SN_{id}\) and its own location.

Therefore, at this stage, SN obtains important information of the deployed node (\(N_{id}\), residual energy R.E, link quality LQ, distance to SN node \(d_{tance}\)).

3.3 Cost Function

After SN obtains the information, the cost function (C.F) for each deployed node is calculated based on the information obtained from the deployed node. Based on the results of C.F, the decisions of CLH, CLG and CN are made. Sink nodes identify CLH, CLG nodes, broadcast to all nodes, send a notification packet, notify all nodes which nodes are selected, and public at the Sink node. In addition, SN deploys a CN (Check-up Node) to monitor CLH and CLG. If the number of forwards for CLH and CLG does not exceed a fixed number of packets, CN broadcasts messages in the cluster to stop responding to them and blacklists the relevant CLH or CLG. CN node should also monitor the energy consumption of CLH and CLG nodes in real time. If these two nodes run out of energy, SN node will be notified to update CLH and CLG nodes dynamically to ensure the effective implementation of the protocol. CN is selected from CLH by Sink nodes based on the cost function C.F chooses CN according to the game theory, compares two games in the selected CLH node, chooses the low cost as the CN node.

In addition, due to CLH, CLG nodes need to consume a lot of energy, in order to extend the service time of the nodes, we provide an energy collection module for CLH, CLG nodes, that allows them to harvest energy from their surroundings or from radio signals around them.

The cost function (C.F) is calculated in three parts, including the distance from each node to Sink node, the residual or total energy of each node, and the quality of each node in the overall network node (link quality).

3.4 Cluster Construction Based on Cost Function

According to the above four sections, Sink node calculates the cost function of all nodes based on the results of the preprocessing section. According to the cost function, CLH and CLG nodes are selected. Next we build the cluster. The Clustering Method is used to cluster all the data elements, and then the three-layer cluster routing protocol is formed, which can reduce the data transmission cost of large-scale WSN.

In order to classify the network into hierarchical clusters, we adopt the idea of K-means clustering algorithm, and take the distance as the standard of similarity measurement between data objects, that is, the smaller the distance between data objects, the higher the similarity between them, the more likely they are in the same cluster.

Fig. 3.
figure 3

Three-tier cluster structures.

3.5 Energy-Saving Aggregation Routing Protocol Based on Cost Function

Finally, our system model is constructed as shown in Fig. 3.

In this paper, we construct a three-layer cluster routing protocol. The sensor node wants to transmit data to the Sink node. The sensing data is forwarded by CM (Cluster Member) to CLH, which forwards it to CLG, and then the aggregated data is further forwarded to SN.

3.6 Anonymous Communication Scheme Based on Energy-Saving Routing Protocol

After clustering, the idea of on-demand routing discovery can be used when CM nodes want to send secret information to CLH nodes. When the CM node needs to send packets to the CLH node, it sends a routing request message RREQ to all its neighbors. When an intermediate node receives it, the intermediate node adds its address to the packet, sends it to its neighbor, and so on. When a request reaches the destination CLH node, it generates the route reply message RREP. The reply is unicast along the reverse path already built by the intermediate node to the CM source node. The CM node thus gets a multi-hop link to the CLH node, see Fig. 4.

Fig. 4.
figure 4

The communication process between the CM node and the CLH node.

But when the two communicate, because the Internet is a public channel, malicious users can intercept ongoing communications and wreak havoc. In order to reduce such activities and ensure the security of the online communication, we design an identity authentication key protocol scheme to protect the privacy and security of the node communication. With this mechanism, two communication entities validate each other and then build a session key to protect future communication between them.

Registration of CLH. During this process, \(CLH_{j}\) node is registered in RN node. \(CLH_{j}\) sends a registration request and RN feeds back to \(CLH_{j}\) two keys, h(Q) and \(h(HI{D_j}||Q)\).

Figure 5 shows the detailed process of the CLH node registration phase.

Fig. 5.
figure 5

Registration of CLH.

Registration of CM. This process is the \(C{M_i}\) node registration phase, in the Secure Channel, \(C{M_i}\) provides RN with its own identity \(I{D_i}\) and password \(P{W_i}\). After receiving the command, RN will merge the two parts into a smart chip, and hide the \(h(HI{D_j}||Q)\) key generated by \(CLH_{j}\) node with \(C{M_i}\) identity information to generate \({A_{ij}}\), which will be stored in the chip and fed back to the \(C{M_i}\) node.

Figure 6 shows the detailed process of the CM node registration phase.

Fig. 6.
figure 6

Registration of CM.

Validation of CM. The process is to verify the validity of \(C{M_i}\) node with password for smart chip. Before the \(C{M_i}\) request begins communication, the smart chip first validates the \(C{M_i}\) node by providing its own \( < ID_i^*,PW_i^* > \). The smart chip computes \(X_i^* = {Y_i} \oplus ID_i^* \oplus PW_i^*\) and verifies \(X_i^*\mathop = \limits ^? {X_i}\). If the condition is met, the \(C{M_i}\) node password and identity information is consistent with the registered node in RN, the node is valid and verified. If the condition is not met, the smart card rejects the \(C{M_i}\) node.

Key Agreement with Authentication. If the smart chip is verified, the smart chip provides the \(C{M_i}\) node with the \({A_{ij}}\), which contains the \(h(HI{D_j}||Q)\) key. The \(C{M_i}\) finds the \(CLH_{j}\) it wants to communicate with and encrypts a random number \({\alpha _i}\). This ensures that only \(CLH_{j}\) can decipher the random number. At the same time, the identity information is sent to \(CLH_{j}\) together with the verification message \(V_{1}\) generated by the random number. After receiving the message, \(CLH_{j}\) decrypts the random number \({\alpha _i}\) with its own key and verifies the correctness of the message \(V_{1}\). If the verification is successful, a random number \({\alpha _j}\) is generated for the later session key SK calculation. Similarly, a validation message, \(V_{2}\), will be passed to \(C{M_i}\). \(C{M_i}\) carries out the identification of \(CLH_{j}\) and the calculation of session key SK, and finally the confirmation of SK.

Figure 7 shows the detailed process of authentication and key agreement between CM and CLH nodes.

Fig. 7.
figure 7

Key agreement with authentication function.

4 Security Analysis

We assume that we have compromised a pair of cluster head node \(CLH_j^C\) and cluster member node \(CM_i^C\). Using these leaked confidential information nodes, the enemy can not impersonate any other node, nor can it compute the session key of any other participant, as follows: Adversary A can get \({\{ \mathrm{{ }}h\left( Q \right) ,h(HI{D_j}||Q)\} ^c}\) from cluster head, and adversary A can get \(HID_j\) from cluster member.

Can not impersonate any legitimate CLH node: In all nodes, h(Q) is known, but \(h(HID_j||Q)\) is different. To know \(h(HID_j||Q)\), A either knows Q from RS (which is impossible), or it can reverse-deduce it from \(A_{ij}^c = h(h(HI{D_j}||Q)||I{D_i}^c)\) (which is also impossible).

Can not impersonate any legitimate CM node: To impersonate a member of the legal cluster, adversary A needs to calculate the \(A_{ij}\). Suppose the adversary obtains \( < \mathrm{{ }}CI{D_i},{W_i},{V_1} > \) from the landing phase, relevant to the calculation of \(A_{ij}\), of these four parameters only \(W_i\), \({W_i} = {A_{ij}} \oplus {\alpha _i}\). Because of the unavailability of \({\alpha _i}\), \(A_{ij}\) is not computable.

Can not calculate any session key: The session key is calculated as \(SK = h(I{D_i}||HI{D_j}||{\alpha _i}||{\alpha _j})\). \(A_{ij}\) is not obtainable, so \({\alpha _i}\) and \({\alpha _j}\) are not obtainable.

5 Performance Analysis

Fig. 8.
figure 8

Total energy consumed.

In this section, we analyze and compare this scheme with the other two schemes. We set all the member nodes at the bottom to send packets to the SN node, the completion of the transmission is considered as the end of a round of data transmission. We calculate the total energy consumption of each node from the total energy consumption, according to the established routing protocol, calculate the total energy consumption of all member nodes to SN node.

Figure 8 shows the energy consumption of a round trip from all the member nodes we calculated to the sink node. As can be seen from Fig. 8, the difference in total energy consumption between protocols increases as the number of packets to be transmitted increases after the information is transmitted. When the packet reaches 1000 bits, as can be seen from the graph, the total energy of RPL is 200 times that of our proposed protocol, and the total energy consumption of RPL is 66 times that of our proposed protocol. To sum up, in terms of total energy consumption, our proposed protocol is lower than other protocols.

6 Conclusion

With the advent of the big data era, it is conceivable that the collection and analysis of data is very important. As a data collection device, wireless sensor network has greatly promoted the development of big data and data mining technology. This paper focuses on a secure convergent routing protocol with authentication and energy-saving functions. This paper proposes a solution to the problems of excessive energy consumption, unbalanced resource consumption and data privacy leakage in wireless sensor networks. The designed cost function selects a suitable “Leader” node of the data transmission cluster by calculating the communication loss, helps to plan the data transmission path, and solves the problem that some nodes consume too much energy due to the uneven resource consumption. The key agreement protocol is designed by using one-way Hasche function and XOR operation to hide and restore the parameters, using these parameters to calculate the shared key. The privacy protection of the secret data is realized, and the problem that the secret data is easy to be divulged when two nodes communicate is solved. The key agreement protocol with authentication function is designed. We authenticate first and then negotiate to prevent the third party from maliciously impersonating. Through the implementation of the above three parts, we construct a secure converging routing protocol with authentication and energy-saving functions. At last, through the performance analysis, the total energy consumption of the proposed protocol is compared with that of the two related protocols. Our work extends the network lifetime to some extent, increases the number of active nodes, reduces energy consumption, and increases the privacy of secret data transmission to some extent.