1 Introduction

Wireless sensor network (WSN) is a new information acquisition and processing technology with broad applications [1, 2]. The wireless sensor network is a major technology that drives the development of precision agriculture. WSNs increase the efficiency of sustainable development. Increases in agricultural efficiency will stem from networking sensors that elucidate important spatiotemporal patterns and integrate their data streams to not only display or record information, but also actuate human and autonomous responses. This involves monitoring soil, crop and climate conditions in a field, generalizing the result and providing a decision support system (DSS) for actions such as real-time variation of fertilizer or pesticide application.

WSN must rely not only on intrusion prevention technology, but also on intrusion detection system (IDS). Although there have been many research results about intrusion detection, many technical problems have not been solved, because of the particularity of the WSN itself.

Karlof and Wagner pointed out that many WSN routing protocols are not considered in the current routing protocols [3], and the need for the security of all routing protocols to identify the specific target. They demonstrate how to successfully introduce WSN, the ad hoc and end-to-end network and put forward some safety measures related to other routing attacks, such as Sybil [4, 5], wormhole, selective forwarding and spoofed routing information. In addition to the two new attacks against sensor networks, namely Sinkhole attacks and HELLO message flooding attack, they provide detailed analysis of the threat of Sinkhole attacks [6] on sensor networks in all attack types.

Ngai et al. proposed a method of detecting Sinkhole attack [7] using the base station to judge the consistency of the data in a region by detecting abnormal data. In order to locate the malicious node, the base station sends request messages to the node. Then, the base station uses the received information to generate a network topology. However, the base station is often not very good in the area of data tampering or selective forwarding because of fluctuations in regional data and the environment, so this method will have false positives.

Krontiris et al. first described the conditions for IDS of WSN [8] and the number of normal node being greater than the malicious node. Furthermore, it proposed the basic architecture of distributed IDS and the method of detecting black hole attacks and selective forwarding attacks. Based on the above research, the technology of detecting Sinkhole attack is proposed.

Shafiei et al. proposed two methods to detect the Sinkhole attacks [9]. One is based on the geographical statistical sampling method, and an energy consumption model is used to detect the possibility of Sinkhole attacks in every area of the network, and then, the distributed monitoring method is set up. Another is based on mitigation strategies to prevent traffic flow when hijacked by Sinkhole attacks. Finally, the two methods are verified by Castalia.

Rajasegarar proposed an anomaly detection technology based on the distributed clustering algorithm and k-nearest neighbor (KNN), which is based on the hyper-sphere [10]. By using the collected information in the node local clustering, similarity identification and the one-hop parent node performing the clustering task, it finds abnormal data sets and reduces the energy consumption caused by the node communication. Moreover, this can be used for Sinkhole attacks detection. However, according to the analysis, when the node to launch the Sinkhole attacks creates abnormal data flow hijacking, the possibility of finding the attacks in the network is very small [11].

In this paper, an adaptive network anomaly detection model based on CUSUM_MV is constructed, which is composed of two parts: the anomaly detection engine based on neighbor node monitoring and the centralized Sinkhole recognition engine. And then this paper analyzes the performance of intrusion detection algorithm. The proposed model is based on the CUSUM_HDST. The model can reduce the extra communication cost caused by intrusion detection to the network.

2 CUSUM_MV model based on cumulative summation

Assume that the network contains a sensor node and a base station. Sensor nodes and base stations can be transmitted through a variety of appropriate communication protocols; the sensor node will pass the message to the base stations through multi-hop. Each node can monitor the entire network in real time by monitoring the behavior of neighbor nodes and detect the abnormal of the network in this way. In each time window, each sensor node constructs a feature vector, which is used to record the neighbor node behaviors and the related network conditions observed in the time window. This feature vector is composed of a fixed number of attributes.

Due to the instability of the wireless channel in sensor networks, the different statistics contain noise, including signal conflict and environmental factors. So we need to perform a smooth processing of the statistics, such as with formula (2.1):

$$ \mathop X\nolimits_{t}^{i} = (1 - \mathop u\nolimits_{i} )\mathop X\nolimits_{t - 1}^{i} + \mathop {\mu X}\nolimits_{{t^{\prime}}}^{i} ,\quad 0 \le \mathop u\nolimits_{i} \le 1 $$
(2.1)

where \( \mathop X\nolimits_{{t^{\prime}}}^{i} \) is the observation value of the ti node on that period in the network. \( \mathop X\nolimits_{t}^{i} \) is the smoothed value. The value of the μi memory factor fluctuates with the value of different network conditions.

2.1 Anomaly detection of parent node based on CUSUM GLR

Cumulative summation algorithm has been widely used for network anomaly detection [12]. A sensor network is a data stream and network behavior is dynamic and stochastic, so the generalized likelihood ratio (GLR) and CUSUM are introduced to meet the needs of real-time monitoring of sensor networks. Under normal circumstances, the network behavior of adjacent nodes is basically similar. This paper selects the signal intensity RSSIij, Cij and the link quality LQij to the base station to monitor the network behavior of the parent node. After the network enters the stable period, each node saves the RSSIij, the LQij of the parent node and the link quality Cij to the intrusion detection module. Their expected value is:

$$ E(\mathop {RSSI}\nolimits_{N(i)} ) = \varpi ,E(\mathop C\nolimits_{{\mathop {ii}\nolimits_{p} }} ) = \rho ,E(\mathop \Delta \nolimits_{lq} ) = \vartheta ,\mathop E\nolimits_{X} = \mathop {(\varpi ,\rho ,\vartheta )}\nolimits^{\text{T}} $$

Based on the mean value estimation of sliding time window, the data of the long time window can detect the anomaly of the data in a short time. If there is no exception in the short time window, the two parameters (i.e., mean and variance) of the CUSUM GLR are estimated to move forward, and the value of forward moving is far shorter than that of the short time window. So the parameters of CUSUM GLR anomaly detection model can reflect the changes of the current network characteristics, as shown in Fig. 1.

Fig. 1
figure 1

Statistical time window forwarding

For the variance of the computation, the Bessel standard deviation formula is used as follows.

$$ \delta = \sqrt {\frac{1}{l - 1}\sum\limits_{i = 1}^{l} {\mathop {(\mathop X\nolimits_{i} - E(X))}\nolimits^{2} } } $$

In order to reduce the cost of wireless sensor nodes, the unbiased range estimation is introduced to compute δ. First, long time window l, the maximum value Xmax and minimum value Xmin of the statistics are selected. So \( R = \mathop X\nolimits_{{\max} } - \mathop X\nolimits_{{\min} } \). Dividing the data in time window l into three groups, the number of data is n in each group and the average value of each group is \( \bar{R} = (\mathop R\nolimits_{1} + \mathop R\nolimits_{2} + \mathop R\nolimits_{3} )/3 \). Then, the calculation formula of the total standard deviation is estimated by using the principle of probability statistics, as shown in formula (2.2).

$$ S = \frac{{\overline{R} }}{\sqrt n } $$
(2.2)

In the process of anomaly detection, the statistics can be expressed in the form of a vector: \( X = (RSSI,C,LQ) \). It can also be expressed as a vector of a normal distribution.

In order to detect the anomaly dynamically, a long time window is required to estimate the process, as shown in formula (2.3).

$$ \mathop E\nolimits_{X}^{'} = \frac{1}{l}\sum\limits_{i = k - l + 1}^{k} {\mathop X\nolimits_{i} } $$
(2.3)

If there is no exception, the current statistical value is updated, and the process reflects the adaptive mechanism of the anomaly detection. The process is shown in formula (2.4).

$$ E(\mathop X\nolimits_{l}^{i} ) = \frac{1}{l}\sum\limits_{{j = \mathop l\nolimits_{0} }}^{m} {\mathop X\nolimits_{l}^{i} } = \frac{1}{l}[(l - 1)E(\mathop X\nolimits_{l - 1}^{i} ) + \mathop X\nolimits_{l}^{i} ] $$
(2.4)

The mean vector Ex represents randomness before the exception occurs. After the exception occurs, the mean vector of the random statistic is \( \mathop E\nolimits_{X}^{'} \), and its log likelihood ratio is:

$$ \mathop S\nolimits_{t} = \ln \frac{{\mathop p\nolimits_{{\mathop E\nolimits_{X}^{'} }} (\mathop X\nolimits_{t} )}}{{\mathop p\nolimits_{{\mathop E\nolimits_{X} }} (\mathop X\nolimits_{t} )}} $$
(2.5)

Before the anomaly occurs, the value of St is negative. After the anomaly occurs, the value is positive. The value of Sn will continue to accumulate. When a given threshold is exceeded, an exception can be thrown. The decision rule can be given as shown in formula (2.6)

$$ d = \left\{ {\begin{array}{*{20}l} {H_{0} ,\quad {\text{if}}\;\mathop { \, S}\limits_{n} < \gamma } \hfill \\ {H_{1} ,\quad {\text{if}}\;S_{n} \ge \gamma } \hfill \\ \end{array} } \right. $$
(2.6)

The process of calculating in Fig. 5 is relatively inefficiency. It is basically consistent with the standard variance X of the statistical values. In order to reduce the number of calculation in the detection process, we can detect whether the standard deviation is too large in advance of any abnormality. Thus, the standard deviation can be expressed as δ. Furthermore, assuming X obeys the normal distribution, it can show \( p_{\theta }(X)\sim N(E,\delta^{2}) \) shown in formula (2.7).

$$ \mathop P\nolimits_{\theta } (X) = \frac{1}{{\delta \sqrt {2\pi } }}\mathop e\nolimits^{{ - \frac{{\mathop {(X - E)}\nolimits^{2} }}{{2\mathop \delta \nolimits^{2} }}}} $$
(2.7)

In the formula, when θ = θ0, E is Ex. When θ = θ1, E is \( E_{X}^{{\prime }} \) t. Substituting into formula (2.5), it results in formula (2.8).

$$ \mathop S\nolimits_{t} = \frac{{(E^{\prime } - E) \cdot (2X - E^{\prime } + E)}}{{2\mathop \delta \nolimits^{2} }} $$
(2.8)

And formula (2.9).

$$ \begin{aligned} \mathop S\nolimits_{n} & = \sum\limits_{t = 0}^{n} {\mathop S\nolimits_{t} } \\ & {\kern 1pt} = \frac{1}{{2\mathop \delta \nolimits^{2} }}\sum\limits_{T = 0}^{n} {(E^{{\prime }} - E) \cdot (2X - E^{{\prime }} + E)} \\ \end{aligned} $$
(2.9)

Assume that the vector \( \gamma = \{ \mathop \gamma \nolimits_{d} ,\mathop \gamma \nolimits_{lq} ,\mathop \gamma \nolimits_{c} \} \) represents the anomaly detection alarm threshold for each statistic. For the detection threshold set, the bigger it is, the higher is the false-negative rate of the anomaly detection system. Furthermore, the longer the time to find the anomaly is, the longer the delay for alarm is. The smaller the threshold for anomaly detection system is, the higher the false alarm rate is. And then, sensor nodes have the greater burden. By analysis, a list of memory factor, long time l and short time S, and anomaly detection alert threshold \( \gamma = \{ \mathop \gamma \nolimits_{d} ,\mathop \gamma \nolimits_{lq} ,\mathop \gamma \nolimits_{c} \} \) can be set up in the laboratory environment. And the selection of these thresholds can be accomplished by a lot of training before deploying nodes to monitor.

2.2 Anomaly information transfer

When a node finds an anomaly, it sends piggybacking packets to the base station. If it does not send packets for some time, this will generate an anomaly intrusion frame, called an IF packet, as shown in Fig. 2. In the anomaly region, the transmission path uses the historical parent node as the next hop node to avoid flooding which is mentioned by E. C. H. Ngai.

Fig. 2
figure 2

Modified CTP routing packet

So, after the anomaly occurs, the anomaly information is still able to break through the anomaly regions. In order to modify the data packets through multi-hop transmission to the base station, it may select the normal direction. The modified CTP data frame is shown in Fig. 3. The header from bit 2–7, totaling 6 bits, includes D, LQ, C, RD, where the RD domain occupies 2 bits (1 representing an anomaly and 0 indicating no abnormally) and where the RD domain is labeling the current node of the reverse, the maximum expressed 3°.

Fig. 3
figure 3

Modified CTP frame

At the end of the CTP data frame, the area occupied by the 16-bit description of the suspected node intrusion detection is called SuspectedID. The 8-bit LinkEtx saves the single-hop link quality estimation. If the node detects a reverse link, it will send the data frame through the reverse of the RD domain to piggyback on the parent node. The transfer path is shown in Fig. 4.

Fig. 4
figure 4

IF routing diagram (dotted lines indicates the anomaly information transfer route, the arc shape in the region indicates the anomaly region, and the black circle represents an anomaly nodes)

2.3 Sinkhole attacks detection algorithm

When the base station collects enough CTP data frames, D, LQ, C, RD and SuspectedID domain can be extracted from the intrusion detection system. They are used to construct the network behavior graph and route pattern of a certain area. If the node’s anomaly detection works well, then the constructed graph can be regarded as a tree which is based on a Sinkhole, and thus identified as a Sinkhole attack. In Fig. 5, node M has launched a Sinkhole attack; the base station collects the anomaly information to constitute a tree, with node E and node D, for the malicious nodes. There are pointers to node M—A, B, C. There are also pointers to node B—D, E, M. The method of E. C. H. Ngai is analyzed, and the simplified intrusion detection process—putting forward the sink node link quality instead of hop number—is detected. The sinkhole attacks detection algorithm is based on majority voting, abbreviated as MV.

Fig. 5
figure 5

Topology map obtained from base station

The detection technology principle of Sinkhole attacks is: If the node is a malicious node, because of the traffic aggregation effect of Sinkhole attack node, the node can be found in the suspicious region. Because the anomaly detection can generate false alerts, it is necessary to introduce a mechanism which is based on the network. If the node a is a father of the node b, the link quality from b to the base station is equal to the link quality from a to base station multi_Etx and the link quality from b to a link_ETX [13]. It is shown as formula (2.10).

$$ multi\_\mathop {ETX}\nolimits_{a} +\, link\_\mathop {ETX}\nolimits_{(a,b)} = multi\_\mathop {ETX}\nolimits_{b} $$
(2.10)

Similarly, the process of identifying an attack node is transformed into a process of searching for the root node based on the link quality. If the root is a node in the anomaly region exceeds the detection index (setting the value of the detection intensity), then the node is considered to be the source of the Sinkhole attack. When malicious nodes testified the normal nodes, they will forge single-hop link quality information. In the detection, if this is found not to conform to formula (2.10), the testifying is considered illegal to testify and ignore automatic. Figure 5 is an example. The detection algorithm iterates through the suspicious nodes. (In Fig. 5, they are M and B.). If the number of nodes is greater than the number exceeding ρ, an attack has occurred.

Definition 2.1

Node a is a legitimate testifying, that is, all the child nodes send an anomaly information frame of single-hop linkETX and multi-hop multiETX to satisfy formula (2.10).

Definition 2.2

Hypothesis mal1 being the suspicious node, then \( mal_{1}^{n} \) is the number of suspect node mal1. Namely, the node mal1 is the total number of legitimate testifying all nodes and all its child nodes and mal1 is one of the suspicious areas of multiple root nodes.

Because of the instability of the wireless channel [14], this will lead to the network anomaly link map. So it may be that in the process of finding an attack node, the link of RSSI is the lowest. In order to extract the useful link graph from the disrupted anomaly map, the abnormal link of the RSSI is removed from the graph with the exception of the dense region. If the total number of suspicious nodes is m, and the number N all nodes in the suspicious region is sent up, if it is established:

$$ \mathop {{\max} }\limits_{j \in m} (\mathop {mal}\nolimits_{j}^{n} /N) > \rho $$
(2.11)

It is considered an attack, in order to avoid false alarm rate, if the proposed value ρ is set to be greater than 0.5.

3 CUSUM_HDST model based on D–S evidence theory

CUSUM_MV does not solve the other type of attacks, and it has a relatively large communication overhead in the network. In order to further improve the performance of the intrusion detection algorithm, the evidence theory (Dempster–Shafer) is introduced, which is also called the D–S evidence theory. Although the Bayesian network has been widely used for the classification of anomalies, the application of Bayesian networks requires the formation of a probability set and an abnormal distribution in advance. In contrast, evidence theory supports a reliability method implicitly embedded in system knowledge. And it does not need to clearly calculate the probability that it can be expressed with uncertainty in the cognitive domain and with no intellectual invention. Moreover, in the presence of uncertainty it does not require knowledge of the nature of decision making, so evidence theory is more suitable for carrying out the classification and detection of abnormalities.

The efficiency of using statistical analysis and evidential reasoning to carry out the network anomaly diagnosis is studied in the paper [15]. In this paper, we first use the dual-loop auto-regression to model the increase in network monitoring variable values to detect the network anomaly accurately. In order to find out the causes of an abnormal occurrence, the evidence theory is used to combine all kinds of evidence. The results are verified by real data. The results show that the proposed method has higher classification efficiency. Accumulated evidence verifies that the evidence is in the right category, and it is not necessary to consult the class’s estimates. A trust evaluation model is proposed [16] which is suitable for wireless network, and a trusted routing protocol is constructed based on AODV. Evidence theory is attractive because it is able to deal with uncertainty or incomplete knowledge (that is, the lack of a comprehensive probability model of knowledge). In the network environment, a host of reasons can lead to a variety of abnormalities, so evidence theory, which is based on an incomplete probability model, is more suitable for intrusion detection and network anomaly detection [17,18,19, 21,22,23,24].

CUSUM_HDST algorithm is a distributed and centralized intrusion detection system shown in Fig. 6. The information including abnormal detection, misuse detection and hybrid CUSUM_MV is used to reduce information about abnormalities, which can reduce the burden on the network prior to transmitting the exception information to the sink node. The sink node is used to identify the attacks.

Fig. 6
figure 6

Intrusion detection framework based on evidence theory

3.1 Feature selection

In order to detect the DoS attack, we add a statistic, which is used to count the traffic information of the neighbor nodes, that is, Strij, the mean sending packets and receiving packets.

Definition 3.1

Sij, the number of times node i observes the number of packets sent in node j.

Definition 3.2

Rij, the number of times node i observes node j receiving the data packet, that is, the node i listens to the node j sending the confirmation ACK packet, because nodes in each receiving a data packet send a confirmation ACK packet, to confirm to the other side that the packet has been received. Strij means that node i observes the traffic of nodes j information, as defined as formula (3.1).

$$ \mathop {Str}\nolimits_{ij} = \left| {(\mathop R\nolimits_{ij} - \mathop S\nolimits_{ij} )} \right|/\mathop R\nolimits_{ij} $$
(3.1)

Under normal circumstances, the number of packets sent and received should be in balance; that is, the ratio should fluctuate in the vicinity of 0. From the formula, it can be seen that if Rij is far greater than Sij, this ratio is close to 1. When a node initiates a DoS attack, this ratio will be more than 1. Because the sensor network takes data as the center, it is hard to avoid the network congestion caused by the sudden network behavior, which makes the packet loss probability not stable. Smooth processing of packet loss rate is shown in formula (3.2).

$$ \mathop {Str}\nolimits_{ijt}^{i} = (1 - \mathop u\nolimits_{i} )\mathop {Str}\nolimits_{ijt - 1}^{i} + \mu \mathop {Str}\nolimits_{{ijt^{\prime}}}^{i} ,\quad 0 \le \mathop u\nolimits_{i} \le 1 $$
(3.2)

\( \mathop {Str}\nolimits_{{\mathop {ijt}\nolimits^{'} }}^{i} \) is the observation value of the nodes i in the time t′ and \( \mathop {Str}\nolimits_{{\mathop {ijt}\nolimits^{{}} }}^{i} \) is the smoothed value. The value of the memory factor depends on the specific network environment.

3.2 Relay node anomaly handling

In the process of anomaly information transfer, the malicious slander node may broadcast false information packets to launch an attack disrupting the anomaly detection process. This section introduces the concept of fuzzy set theory taking certain node anomaly information as a domain. Accordingly, the various neighbors sending anomaly indication information will be a fuzzy set. It will use legitimate nodes testifying to legitimate nodes as belonging to a certain sample. The approach examines the relationship of the individuals testifying, which is used for calculating the degree of conflict between various indicators.

The conflict degree is constructed into a n × n matrix, in which the evidence is to be considered as a forgery of malicious information and is not transmitted. Thus, the communication of malicious information and redundant information is reduced. In Fig. 7, the node m is a malicious node. Node 1 has neighbor nodes 2, 3, 4, which are responsible for observing the behavior of node 1. In the network, the solid node m initiates intrusion resulting in anomaly network behavior. While the network is anomalous, nodes 1, 2, 3 will think that the node 1 is a slight anomaly because the node m broadcast of the anomaly report will mislead the node 1.

Fig. 7
figure 7

Libel case

When the relay node receives the information from the neighbor nodes, it needs to deal with the anomaly information and eliminate the conflicting information. In CUSUM_MV algorithm, according to the CUSUM GLR and the threshold, we get the anomaly information of neighbor nodes. This section uses this exception information for further processing. The CUSUM GLR model is used to detect the anomaly, and the anomaly in the short time window will be sent to the next hop node. In order to avoid being hijacked by a Sinkhole, the IF transfers uses the CUSUM_MV method. Its work flowchart is shown in Fig. 8.

Fig. 8
figure 8

Relay node work flow

The relay node i information received from the node j is arssi, acn, alq and astr, respectively, and each of them is expressed as the anomaly degree of the nodes (signal intensity, convergence, link quality, traffic). As a result, the detection unit is based on a short time window, so the calculation method is shown in formula (3.3).

$$ (\left| {\mathop E\nolimits_{s} - \mathop E\nolimits_{l} } \right|)/\mathop E\nolimits_{l} $$
(3.3)

where El is expressed in the form of a long period (before the exception alarm) expectation. Es expresses a short time window to detect the anomaly occurrence in the previous statistics. This ratio represents the extent of the exception, as an important basis for judging the occurrence of attacks in the network. Assuming the node f from different neighbor nodes receiving n vectors is the same node a, which can be expressed as a vector table of 3.4, then the node f will be responsible for obtaining n information from the anomaly, which may remove evidence of malicious information. Using fuzzy mathematics, node f can be assessed. We can calculate the degree of conflict between various anomaly reports.

$$ \mathop f\nolimits_{n} = \left[ {\begin{array}{*{20}c} {\mathop f\nolimits_{1} } \\ {\mathop f\nolimits_{2} } \\ \vdots \\ {\mathop f\nolimits_{n} } \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {\mathop m\nolimits_{11}^{a} } & {\mathop m\nolimits_{12}^{a} } & {\mathop m\nolimits_{13}^{a} } & {\mathop m\nolimits_{14}^{a} } \\ {\mathop m\nolimits_{21}^{a} } & {\mathop m\nolimits_{22}^{a} } & {\mathop m\nolimits_{23}^{a} } & {\mathop m\nolimits_{24}^{a} } \\ \vdots & \vdots & \vdots & \vdots \\ {\mathop m\nolimits_{n1}^{a} } & {\mathop m\nolimits_{n2}^{a} } & {\mathop m\nolimits_{n3}^{a} } & {\mathop m\nolimits_{n4}^{a} } \\ \end{array} } \right] $$
(3.4)

The approach degree between the fuzzy sets of each neighbor node can be calculated using fuzzy mathematics. First of all, formula (3.4) is normalized as shown in formula (3.5).

$$ \mathop f\nolimits_{i} = \left( {\mathop m\nolimits_{i1}^{a} /\sum\limits_{j = 1}^{n} {\mathop m\nolimits_{j1}^{a} } ,\mathop m\nolimits_{i2}^{a} /\sum\limits_{j = 1}^{n} {\mathop m\nolimits_{j2}^{a} } ,\mathop m\nolimits_{i3}^{a} /\sum\limits_{j = 1}^{n} {\mathop m\nolimits_{j3}^{a} } ,\mathop m\nolimits_{i4}^{a} /\sum\limits_{j = 1}^{n} {\mathop m\nolimits_{j4}^{a} } } \right) $$
(3.5)

Obviously, formula (3.4) is normalized; this will not affect the degree of divorce between each vector. Then, the weighted Euclidean distance formula (3.6) is used to calculate the approach degree:

$$ d(\mathop m\nolimits_{b}^{a} ,\mathop m\nolimits_{c}^{a} ) = \sqrt {\sum\limits_{i = 1}^{4} {\mathop \zeta \nolimits_{i} (\mathop m\nolimits_{bi}^{a} ,\mathop m\nolimits_{ci}^{a} )} } $$
(3.6)

In the formula, ζ1, ζ2, ζ3 and ζ4 express the signal intensity, the degree of convergence, the link quality and the weight of the flow, respectively. For example, when detecting a DoS attack, the assignment ζ4 should be greater than the other three values. After calculating the distance, the matrix can be obtained shown in formula (3.7).

$$ \mathop {\left| {\begin{array}{*{20}c} 0 & {\mathop d\nolimits_{12} } & \ldots & {\mathop d\nolimits_{1n} } \\ {\mathop d\nolimits_{21} } & 0 & \ldots & {\mathop d\nolimits_{2n} } \\ \vdots & \vdots & \vdots & \vdots \\ {\mathop d\nolimits_{n1} } & {\mathop d\nolimits_{n2} } & \ldots & 0 \\ \end{array} } \right|}\nolimits_{n \times n} $$
(3.7)

In formula (3.7), dij expresses the approach degree. Approach degree dij is 0 indicating no conflict, that is, the anomaly report is completely consistent. On the contrary, the larger dij is, the greater the conflict between the two reports is.

In order to filter out the impact of final evidence, the threshold value ρi is defined. It denotes the conflict degree of node i. \( \rho_{{i_{0} }} \) denotes the conflict proportion of node io. If it exceeds a certain threshold, the report blocking anomaly detection of the malicious slander report should be removed; the relay nodes will not transmit them. Conversely, less than ρi of the threshold should be classified as legitimate to testify the fuzzy set. The calculation method is shown in formula (3.8).

$$ \mathop \rho \nolimits_{{\mathop i\nolimits_{0} }} = \frac{{\sum\nolimits_{{i = \mathop i\nolimits_{0} ,j = 1}}^{n} {\mathop d\nolimits_{ij} } }}{{\sum\nolimits_{i = 1}^{n} {\sum\nolimits_{j = 1}^{n} {\mathop d\nolimits_{ij} } } }} $$
(3.8)

3.3 Information fusion and attack judgment on the base station

Compared with Bayesian theory, the fusion of evidence theory does not require a priori knowledge of probability, and it is suitable for intrusion detection in dynamic changes. To introduce the concept of DST (D–S evidence theory), we first need to establish a sound prior knowledge about network failure or anomaly and list the hypothesis and the corresponding evidence. Through these hypotheses and evidence, we can determine the most likely causes of the current network anomalies. Evidence can be accumulated and calculated to determine the most likely categories of current abnormal information, so as to identify whether there is a network attack. Work flow on the base station is shown in Fig. 9.

Fig. 9
figure 9

Base station detection process

On the basis of CUSUM_MV algorithm, the improved model is used to detect Sinkhole attacks, and it is also used to detect DoS attacks. For the detection of various attacks, we first need to define a recognition framework F = {F1, F2, …, F3} and its relation to the knowledge base, where each element Fi represents a detection of intrusion attack category. In particular, F denotes a non-anomaly. Normal value is non-attack in this category.

In order to identify the attack in the current network, we first need to construct a knowledge set as the knowledge base for the attack detection. We will show that the knowledge set is represented as Dn (φn, fn), where φn is the four-dimensional vector representing the degree of the different degrees of the individual statistics and fn denotes the type of attack. Responding to the changing needs of detection, we can constantly expand the attack or exception types in the evolving knowledge base to improve the detection rate and expand the detection range.

The base station collects anomaly information sent from nodes. Each sub-node, in the t time period, can be expressed as a vector \( \varphi \left( t \right) = \left[ {\varphi^{1} \left( t \right),\varphi^{2} \left( t \right), \ldots ,\varphi^{m} \left( t \right)} \right] \), which includes all the anomaly information in the network and the status of the current sensor network. M denotes the number of statistics. In order to construct the BPA, the Euclidean distance formula is introduced as shown in formula (3.9).

$$ d(\hat{\psi },\psi ) = \sqrt {\sum\limits_{j = 1}^{m} {\mathop {\hat{\psi }}\nolimits^{j} - \mathop {\hat{\psi }}\nolimits_{i}^{j} } } $$
(3.9)

It is used to calculate the distance between two vectors. It can represent the degree of similarity between the vectors ψ and the current statistics \( \hat{\psi } \). And the degree of trust is used to calculate the information generated in each of the exceptions and a knowledge set. With the increase in the distance of two vectors, we believe that the vector ψ and the network exception vector \( \hat{\psi } \) are the same probability for the same class.

An efficient BPA should reply on \( d(\hat{\psi },\psi ) \). It should reflect the relationship between the type of attack and the abnormal information vector, so that the BPA function [19] is introduced to generate the belief values, as shown in formula (3.10).

$$ m_{{\varsigma_{{i_{0} }} }} (A) = \left\{ {\begin{array}{*{20}l} {p_{i}^{{(l_{{i_{0} }} )}} } \hfill & {\quad A = \{ F_{{l_{{i_{0} }} }} \} l_{i} = l, \ldots ,M} \hfill \\ {1 - \, p_{{i_{0} }}^{{(l_{{i_{0} }} )}} } \hfill & {\quad A = F} \hfill \\ 0 \hfill & {\quad A \in \{ \{ F_{{l_{{i,i \ne i_{0} }} }} \} ,F\} } \hfill \\ \end{array} } \right. $$
(3.10)

Formula \( \mathop p\nolimits_{i}^{{(\mathop l\nolimits_{i} )}} = \alpha \mathop e\nolimits^{{ - rd\mathop {(\tilde{\phi },\mathop \phi \nolimits_{i} )}\nolimits^{2} }} \), where 0 < α < 1 and γ > 0, expresses the distance between two vectors using anomaly information vector category judgments provided by the trust.

In order to avoid errors in evidence fusion, the rest trust degree averagely allocates a recognition framework for the rest of the class. In many cases, the same type of attack is expressed on multiple statistics, so a statistic and a class of attacks cannot be a good fit, while an exception vector for multiple attack types distribute the nonzero values. Formula \( 1 - \mathop p\nolimits_{{\mathop i\nolimits_{0} }}^{{(\mathop l\nolimits_{{\mathop i\nolimits_{0} }} )}} \) expresses a degree of uncertainty, and the default is normal.

In order to detect the occurrence of intrusion attacks in the network, we need to focus on the knowledge of all the vectors in the type of attack included. Each evidence of ξi is not only one, but also generally the identification of the framework of multiple sets of processes shown in Fig. 9. Thus, we can use the results obtained by the fusion of evidence to identify the abnormal types. Therefore, in order to obtain the total BPA of each focal element, the evidence combination rule \( \mathop m\nolimits^{(N)} (A) = \mathop \oplus \nolimits_{i = 1}^{N} \mathop m\nolimits_{\xi } (A) \) is used to obtain the BPA, where A ∈ F. For each type of attack set threshold, the detection model can reflect the changes of the current network characteristic, as shown in Fig. 1.

4 Simulation and analysis

The simulation uses a Castalia simulator. The configuration of nodes needs to be done, and there are some simulation parameters, such as simulation scene, number of node and rate of wending data in. They are shown in Table 1. Attack nodes are deployed randomly in the network environment, and the Sinkhole and DoS attacks are launched after a period of time. The simulation program was run 10 times in various settings. Based on the analysis of the above research, for the detection of Sinkhole attacks, the base station periodically broadcasted information frames to the network. When the node receives the information frame, the local detection results were sent to the base station. To determine attack type, relay node detected the anomaly information. The base station generated evidence and fused evidence to the attack information.

Table 1 Simulation environment parameter configuration

The model was analyzed in terms of the detection rate, false alarm rate and the additional burden on the network. Due to the limitation of space, this paper analyzes the LQ of the nodes near the Sinkhole attack. As shown in Fig. 10, the horizontal coordinates of the graph represent the number of hops to the attack node, and the vertical coordinates are expressed as the average degree of the link quality changes of the nodes with different hops. The formula for calculating the degree of variation is (4.1).

$$ \left[ {\left( {\sum\limits_{i = hop,j = 0}^{j < n} {\left| {\mathop {Etx}\nolimits_{S}^{i,j} - \mathop {Etx}\nolimits_{N}^{i,j} } \right|} /\mathop {Etx}\nolimits_{N}^{i,j} } \right)/n} \right] \times 100 $$
(4.1)

In formula (4.1), n is the number of attack nodes for hop count. \( Etx_{S}^{j} \) and \( Etx_{N}^{j} \) are the link quality of the base station before and after the node j changes. It can be clearly seen that the link quality of the node to attack node has a dramatic change and the change of the link quality is not obvious with the increase in the distance from the point of attack. After a period of time, the Sinkhole node had launched the attack, resulting in the change of the LQ in the base station. Attack nodes were launched by the attack, which was broadcasted by the route beacon frame. In the beacon frame, it pretended that link quality to the base station is high and the frequency of the transmitting beacon frames increased. The surrounding neighbor nodes will response to attack node, parent node selection, which ultimately led to the illusion of region near to the base station link quality rise. In addition, it can be known that the change of convergence degree is consistent with the change of LQ; that is, when a node to the base station has a high quality; it is bound to increase its attraction to the surrounding nodes. Figure 11 shows the RSSI of a node’s parent node under normal circumstances. The horizontal coordinates are expressed in time sequence. The vertical coordinate is the corresponding time point. The RSSI value of the node is collected. It is clear that the RSSI of the node is always a normal distribution.

Fig. 10
figure 10

Impact of attack on link quality

Fig. 11
figure 11

RSSI distributions under normal circumstances

As shown in Fig. 12, when a Sinkhole attack occurs, the attack node may be enhanced by the emission power, and the 138th point in the picture shows a sudden change in the power. In order to respond to a range of attacks, it is easy to use CUSUM GLR algorithm to detect the occurrence of anomaly time points and then quickly detect the Sinkhole attacks in the network.

Fig. 12
figure 12

Anomaly under attacking

Usually, the intrusion detection algorithm evaluation index was the detection rate and the false-positive rate for every simulation results. We need the false alarm rate and false-negative rate calculated. When a normal behavior is labeled as an anomaly, it is a false alarm. When an anomalous behavior is labeled as normal, it is false negative. The false-positive rate is the ratio of the number of false positives and the number of actual measurements. The false-negative rate is negative and the actual amount of the abnormal ratio.

The CUSUM_MV anomaly detection model introduces an adaptive mechanism, which is shown in Fig. 13 with respect to the false-positive rate of the anomaly detection with respect to the ordinary CUSUM GLR. False positives are the behavior of an abnormal alarm in the area of the attack. The relative rate of false positive refers to the number of false alarms as a proportion of the total number of nodes. The false-positive rate is a key factor to the detection rate, because if the false-positive rate is too high, the additional communication overhead is increased.

Fig. 13
figure 13

False alarm rate changing with attack strength

In the experimental environment, the convergence degree CN, link quality LQ and the RSSI of the parent node were detected. The adaptive and predictive methods were used to detect the three variables. The long time window was set to 60 s, the short time window was set to 10 s, and the time window movement was every 1 s. If one of the variables was an anomaly, the current node might have an exception based on judging whether the node is abnormal or not. Simulation results are shown in Fig. 13. From the graph, it can be seen that, in the same condition, the CSUSUM GLR anomaly detection model with the introduction of adaptive mechanism was significantly lower than that of low false-positive rate. Intuitively, CUSUM_MV attack detection model was only the most suitable ρ to achieve the highest detection effect.

In the experimental scene, with the random deployment of the 10 Sinkhole attacks, the false alarm rate and the detection rate of change with the value ρ, the trend is shown in Fig. 14. From the figure, it can be seen that the detection rate gradually increased. When the detection rate reached a certain level, the false-positive rate also increased, while the detection rate remained unchanged. Therefore, the intrusion detection module was deployed to the actual scene situations based on past data. In order to give the appropriate value, the ρ value needs training. From the figure, it can be seen that the value is 0.6. The false-positive rate and the detection rate achieved a better mutual balance with this value.

Fig. 14
figure 14

Detection performance with the change of detection threshold

Comparing the algorithm proposed by E. C. H. Ngai, the hyper-sphere distributed clustering algorithm proposed by Sutharshan Rajasegarar and CUSUM_MV in the same network scenario, the relationship between the detection rate of each model and the change of malicious nodes is shown Fig. 15.

Fig. 15
figure 15

Comparison of detection rate of intrusion detection model

When the Sinkhole attacks, the attack node will be information, resulting in information being hijacked. The base station cannot receive effective information. In the cluster-based method, the so-called information transfer mechanism is not used; the detection rate is very low. The defects of the E. C. H. Ngai et al. proposed methods are mitigated. And link quality is a testimony standard, so as to improve the detection rate of the proposed a method of E. C. H.

With the increase in attack nodes, the detection rate of each detection method will fall. Because the anomalies are too much, there is message conflict. Moreover, the corresponding information cannot be transmitted to the base station; the base station is thus not accurate. With the increase in malicious nodes in the network, it is more difficult to detect the attacker, especially when the number of malicious nodes reaches a certain ratio. The network has been caught in a state of non-work. Obviously, with the increase in malicious nodes in the network, the resolution of the malicious nodes and normal nodes is more difficult, and the false-positive rate will also rise.

In terms of energy consumption, because the signal intensity of the set of nodes is certain, the energy consumption of the individual node is certain. If the node needs to maintain the network traffic information in the hybrid mode, the energy consumption is constant [20]. In this paper, we only consider the additional communication burden caused by the intrusion detection module, which is only a proportion of the communication packets in the network. According to E. C. H. Ngai’s approach, the method is based on the base station. The proposed method is based on the node and transmitting the data to the base station. Thus, the communication relatively load is relatively low and the communication cost is relatively low. Figure 16 shows a comparison of the communication burden of intrusion detection models.

Fig. 16
figure 16

Comparison of the communication burden of intrusion detection model

From Fig. 17, it can be seen that the detection rate is equivalent to the CUSUM_MV algorithm, and the stability of CUSUM_HDST detection model is better with the increase in attack nodes. In the experimental scene, the detection rate of the CUSUM_MV and the CUSUM_HDST is the same when the proportion of malicious nodes less than 5. But until 5, the detection rate of the CUSUM_MV keeps 1 and the detection rate of the CUSUM_ HDST decline to 0.9. They both synchronously decline between 5 and 10, and they do not have changes until 15. It can be seen that the detection rate is equivalent to the CUSUM_MV algorithm, and the stability of CUSUM_HDST detection model is better with the increase in attack nodes.

Fig. 17
figure 17

Detection rate changes with the proportion of malicious nodes

From Fig. 18, it can be seen that the false-positive rate of CUSUM_HDST detection model is stable and has no dramatic change.

Fig. 18
figure 18

Communication overhead with the change of the proportion of malicious nodes

5 Conclusions

In this paper, we study the anomaly detection behavior of the nodes and the base station.

  1. 1.

    When the station cannot capture the network anomalies, the CUSUM GLR is introduced, and the anomaly detection model is given;

  2. 2.

    Sinkhole hijack traffic, and the mechanism of transmission of the anomaly information to the base station, is given in view of the Sinkhole attack nodes;

  3. 3.

    Based on the “link quality” and “majority rule,” a new Sinkhole attack detection scheme is proposed, and a CUSUM_MV intrusion detection model based on node and base station communication is presented.

  4. 4.

    Based on the Castalia simulation experiments, the results showed that the CUSUM_MV intrusion detection model has a better performance than traditional methods in detecting Sinkhole attacks. The detection rate is improved, and the false-positive rate is reduced;

  5. 5.

    Based on weighted Euclidean distance, the redundant information removal mechanisms are established on the relay nodes. In order to reduce the communication overhead caused by intrusion detection, evidence theory is applied to the detection of wireless sensor networks. Based on node and base station, the CUSUM_HDST intrusion detection model is given;

  6. 6.

    Simulation experiments based on Castalia show that the CUSUM_HDST intrusion detection model can not only detect Sinkhole and DoS attacks, but also reduce the communication overhead caused by intrusion detection.