1 Introduction

With the continuous development of communication technology, the growing demand for communication between things has become a brand-new growth point in the communication industry [1]. It is widely believed that the Internet of Things based on all things is about to become the next blue sea with great commercial potential. Narrow-band Internet of Things technology has become a key technology in Internet of Things communication by virtue of its key characteristics such as strong coverage, small power consumption, low cost, and large connection [2]. However, in the face of a large number of terminals, how to deal with the large number of changing environmental data from the deployment of the Internet of Things and making the data from multiple sensors and data sources which can reflect this change in real becomes a key problem, so the data fusion is an essential process. The Internet of Things (IoT) builds a ubiquitous network interconnection of all things, where the node energy, storage spaces, and network bandwidths are far less than traditional networks, and links have a characteristic of instability. In addition, since IoT extends the access of network end to things, we can envision that when all of things interconnect with the networks, the issue of information security on perceptual layer becomes mostly prominent. In this paper, a multi-source heterogeneous data fusion based on perceptual semantics in narrow-band Internet of Things is proposed to solve these problems [3, 4].

With the development of artificial intelligence technology, data fusion has penetrated into every corner of the modern information society, and its importance is becoming more and more prominent. Data fusion technology has developed towards the trend of intellectualization and integration. Data fusion is not a simple data processing, where the data processing can be considered as the first stage of data fusion. Data fusion technology is widely used in industrial process, monitoring systems, intelligent detection systems and military applications, and other fields. The key technologies involved in data fusion include data conversion technology, data-association technology, situation-awareness database technology, and fusion calculation of data fusion system. The level of data fusion is divided into lower level (data level or pixel level [5]), middle level (feature level), and high level (decision level). After years of development, data fusion has achieved certain progresses, which can be generally divided into statistical methods and estimation methods [6]. However, the existing data fusion technology still has problems: (1) it did not set aside the practical problems to form a basic theoretical framework, and did not establish effective generalization model and universal data fusion algorithm; (2) the ambiguity of data association is the main obstacle in the process of data fusion; (3) it cannot accurately describe the meaning of data and the semantic connection between data; (4) data access services with semantic support cannot be provided; (5) fault tolerance, error correction ability and system robustness of data fusion system cannot be solved very well; (6) at present, the research on the concrete realization of data fusion is still in the preliminary stage; (7) there are many practical problems in the implementation of the concrete fusion system [7].

Compared with the traditional data fusion, the semantic data fusion is to apply the semantic association technology into the data fusion so as to improve the intelligence and efficiency of the fusion and achieve better fusion effect. The semantic network is a highly intelligent network. The data in the semantic network contains semantic information which can be understood by the machine. Those data can be stored in a semantically relevant database, and the machine can easily read, write, and manipulate the data. To improve operational efficiency, the semantic network introduces the concept of metadata [8]. Computers in the semantic network will be more autonomous, requiring less human intervention to obtain useful information and be able to understand and use the data more effectively. Therefore, computers and software agents in semantic network will be more “intelligent”. They can use the mastered information to automatically extrapolate the implied facts through logical inference. Before semantic fusion, data should be processed by the related technologies of semantic network [9].

The core idea in this paper is to solve “semantic fusion and intelligent perception application for the IoT nodes in narrow-band Internet of Things,” which is to establish an abstract public data representation and an adaptive layer of the underlying equipment in different kinds of information access modes, and then the semantic concepts of different abstract levels are attached to the original perceptual data. Establishing the prototype of inference machine based on the rule set of background knowledge of the Internet of Things and completing various applications of the Internet of Things data space based on situational perception in semantic environment, the relevant technologies are hot research topics in the current field of artificial intelligence and information processing. Furthermore, the knowledge expression and inference involved in semantic network technology have achieved certain progress, but it is far from universal. Therefore, a multi-source heterogeneous data fusion framework based on perceptual semantics in narrow-band Internet of Things mode is proposed to solve the key problem of universal calculation for NB-IoT in this paper.

2 The framework of NB-IoT

Narrow-band Internet of Things (IoT) is a new Internet of Things technology, which has four key features that are wide coverage, low power consumption, low cost, and many connections. Based on the 4G LTE Evolved Packet Core network framework, the network framework of narrow-band Internet of Things is optimized by combining its system features [10]. NB-IoT system is divided into five parts: NB-IoT terminal, NB-IoT base station, NB-IoT core network, NB-IoT cloud platform, and vertical industry centers. NB-IoT terminal support the access of all kinds of NB-IoT terminals. In the future network, as long as the corresponding chips are installed, the access of terminals can be completed in NB-IoT network; NB-IoT-base station mainly adopt LTE-base station set up by major operators and there are mainly three frequency band deployment modes: Stand-alone deployment, Guard-band deployment and In-band deployment; NB-IoT core network mainly realize the connection between NB-IoT-base station and cloud platform; the system in NB-IoT platform can process different businesses and forward the results of processing to vertical industry center or NB-IoT terminal; the vertical industry center is to obtain the business data of this core NB-IoT and control the NB-IoT terminal [11, 12].

NB-IoT can be divided into a system consisting mainly of perception layer, network layer, and application layer, as shown in Fig. 1:

Fig. 1
figure 1

Architecture for NB-IoT network

  1. (1)

    Perception layer: it is mainly to realize the perception function, including object recognition and relevant information collection; its key technologies include RFID [13], sensor, and short-distance wireless communication [14]; data collection and perception are mainly used to collect the events and data in the environment; and the range of perception can be a separate object, an object in a specific area or an object in different position.

  2. (2)

    Network layer: it is to realize extensive interconnection and communication function, integrating sensor network with mobile communication technology and internet technology, and transfer and process the perceived information reliably and safely [15].

  3. (3)

    Application layer: it is combined with industry demand to realize extensive information sharing and interconnecting. Typical applications include intelligent transportation, industrial monitoring, telemedicine, smart home, environmental detection, public safety, urban management, intelligent logistics [16].

This paper mainly studies the problem of heterogeneous data fusion in narrow-band Internet of Things. Sensor network technology is the key technology of the perception layer in the Internet of Things, in which data fusion is the research focus of sensor network. How to integrate these data, reduce energy consumption, and improve accuracy are the core issues of IoT perception technology.

3 Heterogeneous data fusion

In narrow-band Internet of Things, the monitoring range of single sensor nodes is limited, and most of the applied nodes are redundant, so the monitoring range will overlap. In other words, the data collected by the nodes will have redundancy. The repeated transmission of redundant data will increase the burden of the network, which requires the nodes to integrate the data and reduce the amount of transmission to effectively reduce the energy consumption of the network. And different types of data also need effective fusion in order to accurately perceive state information [17]. Data fusion technology can be combined with the protocol layer of sensor network, which is mostly used to combine with the network layer. In other words, data fusion mechanism is combined in the design of routing protocol to reduce the amount of data transmission. Data fusion is an important technology in narrow-band Internet of Things. It will be effective to eliminate data redundancy, reduce the amount of data of the network transmission and energy consumption, and extend the network lifecycle. Meanwhile, it can process multiple sensor data and extract effective information to improve information accuracy.

In narrow-band Internet of Things, a large amount of perception data is transmitted from multiple source nodes to aggregation nodes. From the processing level of nodes in the network, the data fusion model can be divided into two types: centralized mode and distributed mode.

3.1 Centralized fusion model

The feature of centralized fusion model is that multiple sensor source nodes directly transmit the collected data to aggregation nodes [18]. Then the data is fused by aggregation nodes, and its structure is shown in Fig. 2. The advantage of this structure is small loss of information; the disadvantage is that due to the wireless sensor network, node distribution in most application is relatively dense, the adopted data from multiple sensor source nodes has great redundancy, and it will increase energy assumption of internet by transmit large redundant information directly. Meanwhile, the system has poor survivability and is not suitable for applications of wireless sensor network with high energy saving requirements.

Fig. 2
figure 2

Centralized fusion model

3.2 Distributed fusion model

Distributed fusion is to combine the IoT data collected by different sensor nodes, and then fuse those data in intermediate nodes that can delete redundant data and reduce the amount of data transmission of network, and its structure is shown in Fig. 3. When the data collected by the sensor source nodes is forwarded by the intermediate node, the fusion information is transmitted to the aggregation nodes for processing after the corresponding data fusion of the intermediate node. Distributed data fusion plays a very important role in application of wireless sensor network, which can effectively reduce energy consumption in NB-IoT, extend internet lifetime, increase the accuracy of data fusion, and improve the efficiency of data transmission.

Fig. 3
figure 3

Distributed fusion model

4 Multi-source heterogeneous data fusion based on semantic perception

Due to the lack of uniform format rules of various Internet of Things spaces, the data description methods which can be understood by machines, and users’ operation rules when face different abstract levels, it’s an important research topic of combining perception technology with data fusion technology. By choosing semantic protocol and adequate data fusion algorithm, it can effectively reduce the amount of data transmission and mitigate network congestion, and extend network lifetime.

4.1 Overall idea

If the semantic concepts of different abstract levels are attached to the original data, and the logic inference engine prototype is established based on the rule set of background knowledge, the problem of data fusion application in data space can be solved. The proposed data fusion model based on standard rules of semantic network allows authorized access to physical network and the internet. Several problems should be considered in semantic data fusion: input and output of fusion system, how to preprocess the data provided by each sensor, where the obtained results can meet system requirements of fusion model and have the minimum information loss; what structure adopted by fusion system; what fusion algorithm can be adopted, which can make fusion system achieve optimal performance; In order to solve these problems, the hierarchical structure design and implementation of semantic fusion architecture model and the architecture structure of intelligent semantic network are proposed, where Jena framework [19, 20] is introduced to create semantic network and gives the implementation principle and interface model.

Usually, the intellectualization of semantic function needs fusion of multi-sensors in Internet of Things system with multiple sensors and fusion algorithm is the key to realize intellectualization. Semantic intelligent reasoning is based on the relationship between ontology and its ontology. It needs to be added to the fusion algorithm. The fusion algorithm is the logical expression of intelligent realization, and the specific implementation process is used in ontology reasoning. The fusion algorithm is embodied in the ontology inference rules. The establishment based on intelligent inferential rules of ontology is proposed to adopt Jena inference machine and define inferential rules by itself. In addition, it can realize ontology inference in Internet of Things. The inferential rules defined by Jena can analyze itself by using rule and instantiate the objects of inference rules.

4.2 The semantic data fusion model

Aiming at the problem of controlling service heterogeneity which exists in semantic data of Internet of Things data space and referring to the idea and expression method of semantic network service model, we proposed the overall plan of semantic fusion. Through our proposed algorithm takes the perceptive devices of narrow-band Internet of Things conditions as basic data object, perceptive devices are usually heterogeneous data source. And use them to establish knowledge field ontology through virtual database, mapping relations of perceptive devices. Further express the semantic of data object and realize data source fusion. The specifics include semantic marks, construction of metadata, ontology mapping, QoS assurance, and rule definition of applications. The first four items define the static semantics of perceptive device object, while the last item defines the dynamic semantics of perceptive device object. The architecture of our proposed model is shown in Fig. 3. The architecture of our proposed model is logically divided into four levels, which are Home Data Space Adapter, Ontology Layer, Semantic Processing Layer, and Application Service Layer. Among them, Semantic Processing Layer is responsible for managing semantic data, operation model, and personalized requirements of users. To make people release from complex operation, we apply intelligent perceptive application framework. The upper manage and operate the spatial data of narrow-band Internet of Things according to query engine, inference engine, and established rule set in the embedded layer. The important part is the inference machine, which is a part of the Internet of Things system to realize knowledge-based inference, the realization of knowledge-based inference in intelligent applications, and an indispensable part of the knowledge system.

In the process of realizing intelligent Internet of Things fusion algorithm, it will add fusion Dempster-Shafer evidence theory to inference rules so as to improve the probability of object perception. The data source of data fusion comes from multiple sensors and can be some random, fuzzy, and uncertain information because of the effect of environment, information quality of data source. However, because D-S evidence theory can deal with uncertain and fuzzy information, it can provide solutions to these problems. The evidence theory is an inference model established on the basis of recognition framework. The main processes are (1) establish recognition framework; (2) establish initial reliability allocations. The initial reliability allocations are represented by the basic probability assignment function BPA; (3) according to causal relationship, calculate the reliability of all the propositions. The reliability of a proposition is equal to the sum of initial reliability of the evidence for all of its premises; (4) evidence synthesis. Use evidence theoretical synthesis formulas to fuse the reliability of all the propositions and then to obtain the reliability of fused ones; (5) make decision according to the reliability of the fused propositions. We usually take the maximum possibility as the final decision. The synthetic formulas of evidence theory are the basics of evidence inference that can make people fuse the data supplied by multiple evidence sources. There are basic synthesis formulas of evidence theory:

$$ m(A)=\frac{1}{k-1}\sum \limits_{A_{i1}\cap {A}_{i2}\cdots \cap {A}_{in}=A}{m}_1\left({A}_{i1}\right)\cdot {m}_1\left({A}_{i2}\right)\cdot \cdot \cdots {m}_1\left({A}_{in}\right) $$
(1)

where \( k={\sum}_{A_{i1}\cap {A}_{i2}\cdots \cap {A}_{in}=A}{m}_1\left({A}_{i1}\right)\cdot {m}_2\left({A}_{i2}\right)\cdot \cdot \cdots {m}_{\mathrm{n}}\left({A}_{in}\right) \), mrepresents the basic probability assignment function, andmn(Ain) represents the basic probability assignment function of different evidence sources. k is the conflict probability between the evidence, which reflects the degree of conflict between the evidence; when k=1, evidence theory is not available; when k → 1, evidence height conflict can lead to perverse conclusions; the normalization factor 1/(1 − k)is to avoid the effect of the synthesis of the non-zero probability when assigned to an empty set Φ. To get the basic probability functions, we should give an identification framework Θ in the D-S evidence theory and define the basic assignment functions of above power set 2Θasm : 2Θ → [0, 1], which meet the following equations:

$$ m\left(\varPhi \right)=0 $$
(2)
$$ {\sum}_{\mathrm{A}\subseteq \varTheta }m(A)=1 $$
(3)

In this perceptual fusion system, when the D-S evidence theory is used for decision confirmation, many kinds of states can be selected as the identification framework of D-S and take data information supplied by each sensors as basic probability assignment of evidence rules. Given the output value of evidence information source of sensor iisyi(j), we can get

$$ {m}_i(j)=\frac{y_i(j)}{\sum_{j=1}^N{y}_i(j)} $$
(4)
$$ {m}_i(q)=1-a,\kern0.36em i=1,2,\cdots, \mathrm{N} $$
(5)

Take mi(j) as the i evidence to assign statej; mi(q) is the probability allocation function of uncertaintyq; a represents reliability coefficient of single evidence and it also can be considered as the degree of reliability of single evidence decision coming from the specialists. Much more fusion information of evidence can expand the observation range of time and space and then obtain more accurate and more reliable data.

5 Experiment results and analysis

The proposed framework for multi-source heterogeneous information fusion in the NB-IoT is classified into the following phases. Raw data annotation and abstraction based on IoT data is annotated and explained by metadata, which can be linked and used to facilitate integration and interoperability. Data integration involves combining relevant data residing in a number of heterogeneous data sources, which may conflict by structure and context or value. The combined data provide a unified view of the data. Data fusion focuses on the computation of the structured and comparable IoT data to improve data quality or obtain appropriate decisions. Feature abstraction and inference enrich the fused data results with true meaning by semantics, abstraction, and reasoning.

5.1 Configuration setup

In order to qualitatively and quantitatively evaluate the performance of the multi-source heterogeneous data fusion based on perceptual semantics in NB-IoT proposed in this paper, we design some experiments to test its performance. The simulations are all implemented by Ubuntu 14.04.1. × 64 on a personal computer with 4-GB memory, 2.94-GHz Intel Core i5–7500 processor, as shown in Table 1. The LTE BC95-B5 released by QUECTEL in February 2017 is adopted as communication module for NB-IoT device. Based on the semantic perception ontology developed by the W3C, we model our schema level of observation data based on the instance level, where the purpose of Semantic Web is to build a web that is able to describe things in a way that computers can understand, and to automatically access and exchange between semantic-aware IoT. Various NB sensors deployed in labs collect real-time data, where 40 sensor nodes are placed in each monitoring area and data fusion is performed at sink nodes. It is assumed that the perception layer in sensor network of the Internet of Things has been clustered by some clustering algorithm. Each cluster has a cluster-center head node and several member nodes. The perceptual data of each node follows the normal distribution law.

Table 1 Experiment configuration

5.2 Experimental data and metrics

We deploy a narrow-band Internet of Things network to monitor a lab environment. Then, we collect the sensed data into a relational database. The data are sent to sensor network middleware through a NB-IoT gateway. Then, the sensor network middleware stores the IoT resource access and intelligent processing platform. The collected NB sensor data are abstracted and modeled using the semantic perception ontology. According to the lab environments, a domain knowledge base is constructed with Protg. Then, the Jena reasoner is used to perform intelligent recognition and assessment of the lab environment in NB-IoT.

This paper proposes a general data fusion system under the narrow-band IoT architecture. The goal is to provide customizable data fusion services for different business scenarios, and to integrate and deal with the real-time data and historical data generated by multi-source heterogeneous devices in the Internet of Things. Therefore, we will analyze the real-time nature of data stream processing, mainly data throughput and response time, as well as the accuracy of historical data stream reproduction.

5.3 Resource utilization

Resource utilization is an important indicator of system performance. Ten tests are required in this experiment, and each test time is about 10 min. The results of the ten tests are roughly the same. This article uses one of them as an example to demonstrate and analyze the results. Therefore, the test results are shown in Fig. 4. The tests include four phases: system startup, creation of a fusion model, startup model, and subsequent data fusion. The system state is stable after the system is started and when the fusion model is created. The number of active threads is less than 30, and the classes used during the running of the program are stable at around 3300, and the resource usage fluctuates below 5%. When the fusion model was started, the resource usage reached a peak of about 15%. At the same time, the number of active threads has increased to around 30, and the number of program loading classes has soared to around 4600. It can be concluded that the initiation of the fusion model invokes the resources of the parties of the program, which is a challenge to the performance of the machine. After the model was launched, the state of the system returned to a stable state, but the use of overall resources increased. The resource usage fall back from the peak state and fluctuate between 5 and 15%, but the fluctuation range was larger than before. The number of active threads is stable at 31, and the number of program classes-loading is stable at 4668.

Fig. 4
figure 4

Resource utilization in different time

5.4 Analysis for data fusion

The data fusion technology is to process the data collected by multiple sensors in an associated or combined manner, output a more accurate measurement result of the measured target, or make a scientific judgment on the change trend of certain indicators of the measured target. So far, there is no unified algorithm for data fusion operations, but some common non-linear data processing methods with good fault tolerance and anti-interference capability can be adopted for data fusion. In general, the method used for data fusion must have robustness and parallel processing capability, high processing speed and computational accuracy, and need to be able to be good connection with its data acquisition in pre-processing system and fusion data processing system. The core problem of data fusion systems is also to choose the appropriate fusion algorithm. Although data fusion has not yet completed a complete theory and a common and effective method, it also proposes many mature and effective data fusion algorithms according to the specific background in their respective application fields.

In this paper, five fusion algorithms, such as weighted average method (WAM), Kalman filter method (KFM), multi-Bayesian estimation method (MBE), and rough set theory (REM), are used for comparison, which is for comparing the performance of semantic perception. The experiment result is shown in Fig. 5 and Table 2. The experimental results demonstrate that our proposed architecture can obtain high-precision perceptual results, especially when heterogeneous nodes increase. Traditional multi-sensor data fusion can handle the same kind of data effectively. However, as new characteristics emerge in the IoT, interoperable service-oriented technologies are required to share real-world data among heterogeneous devices to integrate and fuse such multi-source heterogeneous IoT data. The IoT can only offer inconsequential practical benefits if it does not have the ability to integrate, fuse and glean useful information from the data generated by a world of interconnected devices. Considerations for future IoT networks include data network integration of heterogeneous networks. Therefore, network layer routing protocol design that considers fine-grained semantic-level fusion is a worthwhile undertaking.

Fig. 5
figure 5

Fusion rate for different number of semantic items

Table 2 Average perceptual precision for different fusion algorithms

6 Conclusion

With its strong coverage, small power consumption, low cost, and large connectivity, narrow-band Internet of Things (NB-IoT) technology has become the key technology in Internet of Things communication. However, in the face of a large number of terminals, the rational allocation of limited resources and the heterogeneous data fusion in the system become an important topic in the research of narrow-band Internet of Things. So, a multi-source heterogeneous data fusion based on perceptual semantics in NB-IoT is proposed in this paper. Firstly, we introduce the advantages and key technologies of NB-IoT, which include the key techniques of physical layer and media access control layer. Then, in order to eliminate data redundancy and extend the network lifetime, we analyze the centralized mode and distributed mode in NB-IoT network, and propose a multi-source heterogeneous data fusion based on semantic perception to form a uniform format. Finally, an improved D-S evidence theory is adopted to proceed data fusion, obtaining the final fusion result. The experiment has shown that our proposed algorithm has faster convergence rate, higher stability, and its judgment to fusion results are more suitable to actual conditions in NB-IoT.