Opportunistic data gathering in IoT networks using an energy-efficient data aggregation mechanism

Afonso, Edvar; Campista, Miguel Elias M.

doi:10.1007/s12243-024-01055-z

Opportunistic data gathering in IoT networks using an energy-efficient data aggregation mechanism

Published: 23 July 2024

(2024)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Annals of Telecommunications Aims and scope Submit manuscript

Opportunistic data gathering in IoT networks using an energy-efficient data aggregation mechanism

Download PDF

Edvar Afonso¹^na1 &
Miguel Elias M. Campista¹

83 Accesses
Explore all metrics

Abstract

Internet of Things (IoT) applications rely on data collection and centralized processing to assist decision-making. Nevertheless, in multi-hop Low Power and Lossy Networks (LLN) scenarios, data forwarding can be troublesome as it imposes multiple retransmissions, consuming more energy. This paper revisits the concept of mobile agents to collect data from sensors more efficiently. Upon receiving a data request, the IoT gateway performs a cache lookup and promptly dispatches a mobile agent to collect data if this is not available. Data collection then uses closed-loop itineraries computed using a Traveling Salesman Problem (TSP) heuristic starting at the network gateway. The itinerary goes through nodes producing solicited and unsolicited data. We assume that the unsolicited data will be requested soon, and opportunistically collecting it avoids future agent transmissions. We limit the collection capacity of each agent using a knapsack problem approach. Simulation results show that our proposal reduces the network traffic and energy consumption compared with a traditional mobile agent without opportunistic data collection. In addition, we show that data aggregation can further improve the performance of our proposal.

Transmission of Aggregated Data in LOADng-Based IoT Networks

CEDAR: A cluster-based energy-aware data aggregation routing protocol in the internet of things using capuchin search algorithm and fuzzy logic

Article 03 October 2022

MultiHop optimal time complexity clustering for emerging IoT applications

Article 03 August 2022

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The Internet of Things (IoT) emerged as a powerful technology to change modern society. IoT enlarges the horizon of possible services and applications by connecting the cyber-physical infrastructure with complex wireless systems. This connection enables remote data processing, adding more intelligence to data collection. Moreover, the growth of IoT research brings more confidence to developers, contributing to the success of IoT in many different areas, such as in Industry 4.0, Smart Cities, Smart Homes, and Smart Agriculture [1,2,3,4].

IoT devices have hardware constraints with known battery limitations, making energy consumption a considerable concern [5]. Communications, however, are at the top position for energy consumption, even considering the various processing tasks that IoT devices need to execute. If we consider multi-hop communications, this becomes even more critical as the traffic generated by any node requires multiple wireless medium accesses until it finally reaches the destination. Thus, saving energy is a challenge for extending the expected lifetime of participating devices.

Following the typical IoT architecture, data collection and transmission between devices or between devices and a central entity represent a fundamental task. Hence, communications are intrinsic to IoT, which must consider the multi-hop possibility for proposing energy savings alternatives. In this direction, the mobile agent (MA) paradigm emerges as an efficient data collection and transmission approach in IoT [6,7,8,9,10,11,12].

A mobile agent can be defined as a computer code dispatched by a central entity, which could be the network gateway, to the network. Network devices use this code to perform specific tasks, e.g., data fusion and aggregation. Using MAs, we can change the behavior of devices by temporarily installing different codes on them.

Thus, rather than performing data aggregation only at the gateway, the MA can save network energy by compressing raw data at each device before forwarding it [13,14,15]. This even allows MA to discard data that is not relevant, preventing its transmission through the wireless network to the gateway for remote processing. To complete the assigned tasks, the MA must visit different network devices following a predetermined itinerary computed using an optimized algorithm (itinerary planning) [16].

The MA itinerary is composed of source and intermediate nodes. We call source nodes those with the desired information that needs to be collected and processed by the MA. Intermediate nodes, in opposition, are those used to forward the MA through the multi-hop network. The MA itinerary can be computed using either static or dynamic approaches. On the one hand, in the static approach, the central entity is in charge of the itinerary computation.

Once established, the MA uses source routing at the application level [17,18,19] with the itinerary registered before being dispatched to the network. In the static itinerary planning approach, the itinerary is registered in the MA’s code by the gate-way, which is composed of source and intermediate nodes.

On the other hand, in the dynamic approach, the MA determines the itinerary composed of IoT devices hop-by-hop based on the current network status. Hence, the computation of the MA itinerary reveals a tradeoff between the maintenance of a network global view at the central entity and additional processing power at the devices. Our proposal focuses on static itineraries to privilege simplicity at the network devices, given their typical simple design. Even though a MA can provide significant energy-saving results, the solution has some restrictions related to delays in completing the proposed itineraries. This can be a point of attention for some delay-sensitive applications, which we circumvent by assuming cache deployment at the network edges.

This paper proposes Agent-Knap, a network architecture that combines MAs with opportunistic data collection and cache deployment at the network edge to reduce the energy consumption at network devices. Our proposal relies on static itineraries, which include an itinerary composed of source and intermediate nodes, and a selection mechanism that prioritizes data acquisition based on a pre-computed data utility. Traditional MA (TMA) migrates from the current source node to the next, ignoring the data available on intermediate nodes. These intermediate devices, however, may contain relevant data that network clients may request soon. Thus, unlike previous proposals [20,21,22,23], in which a static itinerary is computed for data collection at source nodes, our proposal also considers opportunistic data collection on intermediate nodes within the MA itinerary. This proactive approach aims to save energy from IoT devices by reducing MA dispatching rounds. In addition, Agent-Knap assigns priority to opportunistic data as a 0-1 knapsack problem. This model computes the knapsack award based on weights inversely proportional to the data stored freshness in the cache located at the network gateway. Agent-Knap considers the MA payload size as the maximum backpack capacity.

We compare the Agent-Knap with the TMA, i.e., without using opportunistic data collection at intermediate nodes. Also, we improve a previous version of Agent-Knap by introducing the possibility of data aggregation [24]. We observe that the energy consumed for MA transmission also becomes significant after a specific payload size. Simulation results reveal a substantial reduction in energy consumption and network traffic. We summarize our main contributions as follows:

We propose Agent-Knap, an approach to collect data from IoT networks based on mobile agents (MAs).
We propose the use of opportunistic data gathering to reduce energy consumption by proactively collecting data at nodes located at the MA itinerary.
We propose a data aggregation approach to further improve energy savings by reducing the MA size and, consequently, the energy consumed for each MA transmission.

This paper is organized as follows. Section 2 overviews the related work. Section 3 presents the proposed data collection mechanism, Agent-Knap, and its data aggregation improvement. In the following, Sect. 4 describes the simulation environment and the results achieved. Finally, Sect. 5 concludes this work and presents future directions.

2 Related work

Efficient data collection in sensor and IoT networks has been the subject of many recent works [7, 8, 11, 21, 25], where authors start looking for new approaches to reduce energy consumption and network traffic. Gavalas et al. [26] propose a static itinerary mechanism for multiple mobile agents. They primarily consider energy efficiency influenced by the increase in MA size as it moves forward and aggregates data from network nodes. Unlike Agent-Knap, Gavalas et al. do not evaluate data collection on intermediate devices while the MA moves along the itinerary. As far as we know, our work contributes to the state-of-the-art using opportunistic data collection with MA.

2.1 Network edge caching

The use of cache in IoT systems is a possible strategy to reduce the energy consumption of network devices [27,28,29,30,31,32]. Caches introduce more memory resources at the network edges and can reduce the response time as it enables temporary storage of collected data. For example, an IoT gateway can store the data collected by local sensors. In addition to preventing new requests from being directly sent to sensor nodes, this strategy allows quicker response times for Internet clients. It is worth noting that the data gathered by devices may not drastically vary in the physical environment in the short term. Real IoT deployments may work well with data governed by different expiration times, which can be minutes, hours, or even days, depending on the application [33].

Zhou et al. [34] propose to store the collected data at the gateway cache. The main goal is to use the cached data to reply to upcoming requests for the same data, avoiding energy consumption at the sensor network and, at the same time, reducing the response time. The work also considers that some sub-regions of the area of interest (AoI) are more likely to have requested information than others. Thus, the authors develop a mechanism to proactively collect data from popular sub-regions. The authors identify that frequent requests can impose high communication costs that may exceed the network capacity. Thus, the authors argue that the most popular content, among all recently requested, is more likely to be requested again soon. The proposal collects the most popular data by periodically sending requests from the gateway to the sensors. From simulations, the authors show that the proposed mechanism performs well, mainly considering the number of cache hits and the energy consumed in the network.

The fundamental difference between our proposal to Zhou et al. [34] is that we consider opportunistic data collection. In our proposal, the MA collects unsolicited data from the network that will shortly become relevant. Zhou et al. [34] deploy proactive data collection, requiring the transmission of additional messages throughout the network. This alternative leads to increased network traffic and energy consumption. Our proposal, instead, conducts opportunistic data updates from the sensor network using MAs. This strategy enables proactive data collection, a key feature of our proposal.

Other proposals rely on attributes like content popularity to estimate the request probability in the future. The goal is to anticipate which content should be stored in a centralized cache. Wei et al. [31] consider a caching scenario at the IoT network edge. The network comprises mobile devices and content servers positioned at the network edge. These servers aim to make content available to IoT devices with lower latency and minimize the traffic sent from devices to the cloud infrastructure. The work proposes a caching policy called SAPoC that considers, in addition to popularity, the concept of content similarity with other content previously requested by users. The main goal is to reduce the slow-start phenomenon incurred by existing history-based caching strategies. The proposal focuses on dynamic systems where devices arrive and leave the network over time. When a device requests new content, its popularity tends to be high if it holds high similarity with other cached popular content. This strategy helps predict future content popularity, contributing to proactive caching. The authors obtained high performance when comparing the proposal with other cache policies.

Our proposal focuses on another strategy that does not use popularity estimation because it requires high CPU processing at the edge nodes. Agent-Knap chooses a simple priority computation strategy that considers content utility in the cache strategy instead.

3 Data gathering using Agent-Knap

This paper proposes Agent-Knap, which can operate in two modes: with and without data aggregation. We consider a network composed of one gateway and multiple IoT devices randomly deployed in an area of interest (AoI). Each device has a set of sensors, and each sensor collects a specific data type, or content, from the environment (e.g., temperature, pressure, or vibration). Also, each device stores only the last sample of each data collected by a sensor. IoT devices can be source nodes, providing the requested data, or intermediate nodes, interconnecting consecutive source nodes at the MA itinerary. The proposed architecture is centralized at the gateway, our central entity, which runs itinerary planning and cache management functions besides processing all requests from and responses to network clients.

The network operation relies on data requests from external clients sent to the gateway. Each request has information about the desired content and the desired AoI. Depending on the data availability and freshness in the cache, the gateway decides whether a new data-gathering round is needed, i.e., if it needs to dispatch a new MA.

At the system initialization, however, each device must register and inform the list of different contents it offers, i.e., the services it provides and the size of each collected data type in bytes. After the system initialization, we assume that the gateway has information about the network topology, including the geographical position of all devices, which is essential to compute the MA itinerary. We also assume that nodes exchange vicinity discovery messages at the system bootstrap.

If, however, we add dynamics during network operation, the needed information for MA itinerary computation and node vicinity could be maintained with typical updates provided by wireless routing protocols. Assuming that the devices do not move and their corresponding sensors do not change after the system initialization, topology-control information from proactive routing protocols, e.g., Optimized Link State Routing (OLSR) and Routing Protocol for Low Power and Lossy Networks (RPL), would be enough. Hence, during network operation, we would not add any extra overhead.

In summary, upon receiving a data request from an external client, the gateway decides whether a new data collection round is needed. If this is the case, the gateway dispatches a MA that follows a predetermined itinerary containing all sensors of interest, named source nodes. These nodes provide the requested data and may not be neighbors at the network topology. The MA can either concatenate the data collected from the multiple source nodes or execute a data aggregation mechanism using the data collected hop-by-hop as input. Hence, for example, assuming that a client requests the temperature of an AoI and this data is not fully available in the cache, the gateway dispatches a MA to collect the missing data. The MA can either concatenate all temperature readings collected at source nodes along the itinerary or merge all readings into one representative temperature of the entire AoI. Figure 1 depicts a traditional data collection using MAs without data aggregation. The gateway dispatches a MA in red that concatenates all the data collected at the predetermined source nodes, also in red, along the computed itinerary. Note that the MA size grows since more data is put together as it moves. In Fig. 1, in the end, the MA has a payload of size 4, which is the number of source nodes on the itinerary. The following sections detail the main features of the Agent-Knap, including the proposed opportunistic data collection.

3.1 Source node selection

When the gateway receives a client request, it first conducts a cache lookup for updated data. The idea is to provide a fast and complete response to the client. We assume that a complete response comprises unexpired content of a specific type from multiple devices covering the entire AoI. If the cache does not provide a complete response, the gateway starts a data collection by dispatching a MA. The gateway must collect fresh data from IoT devices to complete the non-expired information in the cache. These particular devices are referred to as source nodes, colored in red in Fig. 1.

In each round, the selection of source nodes determines the group of devices the MA must visit to complete the data in the cache. The group of source nodes is selected, taking into account the best possible AoI coverage, which guarantees a complete response regarding the data requested by the client. Hence, considering Fig. 1, the selection of source nodes determines that the devices in red are enough to provide a complete view of the entire AoI.

We model the source node selection mechanism as a classic coverage problem, the Weighted Set Cover. In our model, each sensor $n_i\in \mathcal {N}$ is associated with a coverage subarea. A weight $w_{n_i}$ is also associated with each sensor to assign priority in the selection process. The freshness related to each content stored at the gateway cache determines $w_{n_i}$. The older the data of a given sensor stored in the cache, the higher the priority in the Weighted Set Cover problem.

In terms of complexity, the Weighted Set Cover is NP-hard. The heuristic used in our proposal has time complexity $O(\log n)$.

After selecting the source nodes, the gateway can determine the correct itinerary $\mathcal {I}$ for the MA.

3.2 Itinerary planning

The gateway must compute the itinerary, the sequence of source nodes to be visited, and those connecting them (intermediate nodes), before sending the MA to the network. The itinerary forms a closed loop starting at the gateway. The loop must visit all source nodes the gateway selects, as described in Sect. 3.1.

We assume that each device has a unique identifier on the local WSN and that the information about the identifiers of all devices on the MA itinerary is inserted into its code structure. Each device discovers its directly connected neighbors on the system bootstrap and saves their identifiers. In summary, the MA forwarding process is carried out by performing the following steps:

Upon receiving a MA, each device verifies the next node in the MA itinerary before forwarding it. The next node can be either a source or an intermediate node.
Before forwarding the MA, the device processes its code to perform the programmed tasks (data concatenation or aggregation in our case).
When transmitting the MA over lossy links, the transport layer manages the transmission reliability. We assume this can be done using TCP, for instance.

Let $\mathcal {N}$ be the set of IoT devices. Hence, the gateway starts the itinerary computation using the set of source nodes needed for the next gathering process, i.e., the itinerary computation uses $\mathcal {N}_{tsp}\subseteq \mathcal {N}$ as input. In this step, we use both algorithms of Christofides and Dijkstra. The first one, the Christofides algorithm, is a heuristic used to solve the Traveling Salesman Problem (TSP), which is the one tackled for MA itinerary computation. The output of the algorithm is the sequence of nodes that must be visited. Because these nodes may not be directly connected, we use the Dijkstra algorithm to determine the complete itinerary by computing the shortest path between consecutive source nodes.

The Christofides algorithm is a TSP heuristic that, in the worst case, guarantees a solution at most 3/2 worse than the optimal one [35]. The heuristic used in our proposal has time complexity $O(n^3)$. Thus, the complete itinerary is composed of two sets, $\mathcal {N}_{tsp}$ with all source nodes needed and $\mathcal {N}_{sp}\subseteq \mathcal {N}$ containing all intermediate nodes included by the Dijkstra algorithm. Note that $\mathcal {N}_{tsp}\cap \mathcal {N}_{sp}=\emptyset $.

3.3 Opportunistic data gathering

After computing the itinerary, the gateway must decide on the payload, filling it with data obtained only from source nodes or with data from source and intermediate nodes. Hence, the main challenge is to manage the MA payload.

The MA packet format is divided into four main parts: the MA ID, for unique identification of the mobile agent; the itinerary; the processing code, for data manipulation; and the payload, reserved for the data collected and possibly processed by the MA. The payload has a maximum size (P) in bytes and is further divided into two parts: a fixed size reserved for the data collected from source nodes (G) and a size (C) used to carry the data collected from intermediate nodes. The size of C depends on the number of intermediate nodes visited, the size of each collected data, and whether data aggregation is enabled. Even though C is variable, it is upper bounded by a maximum value determined by the gateway.

The fixed size G carries the guaranteed data from the source nodes, whereas the size C carries all sorts of data from intermediate nodes visited in the itinerary. Hence, we call this data collection opportunistic because it may be used to collect important data that was not requested at the current MA round but can be requested soon.

Each IoT device in the set $\mathcal {N}$ contains distinct contents its sensors collect. Hence, each IoT device, $n_i\in \mathcal {N}$, has a subset of all contents provided in the network. Let $\mathcal {K}$ be the set of contents available in the network, then each content type $k_j\in \mathcal {K}$ has an associated size $s_j$ in bytes. We denote the content $k_j$ collected at node $n_i$ as $k^i_j$, where $i,j\in \mathbb {N}$ are the indexes of the devices and the available content types, respectively. Figure 2 details the format proposed for the payload. Considering the example, G is determined by the content $k_1$ collected at four different devices, with IDs 2, 3, 8, and 12. Hence, G contains $k^2_1$, $k^3_1$, $k^8_1$, and $k^{12}_1$. C, on the other hand, is determined by four different data types, $k_3$, $k_5$, $k_2$, and $k_4$ opportunistically collected at four different intermediate devices, with IDs 1, 7, 8, and 12, i.e., $k^1_3$, $k^7_5$, $k^8_2$, and $k^{12}_4$.

Note that the guaranteed data must be of the same content, $k_1$ in Fig. 2, requested by the client node. This is a consequence of our assumption that clients can only request data of one desirable content at each round. The opportunistic data, however, can be of any type. Also, we assume that the same data type has the same size and format.

To fill C with data from intermediate nodes in $\mathcal {N}_{sp}$, the gateway models the problem as a knapsack problem. The knapsack problem solved by the gateway must consider the size C as the “backpack” capacity used to select the intermediate nodes that have the data opportunistically collected. The MA adds all the data collected to the payload without violating the maximum C size. Nevertheless, filling the payload with opportunistic data depends on whether the MA aggregates data.

The Knapsack problem is NP-complete [36]. Hence, the heuristic used in our proposal is a polynomial-time approximate algorithm with time complexity $O(k^i_j*C)$.

The following section (Sect. 3.4) explains data aggregation in more detail.

3.4 Data aggregation

In Agent-Knap, when data aggregation is enabled, all data samples of the same type are aggregated. Data samples of different types can also be collected, but instead of being aggregated, they are concatenated at the payload. For example, if a temperature request is received and a MA is dispatched to collect samples, pressure and humidity data samples can be opportunistically collected at the same round. The MA payload concatenates the aggregated temperature with the aggregated pressure and the aggregated humidity data. For the sake of simplicity, our proposal considers data aggregation with a factor of 1. Thus, same-type data aggregation does not increase the payload size used by the MA.

Figure 3a depicts an example of MA collection with data aggregation. Data samples of the same type are always aggregated in the payload, whereas aggregated data of different types are concatenated. The knapsack problem solved at the gateway must consider the data aggregation process. Hence, if the MA uses data aggregation, its initial size does not change if the samples collected are of the same type. Figure 3b illustrates the proposed data-gathering process without aggregation. Four same-type data samples collected at different source nodes in G and four different-type data samples opportunistically collected at intermediate nodes in C fill the MA payload. In this case, the payload size increases with every data sample collected at network devices.

In our implementation, the gateway must indicate whether data aggregation is active using a flag. Hence, devices on the itinerary can proceed with data aggregation or not. If data aggregation is enabled, the gateway considers the different data types available at the intermediate nodes as items for the knapsack problem. With data aggregation disabled, all data samples available at intermediate nodes in the itinerary are considered items for the knapsack problem.

For the system modeling, the main goal of our proposal is to save energy of IoT devices when a multi-hop MA is used to collect data. The collected data at each device is stored in the gateway cache until an expiration timer remains valid. Each stored data has a timer and, while valid, avoids a new data collection round. Thus, we save network resources by reducing the number of requests sent to the network. Opportunistic data gathering plays a vital role in the process, as it brings more data back to the gateway, increasing data freshness.

3.5 Payload computation

Let $A_cache ^{k_j}$ be the corresponding area covered with non-expired data stored in the cache for content $k_j$. The proposed Agent-Knap solves a 0-1 knapsack problem in which the solution provides the data contents the MA must collect on each intermediate device along the itinerary. The parameters to solve the knapsack problem are the following:

The list $\mathcal {L}$ of all data contents that are available at the intermediate nodes of the computed itinerary
The priority $p_{k_j}$ computed for the content $k_j$
The size $s_j$ in bytes of each content type
The MA opportunistic content C size in bytes

For the knapsack problem, the priority computed for a specific content is numerically proportional to the contribution it provides to complete $A_cache ^{k_j}$ in the gateway. Thus, the lower the intersection between $A_cache ^{k_j}$ and the total AoI, the higher the priority for $k_j$ in the current gathering round. Equation 1 presents the priority $p_{k_j}$ computation for a given content $k_j$.

$$\begin{aligned} p_{k_j} = AoI - (AoI \cap A_cache ^{k_j}) \end{aligned}$$

(1)

The payload computation procedure determines the opportunistic content collected and the corresponding devices along the itinerary. These devices are those providing the content. This ensures that the MA payload is optimally populated by the most valuable content provided by the devices at the itinerary $\mathcal {I}$ without violating C.

3.6 Cache update

In Agent-Knap, the gateway cache is always updated with data collected by the MA. The cache has enough memory to store the data produced by all sensors from all devices within the AoI. Nevertheless, when data aggregation is enabled, the MA merges same-type data from different sensors to calculate a single value, which can be the mean, maximum, or smallest value, depending on the aggregation approach used by the MA. In this case, the gateway loses individual measurements as the MA does not transfer them along the itinerary.

Hence, when aggregation is enabled, the data stored in the gateway’s cache for each sensor does not match the real value measured by the device. Instead, it is the aggregated value of all nodes in the itinerary. Thus, there is an error between the real and the value stored in the cache when data aggregation is on. Nevertheless, when data aggregation is disabled, the MA maintains the exact value measured by each sensor, keeping it in the gateway’s cache. There is then a clear tradeoff between MA size and measurement precision when using data aggregation.

All timing information is provided at the gateway to avoid synchronization issues between the gateway and the IoT devices. Thus, Agent-Knap does not require synchronization, as caching updates are exclusively handled by the gateway. Therefore, the gateway updates the cache and the corresponding timing information upon the MA arrival for each collected sample. Timing information could also be produced at the devices, providing even more precision (or accuracy) to the system. This, however, would require a more complex system design for time synchronization across the entire network, which is not our goal in this paper.

4 Simulation results and discussions

In this section, we evaluate the performance of the proposed Agent-Knap using simulations. We have implemented a simulator using Python and the NetworkX package. The algorithms used in our proposal to solve Knapsack, Weighted Set Cover, and Traveling Salesman problems are polynomial-time approximate algorithms.

4.1 Simulation setup

The topology consists of fixed devices randomly positioned in a geographical area. The number of devices is always enough to guarantee complete coverage of our AoI. Each device has a fixed number of sensors and, consequently, can collect different contents from the AoI.

We simplify the analysis by considering a few assumptions. First, we consider that all devices have the same initial battery level. Also, we assume that the different content types have the same size; all network devices have four different sensors, one sensor for each type of content; and the number of different contents available in the network is four. At last, the sensing range of each sensor is the same for the entire network.

In this simulation, we represent a remote monitoring application that can benefit from analyzing historical data collected from sensing devices. Considering wide and flat areas, this strategy would fit scenarios such as agriculture, where data sensing is useful for irrigation, soil, and nutrient management. This scenario requires devices spread across greenhouses or plantations to collect data such as humidity, temperature, light, and soil characteristics [4].

Our application considers Internet clients sending request messages to the gateway. Such requests are received with an inter-arrival rate following a Poisson distribution with $\lambda =5$. In our plots, the expiration time is a multiple of $1/\lambda $. For instance, we assume an expiration time 30 times larger than $1/\lambda $. Then, we keep increasing this value until it becomes 180 times greater than $1/\lambda $.

In addition, the link cost between devices is proportional to the Euclidean distance between them. Each device can communicate with neighbors inside its communication range, and the network gateway has a fixed position at the center of the AoI. Table 1 shows the parameters used in all simulations. The energy consumption values are the same as those adopted in [37] to evaluate Tmote Sky sensors.

In addition, the required MA size imposes networking technologies supporting larger packet sizes, such as the one used in this work with 1,024 bytes. Note, however, that the selected MA size is based on the related literature [15, 26].

Table 1 Simulation parameters

Full size table

We compare our proposal in all simulations with the Traditional MA (TMA) and the Client–Server multi-hop approach. TMA collects data only from the subset of source nodes needed to cover the AoI completely, i.e., the source nodes that do not have valid data at the gateway cache. Conversely, the Agent-Knap opportunistically collects the available data of different content types at intermediate devices. In the Client–Server approach, the MA collects data from sensor nodes in a traditional multi-hop routing fashion. In every round, the gateway dispatches a MA to a single source node and waits for the requested data. The same procedure is repeated until all source nodes for that round are covered. Simulations computed three different metrics: energy consumption, cache hits, and the collected data accuracy. This last metric is critical to evaluate the tradeoff between aggregating or not collecting data. If we aggregate the content, we save energy at the cost of data accuracy.

All simulations are performed for a total period of $400/\lambda $ and considered two different aggregation factors for the MA, zero and one. Thus, the MA is either executing complete data agg-regation of the same content type along the itinerary (factor one) or simple content concatenation (factor zero). When the MA conducts complete aggregation, there is no increase in the payload size occupied by collected data of the same type. We considered that same-type data can always be aggregated.

For each simulation round, we perform ten runs with a confidence level of 95%. We additionally analyze the impact of different MA payload sizes, i.e., two different P sizes of 60 and 200 bytes.

4.2 Energy consumption

In this simulation, we compute the average remaining energy of all nodes after each round. Both data-collecting approaches, with and without aggregation, are considered for TMA and Agent-Knap. Figure 4 shows the relationship between the average device remaining energy and the expiration time $\lambda $ of data stored in the cache, when the network has 180 devices. Similarly, Fig. 5 shows the same result when the network has 300 devices. In all cases, the remaining energy increases with data persistence at the cache. Moreover, it is clear that data aggregation is an essential feature of Agent-Knap. The average remaining energy is lower for 180 nodes than for 300 nodes because each device receives more MA visits. Even though P has a subtle impact, it is possible to note that the remaining energy is more significant for 200 bytes as fewer MA rounds are needed when the data of interest is more persistent in the cache. Our results show that this is enough for Agent-Knap to consume less energy for larger values of P compared with client–server approach and TMA, even without data aggregation.

In all cases, we also observe that data gathering with the multi-hop Client–Server approach had the lowest remaining energy compared with a MA dispatched through a pre-defined itinerary. We note that the Client–Server approach incurs, on average, more hops for a dispatched MA to reach all source nodes and return with the updated data to the gateway.

Data aggregation brings more gains because the MA size grows at a lower rate as the MA moves through the itinerary. When the aggregation is enabled, the MA size grows only when data of different content types are collected.

4.3 Cache hits

The cache-hit evaluation aims to verify the performance of the cache infrastructure fixed in the gateway, including scenarios where data aggregation is enabled or disabled. A cache hit happens when a data request is replied to by the gateway with the information available in its cache, i.e., without dispatching a MA. For example, considering a request for the average temperature on a specific AoI, one cache-hit is computed if the non-expired data in the cache is enough to cover all the AoI. In this case, the gateway can send a complete response to the client. Similarly to the previous section, we conduct analysis using 60- and 200-byte payload sizes.

Figure 6 presents the results for 180 nodes. With Agent-Knap, a larger payload results in more cache hits, especially when data aggregation is enabled. Thus, Agent-Knap performs better with a larger payload (P).

When data aggregation is enabled, the MA can retrieve more data in each round than when the MA does not use data aggregation. This happens because the knapsack algorithm has a maximum payload size to determine which nodes belonging to the itinerary will be selected for the data-gathering process. Using data aggregation, more nodes can have their data collected by the MA because aggregated data does not increase the payload size.

Agent-Knap with enabled data aggregation and 200-byte payload size has the best result. This combination achieves 227% more cache hits than TMA with data aggregation. Figure 7 shows that the proportion of cache hits increases with more network devices. This occurs because with more nodes, more redundant data is available.

4.4 Data accuracy

We compute the root mean square deviation (RMSD) of data gathered by the MA to evaluate the impact of data freshness at the cache and the tradeoff of data aggregation and data accuracy. We compare both data-gathering methods using TMA and Agent-Knap.

The RMSD is calculated per round after the MA returns to the gateway with fresh data, according to Eq. 2.

$$\begin{aligned} RMSD = \sqrt{\frac{1}{M}\sum _{i=1}^{M}\delta _{i}^{2}}. \end{aligned}$$

(2)

We compare valid cached data, i.e., non-expired data at the gateway’s cache, with the real valid data registered in all sensors in the AoI, $X_{i}$.

In our proposal, the gateway always replies to an Internet client request with an aggregation of valid data.

For example, if the client request is the average temperature in an AoI, the gateway first verifies if all valid cached data covers the entire AoI. If so, the gateway immediately sends the response to the client with the calculated average of the valid data in cache, $\overline{x_{i}}$. If not, the gateway completes the information with fresh data collected by the MA and then sends a response.

The computed RMSD in every round compares the aggregation of the real values that are registered by network sensors for a specific content in the field, $X_{i}$, with the aggregated value computed and sent by the gateway to the client as a response.

Thus, in Eq. 2, M is the number of valid cached data in the gateway for content $k_j$. Furthermore, we have $\delta _{i} = X_{i} - \overline{x_{i}}$.

Data accuracy results for simulations with 180 devices are displayed in Fig. 8 for payload sizes of 60 and 200 bytes and with and without data aggregation. Figure 9 shows the same results for 300 devices. In the simulation setup, each one of the four sensors in each device has a fixed value measured. These simulated values were randomly generated in a range between 1 and 30 at the system start-up.

In all scenarios, Agent-Knap keeps a higher data accuracy. This is because opportunistic data gathering improves the data refreshes at the gateway cache. The higher number of valid samples in the cache results in lower error in the answer sent to the client, compared with the TMA method.

Also, an error reduction is expected when data aggregation is disabled since data aggregation has an intrinsic error.

We observed that Agent-Knap reduces the RMSD when the data aggregation is disabled. In both network sizes, 180 and 300 nodes, and payload sizes, 60 and 200 bytes, the error reduction is evident when we use Agent-Knap with aggregation. This reduction indicates that the data freshness provided by Agent-Knap can also contribute to data accuracy at the network cache. The results using Agent-Knap reveal a tradeoff between data accuracy and energy consumption. Hence, enabling data aggregation depends on the IoT application, i.e., if this tolerates lower accuracy at the cost of lower energy consumption.

4.5 Lossy network scenario

In this simulation, we compute the average remaining energy of all nodes after each round in a lossy network. We introduced a uniform packet error rate of $10\%$ on all network links.

Hence, if a MA loss occurs, the recovery is conducted on a hop-by-hop basis at the transport level. Note that the application inserts the MA itinerary into the MA code, as described in Sect. 3.2.

We compare the average remaining energy of TMA and Agent-Knap on both lossy and lossless scenarios. All simulations were performed on the 300 devices network topology with MA carrying 200-byte payload (P) sizes. Figure 10 shows the robustness of the Agent-Knap solution even in the presence of transmission failures. The best performance of Agent-Knap appears for the lowest expiration time, which corresponds to the scenario with the largest number of MA transmissions on the network.

5 Conclusion

We proposed the Agent-Knap, a new mechanism for data collection using mobile agents with static itineraries in IoT networks. Agent-Knap improves network communication efficiency by reducing the number of requests sent to the network. We proposed using opportunistic data gathering for proactive caching updates to accomplish that. Our simulation results have shown the impact of the proposed data collection mechanism by reducing the energy consumption of network devices. In addition, Agent-Knap reduces data accuracy, especially when data aggregation is disabled.

The association of the knapsack algorithm with the device selection mechanism allowed the implementation of an intelligent data-gathering process, prioritizing the items of interest. Data aggregation improves energy savings but shows a tradeoff concerning data accuracy for Agent-Knap that must be evaluated depending on the application.

In future work, we plan to dynamically change the payload size reserved for opportunistic data and include a selection process for source nodes that considers upper-layer QoS requirements. We also plan to implement the Agent-Knap with multiple agents with dynamic itinerary planning and evaluate its performance in a low-power and lossy scenario. Finally, we would like to introduce security to our current design by assuming encrypted payloads or secure link-layer strategies.

Data Availability

Not aplicable.

References

Malik PK, Sharma R, Singh R, Gehlot A, Satapathy SC, Alnumay WS, Pelusi D, Ghosh U, Nayak J (2021) Industrial Internet of Things and its applications in Industry 4.0: state of the art. Comput Commun 166:125–139
Article Google Scholar
Al-Turjman F, Zahmatkesh H, Shahroze R (2022) An overview of security and privacy in Smart Cities’ IoT communications. Trans Emerg Telecommun Technol 33(3):3677
Article Google Scholar
Bouzefrane S, Torres Olmedo JG, Zhang G, Puech N (2021) Security and trust in ubiquitous systems. Springer
Sinha BB, Dhanalakshmi R (2022) Recent advancements and challenges of Internet of Things in smart agriculture: a survey. Future Gener Comput Syst 126:169–184
Article Google Scholar
Moy C, Besson L, Delbarre G, Toutain L (2020) Decentralized spectrum learning for radio collision mitigation in ultra-dense IoT networks: Lorawan case study and experiments. Ann Telecommun 75(11):711–727
Article Google Scholar
Shah VS (2018) Multi-agent cognitive architecture-enabled IoT applications of mobile edge computing. Ann Telecommun 73(7–8):487–497
Article Google Scholar
Pourroostaei Ardakani S (2021) Minds: mobile agent itinerary planning using named data networking in wireless sensor networks. J Sens Actuat Netw 10(2):28
Article Google Scholar
Alsboui T, Qin Y, Hill R, Al-Aqrabi H (2021) An energy efficient multi-mobile agent itinerary planning approach in wireless sensor networks. Computing 103(9):2093–2113
Article MathSciNet Google Scholar
Singh H, Bala M, Bamber SS (2020) Augmenting network lifetime for heterogenous WSN assisted IoT using mobile agent. Wirel Netw 26(8):5965–5979
Article Google Scholar
Kumar SA, García-Magariño I, Nasralla MM, Nazir S (2021) Agent-based simulators for empowering patients in self-care programs using mobile agents with machine learning. Mob Inf Syst 2021:1–10
Google Scholar
Alsboui T, Hill R, Al-Aqrabi H, Farid HMA, Riaz M, Iram S, Shakeel HM, Hussain M (2022) A dynamic multi-mobile agent itinerary planning approach in wireless sensor networks via intuitionistic fuzzy set. Sensors 22(20):8037
Article Google Scholar
El Fissaoui M, Beni-hssane A, Ouhmad S, El Makkaoui K (2021) A survey on mobile agent itinerary planning for information fusion in wireless sensor networks. Arch Comput Methods Eng 28(3):1323–1334
Article Google Scholar
Dong M, Ota K, Yang LT, Chang S, Zhu H, Zhou Z (2014) Mobile agent-based energy-aware and user-centric data collection in wireless sensor networks. Comput Netw 74:58–70
Article Google Scholar
Huthiafa Q, Zuriati Z, Zurina H et al (2017) A spawn mobile agent itinerary planning approach for energy-efficient data gathering in wireless sensor networks. Sensors 17(6):1280–1285
Article Google Scholar
Lu J, Xiao W, Song E, Hassan MM, Almogren A, Altameem A (2019) iAgent: when AI meets mobile agent. IEEE Access 7:97032–97040
Article Google Scholar
Karthik S, Karthick M, Karthikeyan N, Kannan S (2022) A multi-mobile agent and optimal itinerary planning-based data aggregation in wireless sensor networks. Comput Commun 184:24–35
Article Google Scholar
Garrigues C, Robles S, Borrell J (2008) Securing dynamic itineraries for mobile agent applications. J Netw Comput Appl 31(4):487–508
Article Google Scholar
Mpitziopoulos A, Gavalas D, Konstantopoulos C, Pantziou G (2009) Mobile agent middleware for autonomic data fusion in wireless sensor networks. Auton Comput Netw 1:57–81
Gavalas D, Venetis IE, Pantziou G, Konstantopoulos C (2015) An iterated local search approach for multiple itinerary planning in mobile agent-based sensor fusion. In: 2015 11th international conference on Mobile ad-hoc and Sensor Networks (MSN), pp 1–7. IEEE
Lu J, Feng L, Yang J, Hassan MM, Alelaiwi A, Humar I (2019) Artificial agent: the fusion of artificial intelligence and a mobile agent for energy-efficient traffic control in wireless sensor networks. Future Gener Comput Syst 95:45–51
Article Google Scholar
Chou Y-C, Nakajima M (2018) A clonal selection algorithm for energy-efficient mobile agent itinerary planning in wireless sensor networks. Mob Netw Appl 23(5):1233–1246
Article Google Scholar
Ghoumid K, Yahiaoui R, Elmazria O et al (2022) Optimized reception sensitivity of WBAN sensors exploiting network coding and modulation techniques in an advanced nb-iot. IEEE Access 10:35784–35794
Article Google Scholar
Chen D (2023) Routing optimization algorithm based on mobile agent for wireless sensor networks. J Comput Methods Sci Eng, 1–8
Afonso E, Campista MEM (2020) Opportunistic data gathering in IoT networks using discrete optimization. In: 2020 IEEE Symposium on Computers and Communications (ISCC), pp 1–6. IEEE
Silva PVBCd, Taconet C, Chabridon S, Conan D, Cavalcante E, Batista T (2023) Energy awareness and energy efficiency in Internet of Things middleware: a systematic literature review. Ann Telecommun 78(1–2):115–131
Article Google Scholar
Gavalas D, Venetis IE, Konstantopoulos C, Pantziou G (2017) Mobile agent itinerary planning for WSN data fusion: considering multiple sinks and heterogeneous networks. Int J Commun Syst 30(8):3184
Article Google Scholar
Wang X, Wang C, Li X, Leung VC, Taleb T (2020) Federated deep reinforcement learning for Internet of Things with decentralized cooperative edge caching. IEEE Internet Things J 7(10):9441–9455
Article Google Scholar
Sun X, Ansari N (2017) Dynamic resource caching in the IoT application layer for smart cities. IEEE Internet Things J 5(2):606–613
Article Google Scholar
Pahl M-O, Liebald S, Wüstrich L (2019) Machine-learning based IoT data caching. In: 2019 IFIP/IEEE symposium on integrated network and service management (IM), pp 9–12. IEEE
Liu Y, Zhi T, Xi H, Duan X, Zhang H (2019) A novel content popularity prediction algorithm based on auto regressive model in information-centric IoT. IEEE Access 7:27555–27564
Article Google Scholar
Wei X, Liu J, Wang Y, Tang C, Hu Y (2021) Wireless edge caching based on content similarity in dynamic environments. J Syst Architect 115:102000
Article Google Scholar
Feng B, Tian A, Yu S, Li J, Zhou H, Zhang H (2022) Efficient cache consistency management for transient IoT data in content-centric networking. IEEE Internet Things J 9(15):12931–12944
Article Google Scholar
Zhang Z, Lung C-H, Lambadaris I, St-Hilaire M (2018) IoT data lifetime-based cooperative caching scheme for ICN-IoT networks. In: 2018 IEEE International Conference on Communications (ICC), pp 1–7. IEEE
Zhou Z, Zhao D, Xu X, Du C, Sun H (2015) Periodic query optimization leveraging popularity-based caching in wireless sensor networks for industrial IoT applications. Mob Netw Appl 20(2):124–136
Article Google Scholar
Christofides N (1976) Worst-case analysis of a new heuristic for the travelling salesman problem. Technical report, Carnegie-Mellon University, Pittsburgh
Kellerer H, Pferschy U, Pisinger D, Kellerer H, Pferschy U, Pisinger D (2004) Introduction to NP-completeness of knapsack problems. Knapsack Probl, 483–493
Jin Y, Gormus S, Kulkarni P, Sooriyabandara M (2016) Content centric routing in IoT networks and its integration in RPL. Comput Commun 89:87–104
Article Google Scholar

Download references

Funding

This study was financed in part by CoordenaÃ§Ã£o de AperfeiÃ§oamento de Pessoal de NÃvel Superior - Brasil (CAPES) - Finance Code 001, CNPq, PR2/UFRJ, FAPERJ Grants E-26/010.002174/2019, and E-26/201.300/2021, and FAPESP Grant 15/24494-8.

Author information

Miguel Elias M. Campista contributed equally to this work.

Authors and Affiliations

GTA/COPPE/UFRJ, Universidade Federal do Rio de Janeiro, Rio de Janeiro, RJ, Brazil
Edvar Afonso & Miguel Elias M. Campista

Authors

Edvar Afonso
View author publications
You can also search for this author in PubMed Google Scholar
Miguel Elias M. Campista
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Edvar Afonso.

Ethics declarations

Conflict of Interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Afonso, E., Campista, M.E. Opportunistic data gathering in IoT networks using an energy-efficient data aggregation mechanism. Ann. Telecommun. (2024). https://doi.org/10.1007/s12243-024-01055-z

Download citation

Received: 15 June 2023
Accepted: 04 July 2024
Published: 23 July 2024
DOI: https://doi.org/10.1007/s12243-024-01055-z

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Opportunistic data gathering in IoT networks using an energy-efficient data aggregation mechanism

Abstract

Similar content being viewed by others

Transmission of Aggregated Data in LOADng-Based IoT Networks

CEDAR: A cluster-based energy-aware data aggregation routing protocol in the internet of things using capuchin search algorithm and fuzzy logic

MultiHop optimal time complexity clustering for emerging IoT applications

1 Introduction