Similarity based image selection with frame rate adaptation and local event detection in wireless video sensor networks

Salim, Christian; Makhoul, Abdallah; Darazi, Rony; Couturier, Raphaël

doi:10.1007/s11042-018-6376-8

Similarity based image selection with frame rate adaptation and local event detection in wireless video sensor networks

Published: 14 July 2018

Volume 78, pages 5941–5967, (2019)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Multimedia Tools and Applications Aims and scope Submit manuscript

Similarity based image selection with frame rate adaptation and local event detection in wireless video sensor networks

Download PDF

Christian Salim ORCID: orcid.org/0000-0002-7553-1987^1,2,
Abdallah Makhoul¹,
Rony Darazi² &
…
Raphaël Couturier¹

248 Accesses
10 Citations
Explore all metrics

Abstract

Wireless Video Sensor Networks (WVSNs7unding environmental information. Those sensor nodes can locally process the information and then wirelessly transmit it to the coordinator and to the sink to be further processed. As a consequence, more abundant video and image data are collected. In such densely deployed networks, the problem of data redundancy arises when information are gathered from neighboring nodes. To overcome this problem, one important enabling technology for WVSN is data aggregation, which is essential to be cost-efficient. In this paper, we propose a new approach for data aggregation in WVSN based on images and shot similarity functions. It is deployed on two levels: the video-sensor node level and the coordinator level. At the sensor node level the proposed algorithms aim at reducing the number of frames sensed by the sensor nodes and sent to the coordinator. At the coordinator level, after receiving shots from different neighbouring sensor nodes, the similarity between these shots is computed to eliminate redundancies and to only send the frames which meet a certain condition to the sink. The similarity between shots is evaluated based on their color, edge and motion information. We evaluate our approach on a live scenario and compare the results with another approach from the literature in terms of data reduction and energy consumption. The results show that the two approaches have a significant data reduction to reduce the energy consumption, thus our approach tends to overcome the other one in terms of reducing the energy consumption related to the sensing process, and to the transmitting process while guaranteeing the detection of all the critical events at the node and the coordinator levels.

Image processing based data reduction technique in WVSN for smart agriculture

Article 25 July 2023

Image Similarity Based Data Reduction Technique in Wireless Video Sensor Networks for Smart Agriculture

Energy-efficient secured data reduction technique using image difference function in wireless video sensor networks

Article 12 November 2019

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Nowadays, after the development of Wireless Video Sensor Networks (WVSN), the enhancement of the surveillance in terms of monitoring and detecting criticalities and anomalies has set big improvements in different fields (e.g. street, forest, traffic, personal, healthcare, industrial monitoring, etc [1]). Hence, after each anomaly and emergency detection, decisions must be made at the coordinator level. The coordinator may be a normal node or a specific node with greater ressources. It manages a zone of interest, analyzes the data received from several camera sensor nodes and sends the necessary information to the sink which controls the whole network as shown in Fig. 1. Different types of anomalies exist depending on the monitored environment and the predefined criteria and parameters such as quick motion, sound, or scene change, the decisions are made in order to avoid any action that can affect the monitored environment.

The detection of irregularities in any monitored scene is one of the main targets in WVSN. Every scene is permanently filmed using multiple video-sensor nodes. At the sensor node level, the sensor-nodes collect frames and send those frames to the coordinator. The coordinator is responsible for the data aggregation process. The aggregation is limited to either selecting, fusing or deleting the received frames. As a consequence, a significant amount of energy is consumed due to the huge amount of captured frames, which reduces the lifetime of the network. Moreover, the continuous transmission process between all the components of the network (sensor nodes, coordinators and sink) has a big influence on the bandwidth capacity of the network which may cause a bottleneck on the network [2].

Video-sensor nodes operate periodically in WVSN. We define some keywords:

A fixed frame rate is defined on every sensor node to film the video accordingly, this frame rate is the number of captured frames per second (fps).

A period is a fixed time length during which frames are captured with a given frame rate.

A video shot is considered as a video sequence taken within a period.

Energy consumption and bandwidth limitation are two important challenges in WVSN. The first one is related to the sensing and transmission modules of the sensor node. The higher the frame rate and the number of frames sent, the more energy is consumed. The second one is related to the transmission module of the sensor node and the coordinator, the greater the number of frames sent on the network is the more bandwidth is used. The energy consumption and bandwidth usage issues on the coordinator’s side can be addressed by reducing the amount of sent data from the coordinator to the sink node. In our approach, the data analysis starts at the sensor node level and continues at the coordinator level to match the greatest reduction possible in terms of energy and bandwidth consumptions on both levels. Each video-sensor node compares all the frames in a shot to the last frame sent and computes the similarity between them. Based on the similarity function, only the frames in which an event occurs are sent. The selected frames are called critical frames and are sent to the coordinator. The similarity function at the sensor node level is based on color and edge similarities able to compare frames. This comparison selects the least required number of captured frames to be sent to the coordinator. By applying the similarity function, we reduce the energy consumption related to the Communication process by reducing the number of transmitted data.

Alongside the similarity function, the frame rate of each video-sensor node is adapted. A method based on signal frequencies presented in [3] is adopted and applied to WVSN in our approach. This method consists in reducing the number of frames captured by adapting the frame rate of each video-sensor node based on the number of critical frames detected in several consecutive past periods. Consequently, by adapting the frame rate, the Sensing process is reduced thus decreasing the energy consumption. At the coordinator level an updated version of the similarity function is implemented in which the motion similarity is added to the color and edge similarities. To avoid comparing all received shots at the coordinator level, a geometric study and a filtering condition are presented. Those conditions consist in reducing the number of possible comparisons. The remainder of this paper is organized as follows. In Section 2, we present the related work to our approach. In Section 3, we describe the proposed method at the sensor node level within its two aspects: the local detection system and the adaptive sampling system, as well as their corresponding algorithms. In Section 4, the data aggregation scheme is described and the proposed geometric method at the coordinator level is introduced. The experimental results and the comparison with another method are given in Section 5. Finally, we conclude in Section 6 with perspectives and future work.

2 Related work

Several research work dealing with data redundancy and energy reduction have been conducted so far [5, 6, 15, 16, 21]. In [16], Akkaya et al. introduced a GPS module into scalar sensors in order to control the cameras. Thus, the system detects which camera should be actuated based on the sensor’s position. In [21], Priyadarshini et al. proposed an approach which eliminates redundancies caused by the overlapping of the FOV’s (Field Of View) of the video-sensors. To do so, it tends to turn off some cameras and activate the optimal number of cameras at the same time. In [6], Bahi et al. proposed an in-network data aggregation technique at the coordinator level which identifies the nearly duplicate nodes that generate similar data.

In [25], Akkaya et al. discussed the background subtraction (BS) and compression techniques as common data reduction schemes, which have been used for camera sensors to reduce energy consumption.

In [9] and [28], almost all of the studies deal with the physical and network layers. In [9] the authors use a CMOS image sensor where the image is recreated from two outputs, with the details in stationary objects and the suppressed motion in moving objects. It should be noticed that a high frame rate is only applied in the region-of-interest where it matters the most to detect and track any event.

In [18], the authors proposed two new approaches based on the cover set concept to help a node in finding its redundancy level. They proposed an algorithm to schedule the activity of sensor nodes according to the overlapping degree between sensors, and to know for certain if a sensor belongs to the cover set of another sensor.

In [19], the authors proposed a scheduling network solution to minimize power consumption using the multipath theory in wireless video sensor networks. They proposed an algorithm that transmits packets over multipath according to their importance.

Different strategies has been used to reduce energy consumption and bandwidth usage by using an adaptive video streaming etc. that can minimize the utilization of network bandwidth taking into consideration that bandwidth is the most important ressource in a network [13, 26, 30, 32]. All these works help to increase the lifetime of the network. Increasing the lifetime of the network is also studied in [31] specifically for smart camera network.

Several proposed methods in the literature discuss the similarity of images [20, 24, 29]. In [29], the authors used the $L_{1^{-}},L_{2^{-}}$ and $L_{\infty ^{-}}$ distance between two cumulative color histograms to simulate the similarity between two color images. In [20], they are interested in the segmentation techniques to compute the similarity, all the techniques are mainly edge based techniques. In [24], the comparison is achieved through an exercise in determining the lack of spatial correlation between two images.

Many methods have been proposed in the literature concerning the visual information and motion estimation in wireless video sensor networks [10, 11, 17, 27]. In [10], the authors studied the correlation in visual information between different cameras with overlapped field of views (FOVs) where the new spatial correlation model function for visual information is implemented. The joint effect of multiple correlated cameras is taken into consideration in this study. An entropy-based analytical framework is developped to measure the amount of visual information provided by multiple cameras. The authors designed a correlation based camera selection algorithm which reduces the energy dissipation of the communication and the computation. This algorithm requires fewer cameras to report to the sink than a random algorithm.

In [11], Jbeily and al. proposed a new symmetric-object oriented approach for motion estimation in WVSN called SYMO-ME which reduces the high complexity of motion estimation, the authors main objective is to reduce the redundancy between successive frames. They adapt a new motion estimation energy consumption model for block matching algorithms (BMAs) in WVSN. This model depends on the energy consumption value of different executed instructions.

Many previous works focused on the scheduling method [4, 5, 7, 12, 22, 33]. In [5], the authors used a clustering methodology. They managed to make a scheduling approach to all overlapping cameras in the same cluster to avoid redundant data. Jiang et al. in [12] proposed a probability scheduling approach based on the kinematics functions and normal law to study the expected positions of the intrusion depending on the kinematics functions to track its trajectory.

In previous works regarding the similarity process, they do not use a pixel by pixel technique. They use the color histograms for color images [29] which can mislead the comparison if the same color happens to be in another place in the area with the same intensity. None of the mentioned works have proposed a data aggregation method at the coordinator level while taking into consideration data reduction performed at the sensor node level for energy consumption. In this paper, both levels are taken into consideration, the sensor node and the coordinator levels. The reduction in terms of energy and bandwidth consumptions is the main purpose of this paper. On the sensor level, a combination of color and edge techniques is established to do the comparison between several images to send only the appropriate frames to the coordinator. The coordinator is responsible for sending to the sink the non similar frames received from different sensor nodes. A geometrical condition is implemented on the coordinator to select the sensor nodes where the comparison must take place.

3 Local detection system: sensor node level

The proposed method is divided into two sections. The first one consists of a local detection function that detects any change in the frames in order to be sent to the coordinator. This function is introduced in every period of our proposed “Multimedia Adaptive Sampling Rate Algorithm” (MASRA). The second section presents MASRA algorithm. This algorithm adapts the sampling frequency of each sensor node based on the monitored area.

3.1 Local detection system

In this section, the frame analysis at the video-sensor node level is introduced. This analysis helps sending only the different frames to the coordinator in order to prevent sending all the frames which costs in terms of energy and bandwidth. In some multimedia applications [34], only the middle frame of a shot is used to represent the shot content. But this solution could represent only static shots without taking into consideration the color similarity between frames in the shot nor the edge similarity or the motion similarity, etc.

Comparing the new approach to the Structural Similarity (SSIM) Index quality assessment index, which is based on the multiplicative combination of the luminance, the contrast and the structural terms, shows that this new approach conserves the information and is less complex than SSIM. Thus, SSIM is not used with tiny sensor-nodes because it drains energy a lot more than two simple low-level similarity metrics (color and edges). To compare between SSIM and Color-Edge function in Multimedia Adaptive Sampling Rate Algorithm (MASRA), we implement both algorithms on raspberry pi 3 using c++ for openCV. For the same images input, the results of the execution time needed are shown in Table 1. The important execution time needed to run SSIM function proves why the SSIM is not used for tiny sensor nodes applications.

Table 1 Execution time comparison for SSIM and color-edge function

Similarity based image selection with frame rate adaptation and local event detection in wireless video sensor networks

Abstract

Similar content being viewed by others

Image processing based data reduction technique in WVSN for smart agriculture

Image Similarity Based Data Reduction Technique in Wireless Video Sensor Networks for Smart Agriculture

Energy-efficient secured data reduction technique using image difference function in wireless video sensor networks

1 Introduction

2 Related work

3 Local detection system: sensor node level

3.1 Local detection system

3.1.1 Color similarity

3.1.2 Edge similarity

Definition 1 (Similarity Function)

3.2 Multimedia Adaptive Sampling Rate Algorithm (MASRA)

Definition 2 (Critical frame)

4 Data aggregation scheme: the overlapping method

4.1 Video sensing model

4.2 Camera’s overlapping filtering

4.2.1 The angle condition

4.2.2 The two points condition

Definition 3 (Low Similarity)

Definition 4 (High Similarity)

Definition 5 (similarity threshold percentage \(\beta _{sim}\) between shots)

4.3 Shot selection algorithm

4.3.1 Motion similarity

4.4 Shots similarity estimation

4.4.1 Different frame rates solution

5 Experiments

5.1 The sensor node level

5.1.1 Number of frames

5.1.2 Bandwidth consumption

5.2 The coordinator level

6 Energy consumption study

6.1 Sensor node level

7 Conclusion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation