Keywords

1 Introduction

Based on various sources [14] an “event” is defined as an important phenomenon that occurs or may have occurred. Consequently, event identification is the procedure through which the respective phenomenon is accurately and reliably identified as well as recorded. Also an event is a specific case that stands out from an otherwise normal situation, exhibiting different data patterns compared to what is expected for a particular scenario. This chapter focuses on such procedures and algorithms specifically designed for or applied to wireless sensor networks (WSNs). Event identification in WSNs is a rapidly evolving research area attracting active interest from both the research and the industrial domains [2, 59]. The former is mainly due to its challenges as well as restrictions while the latter can be attributed to the fact that respective implementation is expected to drastically enhance WSNs practical applicability, usefulness, and widespread while at the same time mitigate notorious shortcomings of such networks. This is also indicated by the increasing number of prestigious publications and projects focusing on this objective. In this context the chapter’s main objective is to offer a comprehensive survey and classification or different approaches, techniques, and methodologies currently comprising state of the art in event detection targeting WSNs.

Respective functionality is of cornerstone importance in a wide range of applications varying from medical, environmental, mechanical, and virtually any practical scenario [12, 15]. In the context of such scenarios instead of acquiring a complete knowledge and notion of the particular application, through event detection algorithms, the objective is to identify the occurrence or the possible occurrence of a type or set of types of events. The initial, though simplistic, idea of event detection assumes the existence of a specific threshold value, with respect to which and based on the deviation of a particular measurement to that threshold an event can be defined. Temperature is a characteristic example since exceeding a specific level (e.g., 40 °C) can be categorized as an event. However, even in simplistic application scenarios it is identified that such approaches are inadequate to capture complex events depending on multiple inputs. Based on this deficiency a new even detection algorithmic trend is developed utilizing techniques based on pattern recognitions since all events can be represented as specific patterns.

Respective techniques’ presentation, analysis, and classification are based on particular set of characteristics which effectively distinguish each approach revealing relative advantages and disadvantages advocating the use of each one to specific application scenarios.

2 Wireless Sensor Network Characteristics

Design and implementation of event detection algorithms for WSNs have to tackle significant challenges due to the limitations and restrictions posed in such networks. Such limitations mainly stem from scarce resource availability in all aspects of a typical WSN nodes. Without a doubt the most important such resource shortage is energy availability. Aimed to be small, low cost, low complexity, wearable, and effectively expendable a typical node is powered by very small (usual rechargeable) batteries with typical capacity ranging from 450 mAh up to ~300 mAh (corresponding to two AA batteries). Thus power conservation comprises probably the most important objective or relative developments.

Consequently, adequate WSN event detection algorithms must be energy efficient and fault tolerant, yet accurate and reliable. Furthermore, an important prerequisite concerns computational and communicational resource conservation and at the same time high configurability and adjustability to wide range of events. In the context of conservative approaches, WSN nodes role was limited to conveying and aggregating raw data to a resource rich central entity (typically referred to as Base Station or Gateway) which was solely responsible for further data analysis and event identification. However, WSN paradigm brings forwards specific characteristics such as multi-hop data transfer, dynamic topologies as well as low bandwidth availability. Such characteristics in combination with centralized approaches can lead to performance shortcomings such as increased event identification delay, unpredictable link breakage, data packets congestion etc. drastically degrading event detection capabilities. On the contrary the contemporary approaches lean towards event identification inside the WSN network through cooperative distributed data processing by adequate subset of WSN nodes. In this way data is being analyzed much faster, events identification delay can be significantly reduced and even more, actuators residing also in the network can react to the identified events much faster, accurately and reliably.

Another critical WSN characteristics influencing event detection algorithm design relates to the data acquisition approach utilized. In that respect three main approaches are typically encountered.

  • Continuous Data Streaming: In this case data acquired by the sensors are conveyed to the central station periodically without any processing or filtering. Although comprising an easily implemented approach it is considered an ineffective method when large data volumes are aggregated or when data are to be transferred over complex multi-hop paths. Additionally, such data acquisition approach typically entail a centralized data processing architecture often leading to underperformance due to excessive time delay, low data transfer reliability as well as high network congestion scenarios. Last but not least such approach frequently lead to increased energy consumption comprising a critical disadvantage for WSN networks.

  • Query-Drive: In this case network users effectively insert a query into the network and respective nodes that can actually provide a response transfer required data through the network. On one hand, this approach is much more efficient than continuously streaming data with respect to resource conservation particularly regarding energy and bandwidth consumption. On the other hand, they pose a critical requirement of supporting only a priori known requests which in many cases contradict to the dynamic nature of WSNs.

  • Event driven: In this case, data are transferred when specific conditions are met (e.g., when predefined thresholds are surpassed) which effectively correspond to an “event,” On one hand, such approach is even more efficient in terms of resource consumption while, one the other hand, it allows the creation of complex application scenarios based on fuzzy or complicated data patterns formation.

However, besides optimal data acquisition technique, an important driving force of event detection approach in WSNs relates to the network entities where processing takes place.

  • Base Station: It can be considered a typical solution but a rather inefficient one for nowadays demanding application scenarios. A Base Station is usually a resource rich WSN node effectively operating as the interface between the WSN network and rest of the work (e.g., an IP network or the internet). Respective hardware offers abundant resources but is also characterized by significant disadvantages such as representing a single point of failure. It can also fail to address strict application requirements in complex network topologies including unreliable links, multi-hop communication paths, and highly mobile WSN nodes.

  • In network: In this case raw data are being processed by the nodes comprising the WSN network. Therefore, performance characteristics are significantly enhanced since there is no single point in the network where data must flow in order for a decision to be made. Designs following this approach can be further divided into two categories.

    • Local: Where a single node can perform the data processing of data acquired.

    • Distributed: Where a group of nodes (typically schematically correlated) are involved in the data processing. Therefore, data exchange amongst nodes is required in this case.

Concluding, base Station data processing is usually combined with continuous data steam data acquisition approach. Additionally, query-driven and event-driven techniques, although also applicable in Base Station-based algorithms, are increasingly utilized in designs and implementation of algorithms following the in-network data processing paradigm.

3 Event Detection Challenges in WSNs

Event detection is of paramount importance in WSNs since it allows the efficient management of emergencies and critical citations due to the occurrence of specific event. In most cases the identification of an event is a time constrained functionality. Respective constraints are posed by the specific application and the criticality of the event.

  • Time critical scenarios: In such cases a specific deadline is defined before which the event must be accurately identified. Omitting specific deadline may lead to degraded performance (usually correlated to soft real time applications) or complete application failure (usually correlated to hard real time applications). The latter case is much more important since respective failure can endanger property or even human lives.

  • Best effort scenarios: In such cases, it is implied that there is no strict deadline to meet but the event detection algorithms perform to the best of its ability depending on various parameters.

Furthermore, WSNs characteristics also pose critical challenges regarding in-network event detection algorithm design and development. The most important such challenges are as follows:

  • Data unreliability: It is widely known that WSN data are unreliable and prone to errors. Reasons for such unreliability include (a) environmental conditions effect, since WSN nodes can be exposed to wide range of conditions for extended period of time, (b) effect of unreliable or/and fluctuating power level, (c) inherent error prone wireless communication medium, and (d) low cost, low complexity hardware components usually utilized in WSN platforms. Environmental influence to data acquired is unavoidable since by definition WSN sensors are targeted to be in direct conduct with the modality monitoring while one of the most important advantage of a WSN is supporting deployment in harsh and hazardous environments. Small batteries comprise the main source of energy in WSNs. Therefore, they are subjects to phenomena affecting data accuracy such leakage or/and power variation to specific components with respect to the energy level as the battery is depleted. Wireless communication poor quality is also well known due to signal propagation phenomena as well as ad-hoc communication paradigm typically utilized in WSNs. Finally, in order for WSNs to be widely deployed in vast numbers it is imperative to minimize cost to the level that a single node can be considered expandable. Therefore, respective implementation tends to rely on low cost thus low quality hardware. A common approach able to tackle all aforementioned shortcomings is node redundancy so that specific failure(s) can be compensated from the rest of the nodes.

  • Data Volume: When event detection takes place in specific central nodes (e.g., the base station) then the problem of excessive data volume comprises a critical challenge since all nodes send their data to the central processing node. On the other hand, when event detection is done in network, the data flow is significantly reduced, however, addressing critical situations demand inter-node collaboration which poses event detection deployment as another challenge to tackle.

  • Complex event patterns: By definition “events” are patterns of acquired modalities which are different from what is considered a normal pattern. However such patters may vary from very simple and well defined to very complex ones or even unknown. In such cases data modeling, pattern recognition, and categorization techniques are required so as to extract patterns which are quite challenging to be developed and even efficiently executed in a computationally scarce environment such the one typically encountered in WSNs deployments.

  • Network Heterogeneity and Dynamicity: WSNs are characterized by high degree of heterogeneity since they are comprised by nodes of diverse capabilities, resources, and communication techniques and data acquisition characteristics. Furthermore, decentralized operation, unpredictable node mobility, energy depletion, and varying communication conditions result in an extremely dynamic topology and volatile network characteristics. Consequently, event detection techniques must be able to efficiently handle and adjust to such conditions.

  • Adjustability: Due to versatile functionality of a WSNs, respective event detection algorithms must also be characterized by a significant degree of adjustability. In that respect it is expected that a specific event detection technique can be utilized so as to identify more than one event. Adjustability implies that the technique must take into consideration the setup and the deployment of the network. A respective prominent approach is to adopt algorithm that can be easily adjusted by, e.g., appropriate weight selection.

4 Data Fusion Categorization

Probably the most basic and fundamental differentiation concerns single modality and multi-modality events [16]. The former concern events the identification of which is based only on a single characteristic, or type of measurement which is referred to as modality. In the latter category the event identification is based on concurrently, combining multiple inputs (e.g., sensors for WSN networks). To effectively utilize a plethora of different acquired type of signals (i.e., modalities) new sophisticated algorithms are required. Although the concept of multimodal data processing was initially exploited for military and robotic applications it is now utilized in all types of WSN applications. It is worth noting that in the context of this research domain terms like data fusion, sensor fusion, multimodal data fusion, multimodal sensor fusion, information fusion, etc. tend to be used interchangeably with little actual difference. The term, multimodal data fusion, refers to the exploitation of multiple, different, and diverse wireless sensors through a sophisticated process combining elementary features like association, correlation, and combination of data so as to derive a refined event detection decision.

The fundamental goal of multimodal data fusion is the detailed description of the phenomenon through the acquired data and the efficient utilization of this representation so as to increase the event accuracy. Here lays the main point of superiority over the single modal approaches. Single modality techniques offer only partial and incomplete system representations. Therefore, respective developments lead to uncertain conclusions which omit critical insight of the specific system leading to increased probability of erroneous indications. Furthermore, relying only on a single modality, respective solutions inherently suffer from single point failure which can be caused by the sensor itself, the communication channel or any random and unpredictable event.

4.1 Data Fusion Algorithmic Approaches

Regarding the core algorithms’ characteristics state-of-the-art approaches can be classified as follows [17]:

  • Competitive approaches [18, 19]: According to this approach the algorithm combines the same modality measurements so as to minimize respective deviations due to hardware, communication, and other sources of failure. Through such approaches it is possible to identify potential problems and estimate which measurement is more accurate in each particular case.

  • Complementary approaches [20, 21]: In this case, data fusion is exploited to combine partial data in order to indirectly formulate a complete and accurate model of a system or phenomenon under investigation. In this case the sensors do not directly depend on each other, but can be combined in order to give a more complete representation of the phenomenon under observation. This resolves the incompleteness of sensor data. An example for a complementary configuration is the employment of multiple cameras each observing disjunct parts of a room. Generally, fusing complementary data is easy, since the data from independent sensors can be appended to each other.

  • Cooperative approaches [16, 22]: In this case data fusion combines multiple sensor data amongst which a direct correlation exist so as to derive a higher level conclusion, decision, or indication. An example for a cooperative sensor configuration is stereoscopic vision—by combining two-dimensional images from two cameras at slightly different viewpoints a three-dimensional image of the observed scene is derived.

Summarizing competitive data fusion increases data reliability and data confidence, whereas complementary and cooperative data fusion techniques tend to lead to higher level abstraction measurement and thus more reliable conclusions. In any case combination of more than one approaches can also be considered.

4.2 Levels of Data Fusion

Depending on the level of data that is actually fused three different categorizes can be identified [23]. It is noted that these categorizes effectively represent abstraction layers typically utilized at application level. Therefore, more layers can be identified with respect to specific application scenarios while combination of different categories can also be envisioned and exploited.

  • Direct level fusion: This category includes algorithms and techniques that fuse together raw data so as to derive to a decision.

  • Feature level fusion: In this case features extracted from acquired data (usually in the form of vectors) comprise the entities that are actually fused and combined so as to offer higher abstraction knowledge and relative decisions.

  • Decision level fusion: Finally, different and diverse decisions extracted from initial algorithms can be fed into a data fusion algorithm leading to even more abstract indications and conclusions.

5 Classification of Event Detection Approaches

We choose to classify state-of-the-art approaches with respect to their underlying core techniques. Following an extensive relative literature analysis, authors followed an elicitation process to deduce the most suitable approaches for WSNs mainly based on [2, 5, 24, 25]. As indicated in Fig. 10.1 current approaches can be initially categorized as follows:

Fig. 10.1
figure 1

Taxonomy of event detection techniques

  • Pattern Matching

  • Model Based

  • Artificial Intelligence and Machine Learning Based

Furthermore, pattern matching approaches can be further classified as signature matching and prototype matching.

Model-based approaches, on the other hand, are analyzed into Arithmetic Models, Statistical and Probabilistic Methods and Map-Based. AI-based approaches, however, comprising probably the most active category, can be further divided into two subcategories, i.e., supervised and unsupervised learning. Each one of these subcategories leads to specific approaches as indicated in Fig. 10.1 comprising the most sophisticated and prominent of the state-of-the-art approaches.

5.1 Model-Based Approaches

In the context of this category respective techniques aim to model an event in a specific form such as mathematical formulas or maps. Respective implementations reveal the following properties:

  1. 1.

    They are able to handle complex systems since they represent a, mainly, non-linear even model

  2. 2.

    They are typically dependent upon the specific application and thus suffer from lack of flexibility so as to be applied in other domains

  3. 3.

    An expert is required to accurately configure the model parameters

  4. 4.

    They tend to be computational intensive due to the complexity of the respective models.

5.1.1 Arithmetic Model-Based Approaches

Arithmetic model-based techniques utilize discrete or continuous mathematical models in order to model events and identify them according to the degree, acquired data, and converge to predefined models. Bager et~al. [26] propose a voting graph neuron (VGN) algorithm for event detection in distributed wide scale WSNs. VGN algorithm is based on a distributed cooperative idea of problem solving according to which the problem of interest is effectively segmented into smaller parts. In this context the event patterns are stored in a distributed graph in the network. Therefore, events are detected by matching data of each WSN node with a subset of the graph. The particular proposal has been evaluated in the context of Matlab-based simulation effort. Zhang et~al. [27], on the other hand, propose two new arithmetic model-based techniques aiming to locally identify events in a WSN node. These techniques are based on if-then-else arithmetic rules which are able to correlate real data against models so as to quantify respective convergence. Validation is done in the context of the widely used WSN Castalia simulator trying to detect overheating events and compare respective performance against threshold-based analogous implementations.

5.1.2 Map-Based Approaches

The main idea upon which respective techniques are based is that maps comprise a way to accurately represent the physical world and physical phenomena in space and time. Thus a map uniquely corresponds to an event and adopting an adequate process it can actually assist in identifying events that occur or have occurred in a specific environment.

Specifically Khelil et~al. [28] present a distributed event detection algorithm based on Map-based world model (MWM) in which each sensor node forms a map of its immediate surrounding and sends the resulting model to the sink node targeting environmental event detection. WSNs, on the one hand, are inherently embedded in the real world, the goal being to detect specific spatial and temporal physical world’s ambient characteristics, such as temperature, air pressure, gas presence, and oxygen density. On the other hand, maps present a powerful/efficient tool to model the spatial and temporal behavior of the physical world being an intuitive aggregated view of it. As stated before, the main objective behind the deployment of WSNs is to create an appropriate model for the physical world. Therefore, without loss of generality, the authors model the world as a stack of maps presenting the spatial and temporal distributions of the sensed attributes of interest in the physical world. Authors argue that the specific techniques are indeed able to identify event rapidly and accurately. Li et~al. [29] propose a distributed event detection approach based on the creation of a 3D map and an aggregation method. Authors argue that nodes reside in a three-dimensional environment and therefore are able to model this environment as a cubic map cell. Such cubic cell maps can be aggregated in a cluster head or a sink node so as to form an extended cubic cell map containing all environments monitored. In final stage of the event detection process, the map of the entire environment represents the event as well as the event location.

5.1.3 Probabilistic/Statistical Model-Based Approaches

These approaches analyze the distribution of data or other statistical metrics so as to form the models and then identify the events. Statistics is the traditional field that deals with the quantification, collection, analysis, interpretation, and drawing conclusions from data. It is concerned with probabilistic models, and specifically inference on these models using data. Statistical techniques are driven by the data and are used to discover patterns and build predictive models. Statistical approaches are generally characterized by having an explicit underlying probability model, which provides a probability of being in each class rather than simply a classification. In addition, it is usually assumed that the techniques will be used by statisticians, and hence some human intervention is assumed with regard to variable selection and transformation, and overall structuring of the problem [3537].

Static threshold event detection is by far the simplest and most computationally straightforward method of statistical event detection. Event detections are reported when the monitored parameter exceeds a predetermined threshold value, and the detection condition persists as long as the parameter value exceeds the threshold set point. Once the parameter falls within acceptable bounds, the detection condition is met. Threshold values may be determined based upon historical parameter values, for example, to similar sensors and systems, engineering estimates, or parametric analysis. The static threshold method exhibits a “memoryless” property from one observation to the next, as the current observation and detection condition is independent of all prior observations. However, observed values are (usually) dependent upon prior observed values, and one would not reasonably expect the observed values to radically change in the short period of time between successive observations. Many references describe the benefits and utility of static threshold event detection methods, and Kerman et~al. [30] baseline the results of the static threshold method against a composite event detection method.

Techniques falling into this category are probably the most straightforward based on simple if-then-else rules aiming to control whether acquired real time measurements deviate and to what degree with respect to predefined thresholds levels. Respective developments tend to exhibit the following characteristics: (1) it is very easy to implement since they effectively comprise by a set of if-then-else rules, (2) they usually require specialized and in-depth knowledge of the system of phenomenon to adequately configure the threshold values, and (3) they tend to be inaccurate in complex scenarios since they model events as linear functions. Vu et~al. [31] propose a complex event detection threshold-based technique, which aims to decompose complex events to a set of sub-events of lower complexity. Therefore an event is identified if all sub-events are identified (occurred either concurrently or sequentially). For example, an explosion event is decomposed into two sub-events: (1) the occurrence of a loud noise and (2) the increase of heat. Consequently the event of an explosion is identified if both aforementioned events occur concurrently. The particular study offers the possibility of distributed processing when applied in a wide scale networks and is evaluated in the context of simulation environment. A distributed event detection threshold-based scheme for heterogeneous WSNs is presented in [32]. In these cases the authors propose a two layer clustering approach. Specifically one layer acts as parent-nodes and the final node acts as the sink node. The study proposes a compete framework for both data collection and event identification in the context of the COMis middleware. Another relative event detection approach characterized as consensus-based techniques is presented in [33], aiming to assist in volcano monitoring and respective event identification. The authors suggest a complete framework enabling accurate volcano activity monitoring applied on the well-known (http://www.willow.co.uk/html/telosb_mote_platform.php) WSN platform.

In [34], a biomedical application of wireless sensor networks is presented under the assumption that a small wireless node with an accelerometer is attached to a human wrist, like a wristwatch. The accelerometer provides instantaneous measurement of acceleration (caused by a person’s movements) that is currently acting on the device. In this configuration, the accelerometer with accompanying algorithms can be used to classify the subject’s movement into one of a few categories. This paper focuses on two threshold-based algorithms, which attempt to identify movements that are potentially harmful or indicative of immediate danger to a patient. The first algorithm seeks to identify rapid shaking movements that usually accompany myoclonic, clonic, and tonic-clonic seizures. Automated, quick seizure detection has the capability of alerting medical personnel when the patient is not physically capable of requesting assistance. The second algorithm is designed to generate an alarm when a patient has sustained an extended period of inactivity, potentially indicative of coma onset or loss of consciousness triggered by an acute brain injury. Like shaking movements, detecting inactive periods also has the potential to alert medical personnel to a problem more expediently than other means. Upon detecting an abnormal event, both algorithms trigger an auditory alarm from the wrist-device and transmit an alarm message (with necessary patient identification) through a ZigBee multi-hop wireless network to a patient monitoring station controlled by medical personnel.

NED (Noise-Tolerant Event and Event Boundary Detection) [38] represents a scheme able to identify events as well as the thresholds of events focusing on WSN networks. It is based on a moving average algorithm so as to minimize noise and a statistical method for event and event levels identification. A very interesting in-network event detection algorithm based on statistical metrics is presented in [39]. According to this approach the results of the algorithm are conveyed to the sink node (or cluster head node) so that the final evaluation is performed. Authors argue that their proposal is accurate, fault tolerant, and energy efficient through simulation-based evaluation.

A probabilistic model uses probability theory to model the uncertainty in acquired data. A probabilistic model describes a set of possible probability distributions for a set of observed data, and the goal is to use the observed data to convert the distribution (usually associated with parameters) in the probabilistic model that can best describe the current data.

Probabilistic event detection methods consist of those methods in which the probability of event occurrence and other related probabilities and parameters are computed and assessed based on specific preexisting assumptions, rather than based on computing and testing statistics derived from sample data sets. Ihler et~al. [11], for instance, develop a probabilistic framework for unsupervised event detection and learning based on a time-varying Poisson process model that can also account for anomalous events. Their experimental results indicate that the proposed time-varying Poisson model provides a robust and accurate framework to adaptively separate unusual event plumes from normal activity. This model also performs significantly better than a non-probabilistic, threshold-based event detection technique.

In the context of this approach, probability theory is utilized to model and describe events able to be identified and thus accurately indicate the occurrence of the event. In that context spatio-temporal event detection (STED) [40] comprises a real time in-network event detection scheme able to detect events using a belief propagation technique. Resulting implementation has been evaluated both in the context of real WSN network (utilizing TmoteSky in a small scale network) and in the context of a simulation environment (able to configure a large scale network).

5.2 Pattern Matching-Based Approaches

The main idea of this class is that events form a specific data pattern. Therefore, an event can be identified if the pattern of the real time acquired data match the event pattern. To achieve their objective respective techniques depend upon specific equations which are able to evaluate patterns or tendencies of data and thus decide on the occurrence or not of a specific event. These techniques share the following attributes: (1) are able to handle complex events while searching for data pattern formation using non-linear equations, (2) they are easily configurable and adaptable to a wide range of applications, (3) demand specialized knowledge for the correct configuration of the techniques’ parameters, and (4) they usually lead to computational intense implementation due to their complexity.

5.2.1 Prototype Matching Techniques

Prototype matching is a method of pattern recognition that describes the process by which a sensory unit registers a new stimulus and compares it to the prototype, or standard model, of said stimulus. Unlike template matching and feature analysis, an exact match is not expected for prototype matching, allowing for a more flexible model. An object is recognized by the sensory unit when a similar prototype match is found. A prototype is usually a vector of numbers derived from the sensors characteristics. Consequently, data acquired during the occurrence of an actual event are different from data acquired during a non-event time period. Therefore, comparing such prototypes with continuously acquired data can indicate the occurrence of specific event or events. In [1] the authors propose a distributed technique for event detection based on the construction and comparison of predefined such prototypes. In the context of this study prototypes are constructed during a training phase.Then following the integration of isolated decision by each particular node, events are identified or not at a wider scale. Respective implementation has been evaluated utilizing the ScatterWeb MSB-430 WSN platform in the context of a fence monitoring application scenario.

5.2.2 Signature Matching Techniques

Signatures are created by converting data into feature domain. Having created signatures for specific events, acquired data corresponding to event occurrences can be reliably distinguished from the rest of the aggregated data. The difference between prototype and signature matching approaches concerns mainly the methodology utilized when data are transformed from one domain to another such as frequency or symbolic domain.

In [10] authors suggest a local event detection scheme utilizing principal component analysis (PCA) in order to extract the event signature. They also use a threshold to differentiate event data from no-event data. Respective implementation is evaluated on MicaZ nodes. Following another perspective Zoumboulakis et~al. [41] define complex events as set of data points effectively defining a pattern and then events are identified utilizing symbolic aggregate approximation (SAX). The idea is to transform data into a symbolic domain and then based on minimum distance estimation between already acquired data patters and real time acquired data try to estimate whether an event has occurred. The implementation concerns local application and it is evaluated using a Matlab simulation environment. Another interesting approach is presented by Martincic et~al. [42] by segmenting the whole network into cells. Then each cell by its own attempts to identify event occurrences by comparing cell acquired signatures to prerecorded event signatures. Signatures in this case are actually two-dimensional matrixes containing specific values characterizing the event. Consequently event identification is basically the process of comparing these types of matrixes to matrixes created from acquired data. In the context of WSNs this approach is evaluated considering the simulation environment of TinyOS assuming 2 × 2 and 5 × 5 sensor networks.

5.3 Artificial Intelligence and Machine Learning-Based Approaches

As the title indicates, these techniques use AI methodologies (also known as machine learning) as the core foundation of event detection. Respective proposals can be further classified into supervised and non-supervised categories. The former category requires annotated data during a training phase while the latter does not require or assume any a priori knowledge. It is also worth noting that AI-based event detection algorithms share a similarity with pattern matching-based approaches in their objective of identifying data patterns of tendencies using non-linear equations. An advantage of paramount importance of AI-based techniques is that they don’t require specialized knowledge into order to configure the approach’s parameters. Instead such a scheme is able to configure and calibrate its own parameters through the knowledge extracted from acquired data. Additionally, with respect to WSNs, it must be highlighted that respective implementations tend to be less computational intensive compared to other approaches like the statistical model-based ones. Since events tend to follow a specific pattern, learning-based techniques comprise a very promising category in WSNs event detection algorithms because they can continuously increase their knowledge without human intervention. Therefore new trends in the context of decision support through accurate event detection increasingly exploit artificial intelligence-based algorithms.

5.3.1 Supervised Learning

As previously mentioned, schemes belonging in this category require annotated data and a well-defined training phase.

5.3.1.1 Fuzzy Logic Techniques

Event-oriented data aggregation (EDA) is a distributed approach for event detection based on fuzzy logic engines, applied on TelosB WSN platform and in the context of an ocean surveillance application scenario. Another fuzzy logic-based distributed scheme is proposed by Xuan et~al. in [43]. This approach segments the network in clusters and then each cluster defines a confidence value. This confidence value represents the probability of an event occurring in each cluster. A local event detection scheme also based on fuzzy logic is presented in [13, 14]. As an application scenario, authors aim to evaluate the applicability of fuzzy logic in fire event detection focusing on home residents. A main objective of this approach is to be able to run independently on each WSN nodes without inter-node communication requirement.

5.3.1.2 Neural Network-Based Techniques

Neural networks consist of layers of interconnected nodes, each node producing a non-linear function of its input. The input to a node may come from other nodes or directly from the input data. Also, some nodes are identified with the output of the network. The complete network therefore represents a very complex set of interdependencies which may incorporate any degree of nonlinearity, allowing very general functions to be modeled. In the simplest networks, the output from one node is fed into another node in such a way as to propagate “messages” through layers of interconnecting nodes. More complex behavior may be modeled by networks in which the final output nodes are connected with earlier nodes, and then the system has the characteristics of a highly non-linear system with feedback. It has been argued that neural networks mirror to a certain extent the behavior of networks of neurons in the brain.

Neural networking approaches combine the complexity of some of the statistical techniques with the machine learning objective of imitating human intelligence; however, this is done at a more “unconscious” level and hence there is no accompanying ability to make learned concepts transparent to the user. A distributed event detection approach based on neural network is presented in [44]. This research effort aims towards forest fire event detection through WSN networks while fire event detection occurs both on the sensor and on the cluster head. Specifically on each sensor a threshold-based event detection algorithm approximation is executed, while the cluster head concentrates all indications and produces a final decision for the whole forest based on neural networks. The respective evaluation is simulation based and considers both small and large scale networks.

5.3.1.3 Bayesian Techniques

In a Bayesian network, the graph represents the conditional dependencies of different variables in the model. Each node represents a variable, and each directed edge represents a conditional relationship. Essentially, the graphical model is a visualization of the chain rule.

Bayesian networks are used from inference to prediction to modeling, whereas neural networks are used exclusively to predict. Compared to Bayesian, in a neural network, each node is a simulated “neuron.” The neuron is essentially “on” or “off,” and its activation is determined by a linear combination of the values of each output in the preceding “layer” of the network. A critical advantage of Bayesian networks compared to artificial neural networks (ANNs) with respect to WSNs concerns the surprisingly well performance assuming very small amounts of training data. On the other hand, ANNs are characterized by significantly higher complexity (in many cases not adequate for WSNs) allowing them to benefit from huge amounts of data compared to Bayesian-based methods which tend to exhibit a maximum performance over a threshold.

Such a technique based on a distributed version of the Bayes algorithm is presented in [45]. Authors indicate scenarios where faulty WSN nodes may exhibit specific patterns. The identification of such a pattern may be correlated to a specific event. It is mainly designed for large scale networks and is evaluated through simulation-based scenarios. A different approach is presented in [46] where an event detection method is presented based on merging utilizing the Bayesian approach where a Kalman Evaluator is utilized to evaluate missing data. Effectively, authors propose an outlier detection scheme based on Bayesian belief networks, which captures the conditional dependencies among the observations of the attributes to detect the outliers in the sensor streamed data. Data are then identified distributed following a Bayesian scheme. This approach is also evaluated in the context of simulation environment.

Finally, in [56] the authors propose an outlier detection scheme based on Bayesian belief networks, which captures the conditional dependencies among the observations of the attributes to detect the outliers in the sensor streamed data.

5.3.1.4 Support Vector Machines

SVM classification method consists of two main components: a kernel function and a set of support vectors. The support vectors are obtained via the training based upon specific training data. New data are classified using a simple computation involving the kernel function and support vectors only. In [47], the authors solve the localization problem with the following modest requirements: (1) existence of beacon nodes, (2) a sensor may not directly communicate with a beacon node, and (3) only connectivity information may be used for location estimation (pairwise distance measurement not required). Requirement (1) is for improved localization accuracy. Requirement (2) relaxes the strong requirement on communication range of beacon nodes. Requirement (3) avoids the expensive ranging process pertaining to specialized sensorial equipment. All these requirements are reasonable for large networks where sensor nodes are of limited resources. The authors propose LSVM—a novel solution that satisfies the requirements above, offering fast localization, and alleviating the border problem significantly. LSVM is also effective in networks with the existence of coverage holes or obstacles. LSVM maps the network using the learning concept of support vector machines (SVM). With respect to the localization problem, a set of geographical regions is defined in the sensor field and each sensor node is classified into these regions. Then its location can be estimated inside the intersection of the containing regions. The training data is the set of beacons, and the kernel function is defined based on hop counts only.

5.3.1.5 Decision Trees

According to this classification method input data is traversing all possible branches of a learning tree [7, 48]. The goal of this process is to compare input data features in relation to different conditions until a specific category is reached. DT-based approach is particularly important for WSNs since they can effectively address respective challenges. Characteristics features applied in WSNs include loss rate, corruption rate, mean time to failure (MTTF), and mean time to restore (MTTR). Finally a critical requirement posed by such solutions concern the necessity for linearly separable data while the process of building optimal learning trees is NP-complete [49].

5.3.2 Unsupervised Learning

Contrary to supervised learning approaches, unsupervised algorithms offer the critical advantage of not requiring annotated data, or training phase of any kind.

5.3.2.1 Fuzzy Adaptive Resonance Theory

The specific theory comprises a neural network category performing the work of clustering. Specifically it concerns the integration of fuzzy logic approach to an adaptive resonance theory algorithm [50] thus increasing the applicability of that algorithm. In [51] such an algorithm is presented in order to detect abnormal events, which is not the common case. In other words, abnormal events can be considered as outliers. Hence, the main idea is to cluster data and the cluster with the minimum population are considered events.

5.3.3 Fixed Width Clustering

In [52] authors propose a constant amplitude clustering technique able to detect intrusion events. The idea of constant amplitude clustering aims towards data clustering among a dynamic number of groups during a training phase. Then based on the population of each group a decision is made about which group corresponds to an intrusion event. Following a training phase, data being closer to the intrusion group are identified as intrusion event. Another technique of this group is presented in [53]. Authors in this case propose that isolated nodes aggregate data and send them to a sink (or cluster) node where these groups are merged together in order to detect anomalies. Respective implementation is based on C++ programming language and evaluated on data from the Great Duck Island (sensor acquiring temperature, humidity, and atmospheric pressure) for detecting any kind of anomalies.

5.3.3.1 K-Means

K-Means is one of the simplest unsupervised learning algorithms and most widely used to solve the well-known clustering problem. K-Means effectively is a numerical, non-deterministic, and iterative method. K-Means clustering requires assigning the number of clusters K beforehand. Additionally, it partitions the data set into K separate groups and every group (cluster) is determined by the distances between vectors.

The main advantage of K-Means clustering is its pace, as in practice it requires only a few iterations to converge. Thus, it is a prominent candidate to run in the sensor node in real time, and is robust and relatively computationally efficient as long as the K value is adequately low. However, the disadvantages include a fixed number of clusters which can make it difficult to predict a K value which is sensitive to the presence of noise data and outliers because they influence the calculation of the clusters’ centers. Randomly choosing an initial cluster center can result in different final clusters and as a consequence to an unsatisfactory result, due to different runs occurring for the same input data. Additionally, this method cannot handle the highly overlapping data because the clusters do not overlap. However, it is possible overcome most of these limitations in the preprocessing stage. K-Means is highly sensitive to the initial placement of the cluster centers. Because of initial cluster centers are chosen randomly, the K-Means algorithm cannot guarantee a unique clustering result. Additionally, choosing an ill-fitting initial placement of the cluster centers can lead to slow converge, empty clusters, and a higher chance of standing still in bad local minima [54].

In [55] authors use K-Means for detecting leak and burst events relying on offline techniques which collect loggers’ data from multiple locations. We believe that detecting such events in real time, with smart sensors nodes, could improve monitoring operations and save operational costs.

6 Performance and Behavioral Characteristics of Event Detection Techniques

In order to select the adequate event detection techniques in WSN networks, various performance and behavioral characteristics must be taken into consideration. In this context aspects like the location where the algorithm is executed (i.e., distributed or centralized), time constrained requirements, scalability, and sensor characteristics (e.g., homogenous or heterogeneous data) are of paramount importance in a relative elicitation effort. Existing approaches vary greatly on addressing such issues, therefore they represent objective metrics enabling useful and practical comparison.

6.1 Processing Model of Event Detection

Relative model describes how data are handled and the location(s) where the event detection actually takes place. Historically the approach followed in WSNs assumed that all sensor data are collected in a central network entity (sink node, base station, and cluster head) where data were being processed offline. The increasing demand for time constrained (or even real time) performance in contemporary WSNs has effectively render such centralized approaches inadequate. Therefore, nowadays two different models are attracting most of the research interest, specifically (1) local processing mode and (2) distributed processing model.

A local processing model assumes that all processing occurs in each isolated node based on each node own capabilities. Therefore, in such implementation no communication (or limited) is actually required. On the contrary in the context of the distributed processing model events are actually detected following a cooperative approach entailing specific communication collaboration amongst nodes. Thus the fundamental idea of distributed processing model is a collaboration of multiple nodes towards accurate and reliable event detection.

6.2 Technique’s Scalability

The scale of an actual WSN can drastically vary from few tens to many hundreds or even thousands of nodes depending on the specific application scenario. Consequently selected approach must be considering this parameter or be adaptive to changes of this parameter. Therefore, when a very small scale network is considered then local data processing could be preferred over a distributed approach posing some overhead (communication and computation) without actually enhancing performance. However, when the scale of the network, as well as complexity increases, distributed approaches are clearly more preferable offering better adaptability and enhanced accuracy and reliability.

6.3 Sensor Data Types

The degree of data homogeneity or heterogeneity can play a critical role on event detection techniques selection. Event detection technique design able to efficiently handle heterogeneous data is quite challenging since the increased sensor space availability analogously augments the dimension of data to be handled thus requiring sophisticated data processing approaches.

6.4 Time Constrained Performance Demands

In demanding application scenarios the delay between an event occurrence and event detection can be of cornerstone importance for performance as well as safety (of either equipment, material, or even personnel). Therefore, respective time constrained requirements (implying that a deadline miss leads to performance degradation) and real time requirements (where deadline miss could lead to a system failure) are increasingly required in domains like industry, health, hazardous environments monitoring, etc. Therefore a critical metric of real implementation concerns the capability of an approach to actually meet such demands.

6.5 Density

A parameter that can drastically affect event detection performance has to do with the expected network node density. It is worth noting that despite its importance and degree of influence it is usually omitted when evaluating respective solutions. On the contrast authors usually focus solely on scalability which, however, only partially cover this aspect.

6.6 Evaluation Approach

A critical aspect when characterizing a technique is the framework, infrastructure, and methodology used to actually evaluate it since these pertain to the techniques usefulness, objectivity, and most importantly the applicability with respect to the particular characteristics of a WSN network. Therefore, in many cases respective studies emphasize the development of the proposed algorithm in the context of real WSN development platforms in order to validate the proposal’s feasibility. It is also important to offer details on the computation cost, the communication cost, and the space complexity of the implementation. Such aspects are critical since they pertain to the resource expenditure of the solution, required to efficiently execute required algorithms. The majority, however, of existing proposal and studies focus on simulation-based evaluation efforts. Such efforts, however, fall short with respect to realistic measurements, feasibility validation, and taking into consideration the dynamic and stochastic behavior of a real network environment especially in wireless sensor network deployments.

7 Conclusions

Having discussed the most prominent approaches and highlighted all respective main characteristics of each event detection class, this section aims to summarize our survey and extract useful conclusions. We also hope that this effort can serve as a roadmap for future research efforts in this research domain. Undoubtedly event detection comprises a prominent research domain in WSNs, since by definition in most realistic commercial WSN application scenarios, the main objective is to recognize specific situations, as opposed to, for example, continuous streaming of raw data. However, for respective solutions to be effective careful consideration and adequate attention must be paid to WSN specific characteristics and even more scares resource limitation drastically differentiating them from any other, resource rich, wireless technology. Despite, the criticality of this requirement it is found that many proposal omit to take it into consideration failing to offer sufficient performance and behavioral characteristics. Probably the most compelling characteristic stemming from the aforementioned requirements has to do with optimal task allocation amongst nodes clearly advocating distributed strategies especially in wide scale WSN deployments. Furthermore, from the elicitation effort devoted in this chapter it has been extracted that applicability of artificial intelligence comprises one of the most prominent approaches, future event detection algorithms should pursue offering enhanced capabilities with respect to flexibility and adaptability to a wide range of real life application scenarios. Additionally, existing implementation exhibits advanced performance in terms of accuracy and reliability compared to the rest of classification categorizes. Finally, an aspect often overlooked but drastically influencing the added value of any relative proposal pertains to the validation/evaluation environment selected. In that respect the majority of the efforts rely on simulation-based environments which although offering specific advantages in early stages of design and development or preliminary performance indications in wide scale networks suffer from low objectivity, accuracy as well as myopic consideration of the dynamic nature of real WSNS. Therefore, in conjunction to the wide availability of powerful WSN platform nowadays, we believe that real life experimentation of proposed solutions should be an indispensable part of any relative research or/and development effort.