Anomaly Detection Scheme for Medical Wireless Sensor Networks

Salem, Osman; Guerassimov, Alexey; Mehaoua, Ahmed; Marcus, Anthony; Furht, Borko

doi:10.1007/978-1-4614-8495-0_8

Osman Salem³,
Alexey Guerassimov³,
Ahmed Mehaoua^3,4,5,
Anthony Marcus⁶ &
…
Borko Furht⁶

2568 Accesses
11 Citations

Abstract

Wireless Sensor Networks are vulnerable to a plethora of different fault types and external attacks after their deployment. We focus on sensor networks used in healthcare applications for vital sign collection from remotely monitored patients. These types of personal area networks must be robust and resilient to sensor failures as their capabilities encompass highly critical systems. Our objective is to propose an anomaly detection algorithm for medical wireless sensor networks, able to raise alarms only when patients enter in emergency situation and to discard faulty measurements. Our proposed approach firstly classifies instances of sensed patient attributes as normal and abnormal. Once we detect an abnormal instance, we use regression prediction to discern between a faulty sensor reading and a patient entering into a critical state. Our experimental results on real patient datasets show that our proposed approach is able to quickly detect patient anomalies and sensor faults with high detection accuracy while maintaining a low false alarm ratio.

Access provided by Autonomous University of Puebla. Download chapter PDF

False alarm detection using dynamic threshold in medical wireless sensor networks

Article 29 November 2019

A Novel Framework to Detect Anomalous Nodes to Secure Wireless Sensor Networks

Wireless Sensor Networks Anomaly Detection Using Machine Learning: A Survey

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Introduction

With the rise and precision of current medical procedures and the healthy lifestyles of many individuals turning towards a healthier lifestyle, the average lifetime expectancy is ever increasing [1]. Doctors are able to better diagnose and treat patients while the ability of individuals to cope and recover from illnesses is staggering. Technological advances incorporated with vast and accurate knowledge of the human anatomy have allowed healthcare professionals the ability to handle almost any scenario they encounter in individuals at hospitals and emergency treatment facilities [1, 2]. As the average individual lifetime expectancy has increased, this has also directly impacted our planets population and as such, a shortage of qualified healthcare professionals to treat the sick and needy has become an issue.

Scientists and researchers have developed numerous solutions to this problem, one of which allows patients to be remotely monitored utilizing networks of wireless sensors which relay, in real time, patient information to doctors and healthcare providers. Advances in sensor technologies and high throughput networks continue to refine the accuracy and increase the integrity and public trust of these systems. As a direct result, more individuals elect to utilize these systems as they allow greater freedom and mobility while maintaining the quality of care equivalent to direct medical interaction and attention found previously only in hospitals, clinics, and other specialized care facilities.

In medical applications, implementations of specialized Wireless Sensor Networks (WSN), known as Personal Area Networks (PAN) and Wireless Body Area Networks (WBAN), are comprised of numerous small devices attached to or implanted in the body of a patient. At present, many existing medical wireless devices are used to collect various patient metrics and vital signs, such as Heart Rate (HR), pulse, oxygen saturation (SpO2), Respiration Rate (RR), Body Temperature (BT), Electro Cardio Gram (ECG), Electro Myo Gram (EMG), Blood Pressure (BP), Blood Glucose Levels (BGL) and Galvanic Skin Response (GSR).

These networked medical sensors accumulate and transmit collected data to a central device (i.e., base station, PDA, smart phone) for processing and storage, This data may be then reevaluated and used to trigger medical alarms for caregivers or healthcare professionals, upon detection of anomalies in the physiological data, or clinical deterioration of monitored patients, to quickly react [2–4] by taking the appropriate actions.

The use of PANs and WBANs has been extended to monitor individuals having chronic illnesses (i.e., cardiovascular, Alzheimer’s, Parkinson’s, Diabetes, Epilepsy, Asthma) where these networks have enhanced the quality of life by: (i) reducing the healthcare costs (overcapacity, waiting, sojourn time, number of nurses, etc.), and (ii) providing mobility, while continuously collecting and relaying critical physiological data to their associated healthcare providers, e.g., long-term monitoring of patient recovery from surgical procedure after leaving the hospital, kinematic and rehabilitation assessment.

These types of Personal Area Networks (PAN), while extremely useful, are not without problems such as faulty measurements, hardware failure, and security issues. These networked small, lightweight wireless sensing devices also have additional drawbacks such as reduced computational power and limited capacity and energy resources. Sensor measurements from these networks are prone to a variety of other types of anomalies including environmental noise, constant faults resulting from bad sensor connections, energy depletion, badly placed sensors, malicious attacks through data injection, modification or replay attacks which may cascade and directly affect the collection point leading to unexpected results, faulty diagnosis, and a reduction in public trust of these systems.

Medical sensors with wireless capabilities are available in the market (MICAz, TelosB, Imote2, Shimmer [5], etc.). For example, ECG wireless sensor is connected to three electrodes attached to the chest for real time monitoring of heart problems. The pulse oximeter is used to measure the pulse and blood oxygenation ratio (SpO2), through the use of infrared light and photosensor. These valuable information can be exploited to detect asphyxia, insufficient oxygen (hypoxia) or pneumonia. A normal SpO2 ratio typically exceeds 95 %. When this ratio is lower than 90 %, an emergency alarm must be triggered due to possible lung problems or respiratory failure.

Sensor readings are unreliable and inaccurate [6, 7], due to constrained sensor resources and wireless communication interferences, which make them susceptible to various sources of errors. An improperly attached pulse oximeter clip or an external fluorescent light may cause inaccurate readings. In [3], the authors found that the sensing components were the first source of unreliability in medical WSNs, not networking issues. Faulty measurements from sensors negatively influence the measured results and lead to diagnosis errors. Furthermore, this may threaten the life of a patient after alerting emergency personnel for a code blue.

There may be many reasons for abnormal readings in WSNs [8], such as hardware faults, corrupted sensors, energy depletion, calibration, electromagnetic interference, disrupted connectivity, compromised sensors, data injection, patient with sweating, detached sensor, and heart attacks or some other health degradation, etc. Therefore, an important task is to detect abnormal measurements that deviate from other observations, and to distinguish between sensor faults and emergency situations in order to reduce the false alarm rate.

Over time, these networks accumulate vast amounts of historical data about an individual. Due to the enormity of information, it often becomes difficult to observe and extract sensor metric correlations and to distinguish between a patient entering a critical health state and a faulty sensor component. Therefore, an anomaly detection mechanism is required to identify abnormal patterns and to detect faulty data.

In contrast to signature based intrusion detection systems, where signatures are required to detect attacks, anomaly based systems [9] look for unexpected patterns in data measurements received from sensors. The abnormal pattern is a deviation from a dynamically updated normal model for sensed data, and is more adequate for WSNs given the lack of attack signatures. It is also important to note that anomaly based systems face challenges related to the training phase as it is difficult to find normal data in order to establish an appropriate normal profile.

Various anomaly-based detection techniques for sensor fault identification and isolation have been proposed and applied [9–12]. Distributed detection techniques identify anomalous values at individual sensors to prevent the transmission of erroneous values and reduce energy consumption. These techniques require resources that are not available in the sensors, and their accuracy is lower than centralized approaches, which have global view for spatio-temporal analysis.

Physiological parameters are correlated in time and space, and correlation must be exploited to identify and isolate faulty measurements, in order to ensure reliable operation and accurate diagnosis result. Usually, there is no spatial or temporal correlation among monitored attributes for faulty measurements.

In this chapter, we focus on anomaly detection in medical wireless sensor readings, and we propose a new approach based on machine learning algorithms to detect abnormal values. First we use J48 [13] decision tree algorithm to detect abnormal records, and when detected, we apply linear regression [14] to pinpoint abnormal sensor measurements in an abnormal record. However, physiological attributes are heavily correlated, and changes occur typically in at least two or more parameters, e.g., in Atrial Fibrillation (AF) and Asthma disease, the heart rate and respiration ratio increase simultaneously.

Our proposed solution is intended to provide reliability in medical WSNs used for continuous patient monitoring, where we detect anomalies in a patient’s health, and differentiate between the individual entering a critical health state and faulty readings (or sensor hardware). We seek to reduce the false alarm rate triggered by inconsistent sensors readings.

The rest of this chapter is organized as follows. In section “Related Work”, we review related work on anomaly detection and machine learning algorithms used in medical WSN. Section “Background” describes briefly linear regression and decision tree algorithm (J48) used in our detection system. The proposed approach is explained in section “Proposed Approach”. In section “Experimental Results”, we present our results from experimental evaluation, where we conduct a performance analysis of the proposed solution over medical dataset. Finally, section “Conclusion” concludes the chapter with a discussion of the results and future work.

Related Work

WSNs are becoming a major center of interest as they provide a viable solutions to avoid unnecessary casualties in many fields such as military, civil protection or medicine. Various vital sign monitoring systems have been proposed, developed and deployed, such as MEDiSN [4] and CodeBlue [15, 16] for monitoring HR, ECG, SpO2 and pulse, LifeGuard [17] for ECG, respiration, pulse oximeter and BP, AlarmNet [18] and Medical MoteCare [19] for physiological (pulse and SpO2) and environmental parameters (temperature and light), Vital Jacket [20] for ECG and HR. A survey of medical applications using WSNs is available in [21, 22].

However, collected data by WSNs have low quality and poor reliability. Many approaches for anomaly detection in WSNs have been proposed to detect abnormal deviation in collected data, and to remove faulty sensor measurements. Authors in [23] propose an algorithm for the identification of faulty sensors using the minimum and the maximum values of the monitored parameters. Any received measurement outside the [min-max] interval is considered an outlier or inconsistent data. In medical applications, we can not assume that all patients will have the same attribute interval ranges as the min-max values depend on sex, age, weight, height, health condition, etc.

Authors in [24] propose a hierarchical (cluster based) algorithm to detect outliers from compromised or malicious sensors. The proposed method is based on transmission frequency, and KNN distance between received values from different sensors. However, it is impractical in medical applications to put redundant sensors for monitoring the same parameters. A simple prediction and fault detection method for WSNs was proposed in [25]. The proposed algorithm is based on the detection of deviations between the reference and the measured time series. The proposed approach uses a predefined threshold and has been evaluated on the 3 types of faults: short time, long time and constant fault.

Authors in [11] propose a distance based method to identify insider malicious sensors, while assuming neighbor nodes monitoring the same attributes. Each sensor monitors its one hop neighbors and uses Mahalanobis distance between measured and received multivariate instances from neighboring sensors to detect anomalies in a distributed manner. Authors in [26] propose a voting based system to detect such events. Authors in [10] propose a failure detection approach for WSNs, which exploits metric correlations to detect abnormal sensors and to uncover failed nodes.

Authors in [27] explore four classes of methods for fault detection: rule-based, estimation-based, time series analysis, and learning based methods. They investigate fixed and dynamic threshold, linear least squares estimation, Auto Regressive Integrated Moving Average (ARIMA), Hidden Markov Model (HMM), etc. The authors found no best class of detection methods suitable for every type of anomaly.

Data mining techniques and machine learning algorithms have also used in WSNs to detect anomalies in multidimensional data. For example, Naïve Bayes [28], Bayesian network [29], Support Vector Machine (SVM [30]), etc. Authors in [31] propose an approach based on Support Vector Machine (SVM) and k-nearest neighbor (KNN) for anomaly detection in WSNs. Authors in [32] use an unsupervised approach for anomaly detection in WSNs, which is based on Discrete Wavelet Transform (DWT) and Self-Organizing Map (SOM). The DWT is used to reduce the size of input data for SOM clustering.

Authors in [33] propose the use of logistic regression modeling with a static threshold to evaluate the reliability of a WSN in the industrial field with a large number of sensors, and without updating the training model to be able to identify the cause of a potential loss of reliability. On the same scale of large sensor networks, authors in [13] propose a diagnosis method based on the enhanced $C4.5$ ($J48$ or decision tree algorithm) which merges the local classifiers into a large spanning tree to answer for the whole network accuracy. Another type of WSN deployment is presented in [28], which shows how to monitor the physical activity of a person using Sun SpOT sensors attached to the thighs. Authors use naïve Bayes based machine learning algorithm to determine if the person is sitting, standing, lying or walking. However, they do not consider that the values may be corrupted due to faulty hardware. Similarly, the authors in [34] present a system capable of discerning between mental stress states from relaxation states using logistic regression based on the heart rate variability.

In this chapter, we will use decision tree (J48) and linear regression algorithms to detect abnormal records and to pinpoint abnormal sensors reading. J48 is used to classify records and to reduce temporal complexity, and linear regression is used to predict current values. As physiological parameters are correlated, if only one monitored attribute deviates from estimated value, we classify the reading as faulty and perform data cleaning, and in the other cases, we trigger an alarm for patients entering into a critical state.

Background

In this chapter, we consider $N$ medical wireless motes (${S_1}, \ldots ,{S_N}$) attached to patient in order to monitor specific physiological parameters, as depicted in Fig. 8.1. These sensors transmit the collected data to the base station (smart phone) for real time analysis and alerting healthcare professionals when required. The base station may also transmit collected data to a remote/local DB for storage. The base station has higher computational power, memory storage and a greater transmission range than sensors. Collected data is analyzed at the base station before transmission to detect anomalies and raise alarms when a patient enters a critical state.

The collected measurements for physiological parameters are represented by the data matrix $X = \left( {{X_{ij}}} \right) $ where $i$ is the time instance, $j$ represents the monitored parameter. We denote by ${X_k} = \left( {{X_{1k}},{X_{2k}}, \ldots ,{X_{tk}}} \right) $ the time series associated with each parameter. ${X_k}$ is a column in the data matrix $X$ given in Eq. 8.1.

$$\begin{aligned} \begin{array}{l} X = \begin{array}{*{20}{c}} {{t_1}}\\ {{t_2}}\\ \vdots \\ {{t_m}} \end{array}\;\mathop {\left[ {\begin{array}{*{20}{c}} {{x_{11}}}&{}{{x_{12}}}&{}{{x_{13}}}&{} \cdots &{}{{x_{1n}}}\\ {{x_{21}}}&{}{{x_{22}}}&{}{{x_{23}}}&{} \cdots &{}{{x_{2n}}}\\ \vdots &{} \vdots &{} \vdots &{} \ddots &{} \vdots \\ {{x_{m1}}}&{}{{x_{m2}}}&{}{{x_{m3}}}&{} \cdots &{}{{x_{mn}}} \end{array}} \right] }\limits ^{\begin{array}{*{20}{l}} {{X_{1\;}}}&{{X_{2\;}}}&{{X_{3\;}}}&\;\,\cdots \,&{{X_{n\;}}} \end{array}} \end{array} \end{aligned}$$

(8.1)

The collected data on the smart phone must be processed in real time for online anomaly detection. These measurements are probably of low quality and reliability, due to the constrained resources of sensors and the deployment context (sweat, detached, damaged sensor, interrupted communications, etc.). The accuracy of this monitoring system relies on the received data, where faulty measurements trigger false alarms for caregiver. Therefore, to increase the accuracy of diagnosis result, faulty observations must be detected and isolated to reduce the false alarms and to prevent faulty diagnosis.

To detect abnormal values, we use decision tree algorithm (J48) to classify records (or line) as normal or abnormal. When an abnormal record is detected, the linear regression algorithm is used to predict current measurements for each parameter, and when the difference between predicted and current value is larger than the predefined threshold, a correlation analysis is conducted to differentiate between faulty sensor and patient health degradation.

In the rest of this section, we briefly review decision tree (J48) and linear regression algorithms used in our approach. For detailed information about these algorithms, please refer to [14].

Decision Tree J48

J48 [13] is an implementation of the decision tree algorithm C4.5 (proposed by Ross Quinlan), and belongs to the family of the supervised machine learning approaches. Like other decision tree algorithms used in classification, J48 uses training set to generate an optimal tree structure, which will be used to classify the arriving data flow (test set).

The decision tree classification is a process starting at the root of the tree, where each node of the tree is an independent decision that leads to another node, and continues until a leaf node is reached. The leaf nodes represent the outcome of the classification. In our model, the tree nodes are the monitored physiological attributes and the leaf nodes are the class (normal and abnormal).

The decision process in J48 is based on the information carried by each attribute. This information is used by the algorithm to establish a hierarchical classification from root to leafs of the decision tree, and it is represented by the Information Gain $IG(X,X_{k})$ in Eq. 8.4:

$$\begin{aligned} IG(X,{X_{k}}) = H(X) - \sum \limits _{{x_{ik}} \in X} {\frac{{\left| {{x_{ik}}} \right| }}{{\left| X \right| }}H({x_{ik}}} ) \end{aligned}$$

(8.2)

Where $H(X)$ is the entropy of the association between a training record ($r_{i}$) and the nominal class (normal or abnormal) given in Eq. 8.3, and ${x_{ik}}$ are the values taken by the attribute ${X_{k}}$.

$$\begin{aligned} H(X) = \sum \limits _{{r_i} \in X} {p({r_i}){{\log }_2}(\frac{1}{{{r_i}}})} \end{aligned}$$

(8.3)

The attributes with higher Information Gain are placed on the top of the tree, as the most relevant decisions are taken on early for faster classification and to optimize the calculation time. The Information Gain does not take into account the distribution of attribute values between the classes. The Gain Ratio ($GR$) is used to take into account the class splitting factor of each attribute :

$$\begin{aligned} \textit{GR}(X,{X_k}) = \frac{{\textit{IG}(X,{X_k})}}{{\textit{SI}(X,{X_k})}} \end{aligned}$$

(8.4)

Where the Splitting Information is given by:

$$\begin{aligned} \textit{SI}(X,{x_{ik}}) = - \sum \limits _{c = 1}^n {\frac{{|{x_{ik}}|}}{{|X|}}} {\log _2}\frac{{|{x_{ik}}|}}{{|X|}} \end{aligned}$$

(8.5)

Where $n$ is the number of classes, and $SI(X,{x_{ik}})$ is the entropy of the apparition of the ${x_{ik}}$ within each class. Therefore, by calculating the gain ratio for each attributes, we will be able to hierarchically distribute those attributes into the tree nodes.

Linear Regression

Linear regression is a statistical method which models a dependent variable $y_{ik}$ using a vector of independent variables $x_{ik}$ called regressors. The goal is to predict the value of $y_{ik}$ at time instant $t_{i}$ given the value of other attributes. The model itself is represented by the following relationship:

$$\begin{aligned} {y_{ik}} = {C_0} + {C_1}{x_{i1}} + {C_2}{x_{i2}} + \cdots + {C_n}{x_{in}} \end{aligned}$$

(8.6)

Where $y_{ik}$ is the dependent variable, $x_{ik}$ are the regressors and $C_{n}$ are the coefficients of the regressors (weights). These coefficients are calculated in the training phase as the covariance of $X_k$ and $Y_k$ is divided by the variance of $X_k$.

$$\begin{aligned} {C_k} = \frac{{Cov ({X_k},{Y_k})}}{{Var({X_k})}} = \frac{{\sum \left( {{x_{ik}} - \bar{X}_{k}} \right) \left( {{y_{ik}} - \bar{Y}_{k}} \right) }}{{ \sum ({x_{ik}} - \bar{X}_{k})}} \end{aligned}$$

(8.7)

The linear regression is used to predict the value of $y_{ik}$ by using the other attributes in the same instance $x_{ij|j \ne k}$, and to compare the predicted ($y_{ik}$) with the actual value of ${x_{ik}}$ to find if it fits within a small margin of error.

Proposed Approach

We consider a general scenario for remote patient monitoring, as shown in Fig. 8.1, where many wireless motes with a restricted resources are used to collect data, and a portable collection device (e.g., smart phone) with higher resources and greater transmission capability than WSN motes, is used to analyze collected data, and to raise alarms for emergency team when abnormal patterns are detected. We seek to detect abnormal values, in order to reduce false alarms resulted from faulty measurements, while differentiating faults from a patient’s health degradation.

The proposed approach is based on decision tree and linear regression. It builds a decision tree and looks for linear coefficients from normal vital signs that fall inside restricted interval range of monitored attributes. In the rest of this chapter, we focus only on the following vital signs: HR $ \in [80-120]$, pulse $ \in [80-120]$, respiration rate $ \in [12-30]$, SpO2 $ \in [90-100]$, T$^\circ \in [36.5-37.5]$. Attributes values that fall outside these (restricted) normal intervals are considered abnormal. HR and pulse reflect the same attribute from different sensors, where pulse is obtained from the pulse oximeter and HR is measured as the number of interbeat intervals (R-R) in ECG signal.

Equation 8.8 shows the residual threshold used to detect abnormal measurement:

$$\begin{aligned} {e_i} = \left| {{x_{ik}} - {{\hat{x}}_{ik}}} \right| \ge 0.1*{\hat{x}_{ik}} \end{aligned}$$

(8.8)

The proposed approach is based on two phases: training and detection. In the training phase, machine learning methods generate a model to classify data, and in the testing phase, inputs are classified as abnormal if they deviate from the established model. The J48 decision tree model (built using training data within restricted intervals) is used in our approach to classify each received record as normal or abnormal. In our experiments, the decision tree was the most efficient classification algorithm. The tree model is a set of rules (if-then) which is inexpensive to build, robust, and fast in processing as it is based on numerical comparisons for classification. Furthermore, abnormal instances detected by J48 will only trigger the forecasting with linear regression, and this is why we use restricted small intervals for monitored attributes in the training phase.

If a record is classified as abnormal by J48, we recursively assume that an attribute ($x_{ik}$) is missing, and the coefficients of linear regression are used to estimate the current value for this attribute ($\hat{x}_{ik}$) with respect to the others ($x_{ij|j \ne k}$), as given in Eq. 8.9 for heart rate estimation:

$$\begin{aligned} \hat{HR_{i}} = {C_0} + {C_1}{Pulse_i} + {C_2}{RESP_i} + \cdots + {C_5}{T_i} \end{aligned}$$

(8.9)

If the Euclidian distance between current ($HR_{i}$) and estimated ($\hat{HR_{i}}$) values is larger than the predefined threshold (10 % of estimated value) for only one attribute, the measurement is considered faulty and replaced by estimated value with linear regression. However, if at least two readings are higher than the threshold, we trigger an alarm for response caregiver emergency team to react, e.g., heavy changes in the HR and reduced rate of SpO2 are symptoms of patient health degradation and requires immediate medical intervention. The majority voting is the optimal decision to detect events and correct faults, as the probability of many attributes (2 or more in our experiments) being faulty is very low.

J48 is used to reduce the computation complexity, and to prevent the estimation of each attribute for each instance on the base station. J48 is based on few comparisons for classification, and the combination of both approach for fault detection and classification is used. Sliding windows are not used in our experiments to reduce the complexity. When the model is well specified within the training data, updating or rebuilding the models requires additional complexity (temporal and spatial) without a large impact to the performance.

Experimental Results

In this section, we present the performance analysis results of the proposed approach for anomaly detection in medical WSNs. Afterwards, we conduct an analysis to study the impact of the decision threshold on true positive and false alarm ratios. We used real medical data from the Physionet database [35], which contains 30392 records, and each record contains 12 attributes (ABPmean, ABPsys, ABPdias, C.O., HR, PAPmean, PAPsys, PAPdias, PULSE, RESP, SpO2, T$^\circ $). We only focus on 5 attributes: HR, PULSE, RESP, SpO2 and T$^\circ $. The variations of Heart Rate (in beat per minute - bpm), Pulse and respiration rate are presented in Fig. 8.2a, b and c respectively. Figure 8.2d shows the variations of SpO2 (oxygenation ratio) and the body temperature (constant value: 37$^\circ $C).

Figure 8.3a, b shows the predicted and error (difference between actual and predicted) values for HR with linear regression. The measured values of HR (actual) are presented in Fig. 8.2a. To test the efficiency of the used algorithms, we compare the results (predicted and error) with different classifiers using the WEKA [36] toolkit: Decision Table, Additive Regression and KNN for $K=3$.

Figure 8.3c, d shows the same results (predicted and error respectively) with additive regression tree, where the error is higher than linear regression. Figure 8.4a, b shows the results for KNN which is more computationally expensive (slow) and has an error higher than additive regression. Figure 8.4c, d shows the results of the decision table classifier, which had the worst results of all the classifiers used. Figure 8.5c shows the mean absolute error for each of these classifiers, where decision table achieves the prediction with the highest mean error rate, followed in descending order by KNN, additive and linear Regression. Linear regression had the lowest error percentage and the best overall performance out of the three classifiers, which is also why we use this classifier in the rest of this chapter.

Figure 8.5a shows the variations of the pulse and the respiration rate. Figure 8.5b shows the raised alarms by our proposed approach. The first alarm is raised when reported values for pulse and SpO2 (Fig. 8.5a) are abnormal in the same instant (both attributes are measured by the same sensor). The second alarm is triggered by the abnormal values of the HR attribute. These abnormal values are visible in Fig. 8.2a, b, c, and d when corresponding attributes suddenly fluctuate or decrease to zero.

To evaluate the performance of the proposed approach, we used the ROC (Receiver Operating Characteristic) to show the relationship between the true positive rate (Eq. 8.10) and the false positive rate (Eq. 8.11).

$$\begin{aligned} TPR = \frac{{TP}}{{TP + FN}} \end{aligned}$$

(8.10)

Where $TP$ is the number of true positives, and $FP$ is the number of false positives. The false positive rate (FPR) is defined as:

$$\begin{aligned} FPR = \frac{{FP}}{{FP + TN}} \end{aligned}$$

(8.11)

The ROC curve is used for accuracy analysis. A ROC curve is a graphical representation of the true positive rate versus the false positive rate when varying the value of the decision threshold. In general, a good detection algorithm must achieve a high detection ratio with the lowest false alarm rate. Figure 8.5d shows the ROC for the proposed approach where the first nominal classifier is J48, Logistic regression, NaïveBayes and Decision Table respectively. The J48 classifier achieves the best performance with $\mathrm{TPR}=100$ % and $\mathrm{FPR}=7.4$ %. These results demonstrate that our proposed approach can achieve very good accuracy for detecting motes anomalies.

Conclusion and Perspectives

In this chapter, we proposed a new framework which integrates decision tree and linear regression for anomaly detection in medical WSNs. The proposed approach achieves both a spatial and temporal analysis for anomaly detection. We have evaluated our approach on real medical data with many (both real and synthetic) anomalies. Our experimental results demonstrated the capability of the proposed approach to achieve a low false alarm rate with high detection accuracy.

We are currently investigating the performance of the proposed approach on real medical wireless sensor traffic using Shimmer platinum development kit [5]. In the future, knowing that most collected sensor measurements are normal, we look to experiment with data aggregation locally on the sensor motes to reduce the amount of exchanged data between the wireless sensors and the sink node without sacrificing accuracy.

References

Pardeep Kumar and Hoon-Jae Lee. Security Issues in Healthcare Applications Using Wireless Medical Sensor Networks: A Survey. Sensors, 12(1):55–91, 2012.
Google Scholar
JeongGil Ko, Chenyang Lu, Mani B. Srivastava, John A. Stankovic, Andreas Terzis, and Matt Welsh. Wireless Sensor Networks for Healthcare. Proceedings of the IEEE, 98(11):1947–1960, 2010.
Google Scholar
Octav Chipara, Chenyang Lu, Thomas C. Bailey, and Gruia-Catalin Roman. Reliable Clinical Monitoring using Wireless Sensor Networks: Experiences in a Step-down Hospital Unit. In Proceedings of the 8th ACM Conference on Embedded Networked Sensor Systems (SenSys’10), pages 155–168, 2010.
Google Scholar
JeongGil Ko, Jong Hyun Lim, Yin Chen, Rvazvan Musvaloiu-E, Andreas Terzis, Gerald M. Masson, Tia Gao, Walt Destler, Leo Selavo, and Richard P. Dutton. MEDiSN: Medical Emergency Detection in Sensor Networks. ACM Transactions on Embedded Computing Systems (TECS), 10(1):1–29, 2010.
Google Scholar
Adrian Burns, Barry R. Greene, Michael J. McGrath, Terrance J. O’Shea, Benjamin Kuris, Steven M. Ayer, Florin Stroiescu, and Victor Cionca. SHIMMER™- A Wireless Sensor Platform for Noninvasive Biomedical Research. IEEE Sensor Journal, 10(9):1527–1534, 2010.
Google Scholar
Honggang Wang, Hua Fang, Liudong Xing, and Min Chen. An Integrated Biometric-based Security Framework Using Wavelet-Domain HMM in Wireless Body Area Networks (WBAN). In IEEE International Conference on Communications (ICC’11), pp. 1–5, 2011.
Google Scholar
Yang Zhang, N. A. S. Hamm, N. Meratnia, A. Stein, M. van de Voort, and P. J. M. Havinga. Statistics-based outlier detection for wireless sensor networks. International Journal of Geographical Information Science (GIS), 26(8):1373–1392, 2012.
Google Scholar
Yang Zhang, Nirvana Meratnia, and Paul J. M. Havinga. Outlier Detection Techniques for Wireless Sensor Networks: A Survey. IEEE Communications Surveys and Tutorials, 12(2):159–170, 2010.
Google Scholar
Raja Jurdak, X. Rosalind Wang, Oliver Obst, and Philip Valencia. Wireless Sensor Network Anomalies: Diagnosis and Detection Strategies, volume 10, chapter 12, pages 309–325. Springer, 2011.
Google Scholar
Xin Miao, Kebin Liu, Yuan He, Yunhao Liu, and Dimitris Papadias. Agnostic Diagnosis: Discovering Silent Failures in Wireless Sensor Networks. In IEEE INFOCOM’11, pages 1548–1556, 2011.
Google Scholar
Fang Liu, Xiuzhen Cheng, and Dechang Chen. Insider Attacker Detection in Wireless Sensor Networks. In IEEE INFOCOM’07, pages 1937–1945, 2007.
Google Scholar
Yu-Chi Chen and Jyh-Ching Juang. Outlier-Detection-Based Indoor Localization System for Wireless Sensor Networks. International Journal of Navigation and Observation, 2012, 2012.
Google Scholar
Xu Cheng, Ji Xu, Jian Pei, and Jiangchuan Liu. Hierarchical distributed data classification in wireless sensor networks. Computer Communications, 33(12):1404–1413, 2010.
Google Scholar
Ian H. Witten, Eibe Frank, and Mark A. Hall. Data Mining: Practical Machine Learning Tools and Techniques (Third Edition). Morgan Kaufmann Publishers Inc., 2011.
Google Scholar
David Malan, Thaddeus Fulford-jones, Matt Welsh, and Steve Moulton. CodeBlue: An Ad Hoc Sensor Network Infrastructure for Emergency Medical Care. In Proceedings of International Workshop on Wearable and Implantable Body Sensor Networks, 2004.
Google Scholar
Havard Sensor Networks Lab. CodeBlue: Wireless Sensors for Medical Care. http://fiji.eecs.harvard.edu/CodeBlue, Last visited January 2013.
K. Montgomery, C. Mundt, G. Thonier, A. Thonier, U. Udoh, V. Barker, R. Ricks, L. Giovangrandi, P. Davies, Y. Cagle, J. Swain, J. Hines, and G. Kovacs. Lifeguard - A personal physiological monitor for extreme environments. In Proceedings of the IEEE 26th Annual International Conference on Engineering in Medicine and Biology Society, pages 2192–2195, 2004.
Google Scholar
A. Wood, G. Virone, T. Doan, Q. Cao, L. Selavo, Y. Wu, L. Fang, Z. He, S. Lin, and J. Stankovic. ALARM-NET: Wireless sensor networks for assisted-living and residential monitoring. Technical report, University of Virginia, 2006.
Google Scholar
Karla Felix Navarro, Elaine Lawrence, and Brian Lim. Medical MoteCare: A Distributed Personal Healthcare Monitoring System. In International Conference on eHealth, Telemedicine, and Social Medicine (eTELEMED’09), pages 25–30, 2009.
Google Scholar
Jolla P. Silva Cunha, Bernardo Cunha, A. S. Pereira, W. Xavier, N. Ferreira, and L. Meireles. Vital-Jacket$^{\textregistered }$: A wearable wireless vital signs monitor for patients’ mobility in cardiology and sports. In International Conference on Pervasive Computing Technologies for Healthcare, PervasiveHealth, 2010.
Google Scholar
Kres̆imir Grgic, Drago Z̆agar, and Vis̆nja Kriz̆anovic. Medical applications of wireless sensor networks - current status and future directions. Medicinski Glasnik, 9(1):23–31, 2012.
Google Scholar
Hande Alemdar and Cem Ersoy. Wireless sensor networks for healthcare: A survey. Computer Networks, 54(15):2688–2710, 2010.
Google Scholar
Torsha Banerjee, Bin Xie, and Dharma P. Agrawal. Fault tolerant multiple event detection in a wireless sensor network. Journal of Parallel and Distributed Computing, 68(9):1222–1234, 2008.
Google Scholar
Yiying Zhang, Han-Chieh Chao, Min Chen, Lei Shu, Chul hyun Park, and Myong-Soon Park. Outlier Detection and Countermeasure for Hierarchical Wireless Sensor Networks. IET Information Security, 2009.
Google Scholar
Yuan Yao, Abhishek Sharma, Leana Golubchik, and Ramesh Govindan. Online Anomaly Detection for Sensor Systems: A Simple and Efficient Approach. Performance Evaluation, 67(11):1059–1075, 2010.
Google Scholar
Sung-Jib Yim and Yoon-Hwa Choi. An Adaptive Fault-Tolerant Event Detection Scheme for Wireless Sensor Networks. Sensors, 10(3):2332–2347, 2010.
Google Scholar
Abhishek B. Sharma, Leana Golubchik, and Ramesh Govindan. Sensor Faults: Detection Methods and Prevalence in Real-World Datasets. ACM Transactions on Sensor Networks, 6(3):1–39, 2010.
Google Scholar
Xiuxin Yang, Anh Dinh, and Li Chen. Implementation of a Wearerable Real-Time System for Physical Activity Recognition based on Naïve Bayes Classifier. In International Conference on Bioinformatics and Biomedical Technology (ICBBT’10), 2010.
Google Scholar
Alfonso Farruggia, Lo Re Giuseppe, and Marco Ortolani. Probabilistic Anomaly Detection for Wireless Sensor Networks. In Proceedings of the 12th international conference on Artificial intelligence around man and beyond, pages 438–444, 2011.
Google Scholar
Ajay Singh Raghuvanshi, Rajeev Tripathi, and Sudarshan Tiwari. Machine Learning Approach for Anomaly Detection in Wireless Sensor Data. International Journal of Advances in, Engineering and Technology, 1(4):47–61, 2011.
Google Scholar
Miao Xie, Jiankun Hu, Song Han, and Hsiao-Hwa Chen. Scalable Hyper-Grid k-NN-based Online Anomaly Detection in Wireless Sensor Networks. IEEE Transactions on Parallel and Distributed Systems, PP(99):1–11, 2012.
Google Scholar
Supakit Siripanadorn, Wipawee Hattagam, and Neung Teaumroong. Anomaly Detection in Wireless Sensor Networks using Self-Organizing Map and Wavelets. International Journal of Communications, 4(3):74–83, 2010.
Google Scholar
Fei Huang, Zhipeng Jiang, Sanguo Zhang, and Suixiang Gao. Reliability Evaluation of Wireless Sensor Networks Using Logistic Regression. In Proceedings of the 2010 International Conference on Communications and Mobile, Computing (CMC’10), pp. 334–338, 2010.
Google Scholar
Jongyoon Choi, Beena Ahmed, and Ricardo Gutierrez-Osuna. Developpement and Evaluation of an Ambulatory Stress Monitor Based on Wearable Sensors. IEEE Transaction and Information Technology in Biomedicine, 16(2):279–286, 2012.
Google Scholar
Physionet. http://www.physionet.org/cgi-bin/atm/ATM.
Weka data mining tool. http://www.cs.waikato.ac.nz/~ml/weka/.

Download references

Acknowledgments

This research was supported by Korea Science and Engineering Foundation, under the World Class University (WCU) program with additional support from NSF grants CCF-0545488 and OISE-0730065 and the National Science Research Center (CNRS) LaBRI, France.

Author information

Authors and Affiliations

LIPADE, University of Paris Descartes, Paris, France
Osman Salem, Alexey Guerassimov & Ahmed Mehaoua
Division of IT Convergence Engineering, POSTECH, Gyeongsangbuk-do, Korea
Ahmed Mehaoua
LaBRI, Centre National de la Recherche Scientifique (CNRS), Bordeaux, France
Ahmed Mehaoua
Department of Computer Science, Florida Atlantic University, Boca Raton, USA
Anthony Marcus & Borko Furht

Authors

Osman Salem
View author publications
You can also search for this author in PubMed Google Scholar
Alexey Guerassimov
View author publications
You can also search for this author in PubMed Google Scholar
Ahmed Mehaoua
View author publications
You can also search for this author in PubMed Google Scholar
Anthony Marcus
View author publications
You can also search for this author in PubMed Google Scholar
Borko Furht
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Osman Salem .

Editor information

Editors and Affiliations

Florida Atlantic University Dept. of Computer Science & Engineering, Boca Raton, Florida, USA
Borko Furht
Department of Computer Science and Engineering, Florida Atlantic University, Boca Raton, Florida, USA
Ankur Agarwal

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Salem, O., Guerassimov, A., Mehaoua, A., Marcus, A., Furht, B. (2013). Anomaly Detection Scheme for Medical Wireless Sensor Networks. In: Furht, B., Agarwal, A. (eds) Handbook of Medical and Healthcare Technologies. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8495-0_8

Download citation

DOI: https://doi.org/10.1007/978-1-4614-8495-0_8
Published: 21 November 2013
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8494-3
Online ISBN: 978-1-4614-8495-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics