1 Introduction

Aging is often related to a functional decline physical and cognitive abilities, especially in individuals who suffer from diseases such as Alzheimers disease and dementia. Compounding the problem is the reality that many elderly live alone [33]. A smart assistive living environment is one approach to promote active living for the elderly while reducing the burden on family and care givers.

In the past two decades, with advances in sensor technologies and intelligent data analysis algorithms for health care [2225], many systems for monitoring activities of daily living (ADLs) and detecting abnormal behaviors [36, 49] have been proposed. These systems typically use sensors deployed in the living space and on the body of a user, such as visual sensors, ambient sensors, and wearable sensors, to collect a user’s daily personal behavior data, and employ machine learning algorithms to identify daily activities or unusual behaviors. These include daily routine [10], specific activities such as eating and exercising [17, 35, 42], as well as urgent events such as falls [15, 31, 41]. Interestingly, a recent study [2] proposes a model to analyze the daily stress of a user using mobile phone data, weather conditions monitored by environmental sensors, and personal traits obtained by questionnaire. These studies provide a strong technical basis for providing daily care to the elderly and promoting active living.

However, a number of issues remain to be resolved. First, unlike traditional public surveillance and multimedia event detection [4, 37, 43], protection of the privacy of the elderly is of paramount importance. In particular, methods based on visual sensors or microphones [7, 20, 30, 38] are not suitable for home monitoring. It is also difficult to ask the elderly to fill out questionnaires [2] or to perform activities in specific ways [47]. Second, existing methods are typically based on fixed rules or training classifiers to identify ADLs from sensor data and treat outliers as abnormal behaviors. However, the performance of classifiers depends highly on the quality and volume of training samples. As existing studies can only obtain data from a limited number of volunteers and people may perform the same activity in different ways, it is a challenge to build robust classifiers suitable for detecting the activities of all users, especially for the task of abnormal behavior detection. Therefore, building a robust intelligent system which is able to learn from a user’s behavior and provide personalized assistance is still challenging. Third, to the best of our knowledge, almost all of the proposed methods cannot perform online analysis of the sensor data. They entail delays in detecting and reporting urgent situations, such as falls. As such, online activity recognition models are necessary for real-time analysis of sensor data. Fourth, it is important for a home care system to summarize the health status and daily behavior of the elderly, in order to guide the elderly towards healthy and active living. However, few studies exist in this area.

Towards building a smart assistive living environment for the solitary elderly, we propose an online daily habit modeling and anomaly detection (ODHMAD) model, which can perform real-time personalized daily activity recognition, habit modeling, and anomaly detection. Compared with existing approaches, ODHMAD offers several advantages. First, ODHMAD is an online healthcare framework which simultaneously performs dynamic online activity recognition, habit modeling, and anomaly detection for the elderly. Second, ODHMAD employs an online activity recognition (OAR) algorithm that performs online analysis of the sensor data triggered by the activities of the elderly. OAR is able to recognize activities by adaptively learning the activation status of sensors with light parameter tuning and no training data, and can capture activity details, such as time, duration, and breaks during an activity. Due to its real-time feature, OAR can respond quickly to the occurrence of events. This enables ODHMAD to take timely action for urgent events, such as falls. Moreover, OAR allows one-to-many relationships between activities and sensors, which enables ODHMAD to recognize complex daily activities using multiple sensors, such as sleeping and leaving home. Third, ODHMAD incorporates a dynamic daily habit modeling (DDHM) algorithm for the dynamic modeling of the elderly’s daily habits based on the activities detected by the OAR algorithm. DDHM generates a two-layer tree structure in which nodes in the first layer specify different activities while those in the second layer having the same father node will model the likelihood of different periods during which the corresponding activity may happen. Fourth, the modeled daily habits of the elderly provide a summary of the elderly’s daily life, which is an important indicator of the elderly’s wellness and is helpful to family members and caregivers. They can also serve as a knowledge base for the personalized detection of anomaly based on the elderly’s daily behavior. Assuming the two-layer hierarchy is stable, once an activity is detected by OAR, DDHM will perform a search in the hierarchy to match the activity and the elderly’s habits. A low likelihood will be indicated if the activity is dissimilar to the elderly’s habits, which indicates a potential anomaly. In contrast to the detection of anomaly in daily activities, urgent events such as falls can directly incur an alarm when detected by OAR, which can subsequently be modeled by DDHM for summarization purpose.

Due to a lack of data on the whole-day monitoring of a user’s activities, we are unable to evaluate the proposed DDHM algorithm for daily habit modeling and anomaly detection. Therefore, we have conducted experiments on two published data sets, namely, the fall detection data set [27] and the Opportunity activity recognition dataset [34], to evaluate the performance of the proposed OAR model for online activity recognition. Experimental results show that OAR can effectively model the normal status of sensors and can provide better performance than state-of-the-art algorithms, especially for the miss detection rate.

The remainder of this paper is organized as follows. Section 2 reviews the literature on daily activity detection and abnormal behavior/anomaly detection using sensors. Section 3 describes technical details of the proposed ODHMAD model. Experimental results are reported in Section 4 and the last section summarizes our main findings and highlights possible future work.

2 Related work

2.1 Daily activity recognition

Generally speaking, existing studies on daily activity recognition typically follow one of two approaches, i.e. the rule-based approach and the pattern recognition approach. The rule-based approach relies on manually created rules for decision making [3, 9, 47]. It requires either domain knowledge or specific users’ habits to detect activities. Therefore, although it can provide personalized healthcare, it requires a lot of effort and is not easy to scale up. The pattern recognition approach extracts different information/features from the sensor data and uses machine learning algorithms, typically classification methods, to identify activity patterns. Existing studies follow one of three directions: 1) using ambient sensors for detecting daily behavior and routine of a user [10, 11, 19, 28, 40]; 2) using wearable sensors and accelerometers to detect the occurrence of specific activities, such as drinking, eating, taking vitamin, using the bathroom, and exercising [12, 13, 17, 35, 42, 48]; and 3) studying complex scenarios, such as the detection of activities during which other activities are involved [32] and the detection of multiple individules in a room [21]. There are also active studies in the areas of computer vision and multimedia which detect daily activities of users from images [30, 38, 39] and videos [1, 8, 16, 44, 45]. Although such studies are not under the umbrella of unobtrusive sensing and image/video capture is not used in our study, the machine learning algorithms employed could be investigated for sensor data analysis.

2.2 Abnormal behavior detection

In contrast to daily activity recognition, classification methods cannot generally be used for the detection of abnormal behaviors, because the anomalies are usually rare and unexpected, resulting in insufficient training data. However, there are several studies which examine the feasibility of identifying abnormal behavior by finding behavior patterns that are dissimilar to the learnt normal patterns [14, 28]. Many studies have demonstrated the feasibility of training a classifier to detect a specific event, especially falls [6, 7, 18, 20, 26, 31, 41, 46]. Moreover, clustering algorithms have also been used to identify abnormal behavior patterns [15, 19]. There are also studies on the detection of abnormal user behavior through the analysis of the activation sequences of sensors [29].

3 Online daily habit modeling and anomaly detection (ODHMAD) model

The online daily habit modeling and anomaly detection (ODHMAD) model is designed as a general-purpose and integrated homecare framework, providing online analysis of the elderly’s sensory behavior data for their personalized daily activity recognition, daily habit modeling, and anomaly detection. Figure 1 provides an overview of the ODHMAD model, which consists of mainly three modules, i.e. the sensor data gathering and processing module, the online activity recognition module, and the dynamic daily habit modeling and anomaly detection module. The first module dynamically collects raw sensor data from sensors and performs information processing to extract and organize useful information into required format, such as vector form in our case. Subsequently, in the online activity recognition module, the formatted sensor data are processed one at a time by the developed online activity recognition (OAR) model to identify daily activities with detailed activity information, such as start/end time, sensor conditions, and the number of breaks. Once an activity is recognized, the activity details will be sent to the last module, i.e. the dynamic daily habit modeling and anomaly detection module. The dynamic daily habit modeling (DDHM) model plays key role in this module by dynamically modeling the daily habits of the elderly as a two-layer hierarchy using probabilistic models. The anomaly detection is performed by detecting activities that are against the modeled daily habits of the elderly.

Fig. 1
figure 1

Overview of the online daily habit modeling and anomaly detection (ODHMAD) model

Compared with existing studies, the online processing manner of the OAR model enables immediate response of the system to potential urgent personal and environmental events, such as falls and fire. Besides, OAR recognizes activities solely based on the activation status of the sensors, so it requires light parameter settings and no training data. The dynamic daily habit modeling (DDHM) model provides an informative summarization of the elderly’s health status, making it possible for the elderly to have a direct look at their daily lifestyle. Besides, it is feasible to use such knowledge to infer the most likely activity to be performed by the elderly at a specific time. Regarding anomaly detection, ODHMAD does not relay on training data of normal behaviors as most algorithms will do. Instead, the learnt daily habits of the elderly provide a personalized knowledge base for detecting abnormal behavior of the elderly. In the following sections, technical details of the three modules are introduced.

3.1 Sensor data gathering and processing

The gathering and processing of sensor data serve as the basis for any sensing-based healthcare systems. To build a robust system for daily activity analysis, the first step is always the deployment of sensors and the decision on the target activities. In this section, we summarize important activities and popular sensors for detecting them from recent studies. As shown in Table 1, we observe that with the advances of sensors, daily trajectory of a user could be sensed using simple ambient sensors, such as switch, pressure, and motion sensors. Besides, some simple behaviors, such as device usage and exercise, and urgent events such as falls, can be inferred using wearable sensors. However, for complex behaviors such as sleeping, drink, and taking medicine, existing studies typically use the detection of lying in bed, holding water bottle/cup, and holding medicine bottle instead. Therefore, precise detection of the complex behaviors of a user is still an open problem. Besides, in view that much of recent effort has been on the detection of specific behaviors, gathering sufficient information from multiple sensors for detecting complex behaviors will be an important direction.

Table 1 Summary of the activities to be detected and the utilized sensors

With a well-defined mapping between activities and sensors, the collected raw sensor data should be pre-processed before seeding to the system. As an example of sensor data shown in Fig. 1, the sensor data from a sensor at a time typically have information from multiple entries. Therefore, effective processing and selection of meaningful information from the raw sensor data are necessary to make them in a proper format for later input to the intelligent system and backup purpose. Existing studies typically use traditional text processing tools to achieve this task. However, there exists critical practical issues, such as data missing, data storage, and interface developed for transmitting data from sensor side to server side.

In our study, we have established a simulation environment to gather sensor data and test the developed model. Up to now, we have installed 18 sensors, including pressure, switch, noise, light, temperature, and humidity sensors, and identified 11 target activities, including sleeping, cooking, eating, leaving home, watching TV, using toilet, dressing, having visitors, using laundry, doing exercise, and taking medicine. Note that more important activities will be further investigated, and more sensors will be integrated to our system to explore more effective solutions for detecting the target activities. Besides the sensors for sensing the elderly’s activities, environmental sensors are utilized to provide an evaluation of the living environment of the elderly and also to detect urgent events such as fire and explosion.

3.2 Online activity recognition (OAR) model

The online activity recognition (OAR) model (Fig. 2) performs online analysis of sensor data in an incremental manner to recognize activities. Different from the algorithms based on classification, OAR does not require training data. However, several issues should be addressed, including 1) how to be aware of the activation status of a sensor, i.e. whether some behavior information of the elderly is captured by the sensor; 2) how to decide the start and end time of an activity; 3) how to deal with incontinuous activities which may be disturbed some times before it ends. OAR copes with such challenges based on six assumptions:

  1. 1.

    A sensor will return stable values when no activity happens;

  2. 2.

    A sensor should return much higher or lower values when the corresponding activity is happening;

  3. 3.

    The total time of a sensor in activation should be no longer than that in normal status, i.e. the status when no activity happens;

  4. 4.

    An activity should last for a certain period of time;

  5. 5.

    Short breaks during an activity should not divide the whole activity into several periods;

  6. 6.

    The same set of sensor(s) should not be the sole indicator for more than one activity.

Fig. 2
figure 2

Flowchart of the online activity recognition (OAR) model

With the above assumptions, OAR is able to recognize the elderly’s activities based on the activation status of sensors. Figure 3 shows an example of the activities detected by OAR. However, these assumptions also limit the ability of OAR to recognize activities that should be distinguished by specific signal curve patterns, such as level walking and ascending stairs. Fortunately, these assumptions are applicable to wearable sensors for detecting drastic activities/events such as falls and most ambient sensors.

Fig. 3
figure 3

Example of the information of activities detected by the online activity recognition (OAR) model

As observed from the flowchart of the OAR model in Fig. 2, OAR utilizes five types of information to detect an activity in an online manner, including sensor activation period status, sensor normal status, sensor break status, sensor pending status, and activity-sensor mapping status. We illustrate the details of each type of information and how OAR handles such information as follows, and summarize the entire algorithm in Algorithm 1.

  • Sensor activation period status indicates whether a sensor is in activation and the associative information. Specifically, the corresponding file for this status of a sensor records the flag indicating wether the sensor is in activation, the start and end date/time, the mean sensor value during the activation, and the number of data items received during the activation period. This enables OAR to capture current status of all sensors and to dynamically receive and record information from new sensors.

  • Sensor normal status evaluates whether the received sensor signal indicates an activation of the corresponding sensor. OAR achieves this goal by modeling the normal status of sensor values so that the activation of a sensor can be determined by the sensor values that are far different from the learned normal ones. OAR adopts different equations for modeling the normal status of sensors producing different types of output signals. For the state-change sensors producing binary values, a fixed value is qualified; while for the real-valued sensors producing fluctuant curves, an Gaussian-like probability density function f(x)∼N(μ,σ 2) is used to model the range of normal sensor values, where μ and σ 2 are the mean value and variance respectively. Given the sensor values \(\{a_{i}\}_{i=1}^{n}\) and the learned function \(f(x)=e^{-\frac {(x-\mu )^{2}}{2\sigma ^{2}}}\), when a new sensor value a n+1 arrives, the update functions for the new parameter values \(\mu ^{\prime }\) and \({\sigma ^{2}}^{\prime }\) are defined by

    $$ \mu^{\prime}=\frac{n}{n+1}\mu+\frac{a_{n+1}}{n+1}, $$
    (1)
    $$ {\sigma^{2}}^{\prime}=\frac{n}{n+1}(\mu^{2}+\sigma^{2})+\frac{a_{n+1}^{2}}{n+1}-{\mu^{\prime}}^{2}. $$
    (2)

    The Gaussian distribution f(x) provides a quantitative evaluation for the normal status of sensors, and sensor values far from the normal ones indicate the activation of sensors. In our study, we typically use x∈[μ−2σ,μ+2σ] as the range for normal status evaluation, which has a relatively strong immunity to unstable signals.

  • Sensor break status and pending status work in conjunction to record breaks during an activity. They, on one hand, help OAR to precisely detect the end time of an activation period; on the other hand, they enable OAR to detect activities with short interruptions. The information on breaks may also be important indicators for the elderly’s healthcare, such as the quality of sleeping. Note that domain knowledge for specific activities here is required to select a proper time interval as a short break.

  • Activity-sensor mapping status includes an indexing list of mapping between activities and sensors, similar to these listing in Table 1. It not only enables OAR to perform a fast checking of the occurrence of an activity immediately after the completion of the activation period of a sensor, but also make it possible for OAR to recognize complex activities that should be detected using multiple sensors, such as sleeping (pressure sensors on bed and wearable sensors for detecting heart rate).

figure d

3.3 Dynamic daily habit modeling (DDHM) model

The dynamic daily habit modeling (DDHM) model aims to learn the daily habit of the elderly from their daily activities. In the current ODHMAD system, DDHM dynamically generates a two-layer tree structure with the daily activities recognized by the OAR model for modeling the elderly’s daily habits. As shown in Fig. 4, each node in the first layer specifies a predefined activity; while the probabilities of the elderly to perform an activity in different time periods are modeled in the second layer. Specifically, for each period of an activity, DDHM models important indicators such as start time (T s t a r t ), end time (T e n d ), and the number of breaks (B) etc. Similar to modeling the normal status of sen- sors, the Gaussian-like probability density function and the incremental update equations, i.e. (1) and (2), are utilized to model these indicators for discovering the elderly’s daily habits.

Fig. 4
figure 4

Structure of the dynamic daily habit modeling (DDHM) model

To effectively build the two-layer hierarchy, an important task is to precisely identify and distinguish different modeled periods of activities in an online manner. To achieve this goal, we utilize the information on start time, end time, and duration to quantitatively evaluate the similarity between the detected activity and the modeled daily habit. The higher the similarity is, the higher probability the detected activity has to occur in the modeled period. Given a detected activity period with start time t 1 and end time t 2 and a selected period with probability density functions f 1(x)∼N(μ 1,σ 1) and f 2(x)∼N(μ 2,σ 2) for start time and end time respectively, the similarity between them is defined by

$$ Sim \,=\,\frac{1}{2}(f_{1}(t_{1})+f_{2}(t_{2}))+\frac{\max(0,\min(t_{2},\mu_{2})-\max(t_{1},\mu_{1}))}{2}\left( \frac{1}{t_{2}-t_{1}}+\frac{1}{\mu_{2}-\mu_{1}}\right). $$
(3)

Equation 3 essentially evaluates three aspects, including how close their start times are, how close their end times are, and how much their overlap is. If the similarity is lower than a threshold, say 80 %, for all periods of the activity in the hierarchy, a new node will be created to model this new period of the activity. Practically, a pruning of rare nodes can be perform to prevent node proliferation and save computational resources.

3.4 Personalized anomaly detection method

The two-layer hierarchy generated by the DDHM model not only produces a summary of the elderly’s daily habits and indicates their health status, but also serves as a knowledge base assisting the personalized anomaly detection from the elderly’s daily behavior. The anomaly detection works conjunctly with the daily habit modeling process in DDHM. Given that the hierarchy of the elderly’s daily habits has been stable, if the similarity between the detected activity and the most similar period in the hierarchy does not reach a threshold, say 30 %, the detected activity will be deemed as a potential anomaly and an alarm could be sent instead of new node creation for habit modeling. Besides, the risky event such as falls should incur an alarm immediately without modeling.

4 Experiments

As an online system for the personalized daily activity recognition, daily habit modeling, and anomaly detection, the performance of the proposed ODHMAD model should be evaluated in terms of three aspects, including the performance of the OAR model for activity recognition from sensor data, the quality of the two-layer hierarchy generated by DDHM for daily habit modeling, and the performance of anomaly detection. Unfortunately, as an early-stage study, we currently are still in preparation for the collection of such real-world data. Besides, we did not find a publicly accessible data set for whole-day monitoring of a user’s behaviors. Therefore, we are unable to evaluate the performance of the proposed DDHM model for daily habit modeling and anomaly detection at current stage.

In the following sections, we reported our experiments on two public accessible data sets, i.e. the fall detection dataset [27] and the Opportunity activity recognition dataset [34], to evaluate the performance of the proposed OAR model for online activity detection.

4.1 Dataset and experiment setup

4.1.1 Fall detection dataset

The fall detection data set [27] is originally collected for simulated falls, near-falls, and activities of daily living. The data are collected from 42 volunteers, each of whom wears two sets of sensors, including a 3D accelerometer and a 3D gyroscope, on chest and thigh respectively. The volunteers are divided into two groups, in which 32 of them in group 1 perform a series of activities including falls, near-falls, and a set of daily activities, such as standing, sitting, walking and lying; while the rest perform ascending and descending of stairs. During the activities, data are collected at 100 Hz.

In our experiments, we utilized the sensor data of the 32 volunteers/subjects in group 1 to evaluate the performance of our proposed OAR model on detecting falls. Specifically, each subject has the number of data items ranging from 130k to 160k, and each item has 12 dimensions recording the data from the two sets of sensors. To evaluate the fall event, we used the data from the six dimensions of 3D accelerometer and 3D gyroscope deployed on the chest of subjects for experiments.

4.1.2 Opportunity activity recognition dataset

The Opportunity activity recognition dataset [34] is for human activity recognition from wearable, object, and ambient sensors. There are in total four subjects, each of whom performs six runs of activities, including activities of daily living and scripted activities. The sensor data are collected at 30 Hz. It is notable that the annotation of this data set is rather rich and diverse, which includes four types of locomotion, thirteen actions to twenty-three objects, seventeen gestures, and five types of activities.

Different from the experiments on the fall detection data set by which we evaluated the performance of the proposed OAR model on urgent event detection, we aimed to demonstrate the performance of the OAR model on daily activity recognition. Therefore, we selected six types of activities for performance evaluation, including taking the cup, taking the bottle, open/close door1, open/close dishwasher, open/close upper-drawer, and open/close fridge. For the six runs of activities of each subject, The sensor data from 3D accelerometers attached to the corresponding objects and the left/right hands are utilized for experiments. Please note that we did a processing on the ground-truth labels so that activities performed by either left or right hands are treated to be the same.

4.2 Evaluation measures

We adopted three performance evaluation measures for activity recognition, including precision, false alarm rate (F A_R a t e), and miss detection rate (M_R a t e), which are defined by

$$ Precision = \frac{n_{true}}{n_{detected}}, $$
(4)
$$ FA\_Rate = \frac{n_{false}}{n_{detected}}= 1-\frac{n_{true}}{n_{detected}}, $$
(5)
$$ M\_Rate = 1-\frac{n_{true}}{n_{activity}}, $$
(6)

where n d e t e c t e d is the number of detected activities, n t r u e is the number of correct detection, n f a l s e is the number of false detection, and n a c t i v i t y is the total number of activities in reality.

We counted a correct detection by evaluating whether there is an overlap between the detected period and the groundtruth period. Note that although the false alarm rate is a complement to precision, it is one of the most important indicator to the performance of a detection system. So we reported the performance in terms of both precision and false alarm rate.

4.3 Case study of OAR model

We first evaluated the performance of the OAR model by conducting a case study on the fall detection data set to visually observe how OAR works. Specifically, we selected data from certain dimensions of the sensors and incrementally fed them to OAR to obtain the learned normal status of sensors and the detected fall periods. Because data from all subjects typically produce similar curve patterns, we take the data of subject 1 as an example, where a visualization of the sensor data is shown in Fig. 5. We observed that the two types of sensors produce sensor data in quite different ranges of values, and even the sensor data from the same sensor but different axes also have different curves. Our objective is to correctly identify all fall periods from such sequential sensor data.

Fig. 5
figure 5

Visualization of data from the respective x-,y-, and z-axis of 3D accelerometer (a) – (c) and 3D gyroscope (d) – (f) deployed on the Chest of Subject 1. The x-axis of graphs is the number of data items while the y-axis is the sensor value

The ground-truth and experimental results by our OAR model are shown in Fig. 6. Note that we used the graph of data from the x-axis of 3D accelerometer to show the results, because they are the most similar to the ground-truth. From Fig. 6a, we observed that the x-axis of 3D accelerometer typically produced a peak value during a fall period. This enabled the OAR model to effectively detect the fall events. However, there were also peak values that did not indicate fall periods, which may degrade the performance of OAR for fall detection. As illustrated by the authors who created this data set, those peak values are produced by near-falls or transitions of postures. The fall periods detected by OAR using solely data from x-axis of 3D accelerometer was presented in Fig. 6b. We observed that OAR correctly modeled the regions of normal sensor status other than that of fall periods. This demonstrated the effectiveness of the proposed (1) and (2) to model the normal sensor status using the Gaussian-like probability density function f(x)∼N(μ,σ 2) and the suggested boundaries x∈[μ−2σ,μ+2σ]. Also, OAR correctly detected all fall events. However, as expected, a number of false alarms were produced. This demonstrated that solely using one type of data is insufficient to detect complex activities like falls. Therefore, we further evaluated whether the performance of OAR can be improved by using multiple types of sensor data. Figure 6c illustrates the fall periods detected by OAR using data from all fix dimensions of 3D accelerometer and 3D gyroscope. We observed that, although the correlation between the curves of different dimensions, as visualized in Fig. 5, is not obvious by human judgement, OAR was able to well utilize such information to detect 13 out of 14 fall events while no false alarm was produced. This demonstrates the performance of OAR in sensor data fusion for fall event detection.

Fig. 6
figure 6

Results of the periods of falls detected by OAR and the groundtruth on the sensor data of subject 1. a The sensor data from the x-axis of 3D accelerometer with ground-truth fall periods marked in red; b the fall periods marked in red detected by OAR using solely the data from x-axis of 3D accelerometer. Black line in the middle and two blue lines are the mean value and bounds learned by (1) and (2); c the fall periods detected by OAR using all information from 3D accelerometer and 3D gyroscope

4.4 Performance comparison

4.4.1 Performance comparison on fall detection dataset

We evaluated the performance of the OAR model for fall detection on the fall detection dataset and compared it with related algorithms for daily activity and fall detection, including Model C h e n [6], Model L i [18], C4.5 Decision Tree (DT) [26], and HMM [41]. Note that all algorithms in comparison except Model L i are not able to perform online analysis of the sensor data. Instead, those algorithms require sensor data to be presented in batches. Also, Model C h e n and HMM applies to a single accelerometer only and cannot perform fusion of multiple sensor data resources. Moreover, the algorithms DT and HMM require training data. For a fair comparison, in the experiments, we extracted features, selected sliding windows and moving speed, and tuned parameters for the baseline algorithms according to the methods mentioned in the respective papers in order to ensure that all algorithms can obtain reasonable performance. For DT and HMM which require training, we performed 4-fold cross-validation.

The performance of all algorithms on fall detection, both the mean value and standard deviation, is reported in Table 2. We observed that, even without any training data, OAR achieved superior performance than the other algorithms in terms of all evaluation measures. In contrast to all other algorithms which require to set specific time period/window and data-dependent parameters for analyzing the data, OAR requires just a subjective value to determine a break. This demonstrated that OAR could effectively learn the required information for activity recognition from sensor data streams. A higher precision indicated that OAR could better distinguish the fall event from other daily activities, such as walking, sitting, and lying; while a lower miss detection rate demonstrated that OAR could learn to correctly recognize different types of falls, such as forward and lateral falls. Considering the fact that OAR incrementally models necessary knowledge of sensors from sensor data streams, it was likely for OAR to mis-recognize fluctuations as activation status of a sensor during early learning process. Therefore, we believed the performance of OAR could be improved by making use of past data or domain knowledge to initialize the algorithm.

Table 2 Performance comparison between OAR and baselines for fall detection on the fall detection dataset in terms of precision, false alarm rate (F A_R a t e), and miss detection rate (M_R a t e)

4.4.2 Performance comparison on opportunity activity recognition dataset

Similar to the experiments in Section 4.4.1, we evaluated the performance of the OAR model and several baseline algorithms on the Opportunity activity recognition dataset for the detection of activities of daily living. Regarding the algorithms in comparison, DT and HMM, as compared in the fall detection dataset, were chosen for comparison while Model C h e n and Model L i were not chosen as they were designed specifically for fall detection. Besides, we compared our OAR model with two algorithms that achieved promising performance on the Oppotunity dataset. One is the k-nearest neighbor (kNN) algorithm with k=3 [34]; the other one is the information theoretic score approach (ITS) [5], which is an ensemble method for activity recognition via sensor data fusion. For kNN, we concatenated the feature vectors of sensor data from different axes of all selected sensors for sensor data fusion; for ITS, we performed 4-fold cross-validation for training classifiers.

As reported in Table 3, we observed that, regarding the algorithms without training data, the OAR model significantly outperformed kNN in terms of precision, false alarm rate, and miss detection rate. This demonstrated the effectiveness of OAR in the adaptively unsupervised modeling of sensor status and the fusion of multiple sensor data for activity recognition. Compared with supervised models ITS, DT, and HMM, OAR still obtained comparable performance to the best algorithm, i.e. the ITS algorithm, and achieved a much better performance in miss detection rate. It is notable that OAR achieved superior performance in terms of miss detection rate while obtaining a reasonable performance of precision in the experiments on both datasets. This indicated that OAR could correctly identify more activities of daily living than other algorithms while maintaining a lower false alarm rate.

Table 3 Performance comparison between OAR and baselines for activity detection on the Opportunity activity recognition dataset in terms of precision, false alarm rate (F A_R a t e), and miss detection rate (M_R a t e)

4.5 Computational efficiency analysis

In this section, we evaluated the computational efficiency of the OAR model and the algorithms in comparison on the fall detection dataset. Specifically, we simulated the case when sensor data items were received sequentially and employed two measures for evaluating the efficiency of algorithms, including 1) the time cost of each algorithm on processing the same amount of data; and 2) the average time delay of each algorithm for each detected fall event, computed by the time interval between the detected time and the real start time of the event. Here, the first measure evaluates the total computation resource required by each algorithm; and the second one evaluates how prompt each algorithm is able to react to an emerging activity.

We used the sensor data obtained from all axes of 3D accelerometer and 3D gyroscope of subject 1. The parameters of all algorithms were set to those as used in Section 4.4.1. All algorithms were implemented in Matlab and were run on a 3.40GHz Intel(R) Core(TM) i7-4770 CPU with 16GB RAM. The time cost of all algorithms is presented in Fig. 7. We observed that the OAR model required more time than other algorithms for processing the same amount of data. This was because OAR performed sensor status modeling and activity information storage at the same time during the processing of data. Therefore, besides the update of Gaussian models for the sensors, the I/O stream communication with files incurred heavy time expense. However, as a result, OAR would be able to produce more information of the detected activities than other algorithms, such as the start and end time, and the number of breaks etc. From the time delay, as presented in Table 4, we observed the superior performance of the proposed OAR model in terms of the reaction to emerging activities. This was gained by the simple but effective logic for activity recognition. Different from other algorithms that require batch-mode processing of the sequential sensor data or higher-level feature extraction, OAR incrementally models the normal status of sensors, by which activities could be discriminated by the abnormal sensor data. This also demonstrated the importance of online learning models for efficient healthcare systems for daily activity sensing.

Fig. 7
figure 7

Time cost of OAR and algorithms in comparison on processing the sensor data of subject 1

Table 4 Average time delay (in seconds) of OAR and algorithms in comparison for each detected fall event

5 Conclusion

This paper proposed a novel real-time unobtrusive sensing homecare framework, termed online daily habit modeling and anomaly detection (ODHMAD) model, which can perform daily activity recognition, habit modeling, and anomaly detection for the solitary elderly in their living space. ODHMAD consists of an online activity recognition (OAR) model and a dynamic daily habit modeling (DDHM) component. OAR performs online processing of the sensor data to identify daily activities and urgent events of the elderly. In contrast to most activity detection algorithms, OAR requires only light parameter tuning and no training data, and is able to capture activity details, such as start/end time, duration, sensor conditions, and the number of breaks. DDHM generates a two-layer hierarchy for modeling the elderly’s daily habits based on the activity information identified by OAR. This hierarchy can serve as a personalized knowledge base for recognizing abnormal behaviors, and can also be an important indicator of the elderly’s wellness to their family, and caregivers.

As an early-stage study, there is plenty room for improvement. First, although we have demonstrated the effectiveness of the OAR model for online activity recognition, OAR recognizes activities based on the activation status of sensors rather than the curve patterns that record how the sensor values change during a period. Thus, OAR may not able to distinguish between activities which trigger the activation of the same set of sensors but result in different curve patterns, say falls and quick posture transitions. Therefore, incorporating classification methods as medium-layer to further analyze the activity periods recognized by OAR is a promising way to improve the recognition ability of the system. Second, the OAR model determines the occurrence of activities via the modeled activation status of sensors, which is a binary decision but does not consider the relative importance of sensors in activity recognition. A promising way to improve the OAR model is to introduce importance score for sensors in recognizing specific activities. Third, besides the online processing, offline data analysis methods will be included as our future work to mine important relations between sensors and activities and thereby improve the online system. Fourth, in this study, we only evaluated the effectiveness of the OAR model. In the next stage, we will collect real-world data to evaluate the system and further improve the system by incorporating real-world requirements. Lastly, the current proposed system is applicable for one person. Investigation of methods to recognize activities performed by multiple residents will be an interesting direction.