1 Introduction

Nowadays, the prominent enhancement of sensor technology, alongside with the remarkable progress on Machine Learning methods has opened up a wide variety of real-world applications through combining different types of sensor and Machine Learning techniques for data acquisition and processing, respectively. Various sensors have been utilized in research projects. Numerous studies have been done using visual sensors such as cameras which receive the sequence of images or videos as the sensor data. Despite the successful performance of vision based solutions, non-visual sensors are still required to solve the existing limitations. Non-visual sensors can be installed on the human’s body or in the environment as wearable and ambient sensors. Exploiting the network of heterogeneous sensors has been of interest in many recent studies with the aim of enhancing the performance as well .

Human activity recognition (HAR) as an active research area in Machine Learning is of great importance. In many of the real world applications, system’s decision is made based on the current human activity which is inferred by its HAR module from sensors data.

Smart Environment is a remarkable real world application which aims to monitor the behavior of individuals in the environment. This environment is enriched with sensors,actuators and processing units which can also be embedded in the objects. Monitoring systems as a use case of Smart Environments, are deployed to predict, detect and neutralize the abnormal behaviors of individuals in environments such as metro stations.

Ambient Assisted Living (AAL) approaches use Activity Recognition methods in order to monitor resident’s behavior and support them so that improve the quality of their lives. HAR has also been exploited in game consoles, fitness and health monitoring on smartphones’ applications. Many research laboratories have paid considerable attention to the Smart Environment related topics and consequently, several research projects such as CASAS (Cook et al. 2009), MAVhome (Cook et al. 2003), PlaceLab (Intille et al. 2005), CARE (Kröse et al. 2008), and AwareHome (Kidd et al. 1999) have been deployed. Activity recognition in this context denotes recognizing the daily activities of a person who is living in the environment. Activities of Daily Living (ADLs) is a terminology defined in the healthcare domain and it refers to people’s daily self-care activities (Katz et al. 1970).

1.1 Activity modeling

In general, there exist two main approaches to model the activity; data-driven and knowledge-driven approach. Data-driven models, analyze large sets of data for each activity by means of Machine Learning and data mining techniques. Another approach is to acquire sufficient prior knowledge of personal preferences and tendencies by applying knowledge engineering and management technology methods. Activity recognition task contains different challenges such as labeling the activities, interrupted activities, sensors with non-valuable data and activity extraction. However, online recognition is noted as one of the most dominant challenges since the main requirement in many modern real-world applications is to recognize user’s behavior based on the online stream of data.

In the present study, the stream of data is generated from a set of binary environmental sensors’ output. The majority of online activity recognition approaches are fed with the data which preprocessed using the sliding window technique. The main drawback of this approach is the optimization problem of window size parameter itself.

To address this issue, we propose a method to detect the activity pane on data stream by discovering the sequential pattern of the sensors data in the beginning and termination of each activity. Detecting the activity boundaries, we can recognize the activity using a proper segment of streaming sensor data. Since the beginning of an activity is detected, the model keeps reporting the identified activity till the end of it gets recognized by its sequential pattern. While the model is reporting the current activity, it is capable of recognizing other activities interrupting the ongoing one. Having activity’s segment identified, statistical features of the segment including activity duration will be taken as the input to correct predicted label of the activity. Consequently, we get ride of the non-valuable sensor data by ignoring all the intervals but activity panes and tuning the window size parameter.

Our model consists of a set of Hidden Markov Models implemented in a hierarchical manner in order to recognize the beginning, estimate on-going activity, detect end and class of the activities. Our evaluation results demonstrated the superiority of the proposed model in comparison with some existing ones and competitive performance compared to the state-of-the-art models.

The remainder of the paper is organized as follows. Section 2 outlines the previous researches that have been done in this domain. Section 3 describes the proposed method and implementation details. Evaluation and experimental results are presented in Sects. 4 and 5 summarizes the results of this work and draws conclusions.

2 Related work

Smart environments obtain knowledge from physical platform and its residents in order to modify setting based on residents interactions’ pattern with environment. These systems optimize numerous goals such as monitoring environment, assisting residents and control resources. The first idea that uses sensor technology for creating smart environment backs to 1990. Moser’s neural network house modifies environment conditional statues based on user behavior pattern (Mozer 1998). The next decades witnessed a considerable progress in this field which obtained by exploiting HAR techniques.

Several publications have appeared documenting research works on Human activity recognition. Generally, sensor-based and vision-based approaches, have been documented in the literature as two main categories of HAR. This paper is concentrating on the sensor-based HAR. The common basis for the majority of attempts of this type is to process the sequence of sensors’ events and detect the corresponding activity, based on the discovered pattern. Furthermore, activity recognition has been vastly used in various applications and environments. Hence, it is conspicuous to have a robust Activity Recognition Model that is invariant to the environmental settings such as sensors’ structure and placement in the smart environment.

To address this requirement, authors in Cook (2012) propose a robust model of the sensors which extracts different features such as the duration of the activity. In this generalized model, the problem is independent of the sensor environment and can be implemented for different users. Besides, to evaluate the models, having enough correctly labeled data is inevitable. Nevertheless, manual labeling of the sensors’ data, is prohibitive due to being highly time-consuming and often inaccurate.

Moreover, the way that annotation has been done is often ignored while it makes bias on the data. In the most common approach of annotating, residents of the smart environment are asked to perform an activity and then annotation will be done based on activated sensors. Nonetheless, this approach may not be practical in all situations. A labeling mechanism presented in Szewcyzk et al. (2009) is an example of existing solutions for annotating sensor data automatically.

Much researches on activity recognition have been done using pre-segmented data; which means the beginning and end of the activities is pre-determined in the dataset (Sánchez et al. 2008; Fleury et al. 2009). Such approaches are far unrealistic compared to real-world setting and are not applicable in the online applications as the beginning and end of the activities is not determined when it comes to the stream of data. Researchers in Hong and Nugent (2013) and Okeyo et al. (2014) develop methods for sensor stream segmentation which brings the activity recognition based systems closer to those of actual world.

Despite all the researches on HAR, still opened challenges such as overlapping or concurrent activities, have yet to be solved. Overlap is noted as the phenomenon that different activity classes activate the same set of sensor events which makes overlapping activities hard to discriminate only based on the types of sensor events that they have triggered (Wen et al. 2015; Helaoui et al. 2011).

A trajectory propagation algorithm is proposed in Zhang et al. (2014) that uses concurrent activation events to recognize most probable occupants trajectories and locations. More abstract details of an occupants behavior can be extracted by using the concurrent activation events of a binary motion sensor network.

Riboni and et al. propose an unsupervised ontological method for interleaved activities (Riboni et al. 2016). Another unsupervised technique is introduced by Zhang and et al., which is able to learn the occupants moving patterns of resident activities’ routine that may subject to change through time (Zhang et al. 2017). This decomposition based learning approach overcomes the limitation that a lot of other supervised learning approaches have faced.

AALO is a novel Activity recognition system for single resident smart homes that utilises Active Learning in the presence of Overlapped activities (Hoque and Stankovic 2012). It can recognize the overlapped Activity recognitionactivities by preprocessing step and item-set mining phase. A key limitation of this research is that it is not capable of recognizing the overlapping activities which happen in the same location. In addition, performing on the online data stream is still lacking in this study.

Moreover, researchers in Malazi and Davari (2018) suggest a two-phase method based on emerging pattern which can recognize complex activities. In the first phase, this method extracts emerging pattern for distinct activities. In the second phase, it segments streaming sensor data, then uses time dependency between segments in order to concatenate the relevant segments. Segments concatenation lets the method recognize complex activities. In Quero et al. (2018), Quero et al. propose a Real-Time method for recognizing interleaved activities based on Fuzzy Logic and Recurrent Neural Networks.

Authors in Cook et al. (2013 and Gjoreski and Roggen (2017) study the problem of handling the large proportion of available data that are not categorized in predefined classes and addressed it by discovering patterns in them and segmenting it into learnable classes. These kinds of data usually belong to the sensors that are not exclusively involved in the predefined class of activities. In  Gayathri et al. (2015), Gayathri et al. propose a hierarchical hybrid approach based on Markov logic network (MLN) in order to detect abnormal activities in dementia care systems. Several Machine Learning approaches have been examined in the domain of Activity Recognition. Ensemble methods (Jurek et al. 2014), Non-parametric models (Sun et al. 2014), Temporal frequent pattern mining (Nazerfard 2018), Support vector machines-based models (Sánchez et al. 2008), Recurrent neural networks (Singh et al. 2017), and probabilistic models like hidden Markov model and the Markov random field (Kabir et al. 2016; Kasteren et al. 2010; Yan et al. 2016; Jalal and Kamal 2018) have been exploited in the literature. Nonetheless, less attention has been paid to the domain of Online Activity Recognition which deals with processing stream of sensor data contrary to the conventional approaches that utilize pre-segmented data.

Most of the presented solutions for streaming data processing are based on sliding window technique (Okeyo et al. 2014; Krishnan and Cook 2014; Yala et al. 2015; Al Machot et al. 2016, 2018). The sliding window approach, briefly named as windowing, mainly considers the temporal relation or number of sensors for framing data. One of the key bottlenecks of this approach is fine-tuning the window size. One basic solution is to employ constant pre-determined window size (Kasteren et al. 2010; Krishnan and Cook 2014). Though, as the number of activated sensors are varied in different activities, applying dynamic window size has been noticed by many researchers (Wan et al. 2015; Espinilla et al. 2018; Al Machot et al. 2016; Okeyo et al. 2014).

Authors in Al Machot et al. (2016) and Al Machot et al. (2018) present a novel probabilistic method to determine the window size. A different window size is initialized regarding each class of activity based on prior estimation and it is getting updated by the upcoming sensor events.

Krishnan and Cook (2014) consider temporal dependency among sensors as a criterion for sliding windowing so that those sensors, which get activated at a certain time interval, are examined as one activity. Their other proposed solution is a sensor-based method which recognizes group of sensors as a window that was continuously activated together by their Mutual Information measure. Authors in Yala et al. (2015) extend the presented model of Krishnan and Cook (2014) and improved its performance by altering computation of Mutual Information.

Researchers in Kabir et al. (2016) present a multi-stage classification method. The first stage is to cluster the activities using a Hidden Markov Model based on location data and then in the next stage, another HMM classifies the exact activity using a sequence of sensor data. The major weakness of this method is that it makes no attempt to specify the boundaries of activities which negatively affects the performance.

Another method to tackle online recognition is introduced by Li et al. (2018). They propose cumulative fixed sliding windows for real time activity recognition. Their segmentation method consists of several fixed time length windows which have overlapped with each other. These overlapping windows considered as the whole a window, and its information is used to detect the on-going activities.

A novel incremental activity inference engine is introduced in Hao et al. (2018) to predict and recognize activities in real-time. It provides cognitive assistance to elderly people suffering from cognitive impairment with abnormal behavioral patterns. An ontological clustering method is also proposed to improve accuracy.

3 Proposed approach

The Windowing technique is among the most widely used solutions to process the stream of sensor data. However, one practical question that arises when dealing with this approach is how to determine the proper window size. There is still some controversy surrounding window size tuning while it has a great impact on the performance as the decision-making process is halting until the model receives a complete window of data.

Fig. 1
figure 1

(1) distinct samples with different sizes (2) different classes have different sizes

Fig. 2
figure 2

Abstract structure of the proposed method

Moreover, the selected window passed to the model often contains inappropriate data which belongs to none of the predefined classes. Dynamic sliding window size may seem an appropriate solution. However, as it is shown in Fig. 1, not only different classes of activity need different window sizes, but also distinct samples of one class require distinct window sizes.

This paper seeks to address these problems by presenting a new model which works based on the dynamic segmentation of the data stream. Our proposed model breaks down the problem of Online Activity Recognition into two sub-problems as it is depicted in Fig. 2. In the initial stage, the model detects the activity pane by recognizing sensor events’ patterns of when the activities start and finish and makes the prediction while receiving data. Having the beginning and end of the activities determined, in the next stage the model proceeds to refine the prediction by classifying the activities into the predefined classes and the other class. Predefined classes proposed by medical specialists (Katz et al. 1970), included Daily Living Activities such as eating, bathing, sleeping, etc. Other class label is devoted to the activities that do not fit into predefined classes. Following sections explain each stage in detail.

The main contributions and innovations of this paper are as follows:

  • Description of a novel method based on the occurrence pattern of activities, that addresses window size problem of previous sliding window methods.

  • Providing an estimation of the on-going activity due to each sensor observation, which delivers Real-time Activity Recognition.

  • Recognizing interrupted activities even those which occur in the same location.

3.1 Detection phase

The first step to recognize the activity on the stream of sensor data is to detect its occurrence. The idea is to investigate the sensor events in the beginning and end of the activities to extract corresponding patterns of start and end. As it can be inferred from Fig. 3, the set of activated sensors, for each activity is almost unchanged in most of the occurrences. This assumption can be justified by considering locally limited functional zone for each activity. Meal preparation as an example happens in the kitchen, thus it should be kitchen’s sensors that get activated during this activity.

The Number of observations which are taken into account for beginning detection purpose, affects the model performance. Table 1 compares different values for this parameter and corresponding results. Based on this comparison, we opted for considering three sensor events in our model. Indeed, the number of considered sensors should be small enough to promise the recognition feasibility of the next activity. To exemplify, when the resident opens or closes the door, only a few sensors get activated; hence waiting for more observations is out of options.

Fig. 3
figure 3

Normalized frequency of sensor activation in the beginning, end and duration of Meal Preparation, Personal Hygiene and Take Medicine activities

Table 1 Proposed model accuracy % obtained using #\(\beta\) observations

On the other hand, there exist some classes of activities such as Personal Hygiene and Bathing, which share a common set of activated sensors in the beginning and the end. Therefore, recognizing these activities requires more information like sensor activation sequences. Once the beginning of an activity gets recognized, the model utilizes the upcoming sequence of sensor events to recognize the activity class itself and as it receives more events, its prediction gradually becomes more confident. This trend can be seen in Fig. 4.

Fig. 4
figure 4

Log likelihood trend of candidate classes during the observation procedure

As Fig. 4 shows, at first the most probable activity is Bed to Toilet transition, while after receiving more observations, estimated activity changes to Bathing. These activities have overlap due to their similar sensors. Therefore, there is a need for more information to produce a reliable prediction.

Numerous Machine Learning methods such as Support Vector Machine, Neural Networks, Decision Tree, and Probabilistic Graphical Models have been applied in activity recognition field. Hidden Markov Model is one of the most popular models in the literature for sequential data processing as it does not suffer from labor-intensive calculations like Neural Networks. Besides, prior knowledge can be simply leveraged in probabilistic models which are also more robust to the noise.

Taking into account the aforementioned reasons, we chose HMM to recognize the activities. Suppose \(X_{t}\) represents the hidden state vector and \(Y_{t}\) represents vector of observations. Assuming K possible hidden states we have \(X_{t}\in \{1,\ldots ,K\}\). In our case, observations are the sequence of sensor events. Equation (1) highlights the calculation to obtain the most probable class based on the observations:

$$\begin{aligned} P(X_{t}|y_{1:t}) = P(y_{t}|X_{t}) \left[ \sum _{X_{t-1}} P(X_{t}|x_{t-1}) P(x_{t-1}|y_{1:t-1}) \right] \end{aligned}$$
(1)

The Hidden Markov Model approach is not well suited to process long sequences and this led us to employ an extension of HMM called Hierarchical Hidden Markov Model in our study. Hierarchical-HMM is suitable for the problems that contain multilevel dependencies in terms of time and follow a hierarchical structure in their context.

Hierarchical-HMM is represented by a triple \((\zeta ,\gamma , \theta )\) while \(\zeta\), \(\gamma\), and \(\theta\) stand for model structure, set of observations and model’s parameters respectively. Model structure defines number of levels d and children-parent relations on each level. States in the lowest level, are the only generative ones which generate the observations. Parameter set \(\theta\) is defined as following (Nguyen et al. 2005):

$$\begin{aligned} \theta = \{B(y|p), \pi ^{d,p^*}, A_{i, j}^{d,p^*}, A_{i, end}^{d,p^*} | \quad \forall (y,p,d,p^*,i,j) \}, \end{aligned}$$
(2)

where \(p^*\) is an abstract state in level d with children set \(ch(p^*)\), B(y|p) is the probability of generating observation y by generative state p, \(\pi ^{d,p^*}\) is the prior probability of \(ch(p*)\), \(A_{i, j}^{d,p^*}\) is the probability of inhering child i to j, and \(A_{i, end}^{d,p^*}\) is the probability of termination of \(p^*\) in state i. The proper value set of \(\theta\) is obtained by Maximum Likelihood technique.

Figure 5 demonstrates the transition diagram of our implemented Hierarchical-HMM. In this diagram we have 2 type of transition , vertical and horizontal. Vertical transitions are shown by dotted arc, and occurs between different level of HHMM. Horizontal transitions happen among same level node, when task of one node finishes.

Additionally, there are two types of node in HHMM, abstract nodes which produce a sequence of observations, and production nodes which produce single observation. In Fig. 5, nodes \(X_{1}\), \(X_{2}\), and \(X_{3}\) are abstract nodes that each of them considered as a distinct HMM. In this model, \(X_{1}\) is the responsible sub-HMM for detecting the beginning of an activity. \(X_{1}\) is an auto regressive HMM which consist of 3 observation. After receiving 3 sensor observations, if beginning of an activity is detected, the control will pass through the \(X_{2}\) in order to recognize the class of on-going activity.

In \(X_{2}\) sub-HMM with occurring each observation, the corresponding activity will be recognized. This process continues until the termination of activity which detects by \(X_{3}\). At the end of the cycle, control will be given back to the root node. The output of this process S, is the segmentation of data which holds the related information.

Sometimes, subject interrupts the current activity and commences a new activity. There exist some techniques which can recognize the latter activity independently of the first one (Nef et al. 2015). Thought, their discrimination is limited to the activities which happen at different locations. Since our hierarchical detection technique is location invariant the proposed model is capable of recognizing interrupted activities that take place in a common location. Figure 6 demonstrates an example of the interrupted activities.

Fig. 5
figure 5

The transition diagram of the implemented Hierarchical-HMM

Fig. 6
figure 6

An illustration on interrupted activities: The subject has been preparing meal since time m1. At time t1 he decided to interrupt so as to take his medicine. At time t2 he resumed and kept preparing meal till m2

3.2 Correction phase

With the completion of the Detection phase, our model can move to the next stage. In this stage, for each class, Joint Probabilistic Distribution Function (Joint PDF) of its time length and time of day gets extracted. Figure 7 exemplifies how time of the day as an statistical feature can ease the discrimination. The extracted PDFs now can be used to measure the level of belonging of a data segment to each of the classes. As Eq. (3) reveals, if this probability passes the threshold \(\alpha\) for each class \(C_{k}\), the data can be labeled as \(C_{k}\). Otherwise, it is labeled as Other class. A fine tuning process has been done to achieve the proper value for \(\alpha\). Table 2 details this trial.

$$\begin{aligned} P( S \in C_k ) = \left\{ \begin{array}{ll} 1 &{} \quad f(T_{C_{k}}=t_{S}) \ge \alpha \\ 0 &{} \quad \text {Otherwise} \end{array} \right. \end{aligned}$$
(3)
Table 2 Accuracy of the proposed model employing different threshold parameter \(\alpha\)

In summary, this model first detects the activity occurrence utilizing the beginning pattern of activities. Next, it recognizes the class of ongoing activity based on activated sensors until the end of the activity gets detected. To improve the discrimination performance between similar activities, our model also exploits the time of day in which the activity is occurring as an statistical feature.

Fig. 7
figure 7

Choosing time of day as a feature facilitates classification task. Bed to Toilet transition and Personal Hygiene activities can be discriminated by this feature since Bed to Toilet transition occurrence time has low overlap with the Personal Hygiene activity

4 methods

The performance of our proposed approach is assessed by conducting 2 experiments on 2 different datasets to recognize the current activity on the stream of data. We compared our approach with a set of most common Machine Learning models and reported the results.

4.1 Dataset description

We evaluated our proposed model on HomeAFootnote 1 and HomeBFootnote 2 datasets where each home is one bedroom apartment hosting a single older adult performing unscripted daily activities. Figure 8 shows a sample slice of this dataset. These 2 datasets have been collected using 32 and 30 sensors which have been deployed in 2 different houses depicted in Fig. 9 for the period of 5 months. Labeling of this dataset has been done later by human experts. Table 3 summarizes characteristics of data acquisition for HomeA and HomeB datasets. These datasets contain 11 classes of interest. Details on sample distribution of classes are highlighted in Table 6. Of datasets, 70% (3.5 months), 10% (2 weeks), and 20% (4 weeks) were dedicated for training, validation and testing respectively.

Table 3 Dataset Statistics

Due to the limitation of accessing real-time data in this domain, we have simulated data streaming using offline datasets. Each row in the datasets is a sample of a sensor event which has a time label, sensor ID, and sensor status. In every time slot, one sample is given as the input to the system, and the system makes predictions based on sensor events that have acquired until the current time slot. To delineate more, let us denote the Eating activity as E. It contains some events of triggered sensors while each of them is one distinct sample in the dataset:

$$\begin{aligned} E=\{S_{t_{1}},{S}_{t_{2}},{S}_{t_{3}}, \ldots ,{S}_{t_{end}}\}. \end{aligned}$$
(4)

In the first time slot \(t_{1}\), sample data \(S_{t_{1}}\) is passed to the system as input to make a prediction. In the next time slot \(t_{2}\), the system received \(S_{t_{2}}\) as input and this process goes on like that. Consequently, data is treated in an online manner.

Fig. 8
figure 8

Sample of sensor data during an activity in dataset HomeA and HomeB

Fig. 9
figure 9

The architecture of a HomeA, b HomeB. Red circles represent motion sensors and blue triangles stand for Door/Cabinet sensors (color figure online)

Table 4 Confusion matrix of the proposed model on HomeA dataset
Table 5 F1-measure of the proposed model on HomeB dataset
Table 6 Overall instances of each class in dataset HomeA and HomeB
Table 7 Performance comparison of models in terms of accuracy and F1-measure

4.2 Baseline models

We have compared performance of our proposed model with several existing approaches:

  • SW this model utilises constant window size. Sensors are equally contributing in window size calculation  (Krishnan and Cook 2014).

  • TW it employs a constant window size that is obtained based on time interval  (Krishnan and Cook 2014).

  • SWMI it considers window size as a constant value. Mutual Information of sensors and activity classes is taken into account for window size calculation  (Yala et al. 2015).

  • SWTW window size remains constant in this model and is calculated by time based weighting of the sensor events  (Yala et al. 2017).

  • DW in this model window size is variable for each activity and its value is calculated using probabilistic information of sensor events  (Al Machot et al. 2016).

  • PWPA it benefits from a 2-level window with a fixed size. First level contains the probabilistic information of the previous windows and activities. Second level includes the data of current step with possible activity as a augmented feature (Krishnan and Cook 2014).

4.3 Evaluation metrics

The performance of the proposed model has been evaluated by Classification accuracy and F1-measure. Classification accuracy is computed considering each test sample using its predicted label and corresponding ground truth label as highlighted in Eq. (5).

$$\begin{aligned} Accuracy = \frac{{tp + tn}}{{tp + tn + fp + fn}} \end{aligned}$$
(5)

To calculate F1-measure, activities are considered despite the length of their windows. To exemplify, consider activity A1 and A2 containing 500 and 50 sensor event samples. Accuracy metric evaluates all 550 samples while F1-measure is calculated for those two activities as a whole. Equation (5) show the F-measure formula.

$$\begin{aligned} F1-measure =2\times \frac{{Precision \times Recall}}{{Precision+Recall}} \end{aligned}$$
(6)

Performance of the models has been compared in terms of F1-measure and Accuracy as it is reported in Tables 4, 5, and  7.

5 Experimental results and discussion

As can be inferred from 4 for HomeA dataset, Personal Hygiene, Bathing, Sleeping, Take Medicine, Meal Preparation, and Eating Meal classes can be highly discriminated. However, unsatisfactory results were reached for Enter Home, Leave Home, and Sleeping classes. This performance degradation is justifying given the fact that Enter Home and Leave Home both activate the same set of sensors and these two classes of activity often occur in sequence. Besides, sometimes the resident walks out of the house just for short moments which is not considered as Leaving home. Similar to other windowing-based methods such as SWTW, SWMI, and PWPA, our proposed method is facing problem in recognizing activities such as Enter Home and Leave Home which shares the same corresponding sensors and involves a few sensors. It makes their beginning and end pattern exactly similar. PWPA can acquire knowledge from previous activities which makes it able to detect Enter Home and Leave Home more precisely as Enter Home always happens after Leave Home activity and vice versa. Moreover, classifying these two classes of activities from the rest is quite challenging due to their short length which does not give enough time to the model to improve its estimation and makes their distribution more sensitive to noise compared with other activities. According to Table 6, the Housekeeping activity has only 13 instances in the HomeA dataset which is dramatically low compared to the rest of the classes. This imbalanced distribution may have been the cause of the inadequate performance of the model.

In HomeB dataset, Bathing and Sleeping are well discriminated and detection of Meal Preparation, Personal Hygiene, Eating, and Bed to Toilet transition are satisfactory. The performance dropped for Taking Medicine, Napping, Enter Home and Leave Home. In HomeB there are times that the resident is eating meal on the couch which is a place for napping, and this act causes problem in recognition of both napping and eating activities. Another difference compared to HomeA is that, the Housekeeping activity recognition’s accuracy increases dramatically due to the adequate samples in HomeB dataset.

The proposed model is capable of detecting overlapped activities such as Meal Preparation and Taking Medicine even though they happen in a common location. Whereas previous methods have difficulty to recognize overlapped activity in an online setting. Datasets’ description implies that activities may overlap in two levels.

Another point worth mentioning is about our trial to benefit from the knowledge of the previous activities with the purpose of enhancement in our proposed method’s performance. However, the results suggest a 12% accuracy drop for HomeA and 15% drop for HomeB. The primary cause of the mentioned accuracy drops can be due to misclassification error propagation. The proposed model does not have access to the ground truth of the previous activity. Therefore, in case of erroneous prediction of the previous activity, this error will propagate through the system.

As the results provided in Table 7 suggest, our proposed method outperforms the others in the matter of accuracy. F1-measure metric has not followed the same trend as accuracy, hence methods with the better accuracy did not necessarily obtain greater F1-measure.

Generally, classes of activity that have the same local domain such as Napping-Eating Meal, Housekeeping-Meal preparation, Enter Home-Leave Home are tough to discriminate, due to their locally similar patterns. Furthermore, no specific place or description has been considered as a tutorial to do activities. Therefore in some cases, activities have been done in different ways or places. For instance, resident Eats his Meal on the coach which causes conflict between Eating and Napping activity, while he was supposed to eat his meal on the eating table. It is worth noting that, taking into consideration time of the day or other statistical features can compensate local overlapping as it does for Personal Hygiene and Bed to Toilet transition.

6 Conclusions and future work

Robust online activity recognition in the domain of smart environments is considered as one of the most dominant challenges. While most of the previous approaches suffer from window size parameter tuning, this paper has highlighted a new activity recognition approach on sensor data stream which determines the beginning and end of the activities. In addition, our proposed model is capable of recognizing interrupted activities. The experimental results of this study indicate the efficiency of the proposed model compared to the existing ones.

In general, the accuracy of the activity recognition methods cannot exceed a certain amount due to the uncertainty of human behavior. Nevertheless, uncertain predictions can be offered through a recommendation context in smart environments.

Further work needs to be performed to achieve improvement on recognition of Other class as the most problematic activity class. Current recognition systems require a fair amount of labeled data to reach a certain satisfactory accuracy while data acquisition and labeling in this domain is impractical. Future studies on the current topic are therefore recommended in the domain of Knowledge Transfer to reduce the crucial need of labeled data.