1 Introduction

Recent advancements in machine learning and data mining as well as pervasive sensing technologies have opened doors to a wide variety of pervasive and context-aware applications. One research area that has benefited from the aforementioned advancements is smart environments. A smart environment is any physical environment (e.g. home, office, conference room) that senses the state of its resident and the physical surroundings and acts in order to ensure the well-being of the resident and the environment. Data is obtained from the sensors and is analyzed using machine learning and data mining techniques. The main goal of such technologies is to achieve greater comfort, productivity, and energy efficiency.

Over the past decade, researchers have recognized the importance of smart environment technologies to provide in-home assisted living (Lotfi et al. 2012) and companies have started taking advantage of such technologies in the marketplace (BrainAid 2013). In many smart home projects, the ultimate goal is to automate residents’ interactions with the environment, in particular interactions that are repetitive or cumbersome to perform for older adults or patients with cognitive impairments. An example of assisted living technologies is a remote health monitoring system that monitors and tracks activities of daily living (ADLs) of older adults with memory impairment. ADLs consist of self-care activities such as eating, cooking, taking medication, bathing and sleeping. The ability to perform ADLs independently and completely on a regular basis provides measurements for the functional well-being of inhabitants. The need for developing such in-home assisted living is highlighted by the increasing aging population, the cost of health care facilities, and individuals’ preference to stay at their homes, rather than health care facilities.

In addition to the physical infrastructure, there have also been a number of machine learning methods for activity-related tasks including, but not limited to: activity recognition, activity discovery, activity prediction and activity reminder systems. Since many of these methods rely upon activity recognition, this subfield has received the utmost attention in the literature. Activity recognition methods range from simple approaches such as decision trees (Maurer et al. 2006) and Naïve Bayes classifiers (Brdiczka et al. 2005) to more complex methods such as Hidden Markov models (Singla et al. 2009), dynamic Bayesian networks (Hall et al. 2009), and conditional random fields (Vail et al. 2007).

As intelligent systems become more prevalent, they must also be able to predict the occurrence of various events, such as the activities that residents perform, in order to have necessary foresight to make decisions in various situations. An especially common problem is sequential prediction, where a sequence of events is used to predict future events. The applications of such a prediction model are endless. In a smart home setting, for instance, predicting residents’ activities provides a basis to automate their interactions with the environment to provide context-aware services. As an example, we will briefly discuss how an activity prediction component together with an activity recognition module can be utilized to generate automated context-aware prompts for smart home residents.

Reminder systems have been long in existence and range from simple alarm clocks to complex systems that are based on rules, planning, or machine learning. Rule-based reminder systems allow a user to specify rules based on time, context, and preferences (Lim et al. 2008). More adaptive reminder systems integrate reinforcement learning (Rudary et al. 2004), which requires a pre-specified complete schedule of activities but can make adjustments without direct user feedback. Other approaches use dynamic Bayesian networks (Pollack et al. 2003), Markov decision processes (Pineau et al. 2003), and Markov-based planning (Boger et al. 2005) to deliver timely prompts for these pre-scheduled activities. Active learning has also been employed (Weber and Pollack 2007) to interactively manage calendar synchronization.

When older adults with cognitive impairment fail to initiate or complete everyday ADLs, typically caregivers are responsible to monitor ADLs and deliver a prompt. The prompt is defined as any form of verbal or non-verbal intervention delivered based on time or context to assist an individual to complete an activity successfully. The discussed interventions are time consuming and burdensome and often have negative impact on the caregiver’s own health. Smart home technologies that can detect when assistance is needed and automatically deliver prompts can potentially reduce caregivers’ burden and allow aging adults to retain their functional independence longer.

While reminder systems have been explored deeply in the literature, few systems take an individual’s behavioural patterns into account to provide context aware prompts. However, studies suggest that activity aware prompts offer significant advantages over traditional time based prompts (Kaushik et al. 2008). By taking advantage of such a prediction module, a reminder system can customize its behaviour to fit the lifestyles of the residents with no input on their part.

As an example application of activity prediction, the technology can feasibly utilize data collected from a smart home to learn context-aware rules for prompting the resident of the home to initiate important daily activities. Such an application assumes that sensor data is collected in a home while the resident performs her routine daily activities. We also assume that the training data is available from when the resident was performing the activities correctly or was prompted by a caregiver to initiate daily activities. Finally, we assume that we are given a list of critical activities for which the resident needs to be prompted. The following scenario highlights the role of an activity prediction component together with the activity recognition module to provide an automated context-aware prompt, assuming that “Taking Medication” is a critical activity (Nazerfard and Cook 2015):

In the morning, the activity recognition module recognizes that the breakfast activity has taken place. Then the activity prediction component, which has already been trained with the correctly labeled activities, makes the following prediction with a high enough confidence: The “Taking Medication” activity should follow within 30 minutes”. Then the activity prompting system would have a relative time offset for when “Taking Medication” usually happens. When the typical timespan passes and the medication is not taken, a prompt is delivered.

The above scenario shows how temporal features can be useful in an assisted living setting. The discovered temporal information can be used to construct a schedule of activities for an upcoming period. Such a schedule is constructed based on the predicted start time, as well as the relative order of the activities.

In this paper, we propose a framework for discovering and representing temporal aspects of activity patterns, including temporal ordering of activities and their usual start time and duration. The discovered temporal information can be beneficial in many applications, such as for home automation, constructing the schedule of activities for a context-aware activity reminder systems, and abnormal behavior detection in smart homes. The rest of the paper is organized as follows. We first review some related work in the field of smart environment together with activity prediction approaches and (temporal) association rules. Next we overview the frequent patten mining terminologies. We then present the details of our proposed temporal feature and relation discovery model. Next, we provide the validation results of the proposed model. Finally, we present some concluding remarks and future directions.

2 Related work

In this section, we first review a number of smart environments deployed during the past decade. Next, we discuss recent proposed activity prediction approaches in the field of smart environments. Finally, we briefly review the concept of association rules.

2.1 Smart environments

Over the past decade, a number of smart environment testbeds have been deployed, including but not limited to: MavHome project (Cook et al. 2003), Georgia Tech Aware Home (Abowd and Mynatt 2004), Gator Tech Smart House (Helal et al. 2005), iDorm (Doctor et al. 2005), PlaceLab (Intille et al. 2006), and CASAS project (Crandall 2011). In addition to creating physical testbeds, researchers have designed approaches to track locations and activities of inhabitants, discover abnormal behaviour, deliver timely prompts, and predict inhabitants’ future activities. The differences among existing approaches can be categorized as follows:

  1. 1.

    Differences in sensor modalities used to monitor activities. These include but are not limited to: supervision cameras (Mocanu et al. 2011), RFID tags (Gu et al. 2011), accelerometers (Yin et al. 2008), and wearable sensors (O’Donovan et al. 2009).

  2. 2.

    Differences in methods designed to learn activity patterns. These include but are not limited to: Bayesian networks (Kasteren and Krose 2007), fuzzy logic (Medjahed et al. 2009), artificial neural networks (Mahmoud et al. 2013), and support vector machines (Hamm et al. 2013).

  3. 3.

    Differences in experimental conditions. These include but are not limited to: smart environment inhabitants performing scripted activities (Maurer et al. 2006) and smart environment inhabitants performing regular unscripted daily living activities (Cook et al. 2013).

Based on the above-mentioned advancements, researchers have realized the importance of employing smart home technologies for health monitoring and assistance (Lotfi et al. 2012; Catarinucci et al. 2015; Hossain and Muhammad 2016; Chen et al. 2017) and companies have started looking into the potential of such technologies in the market (BrainAid 2013).

2.2 Activity prediction approaches

In spite of the significant work that has been done to recognize and track activities in the smart home research (Heung-II et al. 2010; Cook et al. 2013), less attention has been paid to predict the occurrence of various events, such as the activities that residents perform. As smart environments become more widespread, an important functionality that they require to posses is the ability to predict the activities that residents perform, in order to have the visibility to make decisions in various situations. In machine learning literature, “Prediction” often refers to sequential prediction, where the goal is to predict the next event based on a known limited history of past events.

One of the seminal works in this area, the Active LeZi (ALZ) algorithm (Gopalratnam and Cook 2007) approaches the problem from an information theoretic perspective using compression methods. ALZ is an online sequential prediction algorithm that can reason about the future in stochastic domains with no domain-specific information. The authors experimentally analyzed their sequential prediction algorithm using real sequential data obtained from the MavHome smart home environment (Cook et al. 2003). Its precursor, LeZi-update (Dufkova et al. 2009), provides a pattern matching method for location management in cellular communication networks that can exploit the position information of an inhabitant for message routing. The LeZi-update framework uses a symbolic space to represent the sensing zone of the smart environment as alphabetic symbols. As a result, it captures the entire inhabitant’s movement history as a string of symbols.

Furthermore, the researchers in (Mocanu and Florea 2012) propose a multi-agent architecture for a supervising system which includes an activity prediction layer. Their activity prediction component takes advantage of the active LeZi algorithm (Gopalratnam and Cook 2004) in order to detect emergencies in smart environments. Also, the authors in (Tapia et al. 2010) employ the activity prediction component to intervene and interact with the user as a means of prompting the user and preventing accidents. Moreover, the researchers in (Mahmoud et al. 2013) discuss the application of soft computing techniques in prediction of an older adult’s behaviour in a smart environment. In order to build the prediction model, they examine different types of artificial neural networks. Their results suggest that recurrent neural networks such as NARX achieve a great ability to finding the temporal relationships of the input patterns.

While many activity prediction and recognition approaches have been proposed, they are typically designed for constrained situations with pre-segmented data and a single user environment. The researchers in (Krishnan and Cook 2014) have extended these approaches to consider generalization of activity models over multiple users with real-time labeling. In order to avoid off-line data segmentation, the activity recognition component extracts features from a sliding window that moves over the data in real time as it is collected. The window size dynamically adjusts to sensor readings based on likely current activities and their associated likely durations.

Just recently, Minor and Cook (2017) describe an algorithm for automated activity prediction in smart home environments. The authors refer to their proposed algorithm as AF, short for activity forecasting. AF predicts the time that will elapse until a target activity occurs. This method generates an activity forecast using a regression tree classifier and can be applied in forecasting situations where a numeric time prediction is valuable.

In summary, Table 1 highlights the previous prediction approaches discussed in this section. The main characteristics and contributions of the proposed prediction method are as follows:

  • Discovering temporal relations such as the order of activities as well as the their start time and duration features.

  • Handling the outlying observations through the probabilistic clustering techniques.

  • Employing frequent pattern mining to discover the temporal relations of activities.

Table 1 The previous activity prediction studies

2.3 Association rule mining

The frequent pattern mining techniques focus on finding “interesting patterns” from a large set of data items. One of the well-known frequent pattern mining approaches is the association rules (Agrawal et al. 1993), which is usually used to find all the co-occurrence relationships from a set of items called associations. A simple example that is often used to explain the concept of association rule is discovering items that are purchased together within a transactional dataset.

Mining association rules in a market basket database is a well researched area. Even though the main problem of finding association rules is well defined and a number of algorithms exist in the literature to solve it, some particularities are not handled by these common algorithms. One of these particularities are the items that are being sold together during some specific time intervals, e.g. items that are better sold together in morning times. Therefore, a temporal association rule is defined as an association rule that holds during specific time intervals. The authors in (Li et al. 2001) study temporal association rules during time intervals specified by user-given calendar schemas. The discovered association rules along with their temporal patterns, in terms of calendar schemas, are easier to understand. This extension suggests that we might discover different rules for different timeframes. As a result, a rule might be valid during certain timeframe, but not during some other timeframes.

Activity pattern dataset in smart homes also include a timestamp. The timestamp implies when a particular activity was performed, or more specifically when a specific sensor was triggered. Similar to association rule mining, considering the concept of temporal features in activity patterns can be quite useful. For instance, in a home automation setting, we can determine when a certain activity is expected to occur, and which activities are most likely to occur next (Nazerfard et al. 2010). Despite the potential use of temporal features in activity patterns, this key aspect is usually neglected and has not been exploited to its full potential.

3 Frequent pattern mining

The proposed model in this paper is based on frequent pattern mining techniques. The main difference between the frequent pattern mining methods and the other mining approaches is that the former techniques are focused on finding “interesting patterns” from a large set of data items.

Frequent patterns are patterns (e.g., itemsets) that appear frequently in a dataset. For instance, a set of items, such as butter and bread, that appear frequently together in a transaction dataset is a frequent itemset. As another example, a subsequence such as buying first a laptop and then a backpack, if it occurs frequently in a shopping transactional database, is a frequent associated pattern.

3.1 Preliminaries

Let \(I = \{i_1, i_2,\ldots ,i_n\}\) be a set of items. An itemset C is a subset of I, i.e. \(C \subseteq I\). An itemset that contains k items is a k-itemset. We denote the size of C, i.e. the number of items in C, by |C|. A transaction is defined as \(T = (tid, C)\), where tid is a transaction identifier. A transaction database is a set of transactions, which is denoted by D.

The support_count of an itemset X is the number of transactions that contain the itemset and is formally defined as follows:

$$\begin{aligned} Support\_count(X) = |\{tid \mid X \subseteq C, \ (tid,C) \in D \}|. \end{aligned}$$
(1)

Therefore, the support of an itemset X is the proportion of transactions that contain the itemset and is defined as follows:

$$\begin{aligned} Support(X) = \frac{support\_count(X)}{|D|}, \end{aligned}$$
(2)

where |D| denotes the total number of transactions. If the support of an itemset X satisfies a given threshold, then X is a frequent itemset.

Moreover, an association rule is an implication of the form \(A\implies B\), where \(A\subset I\), \(B\subset I\), \(A\ne \emptyset\), \(B\ne \emptyset\), and \(A\cap B =\emptyset\). The rule \(A\implies B\) holds in the transaction set D with support s, where s is the percentage of transactions in D that contain \(A\cup B\), i.e. \(support(A\implies B)=support(A\cup B)\). The rule \(A\implies B\) has confidence c in the transaction set D, where c is the percentage of transactions in D containing A that also contain B:

$$\begin{aligned} \begin{array}{ll} Confidence(A\implies B) &{} = \frac{support(A \cup B)}{support(A)} \\ &{} = \frac{support\_count(A \cup B)}{support\_count(A)}. \end{array} \end{aligned}$$
(3)

A rule is called strong if it satisfies both a minimum support threshold (minsup) and a minimum confidence threshold (minconf). Equation 3 suggests that once the support counts of A and \(A \cup B\) are determined, it is straightforward to generate the corresponding association rule \(A\implies B\) and check whether it is strong or not. As a result, the problem of mining association rules can be reduced to mining frequent itemset. In summary, the problem of discovering strong association rules can be viewed as a two-step process:

  • Frequent itemset mining.

  • Generating strong association rules from the found frequent itemsets.

3.2 Frequent itemset mining

In this section, two well-known frequent itemset mining algorithms are reviewed: Apriori and FP-growth.

3.2.1 Apriori algorithm: finding frequent itemsets using candidate generation

The Apriori algorithm, by Agrawal and Srikant (1994), begins by discovering frequent items by counting them, making a pass over D. Then it combines these frequent items to generate candidate 2-itemsets, and determines their supports by making another pass over D, removing infrequent candidates. The algorithm iteratively continues to extend k-itemset candidates by one item and determines their supports by making another pass over D to check if they are frequent or not. One should note that if an itemset is frequent, all of its non-empty subsets must be frequent. Also known as the Apriori property, this property indicates that if an itemset is not frequent, none of its subsets are frequent. By taking advantage of the the Apriori property, the algorithm prunes those candidates for which a subset is infrequent.

The Apriori algorithm process candidates in a prefix-tree structure. The common k-prefix of two itemsets are the first k items in those sets that are common. The prefix tree is a tree structure where each path represents an itemset, which is the path from the root to a node, and sibling nodes share the same prefix. The Apriori-based algorithms process candidates in a prefix-tree in a level-wise manner, i.e. k-itemsets must be processed before \((k+1)\)-itemsets.

3.2.2 FP-growth algorithm: finding frequent itemsets without candidate generation

The Apriori algorithm can suffer from two major problems. First, it may need to generate a large number of candidate sets. For instance, if there are 2000 frequent 1-itemsets, the algorithm will require to generate around \(10^6\) 2-itemset candidates. Also, the Apriori algorithm may need to repeatedly scan the whole database and check a huge set of transactions by determining the support of the candidate itemsets.

The FP-growth (short for frequent pattern growth) algorithm, by Han et al. (2000), is an attempt to mine the complete set of frequent itemsets, without such a costly candidate generation. The FP-growth algorithm compresses the database representing frequent items into a FP-tree (short for frequent pattern tree), which is an extended prefix-tree structure for storing compressed and association information about frequent patterns. The algorithm then divides the compressed database into a set of conditional databases, each associated with one frequent or pattern fragment, and mines each database separately. For each pattern fragment, only its associated datasets need to be considered. As a result, this method may dramatically reduce the search space.

3.3 Generating association rules

As already discussed in the Preliminaries section, the problem of discovering association rules can be viewed as a two-step process: finding frequent itemsets and then generating the strong association rules. The first step can be done using either of the Apriori or FP-growth approaches discussed in the previous section.

Once the frequent itemsets from a transaction database D have been found, it is straightforward to generate strong association rules from them, using the Apriori algorithm, where strong association rules are those that satisfy both minsup and minconf. This step can be done using Eq. 4 (also appears in Sect. 3.1):

$$\begin{aligned} \begin{array}{ll} Confidence(A\implies B) &{} = P(B|A) \\ &{} = \frac{support\_count(A \cup B)}{support\_count(A)}. \end{array} \end{aligned}$$
(4)

According to Eq. 4, association rules can be generated as follows:

  • For each frequent itemset l, generate all nonempty subsets of l.

  • For every nonempty subset s of l, generate the following rule:

    $$\begin{aligned} s \implies (l-s), \end{aligned}$$

    if \(\frac{support\_count(l)}{support\_count(s)} \ge minconf\), where minconf is the minimum confidence threshold.

Note that since the rules are generated from frequent itemsets, each one already satisfies the minimum support condition. As an example, given a transaction database D placed in a supermarket, we may find an association rule of the following form:

$$\begin{aligned} \begin{array}{ll} Peanut Butter &{} \implies Bread \\ &{}(support: 3\%, confidence: 75\%), \end{array} \end{aligned}$$
(5)

which indicates that \(3\%\) of all transactions contain the items peanutbutter and bread, and \(75\%\) of the transactions that have the item peanutbutter, contain the item bread as well.

One interesting extension to association rules is to include a temporal dimension. For instance, peanutbutter and bread may be purchased together mainly between \(6:30\) and 11 a.m.. Thus, we may find the above association rule has a support as high as \(40\%\) among the transactions that occur between \(6:30\) and 11 a.m. and has a support as low as \(4\%\) in other transactions. We refer to the above-mentioned extension as temporal association rules.

4 Proposed model

The proposed TEREDA model, abbreviation for TEmporal features and RElations Discovery of Activities, discovers temporal features and relations of activity patterns from sensor data. The model is able to discover features and relations, such as the order of the activities, their usual start times and durations through the use of rule mining and clustering techniques.

The architecture of TEREDA is illustrated in Fig. 1, which consists of two main components: the temporal feature discovery component and the temporal relation discovery. Each component will be described in more depth in the following sections. The input consists of a set of sensor events collected from various sensors deployed in the space. Each sensor event includes an identifier (ID), a timestamp and an optional activity label. In order to better understand how TEREDA works, we consider an example, the Taking Medication activity, throughout following discussions.

Fig. 1
figure 1

The TEREDA architecture

4.1 Temporal activity features discovery

Initially, start time and duration of activities are obtained for each activity. After extracting start times for all instances of a specific activity, start times are clustered in order to obtain a canonical representation. For this purpose we take advantage of the expectation maximization (EM) clustering algorithm to construct a normal mixture model for each activity start time.

Let \(t_i\) denote start time of the ith instance of activity A, depicted as \(A_i\). The probability of \(t_i\) belonging to a certain cluster, the kth cluster, can be expressed as a normal probability density function parameterized by \(\Theta _k = (\overline{x}, s)\), as represented in Eq. 6, where \(\overline{x}\) and s are sample mean and sample variance, respectively.

$$\begin{aligned} Prob(t_i|\Theta _k)=\frac{1}{\sqrt{2\pi {s}^2}}e^{-\frac{(t_i-\overline{x})^2}{2{s}^2}}. \end{aligned}$$
(6)

The parameters of the mixture normal model are determined automatically from the available data. Figure 2 illustrates the results of finding canonical start times for the “Taking Medication” activity in the form of a mixture of four normal distributions. According to normal distribution characteristics, the distance of “two standard deviations” from the mean account for approximately 95% of the values (see Fig. 3). Therefore, if only observations falling within two standard deviations are considered, observations that are deviating from the mean will be automatically discarded. Such observations, distant from the rest of the data are regarded as “outliers”.

Fig. 2
figure 2

A mixture model for the start time of the taking medication activity

Fig. 3
figure 3

The normal distribution characteristics

Duration of an activity is also considered in addition to start time, where average duration of all instances within the cluster are calculated for each resulting cluster from the start time discovery step.

4.2 Temporal activity relations discovery

Discovering temporal relations of activities is the main component of TEREDA, wherein features obtained from the previous step, i.e. the canonical start times and durations are used to produce a set of temporal relations between activities. The temporal relations will determine the order of activities with respect to their start times, i.e. for a specific time what are the most probable activities following a specific activity. Such results can be useful in a variety of activity prediction scenarios. To discover the temporal relations of activities, we employ the FP-growth algorithm.

In order to provide a more precise understanding of the temporal relation discovery component we consider the following notations. let \(B_j\) denote the successor activity of \(A_i\), where \(A_i\) and \(B_j\) denote the ith and jth instances of activities A and B, respectively. As previously mentioned, each activity instance belongs to a specific cluster \(\Theta _k\) defined by the start time of the activity instance. Moreover, the \(A^k_i\) notation is used to show that activity \(A_i\) belongs to a specific cluster \(\Theta _k\). The temporal relation “B follows A” formulated as \(A \implies B\) is ultimately obtained and the set containing instances of all activities is represented as D.

Denoting the estimated mean and standard deviation of cluster k by \(\overline{x}_k\) and \(s_k\), we refer to the number of instances of all activities and activity A falling within \([\overline{x}_k-2{s}_k, \overline{x}_k+2{s}_k]\) interval as \(|D^k|\) and \(|A^k|\), respectively. Then we can define the support of the “follows” relation as in Eq. 7 and its confidence as in Eq. 8.

$$\begin{aligned} Support(A^k \implies B)=\frac{\sum _{i,j}{(A^k_i \implies B_j})}{|D^k|}, \end{aligned}$$
(7)
$$\begin{aligned} Confidence(A^k \implies B)=\frac{\sum _{i,j}{(A^k_i \implies B_j})}{|A^k|}. \end{aligned}$$
(8)

The outcome of this step is a set of temporal relation rules corresponding to each cluster. Figure 4 illustrates the discovered temporal relation rules, with support and confidence values greater than 0.1, for the first cluster of the taking medication activity. According to Fig. 4, if the Taking Medication activity occurs in time interval \([6:43, 7:47]\), it would usually lasts less than 2 minutes and the next activities would typically be “Cooking” with a confidence of 0.39, “Relaxing” with a confidence of 0.18 and “Personal Hygiene” with a confidence of 0.16.

Fig. 4
figure 4

Temporal relations of the taking medication activity (the results are shown for the first cluster)

figure a

5 Experimental results

This section presents experimental results of the proposed TEREDA model. Before getting into the details of our results, we explain the settings of our experiment.

5.1 Experimental setup

Two one bedroom single resident smart home apartments, referred to as Apt1 and Apt2, hosting older adults performing normal unscripted daily activities were selected as study environment. Sensors were installed on ceilings, walls, doors and cabinets in order to track resident movements. The sensor layouts for our testbeds are illustrated in Fig. 5, where red circles and blue triangles represent infrared motion/area sensors and magnetic door/cabinet sensors, respectively. A sensor network was applied to capture all events generated by the sensors and middleware was used to store events in an SQL database.

A sensor network captures all of the events generated by the sensors, and our middleware stores them in an SQL database. As a means for providing real activity data for our experiments, data were collected while residents were living in smart homes performing normal daily routines. Table 2 provides the characteristics of our smart home testbeds.Footnote 1

Fig. 5
figure 5

Layouts for Apt1 (left) and Apt2 (right), where the red circles represent motion/area sensors, and the blue triangles represent door/cabinet sensors. (Color figure online)

Table 2 Characteristics of the smart homes for the present study

For our experiments, we consider 11 ADLs performed by residents of the smart homes. The list of the activities taken into consideration for the present study are as follows: bathing, bed-toilet transition, cooking, eating, enter home, housekeeping, leave home, personal hygiene, relaxing, sleeping, and taking medication.

The datasets used consist of a set of discrete individual sensor events collected from various sensors deployed in the space. Each corresponding sensor within a smart home generates a message if resident movement is detected in its field of view. Events in the dataset were manually annotated with corresponding activity labels by trained researchers who employed visualization tools and interviews with residents to generate accurate ground truth labels. Table 3 provides a sample data related to the “Taking Medication” activity, where D27 and M03 represent a cabinet sensor and a motion sensor, respectively.

Table 3 A sample for sensor events used in our study

Table 4 provides the parameter values used for the EM clustering and FP-growth association rules for our experiments.

Table 4 The EM clustering and FP-growth parameters

5.2 Evaluation of TEREDA

In this section, we provide the results of running TEREDA on the smart home dataset and compare its accuracy against a number of other prediction techniques. As already mentioned in Sect. 1, employing the EM clustering algorithm enables TEREDA to discover different numbers of clusters for different activities. Taking into account the EM algorithm parameters provided in Tables 4 and 5 presents the number of discovered clusters, corresponding to the start times of each activity, after running TEREDA on our dataset.

Table 5 The number of discovered start time clusters for each activity

For instance, results in Table 5 for the “Enter Home” activity suggest that TEREDA discovers 4 start time clusters for this activity. The discovered temporal relations for the Enter Home activity are illustrated in Figs. 6, 7, 8 and 9. As mentioned in Sect. 4.1, in order to handle outliers, TEREDA only retains start time values within the [\(\overline{x}-2s\), \(\overline{x}+2s\)] interval for each activity. It is worth noting that “Enter Home” is the one activity in our dataset for which considering the “duration” feature is not applicable. Consequently, there are no normal distributions corresponding to the durations of the “Enter Home” activity in Figs. 6, 7, 8 and 9.

According to Fig. 6, if the “Enter Home” activity occurs in the [\(7:52\), \(10:08\)] timeframe, it is typically followed by the “Relaxing” and “Eating” activities with confidence values of 0.41 and 0.25, respectively. Furthermore, Fig. 7 indicates that when “Enter Home” occurs between \(13:27\) and \(13:35\), the next followup activities are “Cooking” with a confidence of 0.36, “Relaxing” with a confidence of 0.17, and “Eating” with a confidence of 0.15. It is worth mentioning that only maximum of the three most probable succeeding activities with confidence values greater than 0.1 are represented. Figure 8 indicates that when the “Enter Home” activity takes place between \(16:38\) and \(17:10\), the most probable followup activity is “Cooking” with a confidence of 0.82. Finally, Fig. 9 suggests that if the “Enter Home” occurs during the [\(18:41\), \(21:57\)] interval, it is most likely followed by three activities: taking medication, eating, and relaxing. The confidence for these followup activities are 0.28, 0.27, 0.20, respectively.

Fig. 6
figure 6

Results for the 1st cluster

Fig. 7
figure 7

Results for the 2nd cluster

Fig. 8
figure 8

Results for the 3rd cluster

Fig. 9
figure 9

Results for the 4th cluster

Table 6 provides results of running TEREDA on our smart home data for the first five activities. The results indicate that the bathing activity takes place in two time clusters. For the first cluster, the activity normally happens between \(8:31\) and \(15:15\), where it lasts between 3 and 9 min and is followed by personal hygiene, eating, or leave home activities with confidence values of 0.45, 0.12, and 0.12, respectively. As previously mentioned, we represent maximum of three followup activities whose confidence values are greater than 0.1. The results corresponding to the second cluster of the Bathing activity suggests that the Bathing activity is usually performed during the \(20:24\) to \(22:06\) time interval, where it lasts approximately 19–21 min and is followed by three activities: personal hygiene, sleeping and relaxing. The corresponding confidence values for these three activities are 0.58, 0.13 and 0.13, respectively.

According to results of TEREDA regarding the bed-toilet activity, it seems that this activity occurs mostly in two main time clusters. The first cluster usually happens between \(2:36\) and \(5:06\) early morning, where it takes 4–6 min and is followed by the sleeping activity almost all the times. The results corresponding to the second cluster indicate that the bed-toilet activity takes place in the evening from \(21:46\) to \(23:22\) and is followed by sleeping (78%), bathing (11%) and taking medication (11%).

Moreover, results provided in Table 6 suggest the cooking activity is typically performed in five clusters within a day. As previously discussed in TEREDA’s description, discovering different numbers of clusters for different activities is due to employing the EM clustering algorithm. Similar to the Cooking activity, results from Table 6 suggest that the Eating activity also occurs in five main clusters of a day. The observations regarding the Enter Home activity have already been discussed in the main body of the paper, Sect. 5.2.

Table 6 TEREDA’s discovered rules, where minsup and minconf are both set to 0.1.(The results are shown for the first 5 ADLs)

Table 7 demonstrates results of running TEREDA on our smart home data for the second five activities. The results suggest that TEREDA discovers only one cluster of a day for the occurrence of the Housekeeping activity, when it takes less than 22 min and is typically followed by the leaving home, personal hygiene, and relaxing activities with the specified confidence values. Furthermore, the discovered results regarding the Personal Hygiene activity indicate that this activity happens mainly in four time clusters per day and takes no more than 10 min.

Table 7 also demonstrates findings regarding the last three activities (i.e. Relaxing, Sleeping, and Taking Medication). These findings indicate that the Relaxing activity occurs in seven clusters of a day. As a side note, the Relaxing activity is performed on a couch in the smart home and is typically accompanied with other activities, including Watching TV, Snacking, Eating, etc. In contrast to the Relaxing activity, the Sleeping activity takes place in Bed, generally during nighttime. Regarding the Taking Medication activity, the results suggest that the resident takes her Medication mainly in four time clusters. Interestingly, the results convey that the most probable activity following the Taking Medication activity is Cooking, regardless of the time of day at which the Taking Medication activity occurs.

Table 7 The discovered temporal relation rules, where minsup and minconf are both set to 0.1.(The results are presented for the second 5 ADLs)

Finally, we compare TEREDA’s prediction accuracy against a number of other algorithms for the task of activity label prediction. The other prediction approaches used to evaluate TEREDA are naïve Bayes (NB), Decision Tree (C4.5), Multi-Layer Perceptron (MLP) and Support Vector Machines (SVMs). For the mentioned algorithms, three features were considered to predict the next activity label:

  • Activity label The label of the current activity.

  • Activity time of day A discretized value of the time when the current activity occurs. Time values are binned into the following ranges: 0–3, 4–7, 8–11, 12–15, 16–19, and 20–23.

  • Activity day of week An integer value ranging from 1 to 7 representing the day of the week in which the current activity happens.

The NB algorithm is a probabilistic classifier based on applying Bayes’ theorem that assumes all of the above mentioned features are independent (Nazerfard and Cook 2015). The specific parameters for the other compared algorithms are provided in Table 8.

Table 8 Parameters used for the other compared algorithms

All prediction approaches were tested using 10-fold cross validation. Each prediction method was trained on nine out of ten groups and tested on the remaining one. The results from all ten permutations were used to obtain significance values and averaged together for acquiring an overall accuracy.Footnote 2

As already discussed, Tables 6 and 7 demonstrate the details of running TEREDA on our smart home data. Table 9 and Fig. 10 compare the activity label prediction outputs between TEREDA and other discussed algorithms. Results suggest that TEREDA achieves an accuracy of \(73.69\%\) with a standard deviation of 4.03 for Apt1 and an accuracy of \(64.8\%\) with a standard deviation of 4.64 for Apt2. As compared to TEREDA, NB shows a \(11.2\%\) decline for Apt1 and a \(12.18\%\) decline in accuracy for Apt2. The prediction results for the C4.5 algorithm suggest a \(6.77\%\) accuracy drop for Apt1 and a \(9.57\%\) drop for Apt2. Prediction results of the MLP algorithm also indicate a \(9.76\%\) drop in accuracy for Apt1 and a \(10\%\) accuracy drop for Apt2. Prediction accuracies of SVMs with a quadratic polynomial kernel on our smart apartment testbeds also imply a \(11.85\%\) decline for Apt1 and a \(11.37\%\) decline for Apt2, compared to TEREDA.

Table 9 Activity label prediction accuracy for TEREDA, NB, C4.5, MLP and SVMs.
Fig. 10
figure 10

The overall activity label prediction comparison among discussed approaches

Finally, Table 10 provides the confusion matrix of TEREDA predictions for Apt1. The results indicate that our dataset is highly imbalanced. For example, the “Personal Hygiene” activity evidently is over-represented in our dataset, as compared to other activities. Hence, the imbalanced nature of our dataset is one of the major reasons for particular confusions made by TEREDA. However, there are other possible underlying reasons behind major confusions observed in Table 10, for instance, results suggest that some confusions occur between the “Taking Medication” and “Eating” activities. This confusion is mainly a result of similar location of occurrence. Also, the confusion between the “Cooking” and the “Personal Hygiene” activities is most likely due to the two activities overlapping too often. Most of the remaining confusions occur between two activities that typically happen consecutively, including “Personal Hygiene” and “Leave Home”, “Eating” and “Taking Medication”, and “Eating” and “Leave Home”.

Table 10 Confusion matrix for the TEREDA predictions

6 Discussion on scalable TEREDA

A number of work have set up to address the problem of frequent itemset mining in parallel and distributed settings, including work by Moens et al. (2013), Anastasiu et al. (2014), Zhang et al. (2015), and Huynh et al. (2017).

In this regard, studies on parallel programming have mainly followed upon two categories of shared memory and distributed architectures. Shared memory systems are parallel mechanisms in which processes share a single memory address space. Even though implementing parallelism on shared memory systems seems easier, the achieved scalability is not satisfactory (Moens et al. 2013). In the message passing interface (MPI) (Li et al. 2014) processes interact only through direct message passing. Messages are often sent over a network connection. Despite certain advantages in iterative computation, the disadvantages of MPI include its high communication load due to data exchanges between different workstations and lack of fault tolerance mechanism.

MapReduce by Dean and Ghemawat (2008) is a parallel programming framework that provides a relatively simple programming interface. Computation in MapReduce consists of two main phases: map and reduce. The problem input is specified as a set of key-value pairs. In the map phase, each mapper processes a distinct split of data and generates key-value pairs. During reduce phase, the key-value pairs produced in the map phase are grouped by key and fed to reducers as pairs of key-value lists. The reducers further process these intermediate parts of information to produce the final output. The MapReduce framework proves to be an efficient platform for parallel and distributed data mining of large scale datasets. However, it is not appropriate for iterative computation, since repeated read/write operations to its distributed file system would lead to high I/O load and time cost.

To overcome the above-mentioned problems, we propose the Apache Spark platform by Zaharia et al. (2010), a memory-based distributed framework, as a solution architecture for the parallel and scalable version of TEREDA for future directions.

7 Conclusions and future work

In this paper, we introduced TEREDA to discover the temporal relations of the activities of daily living. The proposed approach is based on association rule mining and the EM clustering techniques. TEREDA also discovers usual start time and duration of activities as a mixture normal model.

One of the technologies that has received increasing attention over the past few years is the Internet of Things (IoT). IoT is considered the technology of seamlessly integrating classical networks and networked objects (Miorandi et al. 2012). One of the most important questions that arises in this new emerging technology is how to convert the massive data captured by IoT into knowledge to provide more convenient environments for people (Tsai et al. 2014). In future, we plan to develop distributed and scalable TEREDA, using Spark, and employ it to provide possible solutions to discover hidden information in the IoT big data.