1 Introduction

The knowledge convergence in sensing environments and pervasive computing has created massive interest in the development of smart homes [13]. A smart home is an intelligent agent that perceives state of resident and the physical environments using sensors. Recent advancements in the field of machine learning and data mining have enabled activity recognition research using smart homes sensing data to play a direct role in improving the quality of health care. All techniques for activity recognition depend on various types of methods, parameters, and diverse characteristics of sensing data. Availability of real time data is significantly important for testing and evaluation of methods and parameters. However, feasibility of new dataset collections is inadequate due to involved budget, human resources, and annotation cost. Due to these reasons, mostly researchers prefer to utilize existing datasets for the testing and validation of their techniques.

A large number of smart home datasets are publicly available and activities performed by individuals can be characterized by duration, frequency, sequential order, temporal order, and other factors such as age, and gender [4]. Important attributes that must be considered while analyzing smart home datasets include the time scale that is necessary to estimate the times of different activities. For example, eating requires more time than taking medication. Periodic variations may occur in daily, weekly, monthly, annual, and even seasonal activities [5]. For example, cleaning is more likely to occur on weekends than on weekdays. Parallel activities may occur, in which one user can perform more than one activity in a single time unit [6]. Sequential activities are characterized by preceding and following activities to identify the influences of different activities on each other. For example, taking medicine is very likely followed by eating. Therefore, the acquisition of large datasets poses several challenges for their use and analysis. The existing machine learning algorithms utilize smart home datasets in terms of performed activities, deployed sensors, allocated time, and number of inhabitants. The development of a standard framework for data analysis is significantly important for realistic performance evaluation of the methods. Unfortunately, there are no standardized methods available that can provide detail analysis of datasets.

In this paper, we develop a Framework for Smart Home Datasets Analysis (FSHDA) to represent them in a predefined format. Our proposed method analyzes the time duration, activity count, activity type, sensor count, sensor type, sensor location, activity occurrence, activity sensor events, activity time, parallel activity count, parallel activity type, and sequences of parallel activities. These dimensions can provide detailed views regarding differences in environments, residents, and activities and provide similar information about every dataset so that researchers can choose datasets according to algorithm requirements and application domains. A particular dataset cannot be classified with the same accuracy from all classifiers. To select an appropriate classifier for a certain type of data, there is a need to understand the behavior of classifiers on different data characteristics. Suppose there are four classifiers k 1, k 2, k 3, and k 4, and three classes c 1, c 2, and c 3. Let us assume that depending on the training set and the set of features used, classifier k 1 is more efficient to classify given points in class c 1. Similarly, depending on the configuration, classifiers k 2, k 3, and k 4 are more efficient in classifying points in classes c 2, c 3, and c 1, respectively. During our analysis, we experience a lot of variation in annotations of activities along their sensor events. In all datasets, few activities are incomplete which means they start but end annotation tag is missing. In some cases, activities are complete however, there is no sensor event fired during that time. In this case, no algorithm can detect such activities because there is no sensor information in sensory data. We find out the relationship between the distribution of data, on the one hand, and classifier performance, on the other. We analyze the effects of data dimensions on state-of-the-art classifiers such as Artificial Neural network (ANN) [7], Hidden Markov Model (HMM) [8], Conditional Random Field (CRF) [9], and Support Vector Machines (SVM) [10]. These four selected schemes are applied on four analyzed datasets selected from three most significant smart home projects such as CASAS [11], ISL [1], and House_n [12] smart homes. The experimental results provide an evaluation of the existing activity classification schemes to resolve the uncertainties associated with the choice of classifier and the nature of smart home dataset. The results show that neither of the classifier is best for all datasets; the classification accuracy of each classifier depends on the underline data characteristics.

The rest of the paper is organized as follows. We briefly describe related work in Sect. 2. In Sect. 3, we introduce proposed framework for datasets analysis. The analyzed results of the CASAS, ISL, and House_n smart home datasets are presented in Sect. 4. In Sect. 5, we discuss the effects of analyzed data dimensions on state-of-the-art machine learning techniques. Finally, conclusion and future works are given in Sect. 6.

2 Related work

Smart homes can be divided into two types of environments according to the interactions and types of deployed sensors: event based and continuously sensing environments [13]. An event based environment can be built by using binary sensors that observe ON/OFF states when inhabitants interact with appliances (i.e., cabinet, stove, chair) [1]. A continuously sensing environment can be built using environmental sensors (i.e., temperature, humidity, light) and sending the sensed data periodically to the server [5]. Advances in wireless communications enabled multiple sensor devices to send and receive data over long distance [13, 14]. In such environments, sensory data have different levels of abstraction, different representations, and are diverse in nature [15] that significantly affects the performance of activity recognition techniques.

The aim of smart home technology is to provide ambient assisted living for care delivery, remote monitoring, and promotion of residential safety and quality of life [1620]. Several studies have been conducted to better understand the smart homes and their applications based on activity recognition. In [16], authors studied the impact of few dataset features on classification accuracy of machine learning algorithm based on semi-Markov models. They considered the availability of labeled data, importance of required training time, and speedy inference of ISL dataset for experimental purpose. In their analysis, they showed that CRF outperforms among semi-Markov models. The authors in [10] apply SVM to identify daily living activities on their smart home dataset. They selected a set of features from dataset according to their domain of interest before using multi-SVM for effective activity classification as compare to other classifiers. The work in [17] applied the ANN to cluster human activities of daily living within their own developed smart home environment. They proposed GSOM (Growing Self Organizing maps) based approach to utilize the appropriate data dimensions for cluster analysis of human activities effectively.

The authors in [18] synthesizes the sensor information collected from CASAS smart home and extracted the wide range of useful features. They compared several machine learning algorithms on the selected features to compare the performance of activity recognition. They discussed the performance of the machine learning algorithm based on information gain and mRMR. The authors in [19] resolved the problem of sensor selection for activity recognition in smart homes along with classifier selection. They examine the issue of selecting and placing sensors in a CASAS smart home in order to maximize activity recognition accuracy. In [8], authors used ISL smart home dataset to show the potential of generative and discriminative models for recognizing activities. They presented that CRFs are more sensitive to overfitting on a dominant class than HMM. The work in [20] assisted to understand the boundaries of context-aware computing. They helped application designer to decide which features of data are important to implement. A survey about major techniques related to activity recognition work can be found in [21]. The underneath of these contexts are deployed sensors and algorithms that played an important role for high performance and accuracy. Some of the work [2225] has been done for data analysis of sensory data, but none of the work has focus on analysis of data based on smart home sensor.

The commonly observed methodologies in literature for smart home datasets are to propose machine learning algorithms and select the one, which gives relatively better results for their particular domain. According to best of our knowledge, no existing work has intensions to develop a system that can analyze the different data dimensions for better understanding of smart home datasets so application developer can know what features are important for particular classification task. In this study, we measure the predefined data dimensions and analyze the classifiers performance to show the effects of data characteristics. Our study will help the researchers in choosing an appropriate dataset according to intended requirements of classifiers and associated parameters.

3 Proposed framework for dataset analysis

In smart homes, datasets are collected in the form of raw data files and intended to support the diverse applications for ambient assisted living. Dataset is composed of “Log Files” LF to keep the record of active sensors at different time slots and “Annotation Files” AF to maintain the list of performed activities in temporal manner. Our proposed Framework for Smart Homes Dataset Analysis (FSHDA) process the raw data to compute the data dimensions as (a) total time duration for collected dataset time, (b) activity type and count for annotated activities with their labels, (c) sensor type and count for deployed sensors with their labels, (d) smart home characteristics for dataset characteristics in terms of dataset time, annotated activities and deployed sensors. (e) activity occurrences for number of unique annotated activities, (f) activity time for time duration of individual activity, (g) activity sensor event for the number of generated sensor events for a particular activity, (h) idle activity time for the total time of unannotated activities, (i) idle activity sensor events for generated sensor events during idle activity time, (j) parallel activity type and count for total number of parallel activities with their labels, (k) sequence of parallel activity for the start and end sequences of parallel activities. These dimensions assist the researchers to better understand the nature of performed activities before finalizing the classification technique.

To define the proposed framework, we assume \(A _{n} ^{D} = \{ a _{1}, a _{2}, a _{3},\ldots, a_{n} \}\) are n annotated activities in the dataset “D”, and \(S _{m} ^{D} = \{ s _{1}, s _{2}, s _{3},\ldots, s _{m} \}\) are m deployed sensors in the smart home for a duration of \(T: T _{d} ^{D} = \{ t _{1}, t _{2}, t _{3},\ldots, t _{d} \}\). The set of unique time stamps \(U _{T}= \mathit{unique} ( T _{d} ^{D} )\), unique activities \(U _{A}= \mathit{unique} ( A _{n} ^{D} )\), unique deployed sensors \(U _{S}= \mathit{unique} ( S _{n} ^{D} )\) and \(S _{act}= \mathit{ative\ sensor\ events} ( S _{n} ^{D} )\) are the set of sensor values and events. The proposed data dimensions for uniform representation and compact analysis of smart home datasets are computed in Table 1 to generalize the characteristics of all datasets. The pseudo code for the computation of data dimension from (a) to (c), (e) to (f), and (i) to (j) are given in Algorithms 1, 2, and 3, respectively.

Algorithm 1
figure 1

Computation of data dimensions (a), (b) and (c)

Algorithm 2
figure 2

Computation of data dimensions (e), (f), (g) and (h)

Algorithm 3
figure 3

Computation of data dimensions (i) and (j)

Table 1 The FSHDA data dimension calculations

The proposed frameworkFootnote 1 is publically available to compute data dimensions of other smart home datasets in addition to already evaluated datasets. It has been implemented in MATLAB 7.6 on an Intel Pentium(R) Dual-Core 2.5 GHz with 3 GB of memory and Microsoft Windows 7. The analysis of three most significant smart home datasets is presented in the preceding section.

4 Results of data dimensions analysis

In this section, we experimentally validate the proposed framework of dataset analysis. We calculate a list of data dimensions from a set of diverse datasets in order to express consistency and scalability of the proposed framework. During analysis, we keep the same activity labels as mentioned in the collected datasets. The details of the datasets and the results of analysis are shown in the following subsections. However, the scope of proposed framework is not limited to the following datasets and researchers can analyze other datasets with the help of publically available version of the proposed framework.

4.1 CASAS smart home dataset analysis

The CASAS smart home project is a research project at Washington State University (WSU), we selected the four most recent datasets for analysis of our proposed framework. Different kinds of sensors, such as temperature sensors, motion sensors, and binary sensors, are deployed at various locations. These sensors are placed on multiple objects like doors, kitchen stove burners, TV lounge, and other places in the home environment [26]. The datasets are collected for long durations of time, and thousands of sensor events are generated while single or multiple inhabitants perform daily life activities. In Table 2, the results of computed data dimension activity count, sensor counts, duration, and inhabitant information are given to outline the characteristics of the datasets. It is obvious from Table 2 that Tulum2010 is the largest dataset of CASSAS smart home in terms of activity count and the duration of data collection compared to others.

Table 2 The FSHDA calculation of CASAS smart home dataset characteristics

The detailed characteristics of four CASAS datasets are computed under our developed framework and results are shown in Table 3. During analysis, we identify some common activities that are annotated by almost all datasets are “toileting”, “meal preparation”, “sleeping”, “watching TV”, “personal hygiene”, and “work”. Among these activities, the most commonly annotated activity is “meal preparation”, which is annotated 1791 times in Tulum2010. Our computed dimensions show that in some datasets, a few activities are annotated at the macrolevel, while in others the same activities are annotated at microlevels. For example, cooking is annotated as “meal preparation” in all datasets except Tulum2009, where it is annotated as “cook breakfast” and “cook lunch”. During analysis, the other noticeable activities are those that are only annotated one or two datasets, such as “yoga” (Tulum2010), “study” (Tulum2009), and “R1 snack” (Tulum2009), which occur 24, 9, and 491 times, respectively.

Table 3 The FSHDA computed characteristics of CASAS smart home datasets Tulum2010, Twosummer2009, Twor2009, and Tulum2009. The “Num” column shows activity count, the “Time” column shows activity time in minutes, and the “Sensor” column shows activity sensor events

During the dataset collection period, some activities are performed in parallel and our framework compute these activities by considering as separate data dimensions and results are shown in Table 4 with their counts and dataset descriptions. The pairs of parallel activities do not always have the same start and end sequence. For example, “clean” and “R1 work” are parallel activities in Twor2009. But their start and end sequences are not same for all occurrences. Similarly, “cleaning” and “grooming” are parallel in TwoSummer2009 with different start and end sequences.

Table 4 The FSHDA computed list of parallel activities for CASAS smart home datasets

4.2 ISL smart home dataset analysis

The Intelligent System Laboratory (ISL) is a research group from the University of Amsterdam, recorded three single subject datasetsFootnote 2 using three different smart homes, House A, House B, and House C [27, 28]. The data dimensions related to number of deployed sensors, performed activities, and inhabitants of each house are presented in Table 5. From computed results, we can infer that the data collection duration is longest in House A, while House B has larger activity and sensor counts. Houses A and B are one-room apartments, while House C data is collected in a two-story home [27].

Table 5 The FSHDA calculation of ISL smart home dataset characteristics

The detailed data dimensions in terms of activity count, time durations, and sensor events of three smart homes are computed and results are depicted in Table 6. From these statistics, we identify the most commonly annotated activities in the datasets are “leaving”, “toileting/toilet downstairs”, “showering”, “brush teeth”, “sleeping/go to bed”, and “prepare dinner”. The occurrences of “toileting/toilet downstairs” are more numerous in House A than in the others. Activities that are only annotated in House A are “load dishwasher”, “unload dishwasher”, “store groceries”, “unload washing machine”, and “receive guest”. Among these activities, “store groceries” occurs only once, while the occurrences of the remaining activities are similar to each other. The annotated activities in House B are higher as compared to House A and House C, but the number of occurrences of these activities is very low. For example, “shaving”, “unpacking”, and “on phone” are annotated only once each. The only activities annotated for House C are “relax”, “use toilet upstairs”, “take medication”, and “lunch”. Among these activities, “relax” and “use toilet upstairs” are significantly more frequent than the other two activities.

Table 6 The FSHDA computed characteristics of ISL smart home datasets House A, House B, and House C. The “Num” column shows activity count, the “Time” column shows activity time in minutes, and the “Sensor” column shows activity sensor events

Proposed framework found a set of parallel activities in all three datasets. In the case of House A, “get drink” and “receive guest” are parallel activities and annotated three times in the dataset. Similarly, for House B “eat brunch” and “prepare brunch” are annotated two times as parallel activities. In the case of House C “eatubg” is parallel to “prepare dinner” and “relax” activities. During analysis, discrepancies are found between the reported and actual statistics of activity count, and sensor count of these datasets. The reported numbers of activities are 10, 14, and 16, while the actual activity counts are 16, 25, and 17 for Houses A, B, and C, respectively. In House A, “eating” is annotated one time in the dataset, but no sensor event is fired within that period so it is not detectable during computation. That is why the total count for actual detected activities is reduced from 16 to 15. The most variations are found in the analysis of House B data dimensions, “prepare breakfast” and “get drink” activities are annotated in reported statistics; however, in actual statistics, we did not find even a single occurrence for these activities. Furthermore, the “wash toaster” activity is annotated once for 2.45 minutes in House B, but no sensor event is generated at that time.

4.3 House_n smart home dataset analysis

The House_n smart homes project from University of MIT, collect data by using a set of simple state change sensors [29]. Two datasetsFootnote 3 are collected for two different subjects. Both individuals live alone in one bedroom apartments. The number of performed activities, deployed sensors and duration with inhabitants’ information are shown in Table 7.

Table 7 The FSHDA calculation of House_n smart home dataset characteristics

The detail characteristics of the performed activities are computed by predefined data dimensions as depicted in Table 8. From computed statistics, we identify the most common activities of both datasets are “toileting”, “washing dishes”, “preparing breakfast”, “preparing lunch”, “preparing dinner”, and “preparing a snack”. Among these activities, the “toileting” is annotated almost twice as often for Subject 1 as for Subject 2 (82 vs. 33 times). The durations of data collection and sensor events also vary accordingly. The number of occurrences of “washing dishes” for Subject 2 is three times that for Subject 1 (20 vs. 7). The occurrences of “preparing dinner” are almost double for Subject 2 as for Subject 1. Activities that are only annotated for Subject 1 are “preparing a beverage”, “dressing”, “bathing”, “grooming”, “cleaning”, “doing laundry”, and “going to work”. Among these activities, “grooming” has the most occurrences, while “cleaning” has the least. Activities that are only annotated for Subject 2 are “taking medication”, “watching TV”, and “listening to music”; the number of occurrences of these activities are all similar.

Table 8 The FSHDA computed characteristics of House_n smart home datasets Subject 1, and Subject 2. The “Num” column shows activity count, the “Time” column shows activity time in minutes, and the “Sensor” column shows activity sensor events

During analysis, parallel activities are found in case of Subject 2, where “listening to music” is parallel to “preparing a snack” for three times, “preparing lunch” for two times, and once for “toileting” and “washing dishes” at different time slots. In the reported statistics, for Subject 1, 77 sensors are deployed, however, only 28 of those sensors are found to generate events after analysis from the proposed framework. Similarly, activity count also varies such as “toileting” is reported 85 times, but its actual count is 82 times because no sensor event is fired during start and end duration of remaining three activities. In the case of Subject 2, the reported number of deployed sensors is 84, however, analysis results show that only 20 of those sensors are activated during the data collection period. Similarly, the reported count for “listening to music”, “toileting”, and “washing dishes” are 18, 40, and 21, however, their actual count is 14, 33, and 20 times, respectively.

5 Effects of proposed data dimensions on activity recognition

In this section, we validate the effectiveness of the proposed data dimensions through demonstrating how performances of activity recognition techniques are influenced by the data dimensions calculated by the proposed framework. The proposed methodology of validation consists of three major modules. (1) Data preprocessing: to represent the sensory data as an observation vector for machine learning algorithms. (2) Classifiers for activity recognition: to introduce the applied classifier for the domain of activity recognition with preferred settings for our experiments. (3) Results of variations in classifiers performances and discussion: to show the effect of data dimensions through demonstrating how performances of activity recognition techniques are influenced by the dataset characteristics. The detail of each module is described in the following subsections.

5.1 Data preprocessing

Data preprocessing is an important step towards accurate training of machine learning techniques [30]. Data collected from ubiquitous sensors based on subject interactions are stored in sensor logs and annotation files with attributes start time, end time, sensor id, sensor value, and activity label. In order to recognize the performed activities, recorded dataset is preprocessed into the form of {(x 1,y 1),…,(x n ,y m )}. The x i is the vectors whose components are the values of embedded sensors {S 1,…,S n } such as stove-sensor, refrigerator-sensor, and door-sensor. The values of “y” are drawn from a discrete set of classes {c 1,…,c m } such as a “leave home”, “read”, and “sleep”. Furthermore, excessive information such as multiple header lines is also removed from the sensor logs and annotation files.

5.2 Classifiers for activity recognition

In this section, we briefly introduce the applied classifierFootnote 4 for the domain of activity recognition. We applied the various types of state-of-the-art techniques including probabilistic and statistical approaches (i.e., ANN, HMM, CRF, and SVM) on the analyzed datasets. However, these methods cannot handle the parallel activities due insufficient training instances of parallel activities in the dataset. Brief description of each classifier with preferred settings for our experiments is given as the following.

Artificial Neural Networks (ANN): It is an information processing network of artificial neurons connected with each other through weighted links. In activity recognition, multilayer neural network with the back propagation learning algorithm is utilized to recognize the human activities [8]. The structure of the network, number of hidden layers, and number of neuron in each layer affects the learning of different activities. The activation of the neurons in the network depends on the activation function [8]. We train multilayer neural network through the back propagation learning method and weights are updated by the following equation:

$$ \Delta {w} _{ {ki}} =- {c} \biggl[ -2 \sum _{ {j}} \bigl\{ ( {y} _{ {j}(\mathrm{desired})} - {y} _{ {j}(\mathrm{actual})} ) {f} ' ( \mathrm{act} _{ {j}} ) {w} _{ {ij}} \bigr\} {f} ' (\mathrm{act}) _{ {i}} {x} _{ {k}} \biggr] $$
(1)

where Δw is the weights adjustment of the network links and c is the learning rate of neural network such that (0<c≤1). We set the value of c=0.1 to learn the performed activities more rapidly. In our network, we used one hidden layer, twenty neurons, tangent sigmod function as an activation function as given below:

$$ \varphi ( {x} ) =\tanh \biggl( \frac{ {x}}{2} \biggr) = \frac{1-\exp (- {x})}{1+\exp (- {x})} $$
(2)

Learning of the network is limited to maximum 1000 epochs. The multilayer neural network can be seen as an intuitive representation of a multi layer activity recognition system. The number of correctly classified activities depends on the number of training instances during the learning phase.

Hidden Markov Model (HMM): It is a generative probabilistic graph model that is based on the Markov chains process [9]. For activity recognition, HMM is based on the number of states and their transition weight parameters. In our experimental setting, parameters are learned thorough observation and following parameters are required to train the model:

$$ \lambda = \{ A,B, \pi \} $$
(3)

where λ is graphical model for activity recognition, A is a transition probability matrix, B represents the output symbol probability matrix, and π is the initial state probability [9]. We used Baum–Welch algorithm to determine the states and transition probabilities during training of HMM. The ith classification of an activity is given as

$$ \lambda _{i} = \{ A _{i}, B _{i}, \pi _{i} \},\quad i=1,\ldots,N $$
(4)

Conditional Random Fields (CRF): It is a discriminative probabilistic graph model for labeling the sequences. The structure of the CRF is similar to HMM for activity recognition but learning mechanism is different due to absence of the hidden states [10]. In our experimental settings, the conditional probabilities of activity labels with respect to sensor observations are calculated as follows:

$$ p ( y _{1: T} \mid x _{1: T} ) = \frac{1}{Z ( x _{1: T}, w )} \exp \Biggl\{ \sum_{j =1} ^{N _{f}} w _{j} F _{j} ( x _{1: T}, Y _{1: T} ) \Biggr\} $$
(5)

In Eq. (5), Z denotes normalized factor and F j (x 1:T ,Y 1:T ) is a feature function. To make the inference in the model, we compute the most likely activity sequence as follows:

$$ y _{1: T} ^{*} = \mathop{\mathrm{argmax}}\limits _{ y_1': T '} p \bigl( y _{1: T} ' \mid x _{1: T}, w \bigr) $$
(6)

Support Vector Machine (SVM): SVM is statistical learning method to classify the data through determination of a set of support vectors and minimization of the average error [11]. It can provide a good generalization performance due to rich theoretical bases and transferring the problem to a high dimensional feature space. In our experimental setting, for a given training set of sensors value and activity pairs, the binary linear classification problem requires the following maximum optimization model using the Lagrangian multiplier techniques and Kernel functions as

$$\begin{aligned} &\mathrm{Maximize}\ (\mathrm{w.r.t}\ \alpha) \sum _{ {i}=1} ^{ {n}} \alpha _{ {i}} - \frac{1}{2} \sum_{ {i}=0} ^{ {n}} \sum _{ {j}=1} ^{ {n}} \alpha _{ {i}} {y} _{ {i}} \alpha _{ {j}} {y} _{ {j}} {K} ( {x} _{ {i}}, {x} _{ {j}} ) \end{aligned}$$
(7)
$$\begin{aligned} &\mathrm{Subject\ to}{:} \quad \sum_{ {i}=1} ^{ {n}} \alpha _{ {i}} {y} _{ {i}} =0, \quad 0\leq \alpha _{ {i}} \leq {C} \end{aligned}$$
(8)

where K is the kernel function that satisfies K(x i ,x j )=Φ T(x i )Φ(x j ). We used the radial basis function (RBF) for recognizing the activities.

$$ K ( x _{i}, x _{j} ) = \exp \biggl( \frac{- \Vert x _{i} - x _{j} \Vert ^{2}}{ ( 2 \sigma ^{2} )} \biggr) $$
(9)

Activity recognition is a multiclass problem so we adopt the “one-versus-one” method to classify the different activities. Classification of the final activity class is based on the voting mechanism and maximum vote of a class determined the activity label.

5.3 Results of variations in classifiers performances and discussion

In this section, results for the effects of smart home dataset dimensions on classifiers performance are presented for recognizing the daily life activities. The approach has been implemented in MATLAB 7.6. The configuration of the computer is Intel Pentium(R) Dual-Core 2.5 GHz with 3 GB of memory and Microsoft Window 7. The standard metric of accuracy is used as performance measure by using the values of the confusion matrix [31] and computed as

$$ \mathit{Accuracy} = \frac{\sum_{i =1} ^{Q} \mathit{TP} _{i}}{T} $$
(10)

where Q is the number of performed activities, TP is the number of true positives, and “T” is total number of activities. We evaluated Tulum2009 and Twor2009 datasets from CASAS smart home. From ISL and House_n smart homes experiments are performed on House A and Subject 1 datasets, respectively. We split the datasets using the “leave one day out” approach; therefore, the sensor readings of one day are used for testing and the remaining days for training. Figures 1, 2, 3, and 4 show the experimental results for the Tulum2009, Twor2009, House A, and Subject 1 datasets characteristics, respectively. In each dataset, for each activity, accuracies of ANN, HMM, CRF, and SVM are illustrated. In the following paragraphs, we will discuss how each technique has the performance variations over the different datasets by using the proposed data dimensions.

Fig. 1
figure 4

Tulum2009 activity based accuracy of classifiers

Fig. 2
figure 5

Twor2009 activity based accuracy of classifiers

Fig. 3
figure 6

House A activity based accuracy of classifiers

Fig. 4
figure 7

Subject 1 activity based accuracy of classifiers

In our experiments, ANN shows high diversity in its performance. In order to learn the activities correctly, it requires a sufficient amount of training instances so hidden neurons can learn precisely from training data. Due to this reason, ANN performs better on the set of activities whose training instances are high in the dataset while its performance is insignificant for the recognition of those activities whose training examples are few in the dataset. Overall performance of ANN on Tulum2009 is 81.09 %, and it correctly classified “R1 snack” activity; however, it could not recognize the “group meeting” activity. The training instances for these activities are 491 and 11, respectively, as shown in Table 3 that affects the ANN classification process. In the case of Twor2009, training instances for “meal preparation” are 118, and it outperforms all other classifiers on the identification of this activity. While on the same dataset, it could not classify the “study” activity due to less training instances. For House A, the overall performance of ANN is a low 41.11 % as compared to other datasets. However, it is the only classifier that 100 % classifies the “toileting” activity as its training instances (i.e., 114 samples) are very high as compare to other activities as shown in Table 6. Similarly for Subject 1, training instances of “toileting” activity are more (i.e., 82 samples) as shown in Table 8, and ANN performance is better than other classifiers for the recognition of this activity. Although the overall accuracy of ANN varies from dataset to dataset, however, its better performance is consistently dependent on the large number of training instances in the dataset for a particular activity.

In the case of HMM, the number of deployed sensors affects the activity class distribution by observing their variation during the performed activities. For instance, HMM achieves 57.83 % accurate results in case of House A due to a small number of deployed sensors (i.e., 14 sensors) as shown in Table 5. It correctly classified “take shower”, “unload dishwasher”, and “store groceries”. In Tulum2009, it outperforms other classifier for “cook lunch” and “R2 eat breakfast” activities with overall accuracy 56.84 %, the number of deployed sensors is 20 in this case as shown in Table 2. For Subject 1 and Twor2009, accuracy of the HMM is not significant; the number of deployed sensors in these smart homes is 28 and 71 as shown in Tables 7 and 2, respectively. The large number of missed classified activities is the result of HMM distributions modeling as they are observed in the dataset.

The performance of CRF is affected by a set of data dimensions that are mentioned in Table 1. Its performance does not depend only on a single data dimension like sensor count or activity occurrences; however, its internal processing is based on conditioning of a set of data characteristics. CRF outperforms all classifiers in case of Tulum2009 for “group meeting”, “R1 eat breakfast”, and “wash dishes”. However, other classifiers are better for “cook lunch”, “enter home” and “leave home”, “R1 snack” and “R2 eat breakfast”. In the case of House A, CRF is superior for “brush teeth”, “load washing machine”, and “receive guest”. For Subject 1, its performance is high only for “washing dishes”, however, for “cook lunch”, “enter home”, and “leave home”, “R1 snack” and “R2 eat breakfast” other classifiers performed better. In the case of Twor2009, CRF shows low performance for all activities except “wash bathtub”.

Support Vector Machine (SVM) efficiently identified activities in the case of Subject 1 and Twor2009; it outperforms all classifiers in these datasets except for “washing dishes” and “R2 bed to toilet”, respectively. The performance of SVM is high if performed activities in the dataset are highly discriminative, however, it is hard for SVM to differentiate between activities that are closely correlated in various data dimensions. Due to this reason in House A and Tulum2009, for some activities other classifiers are better than SVM as discussed in the above paragraphs. For example, in Tulum2009, it confused “R1 eat breakfast” with “R1 snack”. Both activities are very similar to each other, the second most confused activity is “cook lunch”. SVM performance is affected if the performed activities are closely interrelated in respect to data dimensions.

The overall comparison results of different classifiers are presented in Table 9. It specifies the overall accuracy associated with each of the dataset over the four learning techniques. ANN performs betters for Tulum2009 due to the large number of training examples for each activity in the dataset that finally results in its overall high performance. The performance of HMM is better in the case of House A, where the number of deployed sensors are small as compared to other datasets that facilitates it to maintain clear transitions between adjacent activities. CRF outperforms other classifiers for House A with 90.30 % accurate results while the performance of SVM is superior for remaining datasets. The highest accuracy of 95.63 % is achieved in the case of Subject 1 as the performed activities are not correlated with respect to data dimensions. The above results and statistics clearly show that the proposed dataset dimensions highly affect the classifiers’ individual class level assignments, and thus their overall performances.

Table 9 Overall classifiers accuracy

6 Conclusion

Technological advances in pervasive sensing play important roles in leveraging the use of smart home datasets for different applications of ambient assisted living. The underlying logic of various classifications methods depends on diverse data characteristics so the dataset is significantly important for their evaluation. One of the main challenges is to compute and analyze data dimensions and variations. So, accurate data analysis is necessary to understand and reuse the datasets. Nevertheless, the information provided with smart home datasets is not always sufficient to explore the possible dimensions of analysis. We developed a framework to analyze the smart home datasets on predefined data characteristics. It enables the researchers to compute data dimensions that cover variations in time, activities, sensors, and inhabitants. To evaluate its effectiveness, we showed the influence of dataset characteristics on the performance of the classifiers (i.e., ANN, HMM, CRF, and SVM). We applied each classifier on four different datasets from three smart home projects. The result shows that usually classifiers perform complementary to each other based on dataset characteristic for the recognition of activities. Therefore, it is imperative to choose an appropriate dataset for a particular algorithm. Hence, the impact of the proposed framework is to provide a valuable and better understanding of data for the domain of activity recognition.

In the future, we are planning to extend this framework as a recommender system, such that a user can provide information about dataset requirements and the proposed framework will recommend a list of candidate datasets along most suitable classifiers from its data repository. The proposed framework will check the availability of datasets by analyzing their compatibility with provided requirements.