Keywords

1 Introduction

Sensor-based activity recognition [1] is at the core of smart environments. This recognition aims to recognize the actions of one or more persons within the environment based on a series of observations of sub-actions and environmental conditions over a period of finite time. It can be deemed as a complex process that involves the following steps: (i) select and deploy the appropriate sensors to be attached to objects within the smart environment; (ii) collect, store and pre-process the sensor related data and, finally, (iii) to classify activities from the sensor data through the use of activity models.

Advances in technology developments have mainly focused around the provision of a wide range of low cost devices, with low-power requirements and decreased form factor that implies constrained resource environments. An example are the inexpensive micro boards, such as Raspberry Pi [2] or Arduino [3]. These devices allow information to be read from sensors, in addition to the processing of sensor data to extract relevant information from the smart environments. These small computer boards, however, offer limited processing capacity and low storage capacity hence a key factor with their usage is to reduce the computational complexity of any tasks they must undertake [4, 5].

Approaches used for sensor-based activity recognition have been divided into two main categories: Data-Driven (DDA) and Knowledge-Driven (KDA) Approaches. The former, DDA, are based on machine learning techniques in which a preexistent dataset of user behaviors is required (and available). A training process is carried out, usually, to build an activity model which is followed by a testing processes to evaluate the generalization of the model in classifying unseen activities [68]. With KDA, an activity model is built through the incorporation of rich prior domain knowledge gleaned from the application domain, using knowledge engineering and knowledge management techniques [9, 10].

This contribution is focused on the most popular algorithm among all of the DDA solutions namely, the nearest neighbour (NN) [11] that provides simplicity and overall good levels of accuracy [12]. The approach is based on the concept of similarity patterns that can be allocated to the same class label (activity) [1315].

The NN approach does, however, suffer from several shortcomings. These mainly relate to high storage requirements and high levels of computational complexity [16]. These requirements are closely related to the size of the dataset. The computational complexity of the linear search method of NN is O(n d) where n is the size of the dataset and d is the dimensionality, i.e., the number of sensors. This fact is even more relevant in the application domain of activity recognition where the size of the data is related to a vast amount of generated-sensor data within smart environments.

For this reason, the PG algorithms [17] are a suitable choice given their focus on recognizing an optimal subset of representative samples from the original training data. This is achieved by removing noisy and redundant examples in order to generate and replace the original data with new artificial data [18].

In order to maximise the advantages provided by the NN approach and avoiding the drawbacks associated with the size of the datasets in resource constrained environments, the contribution of the current work proposes to use PG algorithms to reduce the size of the data in order to (i) decrease the storage requirements and computational complexity with the NN approach, (ii) maintain classification accuracy.

An evaluation is undertaken with four datasets to consider the effects of the reduction of the computational complexity in terms of overall accuracy for activity recognition based on sensor data gleaned from smart environments.

The remainder of the paper is structured as follows: Sect. 2 reviews the NN approach in addition to PG algorithms. Section 3 presents an empirical study that analyzes PG algorithms in terms of their accuracy and reduction for the purpose of activity recognition based on four datasets using binary sensors within smart environments. Finally, in Sect. 4, Conclusions and Future Work are presented.

2 Prototype Generation Algorithms Designed for the NN Approach

In this Section, we present an overview of the NN approach in addition to consider the notion of PG algorithms.

2.1 Nearest Neighbor Approach

The NN approach [11] is one of the most successfully used techniques for classification and pattern recognition tasks. It is based on the concept of similarity [19] and the fact that patterns that are similar, usually, have the same class label. The method is categorized as lazy learning [20] given it classifies the class label from raw training samples.

In order to recognize an unseen sample, representative training samples are stored within the activity model. Each training sample, which is annotated with a class label, is essentially a vector in a multidimensional feature space. In the case of activity recognition, each feature corresponds to each sensor of the network. During the testing process, a non-annotated vector i.e., a new sample, is classified. To do so, a parameter k is fixed that means the k training samples nearest to the new sample will be used to classify. The non-annotated vector, i.e., the new sample, is classified with the activity label corresponding to the most frequent label among the k training samples nearest to it. When \(k=1\) in the NN approach, i.e., the activity label of the non-annotated vector is the activity label of its single most closest neighbour.

The NN approach is based on the similarity of its k closest neighbours and has the ability to attain good levels of performance [12], however it suffers from the following three weaknesses [16]:

  1. 1.

    High storage requirements in order to retain the set of training samples.

  2. 2.

    High computational complexity in order to search through the training samples and classify a new sample.

  3. 3.

    Low tolerance to noise given that it considers all data relevant, even when the training set may contain incorrect data.

A successful technique, which has been shown to address these challenges, is based on PG algorithms. The following Section provides further details on PG algorithms and their use with NN approach.

2.2 Prototype Generation Algorithms

PG algorithms are a form of data reduction technique [21] that aim to identify an optimal subset from the original training set, by discarding noisy and redundant examples and by modifying the value of some features of the samples to build new artificial samples that are known as prototypes [18].

Fig. 1.
figure 1

Reduction through usage of PG algorithms in the number of stored instances with the ability to reduce the computational complexity of the NN

PG algorithms are therefore designed to obtain a set of prototypes generated TG, which has a smaller size of the data than the original training set TR. The cardinality of the TG is sufficiently small and has the subsequent effect to reduce both the storage requirements and computational complexity spent by the NN approach.

A wide range of PG algorithms have been designed for the NN approach to reduce the size of the dataset. Figure 1 illustrates the objective of these algorithms which have been categorized into a taxonomy based on the following four mechanisms of prototyping [17]:

  • Positioning adjustment [2224]: This technique corrects the position of a subset of prototypes from the initial set by using an optimization procedure. New positions of the prototype can be obtained by using the movement idea in the multidimensional feature space by adding or subtracting some quantities to the feature values of the prototypes.

  • Class relabeling [25, 26]: This generation mechanism consists of changing the class labels of samples from TR, which are considered as having errors and/or belonging to other different classes than to those which they have been labeled.

  • Centroid based [27, 28]: These techniques are based on generating artificial prototypes by merging a set of similar examples. The merging process is usually made from the computation of averaged attribute values over a selected set, yielding the so-called centroids.

  • Space splitting [29, 30]: These techniques are based on different heuristics to partition the feature space, along with several mechanisms to define new prototypes. The idea consists of dividing TR into regions, which will be replaced with representative examples establishing the decision boundaries associated with the original TR.

The PG algorithms can be associated with four types of reduction [17].

  • Incremental: An incremental reduction starts with an empty reduced set TG or with only some representative prototypes from each class.

  • Decremental: The decremental reduction begins with \(TG = TR\), and then the algorithm starts reducing TG or modifying the prototypes in TG.

  • Fixed. The fixed reduction establishes the final number of prototypes for TG using a user previously defined parameter related to the percentage of retention of TR.

  • Mixed. A mixed reduction starts with a pre-selected subset TG, and following this, additions, modifications and removals of prototypes are performed in TG.

3 Case Study

This Section details the evaluations under taken to investigate the effects of the performance of the PG algorithms to decrease the size of the dataset for use with NN as a means of classification in the process of activity recognition.

3.1 Activity Recognition Datasets

The case study presented in this contribution uses four datasets collected from multiple smart environments all of which used binary sensors.

Each instance of the dataset is a vector with \(d+1\) components; the first d components correspond to the value of the d sensors involved in the smart environment and the last component, \(d+1\) corresponds to the activity performed (class label). The value of a sensor is represented as a binary variable that takes the value 1 if the sensor had a change of state and 0 otherwise.

Following, the four datasets are described:

  • Casas [31]. This dataset was collected from the smart apartment test-bed of the Washington State University that contains 121 instances that was generated using 39 binary sensors with five types of activities.

  • ODI1 and ODI2 [32].Footnote 1 ODI datasets were generated within the IE Sim intelligent environment simulation tool. The first ODI contains 308 observations generated using 21 binary sensors with 11 types of activities. The second ODI dataset contains 616 observations that was generated using also 21 binary sensors with the same 11 types of activities that ODI1

  • VanKasteren [34]. This dataset was compiled in a house environment and contains 245 observations that were generated using 14 binary sensors with seven types of activities.

Depending on each person within the smart environment, it is usual that the same activity may be performed in a number of different ways over a range of durations. Thus, depending on the activity performance the sensors’ interactions can be different.

3.2 Evaluated PG Algorithms

Twelve PG algorithms, which are presented in see Table 1, have been considered in order to identify the most suitable PG algorithms for binary sensor-based activity recognition that reduce the computational complexity.

Table 1. Taxonomy of PG algorithms evaluated

Each PG algorithm requires a set of parameters. In this contribution, the fixed configuration has been the configuration proposed in [17] due to the successful results previously achieved.

Given that activities of the four datasets are annotated, we are able to evaluate the accuracy of the resulting subset of prototypes from each PG algorithm. In this way, the classification percentage is related to the accuracy percentage using the complete dataset to evaluate the NN classifier for a given prototype in the activity recognition process. Specifically, the accuracy percentage is defined as the proportion of true results among the total number of classes examined.

The set of evaluated PG algorithms are presented in Table 1, which were run using Keel software [35], an open source Java software tool with evolutionary learning and soft computing based techniques for different kinds of data mining problems. To assess the performance of the PG algorithms, a 10-fold Cross-Validation was used to evaluate the accuracy percentage of each PG algorithm. The main advantage of this validation is that all the samples in the dataset are eventually used for both training and testing [36]. With this approach less of an emphasis is placed on how much the data becomes divided.

3.3 Results

Table 2 presents the average results obtained by the set of PG algorithms evaluated over the four datasets with the NN approach with \(k=1\) (1NN). The value of the parameter \(k=1\) is selected because this value presents low tolerance to noise in the NN approach.

Table 2. Results obtained of the PG algorithms with the four datasets

Table 2 indicates the accuracy percentage in addition to the percentage reduction in terms of computational complexity for each PG algorithm and dataset. So, the reduction represents the percentage of instances that are included in the activity model of the original training set. For example, in the ODI2 dataset contains 616 observations, if the training size is reduced by 95 %, the set of prototypes generated contains 31 samples, which will be included in the activity model of the NN approac. Furthermore, the accuracy average percentage and the reduction average percentage are indicated in Table 2.

The PG algorithms reduced the size of the training data and also reduced the computational complexity when classifying a new activity based on binary sensor-based activity recognition with a 1NN approach. The PG algorithms with a reduction around 95 %, significantly decreased the size of the initial training data. Therefore, the computational complexity is clearly reduced in the same proportion, i.e., 95 %, to classify a new activity.

Nevertheless, in some PG algorithms, the reduction of the size of the training data implies a reduction in the percentage accuracy. This fact is dramatic due to the fact that the accuracy average does not exceed 85 % in the following PG algorithms: LVQ3, VQ, LVQTC, ENPC, GENN and MCA. In these cases, the PG algorithms generalize the inconsistency, incoherence or noise in the dataset, implying negative results in the use of these PG algorithms to classify a new activity based on binary sensor-based activity recognition with a 1NN approach.

There are PG algorithms with an excellent performance in terms of accuracy percentage and reduction percentage of computational complexity in the four datasets. It is noteworthy that MixtGauss is the best PG algorithm that obtains an average percentage accuracy of 95 % with a reduction average percentage of 95 %.

Analyzing the results, we can point out that the PG algorithms: MixtGauss, PSCSA and LVQPRU obtain successful results in terms of accuracy percentage with the average accuracy above 90 %. This PG algorithms prefer numerical datasets, especially binary as the dataset used in this case study, which offers excellent reduction percentage without losing performance accuracy. Therefore, these PG algorithms used for activity recognition can be deemed as being very useful given that a reduction in the number of stored instances corresponds to a reduction in the computational complexity, reducing the number of instances that are contained in the activity model.

4 Conclusions

This work has been focused on the identification of the prototype generation algorithms for the purpose of binary sensor-based activity recognition with NN approach in order to reduce the computational complexity of the classification process to be deployed in low cost devices. Twelve PG algorithms have been evaluated with four activity datasets. Results from the evaluation demonstrated the ability of the MixtGauss, PSCSA and LVQPRU PG algorithms to provide good performance and percentage reduction of approximately 95 % with an average accuracy percentage higher than 90 %.