Keywords

1 Introduction

Non-invasive activity recognition (NAR) is an ambient intelligence technology which can recognize activities based on non-invasive sensors without affecting the living conditions of residents. NAR has an important application in the field of smart home that can understand individual behavior, group behavior, and the interaction between people and the environment. Over the past decade, most of the previous NAR methods are based on pattern recognition methods and have evolved from static algorithms to dynamic algorithms, from simple feature representation to multi-level deep knowledge mining algorithms [1], such as support vector machine [2], Naive Bayes [3], hidden Markov models [4], latent-dynamic conditional random fields [5] and deep learning [6]. Past research has proved dynamic activity recognition algorithms and deep knowledge mining algorithms will help improve the accuracy of NAR [5, 7]. Based on these considerations, this paper will enhance NAR based on dynamic activity recognition algorithms by mining latent knowledge that exists in sensors and activities.

On one hand, with low-cost sensors and wireless sensor networks development, various passive sensors have recently been used to recognize activities [8, 9] in non-invasive smart home. For example, motion sensors which are installed in the floor can capture human motion and the RFID tags which are attached to the object can capture human-to-environment interaction. However, observed feature dimension increases with the increase of sensors and causes the higher computational complexity. The more features are used, the higher computational complexity it will cause, while modeling fewer features is often insufficient to ensure recognition accuracy. Besides, activity observation feature which composed of sensor observation is often abundant and sometime redundant. Thus, the feature selection directly contributes to the performance of the recognition model. Principal component analysis (PCA) [10] is tested for feature generation, but the algorithms need to choose appropriate principal component number and number selection is an impact on the result. Fortunately, there is a lot of latent knowledge in sensor networks, such as multiple sensors that are often related to only one activity. It will help NAR if we can mine the latent knowledge between sensors and activities.

On the other hand, the activities in activity recognition are not only the activities of a single person, but also the activities of multiple residents. Multi-resident activity recognition (MRAR) is more difficult due to user activity interfering with each other. In a smart home with non-obtrusive sensors, MRAR often uses data association [4, 11] which associates sensor data to the person who triggered the sensor or changed the value. To improve the MRAR accuracy, dynamic Bayesian networks such as CHMM and FCRF often used to model interacting process [12, 13]. However, data associations are often unknown and hard to obtain in ubiquitous sensor environment. Beside, for multiple residents in smart home with non-obtrusive sensors, who triggered the sensor is often ambiguous and there are not strong underlying data associations to use. If the data association is incorrect, the MRAR will be correspondingly inaccurate. So, it would be interesting to find a method for MRAR that does not rely on data association. Fortunately, there is some latent knowledge which is often invariant in multi-resident environment. For instance, there are some global features and trends, playing chess collaboratively, only one person can use computer at the same time since there is only one computer. The latent knowledge is often easy to represent in multi-resident environment. If we can mine the latent knowledge well, multi-resident activity recognition will be improved.

The paper is organized like this, in Sect. 41.2, it will introduce one activity recognition method by extracting latent knowledge between sensors and activities. Then, Sect. 41.3 will prove one new multi-resident activity recognition method based on latent knowledge in multi-resident activities. Section 41.4 is validation. Finally, the article concludes with some conclusions.

2 Latent Knowledge Between Sensors and Activities

Motivated by the relationship between sensors and activities, this section will combine the multiple features that related only one activity as one feature and use CRF to recognize activities in smart homes. To describe our method, we start from analyzing the relationship between sensor data and activities.

The activity observation feature vector at time t is often denoted as \( {\mathbf{x}}_{t} = \left( {x_{t}^{1} ,x_{t}^{2} , \ldots ,x_{t}^{N} } \right) \), where N is the dimension of observation feature. By considering one sensor as one observation feature, the dimension of observation feature vector will equal to the total sensor number. However, it is common that some sensors are related to only one activity and some sensor states are related to more than one activity. Figure 41.1 is the relationship between sensors and activities. In the figure, the sensors that related only one activity are denoted as “●,” the sensors that related two activities are denoted as “■,” and the sensors that related more than two activities are denoted as “▲.” When the state of sensor data that relates only one activity and does not relate other activities changed, it is easy to deduce that the related activity rather than other activities is being carried out. In addition, it is also common that several sensors related only one activity and does not relate other activities (“●” that in dashed circle). When one or several states of those sensors changed, we can deduce the related activity is being carried out.

Fig. 41.1
figure 1

Relationship between sensors and activities

If we regard the relationship between sensors and activities as latent knowledge and the sensors observation that related only one activity and do not relate other activities as one combined observation feature, we can deduce the observation feature vector \( \left( {x_{t}^{1} ,x_{t}^{2} , \ldots ,x_{t}^{N} } \right) \) to \( \left( {x_{t}^{1} ,x_{t}^{2} , \ldots ,x_{t}^{L} } \right) \) ,where N and L is the observation feature dimension before and after feature combining. The feature combining method is shown in Algorithm 1.

Algorithm 1. Feature combining method

Input: Observation feature x0:

\( \left\{ {{\mathbf{x}}_{t} = \left( {x_{t}^{1} ,x_{t}^{2} , \ldots ,x_{t}^{\text{N}} } \right),\,t = 1, \ldots ,{\text{T}}0} \right\}, \)

activity labels y0:

\( \left\{ {y_{t} ,t = 1, \ldots ,{\text{T}}0} \right\} \)

Output: Combined feature X0:

\( \left\{ {{\mathbf{x}}_{t} = \left( {x_{t}^{1} ,x_{t}^{2} , \ldots ,x_{t}^{\text{L}} } \right),t = 1, \ldots ,{\text{T}}0} \right\} \)

1. Find the number of activities Na = max{y0} and the number of sensors Ns = N;

2. Find the sensors that related to activity i and put them in sensor set Si= {si}, i = 1, …, Na;

3. For every si in Si, put it to set Ci if si does not appear in other set Sj, j ≠ i, i, j = 1, …, Na;

4. Combined the sensors in Ci as one combined feature;

5. For every observation feature xt in x0, update the features that corresponded sensors in Ci with the combined feature, i = 1, …, Na;

6. Denote the updated x0 as X0;

7. Return X0.

3 Latent Knowledge in Multi-resident Activities

MRAR is to infer multi-resident activities form observations. Multi-resident activity sequence is often denoted as {y1, y2, …, yT} and observation is often denoted as{x1, x2, …, xT}. For t = 1, 2, …, T, yt represents multi-resident activities at time t, and xt represents sensor observation vectors at time t. Both yt and xt are multi-dimensional variables, where the dimension of yt is the number of residents, and the dimension of xt is the number of observation feature. MRAR with machine learning method often needs some empirical data to train a recognition model, where empirical data are often used as training samples {(x1, y1), (x2, y2),…, (xT0, yT0)}.To better illustrate the problem, this paper will give a multi-resident scenario below.

Scenario: Two residents (ID = 1 and ID = 2) randomly perform three daily activities in one smart home. The three activities are labeled 1, 2, 3, and 0 if the user does not perform any activity. Assume that A = {(x1, y1), (x2, y2), …, (x7, y7)} is the collect empirical data when the two residents perform activities, where y1 = (0, 0), y2 = (1, 0), y3 = (1, 0), y4 = (3, 0), y5 = (1, 1), y6 = (1, 1), and y7 = (3, 3). In this case,\( y_{t} = (y_{t}^{1} ,y_{t}^{2} ) \) is two-dimension, where \( y_{t}^{i} ,\;i = 1,\;2 \) represent the activity ID that the ith resident performed at time t.

For the four activity labels (1, 2, 3, and 0) for two residents, theoretically we can get 4 × 4 = 16 different vectors {(0,0), (0,1), …, (3,3)}. However, due to resident preferences, some exclusive and independent activities occurred, some states we cannot observe in fact. So, only seven label vectors are obtained.

To represent prior knowledge, some terms will be introduced below.

A single label y often has multiple possible values, which we often denote as state and all the possible value sets as state set. MRAR can be seen as a multi-label state labeling problem. For m residents, there are many possible values for \( (y_{t}^{1} ,y_{t}^{2} , \ldots ,y_{t}^{\text{m}} ) \), since different residents may perform different activities.

Here, we use the State Event \( (y_{t}^{1} ,y_{t}^{2} , \ldots ,y_{t}^{\text{m}} ) \) to represent the activities of multiple users at the same time, use State Event Set A to represent various values of State Events, and use State Event Matrix M to represent the values of State Events at time 1 to T. Then, State Event Set denotes as

$$ A = \{ {\kern 1pt} (y_{1}^{1} ,y_{1}^{2} , \ldots ,y_{1}^{m} ),(y_{2}^{1} ,y_{2}^{2} , \ldots ,y_{2}^{m} ), \ldots ,(y_{K}^{1} ,y_{K}^{2} , \ldots ,y_{K}^{m} ){\kern 1pt} {\kern 1pt} {\kern 1pt} \} $$

where K is the State Events number.

State Event Matrix is given by

$$ {\mathbf{M}} = \left[ {\begin{array}{*{20}c} {y_{1}^{1} } & {y_{1}^{2} } & \ldots & {y_{1}^{m} } \\ {y_{2}^{1} } & {y_{2}^{2} } & \ldots & {y_{2}^{m} } \\ \vdots & \vdots & \ddots & \vdots \\ {y_{T}^{1} } & {y_{T}^{2} } & \cdots & {y_{T}^{m} } \\ \end{array} } \right] $$

which includes T State Events.

For above multi-resident scenario, State Event Set can be denoted as

$$ {\mathbf{A1}} = \left\{ {\left( {0, \, 0} \right), \, \left( {1, \, 0} \right), \, \left( {3, \, 0} \right), \, \left( {1, \, 1} \right), \, \left( {3, \, 3} \right)} \right\} $$

The State Event Matrix can be denoted as

$$ {\mathbf{M1}} = \left[ {\begin{array}{*{20}c} 0 & 0 \\ 1 & 0 \\ 1 & 0 \\ 3 & 0 \\ 1 & 1 \\ 1 & 1 \\ 3 & 3 \\ \end{array} } \right] $$

Note that two State Events in M may be the same, but any two State Events in A are different, and all State Events in M could be found in A. From M1 we can see that the State Event 2 and 3 are the same, State Event 5 and 6 are the same, and all State Event can be found in A1.

Represented \( (y_{t}^{1} ,y_{t}^{2} , \ldots ,y_{t}^{m} ) \) with one uniquely combined label C, it can get combined label states set \( {\mathbf{B}} = \left\{ {0,1, \ldots ,K - 1} \right\} \). The map between State Event \( (y_{k}^{1} ,y_{k}^{2} , \ldots ,y_{k}^{m} ) \) and combined label state \( C_{k} \in {\mathbf{B}} \) is defined as

$$ (y_{k}^{1} ,y_{k}^{2} , \ldots ,y_{k}^{m} )\mathop{\longrightarrow}\limits{f}C_{k} $$

For the State Event Set \( {\mathbf{A}}1 = \left\{ {\left( {0, \, 0} \right),\left( {1, \, 0} \right),\left( {3, \, 0} \right),\left( {1, \, 1} \right),\left( {3, \, 3} \right)} \right\} \), there a recombined label states set \( {\mathbf{B}}1 = \left\{ {0,1,2,3,4} \right\} \). The mapping is defined as

Similarly, there are mapping between M1 and B1 as follows

$$ \left[ {\begin{array}{*{20}l} 0 \hfill & 0 \hfill \\ 1 \hfill & 0 \hfill \\ 1 \hfill & 0 \hfill \\ 3 \hfill & 0 \hfill \\ 1 \hfill & 1 \hfill \\ 1 \hfill & 1 \hfill \\ 3 \hfill & 3 \hfill \\ \end{array} } \right]\mathop \to \limits^{f} \left[ {\begin{array}{*{20}l} 0 \hfill \\ 1 \hfill \\ 1 \hfill \\ 2 \hfill \\ 3 \hfill \\ 3 \hfill \\ 4 \hfill \\ \end{array} } \right] $$

The states of single activity \( y_{t}^{i} ,\; \)\( i \in \{ 0, \, 1, \, \ldots , \, K - 1\} \) can be obtained by inverse mapping. In the two-resident scenario, for C = 1, it can get y1 = 1, y2 = 0 by inverse mapping, and for C = 3, it can get y1 = 1, y2 = 1 by inverse mapping.

The algorithm of extracting latent knowledge in multi-resident activities is given in Figs. 41.2 and 41.3. The former is the model building flowchart, with which we can get State Event Set A, mapping f, combined label states set B, and combined label recognition model. The latter is activity recognizing flowchart from which it can see that new testing multi-resident activities are recognized with two steps: firstly, recognize the states of combined label C, then inverse map C to State Event by \( f^{ - 1} \). Finally, figure out multi-resident activities based on the State Event.

Fig. 41.2
figure 2

Model building flowchart

Fig. 41.3
figure 3

Activity recognizing flowchart

It can be seen that the extracting latent knowledge algorithm did not use data association when recognizing multi-resident activities. But, if there is a need (i.e., tracking the resident), it can also find out data association. For C = 1, if figure out y1 = 1, y2 = 0, we say the data is get by the first resident, since y2 = 0 represents the second resident does not carry out any activity and considered not trigger any sensors.

The algorithm can also handle some uncertain multi-resident activity patterns. For the two residents activity label (A1, A2), where A1 is the activity that the first resident performed and A2 is the activity that the second resident performed. If A1 or A2 is equal ‘0,’ it means the resident is performing one unknown activity and can be any one activity. For two residents with N total activities, there are

$$ \begin{aligned} & (0,A2) = (1,A2) \vee (2,A2), \ldots , \vee (N,A2) \\ & (A1,0) = (A1,1) \vee (A1,2), \ldots , \vee (A1,N) \\ & (0,0) = (0,A2) \vee (A1,0) \\ \end{aligned} $$

The unknown state is actually a union of all possible activities, thus the algorithm can handle some uncertainties and can improve activity recognition.

4 Validation

4.1 Validation 1

To validate our feature combining method, two experiments will be given. For every experiment, we will introduce the datasets, their activities, and sensor features, and then carry out our experiments. To measure the percentage of correctly classified testing samples, we define the recognition accuracy of the class as

$$ {\text{Accuracy}} = \frac{{\mathop \sum \nolimits_{n = 1}^{N} \left[ {\inf {\text{erred}}\left( n \right) = {\text{true}}(n)} \right]}}{N} $$
(1)

where N is total testing samples.

In addition, to verify the recognition of a single class, we also give the recognition accuracy of individual activities as

$$ \, \frac{{{\text{inferred}}_{c} \left( n \right) = {\text{true}}_{c} (n)}}{{N_{c} }} $$
(2)

where Nc is the total samples that contained in class c.

Experiment 1

The first experiment is based on the “ADL adlnormal” dataset that is collected in WSU Apartment Test bed [14]. There are five daily activities in the dataset. The apartment is installed with various non-invasive sensors.

The raw sensor number, cleaned sensor number, and the finally sensor number after sensor combining are shown in Table 41.1. As it was shown, the sensor number decreases obviously after sensor cleaning whereas the finally sensor number after sensor combining does not decrease much. The involved sensor numbers before and after sensor combining for the five activities can be seen in Table 41.2.

Table 41.1 Sensor number changes after cleaning and combining
Table 41.2 Involved sensor number changes before and after sensor combining for five individual activities

After the sensors cleaning and combining, we take every sensor as a feature and recognize activities in this dataset using CRF with threefold cross-validations. To validate our sensors combining solution, we compared the results with the result before sensor cleaning and sensor combining. Table 41.3 is the recognition accuracy and the total time that used for both training and testing with raw sensor and finally sensor. From the table, we can see that the recognition accuracy is increased after sensor cleaning and sensor combining whereas the time used is reduced.

Table 41.3 Recognition accuracy and the total time changes

The recognition accuracies for five individual activities with the raw sensor and with the sensor after cleaning and combining are shown in Fig. 41.4. We can see that the recognition accuracies of all the activities are increased after sensor cleaning and sensor combining for that our method not only can reduce parameter in count, but also can avoid the error caused by redundant information.

Fig. 41.4
figure 4

Recognition accuracies for five individual activities

Experiment 2

The second experiment focuses on routine morning activities collected in kitchen outfitted with 60 RFID tags [8]. In the kitchen, 11 routine morning activities are performed by user in different ways and use the RFID tags to collect sensor data. The objects that attached tags include bowl, coffee container, cupboard, dishwasher, door, drawer, egg carton, hand soap, kettle, cooking spoon, stove control, telephone, and so on.

Before combining the sensors, we first clean the uninvolved sensors, since some sensors may not involve in all the activities. The raw sensor number, cleaned sensor number, and the finally sensor number after combining are shown in Table 41.4. From the table, we can see that all of the 60 sensors are involved in the activities and the total sensor number decreases obviously after combining.

Table 41.4 Sensor number changes after cleaning and combining

Also, we give the involved sensor number before and after combining for 11 individual activities in Table 41.5, where Ai, i = 1, …, 11, presents the ith activity. The table shows that the sensors in activities 4, 5, 6, 8, 10 are combined, and thus the involved sensor numbers are decreased.

Table 41.5 Involved sensor number before and after combining for 11 individual activities

After sensors combining, we take every sensor as a feature. To validate the algorithm of extracting latent knowledge between sensors and activities, we recognize activities based on CRF with leave-one-out cross-validation. Also, we compare the results with PCA method that extracting 35 principal components as features. Table 41.6 is the recognition accuracy and the total time that used for both training and testing with different method. From the table, we can see that the recognition accuracy is increased after sensor combining whereas the time used is reduced. Although based on the feature with same dimension, PCA gets worse result than the algorithm that extracting latent knowledge between sensors and activities with much time that used for both training and testing.

Table 41.6 Recognition accuracy and the total time changes

The recognition accuracies for 11 individual activities are shown in Fig. 41.5. We can see that the recognition accuracies of activity 4, 8, 10 are increased after sensor combining. This is because our method not only can reduce parameter in count, but also can avoid the error caused by redundant information.

Fig. 41.5
figure 5

Recognition accuracies for 11 individual activities

4.2 Validation 2

We will validate our algorithm exploiting latent knowledge of multi-resident activities based on multi-resident activities dataset [4] collected in the CASAS project. In the dataset, there are two residents and 15 activities.

The multi-resident activity State Event and their frequency F are shown in Table 41.7.

Table 41.7 Occurrence counts of different State Events

As Table 41.7 shows that some State Events occur frequently, while some appear rarely, does not happen actually. For two residents with 16 (activity 0 represent the resident performed unknown activity), there are 16 × 16 = 256 State Events theoretically, but in this case, there are only 27 State Events, since some State Event do not occur at all actually. To validate our algorithm, one experiment will be given.

Experiment 3

This experiment is carried out with three fold cross-validations. In the training stage, firstly, it will build mapping f and inverse mapping \( f^{ - 1} \), and map State Event Matrix of training data as combined label state sequence. Then, dynamic activity recognition algorithm, such as HMM, CRF, and latent-dynamic conditional random fields (LDCRF) [4], is trained with observation sequences and combined label states sequence.

In the testing stage, we estimate combined label state firstly with the trained model and observation sequences in the test dataset. The average accuracy of combined label state for HMM with latent knowledge (LK-HMM) and CRF with latent knowledge (LK-CRF), and LDCRF with latent knowledge (LK-LDCRF) are 65.46, 67.61, and 63.87% correspondingly.

It is important to note that the above is not the ultimate accuracy of MRAR. To get multi-resident activity of test dataset, we need to map combined label states to State Event Matrix with \( f^{ - 1} \). Figure 41.6 is the average MRAR accuracies of five models. From it, we can see that LK-HMM gets 75.77%, LK-CRF gets 75.38%, and LK-LDCRF gets 72.69% which all get higher average MRAR accuracies than single HMM and iterative CRF which did not give data association when MRAR. Thus, mining latent knowledge can help MRAR.

Fig. 41.6
figure 6

Average recognition accuracies of five models in recognizing activities for multi-residents

There are some reasons for LK-HMM, LK-CRF, and LK-LDCRF to outperform single HMM and iterative CRF. Firstly, when single HMM model is implemented for both residents, it cannot well represent transitions between activities and multi-resident at the same time. Secondly, although iterative CRF does not need to give data associations, it still needs to learn data associations for MRAR. When data association is learned badly, MRAR accuracy will be low. In addition, LK-HMM, LK-CRF, and LK-LDCRF can mine knowledge in the multi-resident environment and can capture global features and trends of multi-resident activities.

In the LK-LDCRF case, it gets lower accuracy than LK-CRF which gets higher accuracy than CRF in single-user activity recognition [4]. This is because there are many combined label states with different internal structure, fixed hidden state for LK-LDCRF is difficult to adapt to all combined label states. Thus, the number of hidden states is difficult to determine, and it is not easy to determine. If we chose not suitable hidden states number, LK-LDCRF will get lower accuracy. In future, we will study the chosen suitable hidden states number for LDCRF in multi-resident environment and compare the result to our method.

5 Conclusions

This paper recognizes activities with some latent knowledge that exists in training samples. Firstly, this paper gives one new pretreatment method for activity recognition by extracting latent knowledge between sensors and activities, and then it proves one new multi-resident activity recognition algorithm by extracting latent knowledge in multi-resident activities. From the simulation, we conclude that extracting latent knowledge can greatly enhance activity recognition.