Keywords

1 Introduction

In this era of Industry 4.0 where modernization and advancement of technologies are taking place at a very fast pace, people are surrounded by technology all the time. The essence of these technology-laden pervasive and ubiquitous environments lies in effective communication and coordination between various components for sharing, tracking, monitoring and analyzing user data to facilitate collaborate learning, collaborative knowledge and collaborative discovery [1]. Semantic understanding of human behavior during complex activities [2] has immense potential for addressing this challenge. Recent works [3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25] in the field of activity recognition have several limitations. The primary limitation being discrete approaches – for instance video-based recognition, machine learning, probability theory etc., being used for development of the activity recognition models, which makes integration and communication of these systems with other ‘smart’ environment parameters a challenge. There is a need for development of an approach for activity recognition that can be seamlessly understood and integrated with any other systems or technologies in an Internet of Things (IoT)-based environment, to allow the utilization of the immense potential of activity centric computing for real-time development of various applications, for instance – behavior interventions, task recommendations, rehabilitation systems, training of experts for skilled professions, behavior monitoring etc. This serves as the main motivation for this work.

2 Literature Review

This section outlines the recent works in this field. A hierarchical framework for activity recognition using multiple sensors was developed by Azkune et al. [3]. A system that consisted of multiple layers with distinct functionalities for each to study and analyze video data for activity recognition was proposed by Cheng et al. [4]. An infrared sensor powered system using artificial neural networks for activity recognition was developed by Skocir et al. [5]. The system was able to detect different kinds of enter and exit events in an IoT-based smart home environment. An environment specific task recommendation system was proposed by Doryab et al. [6] to improve performance of medical practitioners in hospital and similar environments. Using wireless sensor technology, Abascal et al. [7] developed an indoor navigation system to assist users to perform different activities. A smart home monitoring system was developed by Chan et al. [8] to monitor the behaviors of elderly people as they carried out different activities. An architecture for an intelligent kitchen environment was proposed by Yared et al. [9] with an aim to reduce accidents in the kitchen area. Deen [10] developed a physiological signal tracking system to monitor users and their performance during different activities. The work done by Civitarese et al. [11] involved a system which would track user interactions to evaluate if the user has successfully been able to complete the given activity. Iglesias et al. [12] developed a health information monitoring system to monitor the health status of users in the context of ADLs. In a similar work done by Angelini et al. [13], the authors developed a smart bracelet to collect data about the health of its user as well as remind them of their routine medications and daily tasks. Khosla et al. [14] developed an assistive robot to help elderly people perform their daily routine tasks. A similar work was done by Sarkar [15] where an intelligent robot called ‘Nursebot’ comprised of a scheduler to schedule routine medications. The works of Thakur et al. [16,17,18,19,20,21,22,23,24,25] have also contributed extensively to this field.

3 Proposed Work

The work presented in this section extends one of the previous works [26] in this field – ‘A Complex Activity Recognition Algorithm’ (CARALGO). The steps to develop this framework comprise the following:

  1. 1.

    Identify the macro and micro level tasks associated with the given complex activity

  2. 2.

    Identify the environment parameters or context attributes on which these tasks need to be performed for successful completion of the activity

  3. 3.

    Track the user’s behavior to study when the complex activity started and when it ended – this includes tracking (i) when the user successfully reached the goal and (ii) when it was a false start

  4. 4.

    Record the most important context attribute [25] and the action or actions performed by the user on it

  5. 5.

    Study the user’s behavior associated with performing each of the atomic activities while interacting with the context attributes:

    1. a.

      Develop a methodology to represent the skeletal tracking of the user using Microsoft Kinect Sensors

    2. b.

      Identify specific feature points on the skeletal tracking

    3. c.

      Define these feature points according to their locations and functions

    4. d.

      Analyze characteristics of these feature points – which include joint distance, joint movements, joint angle and joint rotation speed

    5. e.

      Record the sequence in which the characteristics of these feature points change or influence one another

  6. 6.

    Compile the sequence of the characteristics of human behavior from (3) in the CARALGO analysis of the complex activity

Microsoft Kinect Sensors provide real-time skeletal tracking of any user and upto 20 joint points on the skeletal [27]. For defining these joint points as per the associated movements and functions, we review [28]. This definition is presented below (Fig. 1):

Fig. 1.
figure 1

Definition and representation of feature points in skeletal tracking [28]

Prior to implementation of this framework for analyzing a complex activity, we present an example to study a specific macro-level task. For instance, if the user has to answer the phone, then in this context the most important context attribute is the phone. The behavior associated with interacting with the phone would involve bringing the phone close to the ear – this involves the distance between joint point pairs (6,4), (7,4), (8,4), (6,3), (7,3), (8,3) or (10,4), (11,4), (12,4), (10,3), (11,3), (12,3) getting less.

4 Results and Implementation

To evaluate the efficacy of this framework, a dataset was developed based on the work done Ordóñez et al. [29]. This work by Ordóñez et al. involved recording different ADLs performed by a user in the premises of a smart home. The work consisted of a smart IoT-based environment which comprised of multiple sensors that were used to perform activity recognition to sense ADLs in a smart home, for 24 h over a period of 22 days. The different ADLs that were a part of this dataset consisted of the complex activities of Sleeping, Showering, Eating Breakfast, Leaving for work, Eating Lunch, Eating Snacks and Watching TV in Spare Time. Multiple complex activities from this dataset were analyzed according to this framework and analysis of one of the complex activities – ‘Eating Lunch’ is being presented here. In this representation, by change we mean change in distance between the joint point pairs:

As can be observed from Table 1, not all atomic activities are associated with joint point pairs experiencing a change. However, for certain atomic activities, for instance At6, several joint points pairs experience a change and it also depends on the user diversity – in terms of the user being left-handed or right-handed. It is important to track and analyze the joint points that undergo a change and also study the sequence in which multiple joint points undergo this change. This is because any atomic activity can be broken down into sub-atomic activities that need to follow a specific sequence as outlined in [23]. This analysis of changes experienced by joint points can be seamlessly communicated to other systems or technology-laden environments in the given IoT space to facilitate collaborative learning and discovery of human behavior.

Table 1. Analysis of the complex activity ‘Eating Lunch’ using this framework

5 Conclusion and Future Work

The proposed framework for development of a language to define human behavior has several characteristics: (1) It helps to study the macro-level tasks associated with any complex activity, (2) It provides a definition of different feature points for skeletal tracking, (3) It discusses a method to study, track and analyze the changes in characteristic features of these feature points with changing human behavior and (4) It also discusses the relevance of tracking the sequence of these changes for each atomic activity associated with the given complex activity. To the best knowledge of the authors, no similar approach has been done in this field yet. The results presented uphold the relevance and potential of this approach for defining human behavior in the future of IoT-based Smart Home environments for improving the quality of life and user experience during ADLs. Future work would involve implementation of this framework in real-time in an IoT-setting.