Introduction

Recently, the use of monitoring systems such as video surveillance has grown more prevalent worldwide [23]. While traditional methods allow the monitoring of different environments, e.g., stores, a street of a city, and a house, they generally need the presence of a person to verify if an unwanted event occurred. In residences, this complicates monitoring activities of daily living (ADL) and identifying possible health risks [3, 21, 35, 84]. Requiring a person makes the process expensive and extremely invasive [38, 66]. Thus, alternative methods for monitoring ADL are essential research areas.

The use of machine learning for monitoring is a valid alternative to traditional methods. In the last decade, many studies present different approaches to the task of recognizing activities in a residence, including health risk detection. Some of these studies use machine learning, deep learning, or a statistical approach for activity recognition or identifying falls [13, 28, 29, 46, 48, 54]. However, they generally depend on a dataset with labeled data. Due to the rarity of the abnormal, these datasets are usually incomplete, unbalanced, and merely simulations of the real-world events [48, 87]. To avoid these dataset problems, a possible solution is to use unsupervised learning and an anomaly detection approach [31]. The main goal is to train the model to learn only the ordinary activities, allowing it to identify when an abnormal event occurs.

An anomaly can be defined as a random event that occurs rarely and is not expected [10]. In the context of machine learning, anomaly detection has synonyms such as outlier detection, novelty detection, and more recently, deep anomaly detection. Chandola et al. [17] define anomaly detection as the search for a pattern in a dataset, and the identification of data that does not fit with the established metric. There are a variety of fields where anomaly detection is used. For example, on a network to detect an intrusion [12, 33, 41], in the finances to detect possible frauds [2, 25, 43, 69], in medicine to identify health problems [65], and in security to detect uncommon events in surveillance systems [8, 39, 68, 71, 79]. Although this approach is common in the aforementioned areas, the concept is relatively new to smart houses [11]. Thus, a few works are available that identify assault, perform daily activity recognition [37], time-series to verify drastic variation in temperature [49] and to detect falls [36, 56, 61].

Using anomaly detection solves the issue created by the use of labeled data to identify abnormal events, because the existing algorithms are capable of learning with an unbalanced dataset or a one-class dataset, for example. However, in some cases, it is necessary to extract features from the dataset to enhance the model, and manually selecting characteristics can be difficult and can result in a poor model. An alternative solution is the use of deep learning, which has the advantage of automatically learning features during the training process [64]. This avoids the use of handcrafted features, which is usually associated with domain-specific restrictions. Another relevant deep learning technique used in anomaly detection problems is the Autoencoder, which is capable of reducing data dimensionality and is commonly used in unsupervised models [19, 63].

In this paper, we systematically reviewed the field of anomaly detection related to health risks in the context of smart houses. We think that this is a relevant research area, in which the use of deep learning can contribute directly to improvement. Therefore, our primary purpose is to summarize the studies selected by the protocol’s inclusion criteria, conducting a brief analysis of each of them. Thus, it is possible to verify the number of papers that add value to the research area, using questionnaires to rank them. To convey the quality of this systematic review, all used methods are explained.

This paper is divided into five main sections. In “Methods” are presented all quality gates details used in the protocol to reach the results presented here. “Results” presents the results, and a brief explanation of the anomaly detection techniques identified in the studies, focusing on their differences and highlights. “Discussions” discusses the results obtained in the previous sections. Finally, in “Conclusions” presents this work’s conclusion, final discussions, and possible future works.

Related Works

There are several works geared toward health risks, anomaly detection, and deep learning, but none of them is specific to health risks in smart houses. Khan and Hoey [42] presents a complete literature review on fall detection techniques and dataset availability. From those, the main approaches to this problem become clear, including machine learning, deep learning, and statistical methods. As for input data, there are works using wearable sensors, image, and video cameras in different environment types. Although the work of Khan and Hoey [42] is not specifically about anomaly detection in smart houses related to health risks, it provides important information on fall detection data that can be used outside its context. Additionally, because a fall can be considered an abnormal event and is also related to health risks, some of the selected works are specifically about fall problems.‘

Nweke et al. [60] also present a complete review of the literature on state-of-the-art deep learning for activity recognition using wearable sensors. This review is an in-depth study that includes an extensive introduction of the problem and discusses the different types of recognition used in each category. All of the possible types of techniques in this approach are described in detail, such as Restricted Boltzmann Machine, Autoencoder, Sparse Coding, CNN, and RNN. For each type of technique, the authors grouped the related papers and highlighted their advantages and disadvantages. However, the paper does not detail techniques of anomaly detection; the only information on anomaly detection is a mention of a related paper.

Another work was published by by Mabrouk and Zagrouba [51], and it concerns recognition of abnormal behavior in video surveillance systems. The study shows a review of techniques used to recognize activities in different environments (both internal and external) using a variety of machine learning models, including deep learning models. However, this revision is not related to health risks, and although it explains anomaly detection problems in detail, the selected works do not match the protocol requirements established in this work.

Lastly, Chalapathy and Chawla [15] have written a relevant review on anomaly detection with deep learning, wherein a complete explanation regarding anomaly detection and its different types can be found, such as Novelty Detection, Binary Classification, Deep Anomaly, and others. Furthermore, there are sections on classifying the anomaly types, applications, and aspects related to anomaly detection. This survey is the most relevant to our work, but is not related to health risks in smart houses.

Based on the works found, the number of studies related to health risks in smart houses using anomaly detection is low. Most are treated as a fall detection problem and do not present the main concepts of an anomaly detection problem in a more general perspective.

Methods

This section describes this review’s methodology for identifying and selecting papers according to the inclusion (IC) and exclusion criteria (EC). Moreover, all issues analyzed in this systematic review are also outlined.

Study Metrics and Selection

This systematic review’s main objective is to “Identify the state-of-the-art of Anomaly Detection applied to health risks in smart houses”.

The research protocol is centered around the following main question: How can anomaly detection be applied in smart houses for health risk situations? The secondary questions are the following:

  • Why use deep learning to detect anomaly events?

  • How to identify regular events and anomaly events?

  • What are the current challenges in Anomaly Detection?

  • What are the data sources (RGB, accelerometer, thermostats) are used to detect anomalies in smart houses?

  • What types of anomalies can be detected through the use of deep learning techniques?

  • What are the main difficulties in implementing deep learning techniques for anomaly detection?

  • Which techniques were used to detect anomalies in smart houses?

  • Are there experiments using deep learning techniques to detect health risk anomalies in smart houses?

The adopted strategy was to select primary studies from the sources of previous studies, keywords, search period, and then based on defined inclusion and exclusion criteria. Thus, the systematic review was conducted using automatic searches in research databases, using keywords and time period constraints. The following revision sources were selected: IEEEXplore, MDPi, SpringerLink, and ARXiv. They were selected for their scope and relevance for the research area covered by this review. Within them, the following keywords were used: [(Anomaly Detection) OR (Novelty Detection) OR (Outlier Detection)] AND [(Deep Learning) OR (Convolutional Neural Network) OR (CNN) OR (Long Short-Term Memory) OR (LSTM)] AND [(Smart Houses) OR (Health Risks) OR (ADL) OR (Activity of Daily Livings)]. The publication period investigated was 14 years from 2009 to 2023 (June). The number of papers distributed per year and per database can be seen in Fig. 1. An exponential growth in the number of publications is observer in most of the databases considered.

Fig. 1
figure 1

Number of papers per year per database

Although this work covers the time period between 2009 and 2023 (June), there are no papers found before 2012. Furthermore, the number of papers before 2017 represent only 9.8% of the total number of papers. Thus, it can be concluded that this is a relevant and very recent topic.

The process of selection occurred in three distinct phases. In the first phase, all papers returned by the query string and in the selected period were considered. Then, we applied the exclusion and inclusion criteria. In the last phase, the title and abstract were analyzed, and the unsuitable papers were excluded. This process is illustrated in Fig. 2.

Fig. 2
figure 2

The flow of phases in the literature research process

As seen in Fig. 2, the number of papers was drastically reduced in relation to the initial number returned by the query string (1185). As explained previously, most of the papers in this area are recent and still do not have citations, so they were excluded by the EC. Additionally, this is a deep revision on a subject, thus the final number of papers is usually smaller than in other surveys.

Types of Studies and Criteria

Papers that use deep learning and anomaly detection in the context of smart houses, and presenting health risk situations. Book chapters, review papers, and conference proceedings were excluded. To refine and select the results returned by the automatic research in the previous section the criteria of exclusion and inclusion were used. First, we applied the EC (exclusion criteria) followed by the IC (inclusion criteria). The adopted criteria are listed below:

  • Inclusion Criteria

    • Works that include Anomaly Detection solutions using Deep Learning.

    • Works that are applied to smart houses.

    • Works that are applied to Anomaly Detection related to health risks.

  • Exclusion Criteria

    • Works that are not in the English language.

    • Poster, tutorial, editorial.

    • Research in areas not related to anomaly detection.

    • Research in areas not related to machine learning.

    • Research with methodological deficiencies.

    • Works that are not fully available.

    • Duplicate works.

    • Unfinished works.

    • Works that do not include computational techniques.

    • Works that do not have citations.

To validate the inclusion/exclusion criteria phases and guarantee the correct adoption of the IC and EC, two independent researchers performed this step. The obtained results were compared, and disagreements were discussed taking into consideration the aforementioned criteria. If the impasse was not solved, another researcher was consulted for the tie break.

Quality Assessment

A questionnaire was developed based on the main and secondary questions posed in this paper, to validate the quality assessment of the included studies. Thirteen questions were made, each one had three possible answers with specific weights: “Yes”, “No”, and “Partly”, with values 1.0, 0.0, and 0.5, respectively.

Results

In this section, we present an explanation of the adopted quality assessment. Furthermore, all of the selected papers are commented in a brief abstract. Finally, the results are presented based on the information gathered and the quality assessment carried out.

Quality Assessment

The metrics that define the score are instrumental in identifying the most significant studies. Thus, in this work, we elaborated thirteen questions that cover the general and secondary questions made in the systematic review.

Table 1 ID and related question

Some of the exclusion and inclusion criteria presented previously also support the quality assessment. The questionnaire is outlined in Table 1. Although these questions are not criteria of exclusion, they are instrumental to the systematic review. To guarantee the validity of the answers, three researchers answered all questions independently.

Selected Papers

This section presents a discussion on all selected papers after the execution of the protocol. We have created an overview of each of them, highlighting their advantages and disadvantages. The list of the selected papers by year and their reference ID can be seen in Tables 2, 3, 4, 5, 6, 7.

Table 2 List of the selected papers of 2017 year with their ID
Table 3 List of the selected papers of 2018 year with their ID
Table 4 List of the selected papers of 2019 year with their ID
Table 5 List of the selected papers of 2020 year with their ID
Table 6 List of the selected papers of 2021 year with their ID
Table 7 List of the selected papers of 2022 year with their ID

Final Results

After evaluating all of the selected papers, the three reviewers concur on answers to the questions posed in the quality questionnaire. The accepted papers are sort according to their adherence to the search criteria of this Literature Review. The answers can be is seen in Table 8 and the relation of IDs and questions in Table 1.

Table 8 Result of review’ questions. Column ID referrer to Paper ID in Tables 2, 3, 4, 5, 6, 7

As mentioned before, three researchers answered the questions, and when a divergence occurred, they discussed until an agreement was reached. The results of the selected papers show that fourteen papers achievement gold quality (score more than 8), three received silver quality (score between 7 and 8), and ten are bronze quality (score less than 7).

Datasets

In this Section, we have grouped all information on the used datasets for each selected paper. Table 9 shows the obtained results.

Table 9 Description of all used dataset by the selected papers

Discussions

There are many approaches to anomaly detection in machine learning. A list of all techniques described by Chandola et al. [17] and Chalapathy and Chawla [15] can be seen as follows:

  • Type of models existent in the literature:

    1. 1.

      Supervised Learning

    2. 2.

      Unsupervised Learning

    3. 3.

      Hybrid Models

    4. 4.

      One-Class Neural Network

    5. 5.

      Matrix Factorization

    6. 6.

      Variational

    7. 7.

      Generative Adversarial

    8. 8.

      Autoencoder

    9. 9.

      Reinforcement Learning

    10. 10.

      Semi-supervised

    11. 11.

      Statistical

Observing the selected papers, we can divide the types of technique used to detect anomalies into three different categories: Supervised, Unsupervised (including Autoencoder), and Hybrid (including One-Class). This division is detailed in Table 10. Furthermore, these techniques are described in the next sections.

Table 10 Characteristics present in the papers

Supervised Learning

The use of supervised learning related to anomaly detection is most common in identifying a specific type of anomaly in a limited context. As mentioned before in this paper, one of the difficulties in anomaly detection problems is related to labeled anomaly data. Therefore, this approach is limited to certain specific applications.

Although the use of this technique is successfully applied in many scenarios, in the anomaly detection problems its use poses some disadvantages, e.g., needs a labeled dataset, only recognizes the trained anomalies [15].

Concerning health risks in residences in the selected papers, this approach was used mainly to identify falls, not the identification of other health risks problems. Thus, in more general scenarios, the use of supervised learning is a limiting factor. We can see the advantages and disadvantages in the following:

  • Advantages:

    • Generally, the results are more accurate in comparison with the semi-supervised or unsupervised methods

    • The test phase is faster

  • Disadvantages:

    • Fail to identify anomalous data in scenarios that have highly complex and non-linear.

    • Needs a labeled data for normal and anomaly instances

In the context of deep learning, supervised learning has two sub-networks, the first is necessary for feature extraction and the other for classification. Another important characteristic is that this approach needs a large training sample [15].

Unsupervised Learning

In the context of unsupervised learning for anomaly detection, we have observed that anomaly detection problems usually do not use a dataset with labeled anomaly data. This approach is essential in many models for different applications.

This approach was successful in identifying network intrusion [33], fraud detection [69], and to identify an event in a time series [28]. However, for the selected papers in this review, we have identified the use of this approach only in datasets that represent sequential data, e.g., sensors combined with statistical methods.

Unsupervised learning can also be used in a hybrid scenario, allowing the model to identify the anomalies after another model extracted features from the input data.

Although this approach presents significant results in some house-related applications, its use in the identified papers was limited to sequential data, not complex scenarios or computer vision problems. The use of statistical models is also limited in complex unstructured data. We can see the advantages and disadvantages in the following:

  • Advantages

    • Is able to separate normal data from anomalous points

    • Does not require labeled data to identify outliers.

  • Disadvantages

    • Presents more challenges to work in a complex and high dimensional space

    • Generally, presents lower accuracy in comparison to supervised methods.

Hybrid

Hybrid models allow us to combine the advantages of supervised learning with some of the advantages of the unsupervised models. It is possible, for example, to use an autoenconder or a model such as a CNN to feature extract, processing the result in a One-Class model. Thus, in this scenario, we have a model capable of identifying anomalies in a high dimensional space with non-linear data and without labeled anomaly data.

Additionally, using this approach we are able to construct robust models to detect anomalies, including state-of-the-art in some cases [24, 78]. Specifically for the One-Class Network Models, an advantage we have is not needing labeled anomaly data to train the model. Thus, with only the normal class, the model is able to recognize it and signal when a different class was detected.

These models present interesting results, and in the study proposed by Chalapathy et al. [16], the results are better than the state-of-the-art. We can see the advantages and disadvantages in the following:

  • Advantages

    • Is able to separate normal data from anomalous points

    • It is possible combine feature extraction and unsupervised models

  • Disadvantages

    • More complex compared with traditional models

Discussion

The topic explored in this systematic review was demonstrated to be a recent subject (there are no papers before 2012, and no one was included of 2023 after EC and IC is applied), which is also underexplored. The criteria selected only 316 papers, and only 27 papers matched with the IC (after EC). Thus, intense research was performed on the selected papers.

There are two reasons for the low number of returned papers. Firstly, one of the exclusion criteria is that the paper must have citations, and many recent papers did not. Secondly, the adopted criteria are very specific, and thus only a few papers match the criteria.

The presented results suggest that some of the studies showed limitations and do not explore the field of anomaly detection appropriately. However, fourteen returned papers presented gold quality (score greater than 8), as they cover all the approaches proposed in this review. The other three received the silver quality (score between 7 and 8), i.e., it satisfies most of the criteria.

Related to the datasets, we can categorize them into two different categories: images and sensors. We have observed that the image datasets, when used in anomaly detection techniques, a supervised learning technique was most common. On the other hand, when using a sensor dataset on anomaly detection, unsupervised learning was usually adopted.

Conclusions

This review aimed to identify studies related to monitoring people in a home environment using anomaly detection to detect health risks. Health risks can represent serious injuries for people who live alone, especially for the elderly population. Thus, an automatic system that does not need labeled to data identify these risks is relevant in this context.

This review identified relevant papers on monitoring smart houses related to health risks. The selected works show the problems and difficulties in adopting an anomaly detection system. Additionally, most of the works approach the fall detection problem or a deep learning approach on anomaly detection problem. However, many works presented limitations and did not cover all of the points expected in this review. Among the selected papers, seventeen of them are highly significant for researchers interested in anomaly detection within video for detection of health risks. One interesting finding is the perceived exponential growth of interest on the topic, on the past recent years.

This paper contributed to the discussion of good use of anomaly detection. The existing gaps in the research topic were covered by this study, supporting future works in the area.