1 Introduction

An increase of more than 30% in the elderly population is expected by 2050 in 64 countries stated by the United Nations (de la Concepción et al. 2017). The World Health Organization, reported that about 28% of people aged 65 fall and about 32% of people aged 70 fall each year (de la Concepción et al. 2017; Luque et al. 2014). Due to the shortage of nursing homes, more elderly people are required to stay at home (Garripoli et al. 2015). Elderly people who live alone cannot alert anyone for help if a fall occurs due to any serious injuries sustained or if they were unconscious (de la Concepción et al. 2017). The definition of a fall is as follows, “an event which results in a person coming to rest unintentionally on the ground or other lower level, not as a result of a major intrinsic event (such a stroke) or overwhelming hazard” (Pannurat et al. 2017). A fall can occur in one second usually it takes between 0.45 and 0.85 s; during a fall, the posture and shape of the person changes (Yang et al. 2016b). These changes are of great importance when detecting a fall (Yang et al. 2016b). The risks of fall can be divided into two categories, namely, extrinsic, and intrinsic risks (De Backere et al. 2015; Özdemir and Barshan 2014). Extrinsic risks are related to environmental factors such as drug usage, slippery floors, poor lighting, loose carpets, unstable furniture, clutter, and obstructed paths; whereas intrinsic risks are related to the characteristics of the person such as age, general clinical condition, mental impairment, sedentary behaviour, impaired mobility and gait due to reduced muscle strength (De Backere et al. 2015; Yang et al. 2015; Özdemir and Barshan 2014). Extrinsic factors can be prevented by taking precautions, whereas intrinsic factors cannot be prevented (Özdemir and Barshan 2014). Factors that contribute to the increase in rate of falls is the increase in person age, mortality, morbidity, disability, and frailty (Khan and Hoey 2017; Jian and Chen 2015). When a fall occurs, it can result in serious damage to a persons health, fear of falling (FOF), loss of independence, no social contacts, lack of movements, and decrease in productivity, which increases the risk of another possible fall (De Backere et al. 2015; Luque et al. 2014; Khan and Hoey 2017; Pannurat et al. 2017; Sabatini et al. 2016). This can result in loss of self-confidence which can lead to social isolation and lower the quality of life (Luque et al. 2014). An FOF is linked to an increase of neuroticism and anxiety which results in elderly people avoiding participation in any physical activities (Luque et al. 2014). The biggest danger of falling is “long-lie” condition, where the fall victim is unable to stand up from a fall and remain on the ground for hours (Bagalà et al. 2012). Long lie can result in dehydration, internal bleeding, physiological and psychological consequence, depending on the seriousness of the injury; and where half of the people who experience long lie die within 6 months (De Backere et al. 2015; Igual et al. 2015; Khan et al. 2015; Principi et al. 2016; van de Ven et al. 2015; Zigel et al. 2009). With a fall detection system, the psychological stress and severity of head-trauma during epileptic seizures can be reduced and the cost of treatment is also reduced drastically (Gibson et al. 2016; Yang et al. 2016b). Falls can influence the increase in the economic costs which impose a burden on the health-care system (Sabatini et al. 2016). A lot of research has been done on fall detection systems from the 1990s (Daher et al. 2016).

The cost of being monitored is expensive at health care facilities, where the main purpose of it is to detect whether a person has fallen or not, that’s why fall monitoring systems play an important role in society by allowing people to be monitored from the comfort of their homes or anywhere else resulting in huge savings and eliminating the need for a 24/7 nursing to monitor the person (Khan et al. 2015; Gibson et al. 2016; Gupta et al. 2016; Wannenburg and Malekian 2015). The first fall detection system was a device with a button known as user-activated devices or personal alarm system (PAS), which was usually worn as a wrist band or necklace and it required the user to be conscious when a fall had occurred to press the button and alert the emergency personnel (De Backere et al. 2015; Garripoli et al. 2015; Ozcan et al. 2017). The problem with the push buttons were that they could not be pressed if the user had lost consciousness or was in a confused state due to panic; and the button could also be accidentally pushed; and the device is not have been worn by the user during a fall (De Backere et al. 2015; Bosch-Jorge et al. 2014; Zigel et al. 2009). An automatic real-time activity recognition device that can successfully discriminate between activities of daily living (ADL) and fall activities is required. ADLs contain a wide set of actions characterizing the habits of people, especially in their living places e.g. walking, sitting, standing, etc. (Andò et al. 2015). These fall detection devices that are available in the market are not satisfactory in terms of high false alarms, high maintenance cost, and they are not ergonomic (Özdemir and Barshan 2014). Fall detection needs to detect quickly to reduce impact and recovery time; and should inform others quickly to reduce the time people remain on the floor and to neglect any injuries that can occur (Ozcan et al. 2017; Ozcan and Velipasalar 2016; Pannurat et al. 2017). A precise, robust, and reliable fall detection system is required for elderly people living independently thus reducing the risks when living alone (Daher et al. 2016; Ozcan et al. 2017; Ozcan and Velipasalar 2016). There is no standard method for fall detection in terms of what type of sensors that can be used, which features to extract, and which machine learning algorithm performs better (Khan and Hoey 2017). The following is expected from a fall detection system: no intrusion on the users privacy, no restrictions on the users independence, and should not degrade the users quality of life (Özdemir and Barshan 2014).

There are several fall detection surveys published which cover some aspects of the fall detection model. In Luque et al. (2014), an overview of wearable sensors is provided; particularly fall detection systems which incorporates smartphones are covered. The study also conducts an experimental testbed to analyse the performance of the different threshold fall detection algorithms that make use of accelerometer sensors (Luque et al. 2014). The results from the testbed indicate that accelerometer techniques for identifying falls are strongly influenced by the fall patterns; and the tests also shows that it is difficult to set an acceleration threshold to achieve high accuracy (Luque et al. 2014). In Delahoz and Labrador (2014), a detailed comparison of wearable fall detection devices and fall prevention systems are provided. This includes the different sensors for detecting falls, the challenges and design issues faced are discussed. A short analysis on camera-based and ambient sensing is provided (Delahoz and Labrador 2014). The general learning models which are employed in wearable systems are explained and the most popular supervised machine learning algorithms are analysed (Delahoz and Labrador 2014). A three-level taxonomy which describes the risk factors that are associated with falls, is proposed (Delahoz and Labrador 2014).

In Perry et al. (2009), an overview and a comparison of the different systems that make use of acceleration methods, methods that combine acceleration methods with other methods, and methods that do not use acceleration. In Igual et al. (2013), a detailed review on context-aware systems and wearable accelerometer fall detection studies are provided; which includes comparisons between the different studies. The challenges in design of the fall detection systems and the issues which affect the systems performance; the trends in the present and future of fall detection systems are identified (Igual et al. 2013). Due to the lack of fall data, the problem cannot be solved by using supervised machine learning algorithms (Khan and Hoey 2017). In Khan and Hoey (2017), a taxonomy is proposed for sufficient, insufficient and no training data on falls. A comprehensive overview on the different techniques that can be applied for sufficient fall data and the lack of fall data is described. A review on camera-based and wearable studies for anomaly detection is provided (Khan and Hoey 2017).

This study will provide an updated overview on the different types of fall detection systems and the problems associated with each of the mentioned fall detection systems. The study provides in-depth analysis on the different categories compared to previously reviewed papers. The need for personalized systems will be investigated and how it can solve the key problems faced in fall detection system.

2 Model of a fall detection system

In Fig. 1, the most common fall detection model which is used in many studies, when designing a fall detection system is shown. The model comprises of the following parts which will be discussed below: data collection, feature extraction, feature selection, classifier, and evaluation.

Fig. 1
figure 1

Fall detection classifier model

2.1 Data collection

Fall detection starts by the collection of data from sensors. These sensors can be either wearable sensors or ambient sensor, and camera-based sensors. These sensors will be discussed in more detail, in Sect. 3.

2.2 Feature extraction

Feature extraction is a method where significant attributes are found from the raw data which consists of meaningless information; and it plays a vital part in determining the accuracy of the fall detection system (Delahoz and Labrador 2014; Wannenburg and Malekian 2016; Yang et al. 2016b). Fall detection systems require a distinctive feature to represent the different activities and needs to be able to classify falls from ADLs (Ma et al. 2014). There are different features, each having relevant characteristics to specific ADLs or fall activity being performed (Wannenburg and Malekian 2016). Features can be group into two categories, namely, time or frequency based features (Wannenburg and Malekian 2016). In wearable device, the most popular features are acceleration magnitude of the accelerometer and angular magnitude of the gyroscope (Delahoz and Labrador 2014). In camera-based systems the aspect ratio is the most common one; whereas in Doppler and acoustic device the Mel-frequency cepstral coefficient (MFCC) features are the most popular ones. A lot of features are calculated using statistical models such as median, max, min, variance, etc. (Wannenburg and Malekian 2016). Special attention should be applied when selecting features to produce a small descriptive dataset (Delahoz and Labrador 2014). The dataset descriptive power is impacted by the number of features that the dataset is comprised of Delahoz and Labrador (2014). Extracting features are performed on data using a sliding window method (Wannenburg and Malekian 2016).

2.3 Feature selection

The more features a database has, the more descriptive it becomes, and it becomes difficult to find meaningful relationships among the classes as the feature space grows exponentially; and the performance of the machine learning algorithm is also dependent on the feature space (Daher et al. 2016; Delahoz and Labrador 2014; Wannenburg and Malekian 2016; Yang et al. 2016b). By finding features which describes the data better and discarding the redundant features, we can improve computational speed and prediction accuracy (Daher et al. 2016; Delahoz and Labrador 2014). The method of selecting features from an N dimensional feature space is known as feature selection (Zigel et al. 2009). The feature selection algorithms are used to detect and discard features that provides minimum contribution to performance of the classifier (Wannenburg and Malekian 2016). Feature selection provides the following advantages it reduces the cost of pattern recognition process, reduce the dataset, and provides better accuracy (Daher et al. 2016; Zigel et al. 2009). There are two categories of feature selection methods, namely, filter methods and wrapper methods. Filter methods or ranking method make use of search algorithms to score the different features and rank the features from the best to the worst (Delahoz and Labrador 2014; Wannenburg and Malekian 2016). Filter methods make use of statistical tests such as T-test, F-test, Chi-squared, etc. The wrapper method takes combination of different features and compare the combinations of features based on the classifier results, where the classifier is part of the selection process (Delahoz and Labrador 2014; Wannenburg and Malekian 2016). The combination of features is chosen based on that which provides an accurate model for classification (Delahoz and Labrador 2014; Wannenburg and Malekian 2016). The disadvantage of wrapper methods is that it requires a huge amount of processing power and it is very time consuming (Wannenburg and Malekian 2016). Instead of selecting features, all the features that are extracted are combined to create new features using principal component analysis (PCA). The PCA is an unsupervised linear transformation method, which useful variable reduction procedure widely adopted in many fields and is a common technique for identifying patterns in data of high dimension and expressing the data in such way as to highlight their similarities and differences (Andò et al. 2015; de la Concepción et al. 2017). A PCA algorithm provides an orthogonal transformation of a large feature space, into a new set of values made of linearly uncorrelated variables called principal components which results in a significantly smaller feature space and decreases in dimensionality (Andò et al. 2015; de la Concepción et al. 2017).

2.4 Classifiers

The fall detection classifiers can be divided into two parts, threshold-based or rule-based and machine learning algorithms (Luque et al. 2014; Gibson et al. 2016; Pannurat et al. 2017).

2.4.1 Threshold or rule-based

The most popular classification method used in fall detection studies is the threshold analytical method (Khan and Hoey 2017; Zhang et al. 2017). The basic principal for the threshold analytical method is that a possible fall could be detected based on the sensors captured value; which is compared to the reference value (Khan and Hoey 2017; Zhang et al. 2017). A threshold method is a flowchart where each node is tested where the outcomes result in each branch. Fall detection that make use of accelerometer sensors, uses a threshold parameter to detect falls such as absolute acceleration magnitude or wavelet acceleration sum-vector and compares it to a predefined value (de la Concepción et al. 2017). The predefined value is calculated and determined from a fall signal (de la Concepción et al. 2017; Gibson et al. 2016). Fall training data is required to compute the threshold value using domain knowledge’s or data analysis techniques (Khan and Hoey 2017). The advantage of threshold is that it is easy to implement, power budget, and computational power (Andò et al. 2015; Luque et al. 2014; Zhang et al. 2017). The problem of threshold systems is that they lack limited recognition ability, not precise enough, difficult to determine the predefined value and it results in high false rates from running or jumping which results in low accuracy (de la Concepción et al. 2017; Yang et al. 2015; Zhang et al. 2017). The performance of the fall detection methods is affected by the selection of the fall indicators and detection thresholds (Hu and Qu 2014). Thresholds results in low accuracy which makes researchers to focus more on machine learning classifiers which achieves higher accuracies.

2.4.2 Machine learning

Classifiers obtained a greater performance compared to threshold classifiers when using an accelerometer sensor (Gibson et al. 2016). Machine learning algorithms have complex implementation when compared to the threshold implementation, it is based on decisions using posture calculation which result in a higher fall detection rates (Zhang et al. 2017). The advantage of machine learning algorithm is that the different falls could be customized; and high accuracy is achieved when compared to the threshold methods; and it can manage anomalies (such as noise and incompleteness) well; and it can detect patterns in signals (Pannurat et al. 2017; Zhang et al. 2017). The disadvantage of machine learning algorithm is that it requires huge amounts of representative training data, it is complex and requires heavy processing (Jin et al. 2016; Naranjo-Hernandez et al. 2012; Pannurat et al. 2017; Zhang et al. 2017). Machine learning algorithm can be divided into two groups supervised and unsupervised learning algorithm.

2.4.2.1 Supervised

Supervised learning algorithm make use of labelled data for training the system and the outputs of the system is controlled (Delahoz and Labrador 2014; Wannenburg and Malekian 2016). Certain classifiers can perform better on certain activities (Wannenburg and Malekian 2016). Classifiers can be combined such as voting machines or comparator machines (Gibson et al. 2016). A hybrid framework which make use of both threshold based and machine learning algorithms is implemented in the study (Pannurat et al. 2017). Popular supervised machine learning algorithms include Naive Bayes, k-Nearest neighbour, support vector machine, hidden Markov model, and artificial neural network.

For a k-Nearest neighbour (k-NN) also known as a lazy learner, which classifies a new feature vector based on the classes of the other training feature vectors (Delahoz and Labrador 2014; Jian and Chen 2015; Özdemir and Barshan 2014). Each time a new feature vector is inserted into the classifier, all the training feature vector sets are compared to the new feature vector in terms of Euclidean distance. From the Euclidean distance, the shortest distance will be determined, what centroid the feature vector has joined and in what class it lies (Delahoz and Labrador 2014; Gibson et al. 2016; Jian and Chen 2015). The value k determines the number of centroids that are available for each class. Special attention should be applied to determine the value of k; if a smaller value k is selected the variances increases and the results are less stable; and a large k value will result in an increase in biasing which will reduce the sensitivity (Özdemir and Barshan 2014). The disadvantage of this classifier is that the time complexity increases as the training data increases.

The support vector machine (SVM) uses a kernel trick as it transforms the inputs, which are features extracted, into a higher dimensional space using a non-linear mapping in which an optimum hyperplane is found separating two classes from a given training dataset (Aslan et al. 2015; Igual et al. 2015; Kau and Chen 2015). The basic idea is to find a separating hyperplane that corresponds to the largest possible margin between the points of the different classes (Kwolek and Kepski 2014). A hyperplane is used to separate the two classes by creating a decision boundary (maximum margin hyperplane) (Delahoz and Labrador 2014). Optimization of separating hyper plane is done by maximizing the distance between the hyperplane and the nearest data points (Aslan et al. 2015; Kwolek and Kepski 2014). The maximum margin hyperplane is learnt based on the support vectors, which the classifier uses to classify the new feature vector (Delahoz and Labrador 2014; Igual et al. 2015).

A Hidden Markov Model (HMM) is a statistical Markov model. An HMM, is made up with different number of states. A typical model for a fall detection system is a continuous HMM model, where each state is connected to one or more states. An HMM consist of the following parts: a transition probability distribution matrix which is used to determine the probability of one state reaching another state in one single step, an observation symbol probability distribution matrix which is used to determine the output of a state based on the input feature and an initial state distribution matrix which is used to determine what the initial state is. The system is trained by using a Baum-Welch training algorithm. The class is determined using a Viterbi algorithm (Popescu et al. 2012). The disadvantage of HMM it is computationally expensive and requires many model parameters (Pannurat et al. 2017).

2.4.2.2 Unsupervised

Unsupervised learning algorithm make use of unlabelled data for training the system (Delahoz and Labrador 2014). This type of learning algorithm can be trained on only fall data or non-fall data (Khan and Hoey 2017). The classifier can be trained on with new activities on the fly. Popular unsupervised classifiers include: one class support vector machine, and nearest-neighbour.

One class support vector machine (OCSVM) converts the data to a feature space which is surrounded by a hypersphere; and it searches for the appropriate hyperplane that splits a portion of the input data from the rest of the data by the sign of the distance to the hyperplane (f(C) > 0 or f(C) < 0) (Medrano et al. 2016; Yang et al. 2016a; Yu et al. 2013). The classifier makes use of hyper-plane as a decision boundary to classify the binary data (Yang et al. 2016a). The advantage of OCSVM is that it describes the data in a flexible way; since it does not need to ensure that the data follow a certain distribution (Khan et al. 2015; Yu et al. 2013).

Nearest-neighbour (NN) is a data driven method, and which is simply a k-NN classifier where k is equal to 1 (Igual et al. 2015; Li et al. 2012). The basic concept of NN is to allocate the incoming record to the class that has a record closest to the incoming record (Li et al. 2012). The Euclidean distance is computed for the incoming record with each of the stored record, where the minimum distance between the incoming record and stored record is used (Igual et al. 2015; Medrano et al. 2016). If the minimum distance is higher than a threshold value, the incoming record is considered an anomaly (Igual et al. 2015; Medrano et al. 2016). The performance of NN will suffer if the data has regions of varying densities (Medrano et al. 2016).

2.5 Testing and evaluation of the system

Typical testing of the system is to perform leave-one-out method or cross validation method (Pannurat et al. 2017). The dataset can also be split into 70% for training the classifier and 30% for testing the classifier (Wannenburg and Malekian 2016). Statistical tests are done to determine the overall performance of the classifier (Delahoz and Labrador 2014). The classification model can produce the following four possible outcomes (Gibson et al. 2016): (1) true positive (TP) when a system properly detects a fall when fall has occurred. (2) False positive (FP) when a system detects a fall when no fall has occurred. (3) True negative (TN) when a system detects no fall when no fall has occurred. (4) False negative (FN) when a system detects no fall when a fall has occurred. False negatives are falls which remained undetected and false positives are ADL activities which were classified as falls (Luque et al. 2014). The following below, are most popular methods for measuring the performance of the classifier.

The recall or sensitivity (SE) measures the ability of a fall detection algorithm to correctly identify falls over the entire set of fall instances (Delahoz and Labrador 2014; Gibson et al. 2016; Sabatini et al. 2016).

$$\begin{aligned} SE = \frac{TP}{TP + FN} \end{aligned}$$
(1)

The precision (PR) measures the ability of a fall detection to correctly identify falls over the entire set of instances classified as falls. The precision measures the ability of the classifier to return the fall results were correctly classified (Delahoz and Labrador 2014; Gibson et al. 2016).

$$\begin{aligned} PR = \frac{TP}{TP + FP} \end{aligned}$$
(2)

The specificity (SP) measures the ability of a fall detection algorithm to correctly identify ADLs over the entire set of instances classified as ADLs (Bagalà et al. 2012; Delahoz and Labrador 2014; Gibson et al. 2016; Sabatini et al. 2016).

$$\begin{aligned} SP = \frac{TN}{FP + TN} \end{aligned}$$
(3)

Accuracy (ACC) is measured the portion of fall results that were correctly classified amongst all outcomes (Delahoz and Labrador 2014; Gibson et al. 2016).

$$\begin{aligned} ACC = \frac{TP + TN}{TN + TP + FP + FN} \end{aligned}$$
(4)

The \(F_1\)-measure combines the precision and sensitivity indicators (Delahoz and Labrador 2014; Principi et al. 2016).

$$\begin{aligned} F_1-measure = \frac{2 \times precision \times recall }{precision + recall} \end{aligned}$$
(5)

The receiver operating character (ROC) theory has been used to properly define threshold values based on constraints on the system sensitivity and specificity (Andò et al. 2015). By adjusting the threshold value, the ROC curve is created (Igual et al. 2015). From the curve, the threshold point is selected where the maximum geometric mean of the sensitivity and specificity is selected from Eq.  6 (Igual et al. 2015).

$$\begin{aligned} geometric mean = \sqrt{specificity \times sensitivity } \end{aligned}$$
(6)

The area under the curve (AUC) is the recover operating characteristic (ROC) curve and tells the performance of the classification model (Debard et al. 2012; Liu et al. 2014). The closer the AUC is to 1 the better the performance of the classification model is.

3 Fall detection sensors

Fall detection systems are also known as context-awareness systems should be able to recognize, interpret, and monitor different activities the user performs and be able to detect fall events (Özdemir and Barshan 2014). There are different types of fall detection methods which includes camera-based, acoustic-based, and wearable sensors (Sabatini et al. 2016). Each method of fall detection consists of numerous sensors, but none of these sensors provides 100% accuracy, but each sensor has its own advantage (Medrano et al. 2014). Table 1, shows the general characteristics of these sensors types.

Table 1 Characteristics of different fall detection methods

3.1 Wearable sensors

Due to the increase in wearable telemedicine technology, solving these problems becomes easier (Jian and Chen 2015). The growth of micro-electro-mechanical system (MEMS) resulted in miniaturized, more compact, and low cost (Kwolek and Kepski 2015; Özdemir and Barshan 2014). They can be easily integrated to other available alarm systems in the vicinity or to the accessories that the person carrier e.g. smartphones or smart watches which can achieve a kind of non-intrusive and non-invasive diagnosis and monitoring (Jian and Chen 2015; Kwolek and Kepski 2016; Özdemir and Barshan 2014; Wang et al. 2014; Wannenburg and Malekian 2015). The wearable sensors are connected to the subject of interest (SOI) (Yang et al. 2016b).

Wearable devices make use of embedded sensors to calculate the motion of the monitored body in any unsupervised environment, period of inactivity, and the posture of the person (Bosch-Jorge et al. 2014; Luque et al. 2014; Yang et al. 2016b). The first automatic fall detection system is a wearable device that is placed on the user to detect falls which make use of acceleration or rotation information (Rougier et al. 2011). Wearable sensors can detect a fall by analysing the impact of the body with the ground, and taking the body orientation post and prior to a fall has occurred (Hakim et al. 2017). Wearable sensors are not affected by the environment or by privacy concerns (Sabatini et al. 2016). Collecting activity data from wearable sensors is not restricted to laboratory environment, which allows collection of real world activities (Bagalà et al. 2012). Wearable device can be implemented using micro-controller or smartphones (Wannenburg and Malekian 2015).

3.1.1 Using smartphone for activity monitoring

Smartphones are now equipped with MEMS sensors which can be used to perform unobtrusive fall detection monitoring; and smartphones are already integrated in the daily life of users (Andò et al. 2015; Khan and Hoey 2017; Kwolek and Kepski 2015; Luque et al. 2014). The increase in growth of technology has made smartphones more popular and more commonly used than any specific fall detection equipment, they are non-invasive, portability, cost-effective, easy to carry; and work both indoors and outdoors (Khan and Hoey 2017; Luque et al. 2014; Shen et al. 2015). Figure 2 shows a list of different high precision sensors that are nowadays available on the smartphone. The biggest advantage of smartphone is that it has most of these sensors integrated into it, which does not require no extra device (Andò et al. 2015; de la Concepción et al. 2017). The biggest problem of smartphone devices used in fall detection is the fact that the devices lack battery draining; and have limitations in memory and real-time processing capabilities (de la Concepción et al. 2017; Luque et al. 2014).

Fig. 2
figure 2

Available sensors on smartphone devices

3.1.2 Different types of wearable categories

Wearable fall monitoring systems are grouped into three groups: fall alert, fall risk assessment, and impact prevention (Pannurat et al. 2017). Fall alert or personal emergency response system (PERS) is implemented to alert medical personnel or caregivers to provide assistance to user in an event of fall (Pannurat et al. 2017; Sabatini et al. 2016). Fall risk assessment is the study of fall in terms of the cause of it, and detecting which patients should be monitored based on their movements (Pannurat et al. 2017). Impact prevention or fall injury prevention system (FIPS) is used to detect a fall event before it happens, and triggers a protection or prevention device to protect the user (Pannurat et al. 2017; Sabatini et al. 2016). An Example of FIPS is the detection of falls in the pre-impact phase where an activate protection devices can be used, such as an inflatable airbag or other projection device, to avoid any injuries from the fall (Hu and Qu 2014, 2015). PERS prevents a long-lie by notifying caregivers when a fall is detected, since some falls are too hard to get up from or the user is in an unconscious state (Sabatini et al. 2016). PERS is the most popular type of system and more research being conducted into it. PERS can be split into posture and motion devices (Hakim et al. 2017). Only PERS system will be analysed, and not FIPS as it relies on pre-fall data to detect a possible fall and is shown to achieve a low accuracy in Pannurat et al. (2017).

3.1.3 Different wearable sensors

Wearable sensors include but are not limited to tilt switches, accelerometers, gyroscopes, pressure sensors, magnetometers, and microphones (Pannurat et al. 2017). Each sensor has different characteristics and can operate independently or in conjunction with each other.

3.1.3.1 Accelerometer

Accelerometer sensors are the most popular and widely used sensors for detecting fall accidents and sensing body motions; as it has high accuracy, even in noisy measurements a well-read acceleration measurement down to 0Hz (Kwolek and Kepski 2015; Medrano et al. 2014; Sabatini et al. 2016; Yang et al. 2016a, b). Accelerometers are feasible, effective, fast, easy to set up and operate, simple, lightweight, low-power, and cost-effective solutions for fall detection systems (Gibson et al. 2016; Huang et al. 2011; Ozcan et al. 2017). In Perry et al. (2009) a study was conducted to detect what type of wearable sensor can accurately detect falls based on sensors that use acceleration, acceleration integrated with other sensor methods, and no acceleration sensors. The study concludes that using sensors which can sense accelerations are good at detecting falls; whereas methods that did not use acceleration are less accurate and can lead to many false alarms (Perry et al. 2009). Falls can be detected by applying different signal evaluation techniques on accelerometer data (de la Concepción et al. 2017). The most popular feature extracted from the accelerometers is the Signal Magnitude Vector (SMV) which is given below,

$$\begin{aligned} SMV = \sqrt{x^{2} + y^{2} + z^{2}}, \end{aligned}$$
(7)

where x, y and z are the acceleration values along the X, Y, and Z axis of the accelerometer (Perry et al. 2009; Wang et al. 2014). A fall acceleration signal comprises of peaks and valleys, and fall activities usually associated with large SMV peaks (Hakim et al. 2017; Medrano et al. 2014). Fall decision which make use of only SMV and considers only the abrupt peaks in the acceleration which result in high FP, due to the sudden movements which occur when performing complex movements, such as sitting down fast, and jumping (Hakim et al. 2017; Luque et al. 2014). Most acceleration-based studies use a threshold-based algorithm for detecting a fall which result in high false alarms, in order to reduce false alarms, machine learning algorithms can be implemented (Kwolek and Kepski 2015).

The placement of sensors also plays a vital role as it can directly impact the accuracy of the fall detection techniques (Perry et al. 2009). In Kangas et al. (2008) different positions on the human body is tested to identify the best position for the accelerometer. The following positions were tested: head, waist, and wrist to detect falls (Kangas et al. 2008). The acceleration information measured was compared to a threshold to detect a fall (Kangas et al. 2008). The results show that placement of the accelerometer sensor on the person head and waist achieves a sensitivity of 97–98% and specificity of 100% when using a simple threshold algorithm (Kangas et al. 2008). Investigation in Pannurat et al. (2017), to determine what phase of a fall and placement of the tri-axial accelerometer on the body will achieve the best accuracy. Hybrid framework which make use of rule-based knowledge and a two-layer Gaussian classifier was implemented (Pannurat et al. 2017). The following accuracies were obtained at different phases of a fall: 86.54% for pre-impact, 87.315% for impact, and 91.15% for post-impact (Pannurat et al. 2017). The paper found that the side of the waist is the best position for the sensor during post-impact, followed by head, wrist, and front of waist, thigh, chest, ankle, thigh, and upper arm (Pannurat et al. 2017). The reason for not achieving 100% accuracy in the post-impact phase include signal loss, post-impact and high impact ADLs were classified incorrectly (Pannurat et al. 2017). If falls are analysed during post-impact phase, the chest is not suitable placement since the data transmission path of an alert signal could be blocked by the user’s body (Pannurat et al. 2017). The following sensors placements result in false positives by not being able to differentiate falls among sitting and standing: head, upper arm, wrist, ankle, and chest (Pannurat et al. 2017). Placing the sensor close to the person centre of gravity makes the sensor less sensitive to spurious movements.

The disadvantage of accelerometer sensors is prone to elevators and high-speed cars or trains (Ozcan et al. 2017). The output of the accelerometer does not only consist of acceleration but also gravity, which can create errors when calculating the angles resulting in high false positives (Jian and Chen 2015). Accelerometer systems lack the adaptability together with insufficient capabilities of context understanding (Bagalà et al. 2012; Kwolek and Kepski 2014). Accelerometer methods require high sampling rate, which can result in fast battery draining (Kwolek and Kepski 2014). In Steidl et al. (2012) it was investigated that threshold based algorithms implemented on smartphone suffers a limitation from the accelerometers. The assumptions from smartphone fall detection system is that the hardware sensors measure acceleration with sufficient precision which is not the case (Steidl et al. 2012). The sensors from different manufactures record values in significantly different ranges for identical test sensors, which makes it impossible to set a reliable threshold value (Steidl et al. 2012). The accuracy of the system increases when accelerometer is incorporated with other sensors such as gyroscope, magnetometers, and barometers, the accuracy of the system increases (de la Concepción et al. 2017).

3.1.3.2 Gyroscope

The most common feature extracted from the gyroscope sensor is the magnitude of the resultant angular velocity(w), which is given below,

$$\begin{aligned} w = \sqrt{w^{2}_{x} + w^{2}_{y} + w^{2}_{z}}, \end{aligned}$$
(8)

where \(w_x,\) \(w_y\) and \(w_z\) are the angular velocity along the X, Y, Z axis of the gyroscope (Jian and Chen 2015). There are limited studies that only make use of gyroscope sensor to detect a fall.

In Wu et al. (2012), a study was conducted to understand the use and the contribution a gyroscope sensor has when classifying physical activities. Accelerometer and gyroscope data were collected and fed into different classifiers (Wu et al. 2012). The study concluded that by adding the gyroscope sensor to the system can improve the accuracy, the reason being that gyroscope data makes use of the objects orientation which most activities consist of; since the accelerometer only measures the linear motion along specified directions (Wu et al. 2012). There are a lot of studies which combines both accelerometer and gyroscope together (Andò et al. 2016; Colon et al. 2014; Jian and Chen 2015; Zhang et al. 2017).

The disadvantages of low cost gyroscopes are that they suffer from time varying zero shifts. This introduces significant errors when calculating the angular acceleration and angular position, using differential and integral operations (Andò et al. 2016; Jian and Chen 2015; Zhang et al. 2017). If the noise is not removed and the data is accumulating, the error can be huge (Zhang et al. 2017). The Kalman filter algorithm with dynamic information of the target is required to remove the noise, in order to estimate the angle (Zhang et al. 2017). The gyroscope is also only available in higher grade smart phones (Kau and Chen 2015).

3.1.3.3 Health sensors

In Ghasemzadeh et al. (2010), fall is detected using electromyogram (EMG) sensors; which measures the muscle control signals. When a fall occur, there is a change in heart rate, which can be used to detect a fall. In Wang et al. (2014) an accelerometer and cardio-tachometer is used to analyse and detect falls. When a person falls down, the state of person heart-rate can increase anxiety (Wang et al. 2014). When a fall occurs the heart rate can be used to detect how seriousness of the fall is Nguyen et al. (2009). The disadvantage of using health sensors, it difficult to place on, and they can interfere when performing ADLs.

3.1.3.4 Wearable camera

Compared to wearable sensors, wearable cameras provide a much richer set of data including contextual information about the environment, which includes analysis of a variety of activities including falls (Ozcan et al. 2017; Ozcan and Velipasalar 2016). The wearable camera system monitored is not limited to confined areas, and it can extend to wherever the subject may travel (Ozcan and Velipasalar 2016). Wearable cameras do not affect the privacy of the user since it only records the surroundings of the user environments; and the system processes everything locally on the device and nothing gets transmitted anywhere (Ozcan and Velipasalar 2016). In Ozcan et al. (2017) the study make use of a camera system is worn on the user waist, which can provide continuous monitoring and is not limited to certain areas as compared to static cameras. Advantages of this system is that the privacy concerns are removed as opposed to the static cameras (Ozcan et al. 2017). The wearable camera system uses edge orientations and histograms to detect falls; which can work effectively both indoors and outdoors, but it is highly invasive for subjects (Yang et al. 2015). The wearable camera records the surrounding environment, which will make other people around the user uncomfortable, as it will seem as it is recording other people.

3.1.3.5 Ambient sensors as wearable sensors

Ambient sensors such as pressure sensor and microphone can be attached on the user footwear to detect falls (Doukas and Maglogiannis 2008; van de Ven et al. 2015). The advantage of attaching ambient sensors on wearable items, it can provide outdoor monitoring, and it is not limited in coverage area; since it is attached on the user. The disadvantage of the system it is influence by the environment. In Table 2, a summary of the different wearable fall detection studies is shown.

Table 2 Summary of wearable sensors studies

3.1.4 Disadvantage of wearable sensors

3.1.4.1 Placement and intrusion

The major disadvantage of wearable devices includes intrusion, undesirable placement of device, neglect, or not wanting to wear them, and inconvenience to the users movement (Aslan et al. 2015; Debard et al. 2012; Hakim et al. 2017; Kau and Chen 2015; Özdemir and Barshan 2014; Stone and Skubic 2015; Wannenburg and Malekian 2015; Yang et al. 2016b; Yang and Lin 2014). Neglect or forgetting to wear the device, can resulting a wearable device an ineffective solution (Bosch-Jorge et al. 2014; Kwolek and Kepski 2014; Yang and Lin 2014). The undesirable placements of sensor on the user body, can cause obtrusiveness, inconvenience and uncomfortable when performing ADLs (Bosch-Jorge et al. 2014; Khan et al. 2015; Kwolek and Kepski 2014; Yang et al. 2016b). Wearable devices which are placed on the belt around the hip, cannot be worn when changing clothes; and sleeping which results in the inability to monitor when a person is getting up from the bed (Kangas et al. 2008; Kwolek and Kepski 2014). The addition of extra sensors causes the user to feel uncomfortable and lead to certain degree of inconvenience (Kau and Chen 2015). One solution, is to allow the user to choose the placement of the device, and the device should perform on-body sensor localization to detect the location of the device on the user (Colon et al. 2014). This will eliminate undesirable placements. To make it convenient to the user, trouser pocket location can be used for placing device (Shen et al. 2015). Bathroom has a high occurrences of falling down, which make it difficult for a person to wear a device in the bathroom, since these systems are affected by water, and make it uncomfortable when bathing (Litvak et al. 2008; Zigel et al. 2009).

3.1.4.2 Power

Wearable sensors are all battery powered, which means it cannot be used when the device is recharging or batteries will have to be replaced (Özdemir and Barshan 2014; Principi et al. 2016; Stone and Skubic 2015). The battery problem can be compensated by implemented using low sampling frequency scheme together with a hierarchical scheme methodology (de la Concepción et al. 2017). This will also reduce computational complexity of the system thus saving processing time (de la Concepción et al. 2017). To make the system usable , a smaller number of sensors is preferable on the user (Pannurat et al. 2017). The advantage of keeping the number of sensors to a minimum is that it can cope with resource constraint issues such as battery power, storage, computational power, and network bandwidth (Pannurat et al. 2017).

3.1.4.3 Hardware and software

Wearable device are limited to the hardware and software (Hakim et al. 2017). Each smartphone device has fixed number of sensors built in, to add more sensors, the smartphone is required to be upgraded. A basic sensor that is available in all smartphones is the accelerometer sensor. Compared to microcontrollers where the software is fixed; the software of the smartphone can be updated anytime. Smartphones can address the problems of a low-power microcontrollers where classification algorithms are constrained to limited memory and processing power. Most microcontrollers systems implement only threshold classification, whereas smartphones can implement machine learning algorithms.

3.1.4.4 Generates a lot of false positives

Wearable sensors generates a lot of false alarms when performing daily activities, which can lead to frustration of users (Kwolek and Kepski 2014). The reason for poor accuracy and high false positives using accelerometers of lack of adaptability with the lack of context understanding (Kwolek and Kepski 2016). False positives can be limited by implementing communication between the user and the device. If a fall has occurred the user is communicated to first, to determine if a fall has occurred. If the user does not respond within a specified time period, the emergency service is communicated (Sposaro and Tyson 2009).

3.2 Ambient sensors

Ambient device make use of event sensing by collecting and examine the environment which is used to track the elderly person’s movement, through the use of externals sensors which are attached around the surrounding environment such as a home or close to the subject (Luque et al. 2014; Wang et al. 2014; Yang et al. 2016b, 2015). Other application that ambient sensors provide is indoor localization and security (Yang et al. 2016b). The advantage of ambient devices is that user does not to need to wear the device or remembering to put it on, it is passive and unobtrusive (Hakim et al. 2017; Zhang et al. 2017). The ambient devices are non-intrusive and it is invisible to the elderly which would not affect the user privacy (Daher et al. 2016). Ambient devices are cheaper; but less intrusive compared to camera-based systems (Hakim et al. 2017).

3.2.1 Vibration detection

Ambient devices, that make use of vibration data where the detection of falls is based on the characteristics of vibration patterns (Hakim et al. 2017; Ozcan et al. 2017). Vibrations can be used to detect fall based on an observation that normal activities cause measurable vibrations on the floor, which means a when a user falls the down, the impact cause by the body parts with ground will generate vibrations that will be transmitted throughout the floor (Alwan et al. 2006; Werner et al. 2011). An assumption is also made that the vibration signal for falls and ADL are different (Werner et al. 2011). Using the events and changes in vibration data make it useful for monitoring, tracking and localization (Hakim et al. 2017). Vibration signal can be obtained using a piezoelectric sensor or an accelerometer sensor. Floor vibrations are inexpensive, and they can preserve the privacy of the user, but the performance is influenced by the floor type and has a limited detection range (Li et al. 2012).

3.2.2 Acoustic detection

The basic idea of acoustic sensor is to make use of a microphone sensor to capture the movements of the users where MFCC features are extracted to detect falls. The MFCC features are extracted by first removing the high frequency component (Khan et al. 2015). Segmentation of the audio signals into different frames (Khan et al. 2015). A FFT transform is applied to each frame to get the frequency spectral features (Khan et al. 2015). After the FFT, mel-scale mapping is performed and finally discrete cosine transform is applied to obtain 12 MFCC (Khan et al. 2015). Applying beam-forming technique on the sound signal can enhance the desired signal and reduce the interference from TV, radio, or phone ringing (Li et al. 2012). Acoustic system makes use of a Rescue Randy doll for mimicking human falls, for testing the system (Principi et al. 2016). The source of the sound signal, from multiple microphone can be detected using the steered response power with phase transform technique, which can work in any conditions (Li et al. 2012). The sound signal is enhanced using beam-forming technique (Li et al. 2012). Classifier design for acoustic fall design is difficult to design since it is impossible to obtain realistic fall sound signatures for training and testing of the system (Popescu and Mahnot 2009). Generating fall data is difficult to simulate (Popescu and Mahnot 2009). When capturing simulating falls, the test subject tries to prevent a painful fall (Popescu and Mahnot 2009). Most of the acoustic studies make use of Randy Rescue dolls which makes detecting low impact falls difficult, to compensate these different weights of rescue randy dolls is needed to train the system (Litvak et al. 2008). The studies which make use of Randy Rescue dolls cannot replicate realistic falls sounds due to the hard skin and the lack of bones in the mannequins (Popescu and Mahnot 2009). The material of the floor and the limited range of the detection of the audio affect the system (Principi et al. 2016).

3.2.3 Pressure sensor

Pressure sensors are most common method for ambient sensor since its low cost and non-obtrusiveness, a fall is detected based on sensor pressure changes (Yang et al. 2016b). Pressure senor used to detect the high pressure of the object due to the objects weight for detection and tracking (Yang et al. 2016b). The pressure changes depending on how close the person is to the sensor (Chaccour et al. 2015). If the person is closer to the sensor, the pressure is high (Chaccour et al. 2015). The disadvantage of pressure sensors is the low detection precision which is below 90% (Yang et al. 2016b). The disadvantage of only using pressure sensor to detect a fall, it can sense pressure of everything in and around the object, which leads to false positives hence low accuracy is achieved (Yang et al. 2016b; Hakim et al. 2017). The distance of impact to where the pressure sensor is located can impact the accuracy of the system (Yang et al. 2016b). Another problem is that using only pressure sensors it cannot differentiate between lying and falling postures (Daher et al. 2016). To solve this problem in Daher et al. (2016), the make use of intelligent tiles which consists of pressure sensors and three-axis accelerometers. The accelerometer is used to detect hard human falls, but cannot detect soft falls (Daher et al. 2016). The accelerometer is used to enforce the differentiation between the falling and the lying down posture (Daher et al. 2016). Each tile has a processing unit and wireless connection and electric power (Daher et al. 2016). The disadvantage of the system is (Daher et al. 2016) is the cost associated with each tile, and it requires power supply for each tile. Pressure sensors can have high false alarms due to the fact the persons weight is not factored in, when detecting a fall; and the system is usually implemented on a small scale e.g. like a mat which makes it costly when implementing it in a home environment. The factors which influence pressure sensors are the placement and sensitivity to pressure.

3.2.4 Passive infrared sensor

A passive infrared (PIR) sensors detects falls using infrared signatures (Popescu et al. 2012). The strength of the received signal from the PIR sensors changes with motion of a hot object within range of the sensors (Yazar et al. 2013). The PIR sensor cannot be used to differentiate fall since a walking person can produced a signal similar to a PIR fall signal (Yazar et al. 2013). In Yazar et al. (2013), a combination of both PIR and floor vibration sensors is used to detect a fall. The PIR sensors is used to reduce the false alarms in the system by detecting whether the vibration signal was caused by a human, and by detecting the presence of the user (Yazar et al. 2013). A fall alarm is ignored when there is no motion in a room (Yazar et al. 2013). The biggest problem of using PIR sensors is the line of sight and coverage area.

3.2.5 Doppler sensor

Doppler sensors is a motion sensor that can sense, track, and recognize moving objects and surveillance human activity (Liu et al. 2014). Doppler sensors are small and cheap which only detects moving targets by suppressing stationary background cluster, and are noise tolerant systems (Tomii and Ohtsuki 2012). A Doppler sensor has different irradiation direction which is less sensitive to the movements orthogonal to the irradiation direction compared to moving in the irradiation direction, it becomes sensitive (Tomii and Ohtsuki 2012). A Doppler sends a continuous electrometric wave signal at the carrier frequency and gets back the reflected wave which has the frequency shifted by the moving object (Tomii and Ohtsuki 2012). The velocity of the moving object can be determined through the frequency shift within the detection range (Tomii and Ohtsuki 2012). The disadvantage of Doppler sensor is sensitive to motion and can penetrate apartment walls (Tomii and Ohtsuki 2012).

3.2.6 Electric near field

A near-field imaging (NFI) system uses floor sensors to detect falls (Rimminen et al. 2010). The floor sensors detect the locations and patterns of the user by measuring the impedance with a matrix of thin electrodes under the floor (Rimminen et al. 2010). When the NFI is detected, the locations of the electrodes from the matrix is detected (Rimminen et al. 2010). More sensors will be required when the area in the environment increases, hence increase in cost. False positives are generated if there are pets or occlusions available. In Table 3, a summary of the different ambient fall detection studies is shown.

Table 3 Summary of ambient sensors studies

3.2.7 Disadvantage of ambient sensors

3.2.7.1 Coverage

The ambient sensors work only indoors or where the device is confined to, dead spaces, suffer from blind spots, has limited recording area, it can only monitor one person and it can be an expensive setup (de la Concepción et al. 2017; Ozcan et al. 2017; Principi et al. 2016; Zhang et al. 2017). The limited recording area does not affect electric near field and pressure sensors, but will be expensive to cover a monitoring area. Most of ambient systems, assumes that only one person is present in the monitoring room.

3.2.7.2 Noise

Ambient sensors are affected by the environmental interference, background noise and by ambient noise (Garripoli et al. 2015; Özdemir and Barshan 2014). Ambient device can produce many false alarms due to other falls cause by everyday objects (Luque et al. 2014). Acoustic and vibration sensors can only work on certain floor type. Movement sensors are affected by obstructions or occlusions which can deteriorate the signal.

3.3 Camera-based methods

The advancement in computer vision and image processing techniques can also be applied in fall detection problems, where a camera sensor is used to monitor the user behaviour and detect fall activities without interfering with the users routines (Luque et al. 2014; Yang et al. 2015). Camera sensors can record the users position and shape (Yang et al. 2016b). Using computer vision to detect a fall can be difficult since the human body is composed of several parts which can move freely, which makes the process of identifying and locating people more difficult (Bosch-Jorge et al. 2014). To overcome the problem, the current studies uses human parts which can be detected such as the head, waist, or feet (Bosch-Jorge et al. 2014). The advantage of camera-based methods is that there is no intrusion on the users since these sensors does not need to be worn or remembered to be worn, due to the fact that the camera system is contactless; and it can be used to monitor one or more people simultaneously; and it can be used to detect falls in public areas (Debard et al. 2012; Hakim et al. 2017; Kwolek and Kepski 2015, 2016; Yang et al. 2015; Zhang et al. 2017). Multiple people can be tracked in a frame through segmentation and marking module (Yang et al. 2016b, 2015). Camera-based methods can be used to serve for two purposes, fall detection and security monitoring. Advantage of camera-based methods compare to the other methods, it is more robust and it can accurately detect falls and different ADLs; and it can verify a fall remotely if a fall has occurred (Hakim et al. 2017; Khan and Hoey 2017; Kwolek and Kepski 2014; Nizam et al. 2017; Yang et al. 2015). Camera-based systems is best suited where multiple people need to be monitored e.g. hospital rooms or old age homes etc (Aslan et al. 2015). Cameras are included in home and care systems which have multiple advantages over sensors based devices such as, multiple events can be detected simultaneously with less intrusion. Figure 3 shows how a camera system detect a fall.

Fig. 3
figure 3

Operations of the camera system when performing fall detection

3.3.1 Camera sensors

Falls can be detected using a single RGB camera, 3D-based method using multiple cameras, and 3D-based method using depth cameras (Kwolek and Kepski 2014; Nizam et al. 2017; Yang et al. 2016b). The most popular vision-based method is the RGB camera which is the cheapest and easy to setup (Aslan et al. 2015; Yang et al. 2016b). Multiple cameras are required to cover a large area which can be solved using omni-directional cameras or a wide-angle camera can be used (Bosch-Jorge et al. 2014; Kwolek and Kepski 2015). The wide-angle cameras have a wide field of view lenses which can be used to monitor large areas (Bosch-Jorge et al. 2014). The problem of this type of camera, the images produced are highly-distorted (Bosch-Jorge et al. 2014). The camera lens has high radial distortion which needs to be corrected before the calibration process starts (Bosch-Jorge et al. 2014). Omni-camera can capture can capture 360\(^{\circ }\) view in a single shot which compensates for the blind spots (Miaou et al. 2006). The lack of depth information from RGB cameras can lead to a lot of false alarms (Kwolek and Kepski 2014, 2015, 2014). The 2D camera methods can cause misjudgements when a there are more than two people in the frame (Yang and Lin 2014). A single camera cannot extract features that characterizes a 3-D objects movement which creates a robust fall detection system, but this can be created from multiple RGB cameras (Stone and Skubic 2015; Yang et al. 2016b). Multi-camera systems construct a 3-D object from back projecting multiple silhouettes where features such as velocity is extracted for detecting falls (Stone and Skubic 2015). For multi-camera systems installation, calibration, and synchronising of the cameras in the same reference frames is difficult, time-consuming and the cost of the system increases (Stone and Skubic 2015; Yang et al. 2016b). The 3D techniques which are implemented from RGB cameras are not automatic and requires manual initialization. Appearance deformation can occur as the result of 2D grey or colour images that are the projection of 3D targets (Yang et al. 2016b). The colour cameras, in a controlled environment achieve high accuracy, but would not work in an uncontrolled environment where the lighting and tracking of user is fully controlled (Kwolek and Kepski 2014, 2015).

Depth information alleviates the problems where users or objects do not have consistent colour and texture, but they need to occupy an integrated region in the 3D space (Kwolek and Kepski 2016). Depth camera allows a person to be extracted from an image at low computational cost (Kwolek and Kepski 2016). Depth cameras can be used to calculate the distance from the top of the person to the floor (Yang et al. 2016b). Depth cameras can perverse the privacy of the user, and the light conditions do not have any effect on it (Aslan et al. 2015; Kwolek and Kepski 2014; Yang et al. 2016b). Depth images can be extracted in dark rooms using an infrared light (Kwolek and Kepski 2014). Depth cameras also can be used to solve occlusion problems and track key joints of the human body (Yang et al. 2016b; Yang and Lin 2014). The different depth cameras include stereo vision, time-of-flight (TOF), and structured light camera (Rougier et al. 2011). Stereo vision camera constructs a depth image from two views of a scene (Rougier et al. 2011). The problem of this camera the systems needs to be calibrated, computationally expensive, and fails when the picture does not contain enough textures (Rougier et al. 2011). The system cannot work in low light conditions, which can be solved by integrating an infrared light to it, but the loss of colour information can cause segmentation and matching difficulties (Rougier et al. 2011). The earliest depth camera was the time-of-flight 3D camera, but the cost of setup is expensive, and it is restricted to a low image resolution (Rougier et al. 2011; Yang and Lin 2014; Yang et al. 2016b). Time-of-flight image can be used to obtain partial volume information which returns precise depth image compared to stereo vision cameras for tackling occlusion problems (Rougier et al. 2011). The most popular depth sensor is structured light camera which includes the Kinect sensor (Rougier et al. 2011). The Kinect sensor is a low-cost device which comprises of infrared laser-based IR emitter, an infrared camera and an RGB camera (Kwolek and Kepski 2014). A Kinect, makes use of infrared light sensors to illuminate the objects in front of it and an infrared camera to observe them in invisible light, the fall detection can be done at any time (Kwolek and Kepski 2016, 2015). A Kinect sensor, can track the body movements in 3D unlike 2D (Kwolek and Kepski 2014). A Kinect senor, can be used for human behaviour recognition, and detect a fall in 24 day-night cycle (Kwolek and Kepski 2016). The Kinect sensor is not affected by the external light conditions due to the depth interference is done by making use of an active light source (Kwolek and Kepski 2014). The Kinect sensor does not require calibration since the automatic extraction of the features (Kwolek and Kepski 2014). The limitation of the Kinect sensor is that the sunlight interferes with the pattern-projecting laser, which is not suitable for outdoors (Kwolek and Kepski 2014).

3.3.2 Background subtraction and user tracking

Background subtraction is performed to extract the moving object from the image known as foreground segmentation. Simplest background subtraction technique requires an original image with no moving objects. The current frame is subtracted from original image to obtain the moving object. The disadvantage of this technique it does not take into account the lighting changes, shadow changes, and the changes in background due to short-term movements (Kreković et al. 2012). This can be solved using a Gaussian mixture model background model or using approximate median filter (Debard et al. 2012; Kreković et al. 2012; Thome and Miguet 2006). Morphological operations can be applied to reduce the noise in the background. The extracted object is tracked continuously, until the object is out of the camera view angle.

3.3.3 Camera-based detection methods for fall detection

The camera-based detection can be split into shape change, inactivity, posture, and 3D head motion (Hakim et al. 2017; Luque et al. 2014; Wang et al. 2014; Yang et al. 2016b). In Table 4, a summary of the different methods used to detect a fall is shown.

Simple method for detecting a fall using 2D method is to locate the person in the video, and draw bounding box around the person as stated in Stone and Skubic (2015). Most common 2D feature extracted, includes aspect ratio (Debard et al. 2012; Rougier et al. 2011). The aspect ratio is computed as the ratio of the width of the bounding box around the extracted object and the extracted object height (Debard et al. 2012). A small aspect ratio means the users posture is upright, whereas a high aspect ratio means the user posture is lying down (Debard et al. 2012). Ellipse provides greater information than the bounding box; such as calculating the fall angle (Rougier et al. 2011; Yang et al. 2016b). The fall angle of the user is the angle between the long axis of the bounding ellipse and horizontal direction (Debard et al. 2012). A small angle represents that person has fallen (Debard et al. 2012). The problem of using a bounding box alone, it does not provide enough information regarding the human motion, and the performance of this technique relies on the camera view angles (Yun and Gu 2016). Analysing aspect ratio can be inaccurate due to the position of the person, camera, and occluding objects. The method of analysing a fall by placing a bounding box around a person can be efficient only by placing the camera sideways and the accuracy of the system depends on the occluding objects. In Table 5, a summary of the different camera-based fall detection studies is shown.

Table 4 The different types of camera-based fall detection methods
Table 5 Summary of camera-based studies

3.3.4 Problem of camera-based sensors

Camera-based methods accuracy is dependent on how efficient and accurate the shape modelling methods used are Hakim et al. (2017). The problem of camera-based systems is occlusions, light conditions, coverage, privacy, cost, and high processing.

3.3.4.1 Occlusions

Occlusions is where a room contains furniture or objects placed between the person and the camera which can create false positives. When elderly people moves to a smaller residence, they tend to take all these items with them resulting in the room being fill with these items, which means the user is partially occluded when moving around the room (Debard et al. 2012). Image processing difficulties arises when changes occur in the monitoring area e.g. furniture’s being shifted around the room; these changes can also affect the accuracy of the system (Debard et al. 2012; Khan et al. 2015). To accomplish the bounding box the RGB camera is required to be placed sideways, which can fail due to occlusions (Rougier et al. 2007). To solve this the camera is required to be placed higher in the room not to suffer occlusions and to have a greater field view (Rougier et al. 2007). In this case, depending on the relative position of the person, the field of view of the camera, a bounding box will not be sufficient to discriminate a fall from a person sitting down (Rougier et al. 2007). To avoid occlusions some researchers placed the camera on the celling, where 2D velocity of the person is used to classify the person. The problem of velocity in a 2D method becomes high when the person is near to the camera, which makes the threshold for differentiating falls from sitting down fast difficult to define; and 2D methods also suffer from occlusion problems, this can be easily solved using 3D vision systems (Ma et al. 2014; Rougier et al. 2007, 2011). Monitoring the whole body can fail when the elderly people who struggle to walk are assisted with a walking aid such as a rollator or walking frame which causes the lower part of body to be occluded by the system; and when objects are being carried (Debard et al. 2012; Hazelhoff et al. 2008). Head tracking can also be used to solve occlusion problems, where objects cover the user (Foroughi et al. 2008).

3.3.4.2 Light

Camera system should be able to monitor the user in any light conditions (Debard et al. 2012; Khan et al. 2015). The different light sources at the homes such as sun light, fluorescent light, light bulbs, TV-screen, and the different light intensities that occurs during the day, can result in overexposures in some parts of the image, and the quality of the images is influenced (Debard et al. 2012; Garripoli et al. 2015; Kwolek and Kepski 2014; Luque et al. 2014; Nizam et al. 2017). Overexposure can be slightly compensated through careful placement of the camera in the room (Debard et al. 2012). The problem of foreground extraction using traditional cameras it relies on background modelling in colour image space, when in reality it is affected by lighting conditions and shadows (Kwolek and Kepski 2014, 2015, 2016; Stone and Skubic 2015). The use of colour-based shadow detection algorithms can be used to improve the output of the background subtraction algorithm; but these algorithms rely on an assumption that if an area is covered by a shadow, only the brightness of the image is affected and there is no change in colour information (Debard et al. 2012). There is a high risk of falls occurring in low lighting conditions compared to normal illuminated conditions (Kwolek and Kepski 2016). To solve the problem of lighting conditions for single cameras an active source of infrared (IR) light can be installed along with the camera; but there will no colour available due to the IR illumination for background modelling (Stone and Skubic 2015). Colour information is not available in near-infrared night images, and colour images that are available during daytime are not reliable (Debard et al. 2012). Depth cameras can solve the lighting conditions, and can work during both day and night (Rougier et al. 2011).

3.3.4.3 Cost and high processing

The cost of the infrastructure and installation of sensor equipment’s is expensive. Image quality in reality is much lower than the lab experiment setup, this can be accomplished by installing a high-quality camera which can result in high cost (Debard et al. 2012). Camera-based systems require considerable computational power running real-time algorithms (Kwolek and Kepski 2016). One way of minimising the computational power is to integrate the camera based system with an accelerometer (Kwolek and Kepski 2016). Camera-based system only starts processing when a possible fall is detected from an accelerometer sensor (Kwolek and Kepski 2016). An accelerometer sensor is used to identify if a possible fall has occurred and the camera system is used to authenticate a fall (Kwolek and Kepski 2015). The frames are not processed instead there are stored in a circular buffer, and only processed when a fall has occurred (Kwolek and Kepski 2015).

3.3.4.4 Coverage

Camera based systems can only work indoors or where the devices are confined to, which can create blind spots, occlusions cannot be detected, limited field view, and dead spaces are created (de la Concepción et al. 2017; Garripoli et al. 2015; Hakim et al. 2017; Luque et al. 2014; Khan and Hoey 2017; Kwolek and Kepski 2014; Nizam et al. 2017). Multiples cameras are required to be installed to solve these problems and provide continuous monitoring, which increases the cost of the system (Debard et al. 2012). Wide angle camera can be used to provide coverage of the room, but the spatial resolution of the camera system decreases due to the lens of the wide-angle cameras (Debard et al. 2012).

3.3.4.5 Privacy

The ethical issues that are associated with camera-based methods includes confidentiality and privacy of the monitored person, which makes it difficult to monitor a person in the bedroom and bathroom (Hakim et al. 2017; Huang et al. 2016; Kwolek and Kepski 2014; Luque et al. 2014; Nizam et al. 2017). The problem of colour camera based systems is that they contain facial characteristics of users which results in privacy concerns, which can be addressed by capturing low quality images, using depth images or image processing technique such as silhouettes (Alwan et al. 2006; Kwolek and Kepski 2016; Stone and Skubic 2015). Even though privacy techniques are applied, people still has a feeling of “being-watched” based on their perception of a camera system (Alwan et al. 2006; Kwolek and Kepski 2015). Instead of capturing the user, the environment scene can be captured like in Ozcan et al. (2017) and Ozcan and Velipasalar (2016).

4 Personalization

Personal information can make the system smarter by adapting the different parameters for different person (Miaou et al. 2006). If different body postures are not learnt, high false rate could be resulted (Zhang et al. 2017). Methods that make use of thresholds are most popular and easy to implement, and computationally inexpensive, but does not work on different people, and does not provide a good trade-off between false positives and false negatives (Hu and Qu 2014; Khan and Hoey 2017). People have different types of body figures; whereas using the same threshold in fall detection algorithm will not work for everyone or would not be optimal (Miaou et al. 2006). With thresholds is difficult to adapt the threshold to new types of falls and makes it work on different people (Luque et al. 2014; Khan and Hoey 2017). Falls of elderly people might last longer than that of young people (Ma et al. 2014). The values from the threshold method is determined without using any theoretical and/or experimental basis; and where the fall detection model fail is that it cannot address inter-individual difference (Hu and Qu 2014). The basic idea behind personalization, is to train the system using the user data, which will result in higher accuracy.

5 Personalization

Personal information can make the system smarter by adapting the different parameters for different person (Miaou et al. 2006). If different body postures are not learnt, high false rate could be resulted (Zhang et al. 2017). Methods that make use of thresholds are most popular and easy to implement, and computationally inexpensive, but does not work on different people, and does not provide a good trade-off between false positives and false negatives (Hu and Qu 2014; Khan and Hoey 2017). People have different types of body figures; whereas using the same threshold in fall detection algorithm will not work for everyone or would not be optimal (Miaou et al. 2006). With thresholds, it is difficult to adapt the threshold to new types of falls and makes it work on different people (Luque et al. 2014; Khan and Hoey 2017). Falls of elderly people might last longer than that of young people (Ma et al. 2014). The values from the threshold method is determined without using any theoretical and/or experimental basis; and where the fall detection models fail is that it cannot address inter-individual difference (Hu and Qu 2014). The basic idea behind personalization, is to train the system using the user data, which will result in higher accuracy.

5.1 Design of a personalized model

Classification can be trained using the user data or non-user data or the combination of both user and non-user data. The use of supervised machine learning algorithm cannot be used to solve the problem, as the fall data that is used are from simulated falls (Khan and Hoey 2017). Since falls are rare, supervised machine learning algorithms cannot be used (Khan and Hoey 2017). Supervised algorithms can classify known classes which they are trained (Khan and Hoey 2017). Supervised machine learning algorithm requires the data to be label which result in waste of time and effort (Khan and Hoey 2017). Supervised classifiers cannot provide a person-specific solution for individuals (Yu et al. 2013). Due to the lack few fall data, supervised classification algorithms may not work as desired, the following classification are needed over/under-sampling, semi-supervised learning, cost-sensitive learning, and outlier/anomaly detection (Khan and Hoey 2017).

A large dataset needs to be created for training the supervised classifier which should contain data for different activities; if a person does not fit the dataset e.g. if the person is obese a good performance could not be obtained for the specific individual (Yu et al. 2013). Supervised learning algorithms require a balance dataset with has equal misclassification costs for the different classes (Khan and Hoey 2017). When unbalance data is used to train the algorithms, the algorithms fail to distinguish the characteristics of the data, which result in low accuracies; and their prediction tend to favour the majority class (Khan and Hoey 2017). The imbalance class can be handle by performing cost sensitive-classification, where the cost of the classification problem is treated differently (Khan and Hoey 2017). This can be accomplished by adding a cost matrix to a cost-insensitive classifier or by integrating a cost function in the classification algorithm to generate a cost-sensitive classifier (Khan and Hoey 2017). A cost matrix of a fall detection problem is defined, by getting the optimal decision threshold of the classifier (Huang et al. 2011). Cost-sensitive analysis can be performed for fall detection using Bayesian minimum risk or the Neyman-Person method (Huang et al. 2011). This is calculated by varying the ratio of the cost of a missed fall to a false fall alarm to determine an optimal region of operation using the ROC curve (Huang et al. 2011). Generally, the ratios are fixed and should not be dependent on the dataset used (Huang et al. 2011). The costs are unknown and are difficult to compute (Khan and Hoey 2017). In Debard et al. (2012), the study make use of a weighted SVM to compensate the imbalance of data of the falls and normal activates from the camera. The weights are determined using cross-validation and grid search maximizing the area under curve of ROC (Debard et al. 2012).

The lack of fall data could also be compensated using sampling techniques to generate fall data (Khan and Hoey 2017). Fall can be oversampled or the normal activity class can be under-sampled to train a supervised classifier (Khan and Hoey 2017). The disadvantage of oversampling it can lead to over-fitting if a lot a lot of artificial data points are generated and do not represent a fall (Khan and Hoey 2017). The disadvantage of under-sampling it can lead to under-fitting it the normal activities class is reduced to match the number of total activities of falls (Khan and Hoey 2017).

Another approach is to apply temporal patterns which can be used to describe and provide more information on the events that the user performs (De Maio et al. 2017). The temporal paths are used to recognize or predict future events that the user may performed (De Maio et al. 2017). In De Maio et al. (2017), the system combines the temporal extension of Fuzzy Formal Concept Analysis (data driven) and Fuzzy Cognitive Maps (goal driven) approaches for better decision making (De Maio et al. 2017). The system recognizes the following events: tiredness, sleeping, having breakfast, and having dinner (De Maio et al. 2017).

Classifiers only require normal activities for training, which eliminates data imbalance between fall and normal activities are known as unsupervised machine learning algorithm (Khan and Hoey 2017). The problem is that if the normal behaviour in not properly learned, the system can result in large number of false positives, as a slight variation from a normal activities can be detected as a fall (Khan and Hoey 2017). The classifier needs to adapt and learn new activities in order to reduce the false alarm rate when detecting falls (Popescu and Mahnot 2009). The advantage of the unsupervised approach, is that the classifier can easily adapt to new data without worrying about data imbalances (Popescu and Mahnot 2009). In Table 6, below show the summaries of systems which make use of personalized models.

The basic personalization is customizing the threshold based on personal characteristics such as height, weight, etc. (Miaou et al. 2006). In Miaou et al. (2006) an Omni-camera is used to record the activities, where a bounding box is placed on the user (Miaou et al. 2006). The system requires a background image, no user present in the background (Miaou et al. 2006). To detect a fall the foreground is extracted by performing background subtraction (Miaou et al. 2006). A fall is detected if the bounding box aspect ratio is greater than pre-defined threshold value (Miaou et al. 2006). The predefined threshold value is customize based on the following personal information height, weight, and electronic health history (Miaou et al. 2006). The reason for the personal information is used to adjust the detection sensitivity which reduces false alarms, and provide more attention to the elderly person with specific needs (Miaou et al. 2006). The use of electronic health history is to increase the detection sensitivity automatically if the person experiences cardiovascular disease or if a fall accident has happened before (Miaou et al. 2006). In Cao et al. (2012) a smartphone system which is based on the user information’s such as the ratio of height and weight, sex, age is used to adjust the threshold value and sampling of the acceleration data. From the tri-axis acceleration sensor, the direction of the three-axis was extracted (Cao et al. 2012). The system calculates SMV (Cao et al. 2012). Based on the BMI, the user age, and sex, the maximum and minimum threshold from the acceleration and the sampling frequency determined through the range of the personal information (Cao et al. 2012). Fall is classified based on the thresholds and the system achieves a sensitivity of 92.75% and specificity of 86.75% (Cao et al. 2012).

In Medrano et al. (2016), a study was conducted to compare personalised systems to non-personalised systems using a smartphone accelerometer. Three unsupervised methods were implemented NN, OCSVM, and local outlier factor (LOF); and one supervised method SVM (Medrano et al. 2016). The study was divided into two stages, the first stage is determining which unsupervised method was the best; and the second stage to determine how does personalized perform on both the best unsupervised method and supervised method (Medrano et al. 2016). The raw data of the three axes of the accelerometers are fed into the classifiers (Medrano et al. 2016). From the first stage, it was found that NN outperform the rest of the unsupervised methods (Medrano et al. 2016). For the second stage, the personalized model of the NN is trained with the normal activities of the user; whereas the non-personalized model is trained with the normal activities of other people data (Medrano et al. 2016). The personalized model of the SVM is trained with the normal activities of the user and fall activities of other people; whereas the non-personalized model is trained with both normal and fall activities of other people (Medrano et al. 2016). It was found that both the personalized model, NN and SVM outperform the non-personalized model (Medrano et al. 2016). The personalized SVM model achieved slightly higher geometric mean of 0.9764 compare to the personalized NN model of 0.9688 (Medrano et al. 2016). The NN model is better compare to SVM model, the reason being it can adapt to new data, and it can recognize more fall types.

Another approach is to adapt the classifier to accept new ADL data and re-train the classifier in order to learn the user movements. In Medrano et al. (2014) a smartphone tri-accelerometer sensor was used with a NN classifier; where the capture magnitude acceleration data is compared to the store ADL data from the smartphone. A fall is detected when the difference between the stored pattern and incoming pattern is high (Medrano et al. 2014). The new ADL is added every time the system classifies the incoming data as ADL; where the old ADL record is replaced with new ADL (Medrano et al. 2014). To reduce processing power and computational time, the system only classifies when magnitude of the acceleration value is greater than 1.5 g, and if long lie occurs (Medrano et al. 2014). The advantage of NN classifier is that it easy to add new data, and it does not require simulated falls for the training the system (Medrano et al. 2014). The simulated fall data was used only for testing the classifier (Medrano et al. 2014). The disadvantage of the system is that it cannot detect soft falls and it uses long lie. If a person attempts to get up from a fall but fails each time during the long lie period, the system would not detect a fall event (Medrano et al. 2014).

Table 6 Summary of personalized fall detection systems

6 Discussion

In Fig. 4, it shows the trend of the number of studies being conducted in wearable, ambient and camera-based sensors. It shows that wearable sensors has the most research interest in the last five years.

Fig. 4
figure 4

Trends in fall detection based on the number of studies published each year

High classification accuracy is reported in almost all of the fall detection studies, but it was conducted on limited number of subjects, fall types and activities (Bagalà et al. 2012; Pannurat et al. 2017). The reason for simulated falls, it is extremely hard to collect real-world elderly person fall data; since 30% of elderly population over age of 65 years old fall at least once per year (Bagalà et al. 2012). Current fall detection studies are only tested in controlled experiments where they achieve high accuracy, but when placed in the real world the accuracy of these systems decreases (Ozcan et al. 2017). Studies test the specificity of ADL through laboratory experiments by the same subjects who generate fall data (Bagalà et al. 2012). These data could be biased, since subjects are forced to perform activities, which are typically spontaneous (Bagalà et al. 2012). The choice of the mattress to reduce the impact of the falls to protect the volunteers from injuries, can reduce the accuracy of the system when applied to the real world (Bagalà et al. 2012).

It is difficult to compare the different fall detection studies in a fair play since each study made use of they own dataset from different conditions (Igual et al. 2015). The problem comes in when comparing a system since each study validated they research on different data collection protocols, subject groups, and environment settings, hence they cannot be directly compared to previous studies (Pannurat et al. 2017). The factor which influence the performance is the number of training samples are used for training the system (Igual et al. 2015). The main problem of acceleration based studies is that it is difficult to compare the different studies; since that each research study make uses its own dataset composed of simulated falls and ADL (Igual et al. 2015). It is difficult to judge whether the results obtained from these studies are influence by the dataset complied, and it is impossible to make a comparison since the dataset used in each study are different (Igual et al. 2015). Since these devices are required to be worn for long-periods or the whole day a complete dataset is required compared to fall detection studies where the dataset is limited (Ozcan et al. 2017).

In Bagalà et al. (2012) evaluation was conducted on real falls based on accelerometer fall detection algorithms where 29 real world falls were tested on. The result from the evaluation show a reduce sensitivity and specificity values compare to when conducted in an experiment environment to evaluate the effectiveness of the algorithms to detect falls in real-life events (Bagalà et al. 2012). The study achieved average specificity of the algorithms is 83.0% and average sensitivity of the algorithms is 57.0% which are much lower compared to the simulated environment (Bagalà et al. 2012). There is a huge number of false alarms generated from the algorithms in a one-day monitoring period which ranged from 3 to 85 (Bagalà et al. 2012). The results obtained from the study is to encourage researchers to take reality activities into consideration (Bagalà et al. 2012). The problem with these studies is that they cannot work in the real world since no training data for falls were used, and low accuracy will be achieved since classifier cannot predict a fall that it has never observed before (Khan and Hoey 2017). Collecting fall data is futile as it requires a person to perform a real fall which can result in serious injuries (Khan and Hoey 2017). About 94% of fall detection studies used simulated falls from laboratory experiments for training the classifiers (Schwickert et al. 2013). This shows that the difficulty in obtaining real fall data (Schwickert et al. 2013). Instead of real falls, artificial falls are collected in a controlled laboratory environment, which does not represent an actual fall (Khan and Hoey 2017). The advantage of artificial fall it provides information of how falls is occurring, but does not make it easier for detecting falls (Khan and Hoey 2017). Classifiers which use artificial falls as training data can result in over-fitting, which can cause poor decisions on the actual fall (Khan and Hoey 2017). The fall data are limited quantity and suffer from ethic clearance (Khan and Hoey 2017). To get accurate fall data, a long-term experiments needs be conducted in nursing homes using wearable sensors, ambient sensors, or camera based methods (Khan and Hoey 2017).

The main problem of vision based the absence of flexibility, as these systems are case specific where they are designed and optimized for a certain situations or scenarios (Luque et al. 2014). Camera-based studies algorithms are evaluated from data collected from controlled environment, optimal conditions such as perfect illumination, simple scenarios or scenes, and falls are simulated by actors (Debard et al. 2012). The challenges found from real life data compare to the simulated data is that the image quality is low and falls are rare and vary a lot in terms of speed and the nature of fall (Debard et al. 2012). Most studies make use of simulated data, where the falls been recorded in artificial environments and the person performing are young people (Debard et al. 2012).

Each individual has different characteristics and motion patterns compare to people used in the training data (Hu and Qu 2014). Another problem is difficult to detect all the ADL since the classifier is required to be trained with each type of ADL (Popescu and Mahnot 2009). The classifier needs to adapt and learn new activities in order to reduce the false alarm rate when detecting falls (Popescu and Mahnot 2009). It is difficult to detect the different types of falls for the different people; since a fall has different acceleration characteristics and magnitude of acceleration has high variation among various body types (Ozcan and Velipasalar 2016). The phone placement differs from person to person (Ozcan and Velipasalar 2016). The limitations of current fall detection studies are the difference in the shape or strength of measured signals if healthy adults or elderly people wear the fall detector and if the falls are simulated or real, with possibly relevant effects on the design and the performance of the fall detection algorithm (Sabatini et al. 2016). ADLs such as lying down and sitting down can generate high impacts which can be misclassified as a fall, for overweight users (Pannurat et al. 2017). Falls with recovery and backward collapses, where users end up in a sitting position result in a misclassification (Pannurat et al. 2017). An actual free fall is not created due to the cautiousness of subjects, which results in not a proper fall detection (Ozcan et al. 2017). Even when safety precautions are there, subjects are still too afraid to fall (Ozcan et al. 2017). The occurrence of fall rate is low, which results in insufficient or no data (Khan and Hoey 2017). Different types of fall can occur, which makes it very difficult to model (Khan and Hoey 2017).

The solution, is to create a personalized system; which adapts and learns the users movements. By learning the users movements, the system will be available to recognize a wide range of ADLs and not force the user to perform certain activities. One way to achieve a personalized system, is by using unsupervised machine learning algorithm; which can easily adapt new data without worrying about data imbalances (Popescu and Mahnot 2009). Unsupervised algorithm, would only be required to trained with ADLs which are easier to capture compare to fall activities. The biggest advantage of personalized system, it will work on anybody, regardless of their weight and height.

7 Conclusion

In this paper, the different fall detection systems that exist were discussed and analysed, where each one has their own advantages and disadvantages. The accuracy of the system depends on the sensors used and the type of classifications. The wearable and camera-based sensors are the most popular ones compared to ambience sensors. Ambient sensors are highly influence by the environment. The wearable sensor can include a device of MEMS sensors or the use of a smartphones and the system can include a false alarm button. Camera-based sensors, main disadvantage is that the limited coverage and the performance being affect by objects in the environment. The wearable devices main disadvantage is that it intrusive and the placing of the device on the human body is uncomfortable. Wearable sensors are preferred method as it is practical and allows for continuous monitoring and is not influence by the environment. The wearable sensor also provides outdoor monitoring, and can be used to collect real data in a cost-effective approach. A smartphone can be used as a wearable device since a lot of people have them, and it is not intrusive. Wearable device can be placed in the user pocket, which would not interfere when the user is performing ADLs. Experimental systems are limited to the laboratory setting, which would not work in reality and is limited to certain ADLs. Personalization is key, in fall detection system; since it does not only increase the accuracy of the system, but can also be adapted to learn new activities. Adapting new activities can be done by implementing an unsupervised machine learning algorithm, since data balance would not be an issue.