1 Introduction

In recent years, technology has become increasingly dominant in the healthcare sector. Geriatric healthcare has gained wide significance due to the advances in the field of medicine and improved population demographics [1,2,3]. Machine learning applications in geriatric healthcare embrace vital signs monitoring, analyzing sleep patterns, performing behavioural tests, and detecting falls. In fall detection, machine learning assists in efficient detection of falls based on patterns of behaviour of the subject. The machine learning classifiers or models for fall detection work upon the datasets collected by either sensor [4,5,6,7,8] or captured with [9,10,11,12] camera.

Human Falls are one of the most common health issues among the elderly, resulting in a lower quality of life as well as higher mortality and increased morbidity. Additionally, the public health service bears a significant financial burden as a result of falls. A human fall may include risk factors as, cognitive decline, weak muscles, unsteady gait and further results in intake of psychoactive medications. The timely diagnosis and management of fall risk factors can greatly decrease the chances of future falls [13, 14]. History of falls, as well as gait and balance disorders, have been established as strong predictors among these factors.

According to the World Health Organization (WHO), falls are the second major source of unintended, unexpected or accidental mortality [13]. Due to the accidental occurrence of fall, elderly people can suffer weak muscles, permanent dysfunction, painful treatment, and a loss of a life of independence or even more worst situations. Prior and early detection of fall can reduce the risk of future incidents.

Recently, researchers of both the fields i.e. technological and medical are working on this, and have attempted to reduce the chance of falling and cope up with these situations, by providing the better care upon recursive happening of a fall. Falls are without a doubt one of the most serious accidents that can occur in the elderly. Falls can influence the life of elderly people by resulting in several unpredictable conditions and health issues such as spinal cord surgery, fractures. One very important concern of falling is the “long lie”. It can be defined as when a person fall on the floor and spend prolong period because they are unable to get up, this causes death. Elderly people suffers a lot due to fatal falls. Research shows that approximately every year 646k elderly people die from fall, in which 80% are belong to the low middle-income countries [14]. The death rate is increasing day by day due to this unintentional fall of human, according to the research in the USA approx. 30k aged people more than age 65 died due to this incident of fall during the year of 2016 [15]. It is also observed that the death rate of aged men are more than the aged women due to Falls [16].

There are lot of reasons of falls, some elderly people face it due to the obesity illness, diabetes, loss of limb or due to vascular diseases [17]. Some of them occurs due to the physical injuries. Apart from this, some elderly facing the psychological issues that include anxiety, depression, disorder activity and fear of falling.

The perception and anxiety related to falling is the most common psychological problem among the elderly., because of this, they restrict their self or their daily activities. According to a study, 60 percent of elderly people limit their behaviours due to a fear of dropping [18]. Muscle weakening occurs as a result of these activity restrictions, affecting power and independence. As a consequence, the fall can occur recursively. As the age is increasing, human body is getting weaker day by day. Approximately 30-50% of elderly people residing in care centers fall each year and 40% of them are facing the fall repeatedly [19].

The occurrence of fall incidents, such as, fractures, spinal cord incidents, vertebral bone injuries, pelvic bone injuries are increased up to the 131% over the past two or three decades [19]. It is estimated that by 2050, at least one out of every five people will be 65 or older [20], meaning that if preventative steps are not taken, a large number of elderly people will experience incident of falls significantly.

The effects of a fall are not limited to serious physical injury, but can include psychological issues. These psychological concerns consist of activity disorder, anxiety, depression, fear of falling and factitious activity restriction. [21]

Fall prediction is a method of estimating the probability of a possible fall. The risk of falling is assessed in a clinical setting using a questionnaire and various functional measures, for example, performance oriented mobility assessment (POMA)[20], Timed up and Go(TUG)[19],balance berg test[19]. Although these tests are very much helpful and provide a good indication of performance.

Furthermore, the major and long-term effects of falling can be lowered by predicting the causes of falls earlier and offering prompt medical assistance. As a result, a fall detection system can help to alleviate these issues by producing a warning when a fall occurs.

A lot of research has already been done in this field to find out the solution of this problem, some authors proposed the fall detection system by using different sensors individually, i.e gyroscope and accelerator, these sensors hold a significant role in detecting activities. Additionally, due to sensors privacy issues don’t get occur , thus researchers detected the activities easily with mobile sensors. Nevertheless, few proposed the system with only Activities of daily living (ADL), many of them work on few of the activities, and some of the system not recognize the difference between the incident of fall and Activity of daily living. In deep learning, Convolutional neural networks have already proven state of the art results in most of the classification problems. Recurrent neural networks, such as, Long short-term memory (LSTM) have shown to be most successful models in sequential data or historical information, for example, recognizing the human activities.

The rest of the manuscript is as following. Section II presents the factors affecting the risk of falls. Section III indicates the Related Work. In section IV different methods for fall prediction are summarized and section V communicates the Methodology. In subsequent part, Experiments and Results are discussed.

2 Factors that Affecting the Risks of Falls

There are some of the basic factor that affect the fall in elderly people. In general, there are three main factors as shown in Fig. 1. physiological in which elderly are having the age factor and along with that different type of chronic diseases like, spinal cord, vertebral bone injuries etc. In the environmental, environment or ambiance affect the person, if the floor is slippery, or the lack of lighting these factors are also very much affecting in the fall. In the end, psychological in which fear of fall, mentally disable aged people and if the person is drunk so these are the important factors which causes the falls incident.

Fig. 1
figure 1

Hirarchery of the factors that affecting the falls

3 Related Work

There are different preprocessing techniques or methods that are used in fall monitoring systems. These techniques or methods are based upon the variables taken from the different sensors. In most fall detection systems, two categories of data preprocessing methods are followed. i.e. analytical methods and machine learning.

For fall prediction, analytical methods apply statistical techniques. Some of the well-known analytical approaches include, fuzzy logics [22], thresholding [22, 23], Hidden markov model [24], Bayesian filtering techniques [24]. Among all these methods, most of the researchers performed thresholding for the detection of fall. In this method, fall is predicted when the shape of that specific feature is detected in the signals. Moreover, the work in [25] demonstrate the use of an architectural approach for autonomic healthcare management system for autonomic fall detection using wearables, IoT, and cloud computing. Camera-based systems are used image preprocessing techniques for the fall [26], whereas ambient based systems used sensing techniques for monitoring the data [27] .

Machine learning methods are data driven models. These algorithms gain insights intuition from the data. Prior to training the algorithms, a feature set is extracted from the dataset. After training, the model is used for testing. It is evident from literature that few machine learning used for fall prediction are: Artificial Neural Networks (ANNs), Support vector machine (SVM) , Naïve bayes, Multilayer perceptron and K-Nearest Neighbors [28]. In [29], the research work is focused on unobtrusive fall detection system. Their proposed implementation, i.e, WiFall produced 90 percent detection precision with SVM classifier.

Additionally, WiFall achieved average 94 percent fall detection precisions using Random Forest algorithm. These algorithms can also be used for predicting and detecting the future falls as well.

In past years, enormous amount of research has already been done by the researchers to explore the different methods that have been proposed in predicting the fall detections. In the area of machine learning, mostly researchers have worked on Support Vector Machine (SVM), decision trees as a classifier for predicting the falls [20, 30]. In [31] author extracted the features of acceleration and angular velocity by using the gradient histogram and Fourier. Then author used SVM and K Nearest Neighbor (KNN) as a classifier to detect the fall activities of two datasets. In [32] researchers constructed a monitoring system by using six inertial measurements units. After doing analysis, statistical test was chosen to form a feature set, then the researchers used random forest as a classifier to predict the activities, eventually, an overall precision of 84% was achieved.

The researchers in [33] have focused on energy efficiency in fall detection. Data from a 3D accelerometer embedded in a smartphone was selected using amalgamation of threshold-based methods and ML-based algorithms—K-Star, Nave-Bayes, and J48. The authors in [34] applied windowing method to split the sensor signals into time frames. The operation in each window was then categorized using classification algorithms to see if it corresponded to a fall.

The traditional machine learning algorithms rely on manual feature extractions techniques and can vary for each dataset [35, 36]. This is one of the most challenging task in machine learning domain. If the appropriate and meaningful features are not extracted, the model will not be capable to learn the underlying patterns available in data. To deal with this problem, researchers moved to the deep learning algorithms, in which feature extractions is done automatically by training the model and this approach is providing state of the art research. Thus, the successful applications of deep learning models can be seen in image processing, natural language processing, dimensionality reduction, prediction and forecasting. Also, deep learning algorithms are based on variety of various architectures, for instance, Deep Belief Networks (DBNs), Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Stacked Auto-encoders (SAEs), Long Short Term Memory (LSTM) and Generative Adversarial Networks (GANs).

In [37] author used a CNN with 3 convolutional layers and 1 fully-connected layer to identify the human falls activities, moreover author used three-axis accelerometer data to covert it in image. In [38] authors proposed a deep learning model for fall detection with LSTM architecture on publicly available dataset i.e. WISDM and they achieved 95% which is quite good, but they work only on one dataset that contain 6 activities. In general, deep learning models are quite complexed, therefore it is difficult to train these architectures. Apart from its remarkable performance, these models are highly computational intensive. In the view of these problems, in our research, we have implemented a CNN-transfer learning with few parameters for fall detection. our model can detect falls very effectively with less computational power and automatically extracted the features.

In [39], author used combination of dataset i.e stimulated and real-world to predict the falls, with several methods such as SMOTE, Balance cascade and ranking models with transfer learning as a classifier, they observed that model trained better with stimulated dataset with the accuracy of 95% .

Additionally, it is evident from the recent research that nowadays the human falls detection system is widely using video data. For instance, in [40] deep learning is used to construct an automatic fall detection system utilising a dataset of falling and non-falling images. The findings demonstrate that the convolutional neural network (CNN) method is preferable for classifying falling and non-falling images. The researchers [41] have utilized video motion data to generate new temporal templates, which are then employed by a convolutional neural network to detect human movements. Moreover, the fall detection is assisted by the temporal template representation computed at the start and finish of the fall event. Technological innovation and emerging softwares, such as Kinect Xbox 360, opens up new avenues for building advanced systems that might be utilised to monitoring older people as they go about their daily lives. Kinect for Xbox 360 is an inexpensive device. It keeps track of individuals actions. It is utilised by elderly people who are undertaking rehabilitation exercises at home [42].

Table 1 highlights the few specific studies conducted in the domain of fall prediction particularly with sensors. Our main focus is to trained the ML model for identifying the falls more effectively based on input gathered from sensors. Consequently, the related work is summarized in the aforementioned Table 1.

Table 1 Related work for falls prediction using Machine learning for data collected from sensors

4 Research Methodology

In this research deep learning models are developed and trained as classification models for improving the prediction of falls. Among numerous available deep architectures of neural networks, our research focuses on the implementation of LSTM and CNN-Transfer learning. We further evaluate the performance of these implemented networks by giving the comparative analysis with each other and with some other machine learning approaches i.e. LSTM and CNN-transfer learning mentioned in detailed in subsequent section. An appreciable factor of deep learning architectures from these classifiers is automatic feature extraction and learning from data, which come up with the improvement in accuracy of the models. Moreover, these deep learning architectures also improve, optimize and solve the hard steps into several easy steps to tackle the deeper networks. The research methodology followed in this research is shown in Fig. 2. The proposed research work on the fall detection considers training the ML models using a publicly accessible dataset. For the training and testing of the proposed methodology on human fall activities, a publically accessible dataset was utilized. Initially, data preprocessing is done on the downloaded data. In the next step feature extraction is performed. Subsequently, creating an appropriate model for the falls detection, training and testing the data subsequently measuring the performance.

Fig. 2
figure 2

Work flow diagram of research

4.1 Data Gathering

Prior to anything the first step of the data driven modeling is to collect the appropriate dataset. The data set used in our research work is publicly available on kaggle (https://www.kaggle.com/pitasr/falldata). It is consist of 16 types of ADL and 20 types of falls. It includes the acceleration from the accelerometer sensors, rotation from the gyroscope sensor and orientation and direction from the magnetometer/compass sensor. The dataset contains both the participants (young and elderly). We have also considered another dataset, i.e., UFRD dataset which contain 30 falls activities and 40 Daily living activities.

4.2 Data Preprocessing

Once the data is gathered, we progressed to the preprocessing of data. It is imperative to preprocess data in order to remove unnecessary signals, disruptions, and null values from the signals or data. This supports for better classification by using the effective machine learning algorithm. There are different methods and techniques for preprocessing and smoothing purpose, but in this proposed methodology we selected Butterworth filter, Infinite impulse response IIR with the cut-off frequency. This provided a smoother signal, and is less computational intensive.

4.3 Feature Extraction

After the data has been preprocessed, the subsequent phase is to obtain the meaningful representations from data as features. We selected four basic attributes as significant features to represent the data for modelling and training. These are: mean, variance, minimum amplitude and maximum amplitude as mentioned in the Table 2. All of these features are extracted from the three sensors, i.e. accelerometer, gyroscope, magnetometer along with x, y, and z axis. We evaluated the feature extractions for the 15, 120 records (36 motions × 14 volunteers × 5 trials × 6 sensors). Extracted features are shown in the table along with the mathematical equations.

Table 2 Significant features along with the mathematical interpretation

4.4 Deep Learning Architectures

For classification and achieving an optimal result, our main focus was to explore the performance of LSTM and CNN-transfer learning. The aim was to achieve best possible accuracy and lowest error rate. The detail explanation of these models are mentioned below.

A. Long Short Term Memory (LSTM)

LSTM is basically an extended version of RNN with a multilayer cell structure and a state memory, often termed as internal memory and multiplicative gates. The algorithms used to update the weights in RNNs are often gradient based, which leads to vanishing or exploding gradient problems, which have been shown to be solved with the development of (LSTM). These models are found to be more useful for approximating dynamic structures that deal with time and order-based data like video, audio and signals. Reason for choosing the LSTM, because it overcomes the vanishing gradient problem in comparison to other sequential models. LSTM are capable for the long-term dependencies [43]. A single LSTM memory cell is further illustrated in Figure 3.

Fig. 3
figure 3

A single unit of LSTM node

The conventional LSTM architecture [44] is often described mathematically by the composite equations (1), (2), (3), (4), and (5).

$${i}_{t}=\sigma ({w}_{i}[{h}_{t-1}{,x}_{t}]+{b}_{i})$$
(1)
$${\mathrm{f}}_{\mathrm{t}}=\upsigma {(\mathrm{w}}_{\mathrm{f }}\left[{{\mathrm{h}}_{\mathrm{t}-1}{,\mathrm{x}}_{\mathrm{t}}]+\mathrm{b}}_{\mathrm{f}}\right)$$
(2)
$${\mathrm{O}}_{\mathrm{t}}=\upsigma {(\mathrm{W}}_{0}\left[{\mathrm{h}}_{\mathrm{t}-1,}{\mathrm{x}}_{\mathrm{t}}\right]+{\mathrm{b}}_{0}$$
(3)
$${\mathrm{C}}_{\mathrm{t}}={\mathrm{f}}_{\mathrm{t}}{*\mathrm{C}}_{\mathrm{t}-1}+{\mathrm{i}}_{\mathrm{t}} {*\mathrm{C}}_{\mathrm{t}}^{\sim }$$
(5)
$$\stackrel{\sim }{{\mathrm{ht}=\mathrm{c}}_{\mathrm{t}}}=\mathrm{tanh}{(\mathrm{W}}_{\mathrm{c}}\left[{\mathrm{h}}_{\mathrm{t}-1,}{\mathrm{x}}_{\mathrm{t}}\right]+{\mathrm{b}}_{\mathrm{c}})$$
(3)

whereas, i, f, o and c are respectively the input gate, forget gate, output gate and cell activation vectors. The hidden states are denoted by h and w indicates the weight matrices from the cell to gate vectors. The σ is the logistic sigmoid function. In Eq. (3) tanh is responsible to create the new vector.

In the first experiment, LSTM was trained and developed by 32 neurons as mentioned in Table 2 for parameter settings. We added 3 layers for our model. There are three main gate in our LSTM architecture namely, forget gate, input gate and output gate i.e. (\({O}_{t}\), \({f}_{t,}\) \({i}_{t}\)). These gates control the activation of the cell \({c}_{t}\) its output \({h}_{t}\). These blocks consist of one cell at time t as shown in figure, In our work, we used 1000 epochs. LSTM extract the temporal information very efficiently due to its special network structure. However, when we trained the LSTM layers, each step's output and computation were dependent on the output and calculation of previous stages, lowering model training pace. As a result, there were fewer epochs considered. The settings for the remaining model parameters are listed in Table 3.

Table 3 Parameter settings for LSTM model

B. CNN-Transfer Learning

As it is evident from the literature that there exist a variety of models for image recognition, classification and detection problems. CNN has reached cutting-edge performance in a variety of applications [45,46,47]. CNN-Transfer learning has recently gained popularity and has attracted much attention in the field of machine learning. Transfer learning boosts performance while allowing for rapid advancement and development. While considering a new task, it is centered on transferring knowledge from a previously learnt activity. It also helps with the generalizations of the next task. We successfully retrained the VGG-16 architecture for human fall detection in this study.

The research work conducted in [48] proposed the VGG16 model, which is a convolutional neural network model. VGG-16 was developed using an image net dataset with approximately 14 million images and 1000 classes. VGG16 was trained on NVIDIA Titan Black GPUs for weeks. Automatically extracting general features for detection, such as textures, corners, and so on, is normal practise in deep learning architecture. The network was able to learn more features that reflected human movements and motions in the second step of retraining the already trained model, which was employed in fall detection. We changed the input layer of an earlier CNN model trained on the ImageNet dataset, where 224 × 224 is the VGG-architecture image input size and 20 is the stack magnitude.

The total number of convolutional layers in this experiment is 5, with each fully linked layer including 16 neurons. Figure 4 depicts the architecture of this model. There are 16 Convolutional and Max Pooling layers in the VGG16 Transfer Learning Model, three Dense layers for the Fully Connected layer, and an output layer. The CNN-transfer learning model was trained using 80% of the dataset, then the trained model was examined to see if the loss function exhibited overfitting or not, and finally the validation test was being used. Finally, 20% of the dataset is selected for validation, as shown in the graph in the next part of the results. Table 4 presents the specifications of following parameters: epochs, minibatch, layers, optimizer, and activation function.

Fig. 4
figure 4

Working flow of CNN-Transfer learning

Table 4 Parameters setting for CNN-transfer learning

5 Experimental Results

The data is further divided as 80% of data for training and 20% of the data for testing. Here we used two dataset for machine learning algorithms, first we predict the accuracy with fall detection dataset using the LSTM algorithm then we used CNN-transfer learning with UFRD dataset, and we achieved an optimal results with this dataset. Finally, we compare the performance two types of Deep neural network (LSTM and CNN-transfer learning).

5.1 Experiment 1

In our first experiment, we used the publicly available dataset with state of art model sequential model, i.e, LSTM. Several attempts were made for finding the best optimal LSTM model for human fall detection. Ultimately, the selected model parameters for optimized model are already presented in Table 3. With this model. the 88% accuracy on training set was obtained and 85% on validation as can be seen in Fig. 5.

Fig. 5
figure 5

Training and validation accuracy curves per epoch of LSTM

5.2 Experiment 2

The performance of LSTM model was observed satisfactory. Despite the fact that it cost a significant amount of computational resources. As a result, we tried another category of Deep learning architecture, i.e, CNN using transfer learning. For this purpose, pretrained VGG-16 model was cosidered and the results were more promising. Using the 5K epochs it was further retrained for falls detection. The CNN based transfer learning model achieved 98 % accuracy. The training details can be seen in Fig. 6.

Fig. 6
figure 6

Training and validation accuracy curves per epoch of CNN-transfer learning model

It can be seen from the Table 5 that transfer learning approach outperforms the rest of the other classifiers in terms of accuracy. Moreover, it is evident from the literature that transfer learning significantly reduces the computational overhead and time in comparison to traditional DL approach. The table further demonstrates that with an accuracy of 96% and the lowest error rate, KNN was deemed to be the second most promising model. According to the study and table mentioned above, the proposed Transfer learning model, when trained with the above-mentioned procedure and recommended parameter settings, outscored other models on the test set with the lowest error rate of 1.2%. Therefore, demonstrating improved performance over the studies mentioned in [31].

Table 5 Comparison with previous research

The behavior of a Machine learning models is dependent on the behavior and qualities of data provided as input features. The recent advances has proven that data-driven based machine learning has drawn attention and it is the most promising method. However, the change in input data features cause change in model too. Therefore, this change might be desirable sometimes but not always. Consequently, the performance of ML algorithms is influenced by a variety of variables, including the type and location of the sensors, the fall pattern, any associated thresholds, the dataset's characteristics, and probably the preprocessing applied to it. The deep learning models are hierarchical and highly nonlinear models. These giant architectures required a huge set of information in order to train the models accurately. Therefore, sufficient amount of information must always be provided in order to train these models. Since LSTMs are state of the art models for sequential information, thus initially we implemented and train LSTM model. However, the performance of trained model was sufficient but not optimal. The outcomes indicates that this was further improved by using pre-trained architecture and transfer learning.

6 Conclusion

In this research, we implemented two machine learning approaches. LSTM and transfer learning to predict the fall detections effectively. The proposed methodology applied the simple feature extraction process on available dataset. The computed features were used to train and validate the machine learning models. The obtained result shows the accuracy with LSTM is 88% and validation accuracy is 83%. Whereas with CNN-transfer learning achieved 98% accuracy. From these experiments, we observed that the CNN-transfer learning shows more promising results than LSTM. Also, the training of LSTM was found computationally intensive as it required more memory as well more rigorous parameters. Thus, it takes longer training time than CNN-transfer learning. It is also evident from comparative analysis that transfer learning outperformed rest of the other traditional models as well. The proposed research using machine learning is completely based on non-intrusive based method. It is well understood that using heavy wearable sensors which results in inconvenience to the subjects might not be a feasible solution. The work done in this research can be further extended in IoT-based ambient assisted living systems (AALS). Additionally, by using the above trained models, it is easier to classify human or patients into a risk class relevant to the fall probability. This can further support in enabling of assistance by caregivers to elderly people, children or patients immediately.