Gait recognition via random forests based on wearable inertial measurement unit

Shi, Ling-Feng; Qiu, Chao-Xi; Xin, Dong-Jin; Liu, Gong-Xu

doi:10.1007/s12652-020-01870-x

Gait recognition via random forests based on wearable inertial measurement unit

Original Research
Published: 16 March 2020

Volume 11, pages 5329–5340, (2020)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Gait recognition via random forests based on wearable inertial measurement unit

Download PDF

Ling-Feng Shi¹,
Chao-Xi Qiu¹,
Dong-Jin Xin¹ &
…
Gong-Xu Liu¹

512 Accesses
19 Citations
Explore all metrics

Abstract

In recent years, gait detection has been widely used in medical rehabilitation, smart phone, criminal investigation, navigation and positioning and other fields. With the rapid development of micro-electro mechanical systems, inertial measurement unit (IMU) has been widely used in the field of gait recognition with many advantages, such as low cost, small size, and light weight. Therefore, this paper proposes a gait recognition algorithm based on IMU, which is named as FPRF-GR. Firstly, a fusion feature engineering operator is designed to eliminate redundant and defective features, which is mainly based on Fast Fourier Transform and principal component analysis. Then, in the design of classifier, in order to meet the requirements of gait recognition model for accuracy, generalization ability, speed, and noise resistance, this paper compares random forest (RF) and several commonly used classification algorithms, and finds that the model constructed by RF can meet the requirements. FPRF-GR builds the model based on RF, and uses the tenfold cross validation method to evaluate the model. Finally, this paper proposes an optimization scheme for the two parameters of decision tree number and sample number in RF. The results show that FPRF-GR can identify five gaits (walk, stationary, run, and up and down stairs) with the average accuracy of 98.2%.

Development Human Activity Recognition for the Elderly Using Inertial Sensor and Statistical Feature

A Random Forest Method to Detect Parkinson’s Disease via Gait Analysis

Human Gait Analysis Based on Decision Tree, Random Forest and KNN Algorithms

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The research of gait recognition started from 1960s by Murray et al. (1964). Gait recognition has been applied in many fields, such as medical rehabilitation, entertainment industry, sports industry, and in the field of information science. For example, Zhi and Zhang (2012) used relative wavelet energy as features to discriminate walking pattern based on their improved physical activity healthcare system. Vikas and Crane (2013) proposed a novel approach of non-contact, dynamic measurement of joint parameters using the planar vestibular dynamic inclinometer. Ahmadi et al. (2014) presented a system that using discrete wavelet transform during a sports training session. Anwary et al. (2018) proposed an automatic gait feature extraction method to analyze the data during walking.

To identify a manner of gait, three types of gait recognition methods are mainly proposed, including floor sensor based, computer vision based, and wearable sensor based (Loudon and Janice 2008). In floor sensor based approaches, one of the main advantages is in its unobtrusive data collection, and this type method is usually installed in buildings and can be deployed in access control application (Gafurov 2007). In the category of computer vision based, image and video techniques are used to extract features of gait for gait recognition, and most of the computer vision based methods are based on human silhouette (Khan et al. 2018; Liu et al. 2004). Martino et al. (2017) designs a sequential Monte Carlo scheme for the dual purpose of Bayesian inference and model selection, which considers the application context of urban mobility, where several modalities of transport and different measurement devices can be employed. Wu et al. (2004) presents a switching Kalman filter model for the real-time inference of hand kinematics from a population of motor cortical neurons. Achutegui et al. (2009) proposes the problem of indoor tracking using received signal strength (RSS) as a position dependent data measurement. Although the recognition rate of gait based on computer vision is relatively high, gait is recognized by pictures, which relies heavily on external environment, and requires high light intensity and background environment. If the light is insufficient and the background environment is not clear, the gait recognition rate would be low. The application areas of this type method are usually forensics and surveillance. In wearable sensor based approaches, data collections are more unobtrusive and more convenient than other methods. With advances in miniaturization techniques, it is feasible to integrate the sensor with personal devices. Thus, in this paper, we introduce a wearable sensor based approach to solve gait recognition problems.

In recent years, micro-electro-mechanical systems (MEMS) technology has attracted many researchers (Shuai et al. 2018; Wixted et al. 2007; Liu et al. 2018; Qureshi and Golnaraghi 2017), and a lot of studies related with gait recognition are using MEMS sensor, especially based on inertial measurement unit (IMU). IMU possesses many advantages, such as small size, light weight, wearable, low cost, and low power, which make it easy and convenient to implement with good properties. In recent years, with the rapid development of artificial intelligence technology, on the basis of wearable devices, researchers combine classification algorithms in machine learning to extract and recognize features (Mashal et al. 2016), including support vector machines (SVM) (Sprager and Zazula 2009), decision trees (Watanabe 2014), neural networks (Yuan 2012), and Gaussian mixture model (Lu et al. 2014).

Gait recognition is a prerequisite for autonomous pedestrian positioning and navigation. In particular, the development of artificial intelligence technology now requires automatic gait recognition to achieve pedestrian navigation or positioning needs in arbitrary attitudes. To achieve a higher accuracy, we present a more robust and high accuracy method called FPRF-GR. Firstly, the acceleration and angular velocity are collected by the MEMS sensor, and the preprocessed data are windowed and coordinate system transformed. Secondly, Fast Fourier Transform (FFT) + principal component analysis (PCA) fusion feature engineering is used to reduce redundant or defective features. Thirdly, by comparing the advantages and disadvantages of SVM, K-Nearest Neighbor (KNN), Gradient Boosting Decision Tree (GBDT) and random forest (RF) (Breiman 2001), it is found that RF is most suitable for the requirements of the model in this paper. Therefore, FPRF-GR is based on RF to train the data after feature construction, and uses tenfold cross-validation method to evaluate the model. Finally, the optimization scheme of two parameters is proposed, including the number of decision trees and the number of samples. The results show that FPRF-GR possesses better performance than other methods.

The rest of this paper is organized as follows: Sect. 2 presents the details of the proposed method. The experiments are given in Sect. 3. Section 4 concludes the work in this paper.

2 Proposed method

The proposed method is parted into three parts. Section 2.1 introduces the data acquisition. Section 2.2 gives the details of feature engineering. Section 2.3 depicts the implementation of FPRF-GR.

2.1 Data acquisition

2.1.1 Acquisition platform

In this paper, the acceleration of X-axis, Y-axis, Z-axis and angular velocity of X-axis, Y-axis and Z-axis under five pedestrian gaits are collected by an IMU. The physical photo of the IMU is shown in Fig. 1. The MPU9250 includes three mutually orthogonal accelerometers, gyroscopes, and magnetometers. The RS-232 and the bus help us display and store the output data by communicating with personal computer (PC). The STM32 is the CPU to control the raw output data. The IMU can output the raw data from the sensor and the orientation, which can be used for processing and analyzing by researchers.

2.1.2 IMU wearing position

As shown in Fig. 2, the IMU is worn on the ankle during the experiment. Compared with other parts of the body, the data collected by the IMU wearing on the ankle is more accurate, can reflect the characteristics of each gait, and the data acquisition is more convenient. When the IMU is worn on the hand for data acquisition, the hand may shake, which will cause redundant features in the collected data to affect the classification model’s judgment of features. Similarly, if worn on the chest, because the chest can’t be fixed, it will produce jitter phenomenon, which will lead to the data collected by the sensor noise is very large, resulting in redundant features. When the IMU is worn on the waist because the waist characteristics do not change significantly, the data differences of the subjects in walk, stationary, and run are not so obvious so that it can not well reflect the characteristics of each gait. If the IMU is worn on the instep, the gait data collected will be more accurate than the data collected on the ankle. But wearing the IMU on the instep is not conducive to walking, and it is not convenient to use in real life. Therefore, considering the accuracy of acceleration and angular velocity characteristics and the convenience of IMU wearing, this paper will wear the IMU on the ankle.

2.1.3 Experimental subjects

In order to collect experimental data, we invite some volunteers in our laboratory and school. In consideration of the comprehensiveness of data, various genders, ages, and heights of volunteers are guaranteed. In total, 20 volunteers participate in data collection. For each volunteer, three groups of experiments with five different gait (walk, stationary, run, and up and down stairs) will be conducted.

Table 1 shows the details of these volunteers. “# Subject x” (x = 1, 2, …, 20) means the symbol of the volunteer. For the “Sex”, ♂ represents male, and ♀ represents female. The units of “Height” and “Weight” are centimeters (cm) and kilograms (kg), respectively. We balance the number of male and female as much as possible with 12 males and 8 females. The weight is range from 49 to 85 kg. Volunteers are mainly during 21–26 years old. For each volunteer, 100 s of data are collected for each gait.

Table 1 The parameters of the volunteers

Full size table

2.2 Feature engineering

Since gait recognition is difficult directly using input data acquired from the IMU that raw data of the IMU is difficult to make gait recognition because there is serious noise in seniors’ output, some technologies of feature engineering are proposed. For example, Dehzangi et al. (2017) used convolutional neural networks to extract features. In the field of machine learning, various technologies have been used to get proper datasets, i.e. features. A set of good features will be conducive to acquire good results. Thus, this paper designs a method to extract features.

2.2.1 Windowing

The data set is windowed. There are 100 sets of output data in 1 s of the MEMS sensor, each group of data represents three-axis acceleration information and three-axis angular velocity information. If these data are directly used in model training, it would be not practical, because each sample is 0.01 s data. It represents the instantaneous state of walking, which can’t reflect the characteristics of motion. Generally speaking, when an adult walks normally, every gait will show a periodic change. Research shows that 1.5 m/s is the normal walking speed of adult, so it can be considered that the measured person can complete a step in 1 s. Because each training data must contain at least one step of state information in the process of walking, this paper adopts windowed method to process data. In this paper, a window with a length of 1 s is set up, and the 1 s data in the window is regarded as a piece of training data of the classifier. It ensures that the data in the window contains at least one complete gait cycle, so as to retain all gait information of the subject’s walking step.

Each input item of training data is shown as Eq. (1). Then each input item of training data is a matrix of 100 × 6. Class labels are added after windows are added. The definition of labels is Y = {1, 2,…, N}, N is the number of gait patterns. This paper defines five classes, including walk, stationary, run, up and down stairs, so N is 5.

$$ {\boldsymbol{X}}_{t} = \left( {\begin{array}{*{20}c} {x_{{1,1}} } & {x_{{1,2}} } & { \ldots ,} & {x_{{1,6}} } \\ {x_{{2,1}} } & {x_{{2,2}} } & { \ldots ,} & {x_{{2,6}} } \\ \vdots & \vdots & \ddots & \vdots \\ {x_{{100,1}} } & {x_{{100,2}} } & { \ldots ,} & {x_{{100,3}} } \\ \end{array} } \right) $$

(1)

Each input item of training data in the general classification model is one row or one column. Therefore, as shown in Fig. 3, this paper connects the top and bottom of 100 data in the window, and changes the matrix of 100 × 6 to 1 × 600.

2.2.2 Coordinate system transformation

The coordinate system transformation of the data is carried out. Currently, the collected acceleration and angular velocity are based on the carrier coordinate system. As shown in Fig. 4, although the IMU is worn on the ankle during the experiment, the position of IMU is different, which is installed in Fig. 4a, b, and the direction of each axis in the carrier coordinate system is different, which leads to the data collected not in the same standard. In gait recognition, the data under different standards are unreasonable, so it is necessary to rotate the data under the carrier coordinate system (b) to the geographic coordinate system (n), so as to unify the data comparison standard. In Fig. 4, carrier coordinate shows that the IMU (the white box) on MPU9250 is installed Cartesian coordinate system. Figure 4a is installation perpendicular to the ground while Fig. 4b is installation non-perpendicular to the ground. Figure 5 shows the coordinates of the accelerometer and gyroscope in MPU9250. Moreover, Fig. 5 is MPU9250 data sheet description of carrier coordinates. Figure 6 gives the geographic coordinate system. X, Y and Z show the direction of east, north and sky. In addition, Fig. 6 is the geographic coordinate system on the earth surface.

The direction of geographic coordinate system is northeast celestial direction. Acceleration and angular velocity in carrier coordinate system rotate to geographic coordinate system through rotation sequence of Z axis, Y axis and X axis. The rotation matrix is shown in Eq. (2).

$$ C_{\text{b}}^{n} = \left( {\begin{array}{*{20}c} {\cos \theta } & 0 & {\sin \theta } \\ 0 & 1 & 0 \\ { - \sin \theta } & 0 & {\cos \theta } \\ \end{array} } \right) \cdot \left( {\begin{array}{*{20}c} 1 & 0 & 0 \\ 0 & {\cos \varphi } & { - \sin \varphi } \\ 0 & {\sin \varphi } & {\cos \varphi } \\ \end{array} } \right) \cdot \left( {\begin{array}{*{20}c} {\cos \psi } & { - \sin \psi } & 0 \\ {\sin \psi } & {\cos \psi } & 0 \\ 0 & 0 & 1 \\ \end{array} } \right) $$

(2)

$ \psi $, $ \theta $ and $ \varphi $ represent yaw angle, roll angle and pitch angle respectively. Therefore, the conversion formula of acceleration and angular velocity in carrier coordinate system is defined in Eqs. (3) and (4). In Eqs. (3) and (4), $ {\text{a}}_{k}^{\text{n}} $ and $ \omega_{k}^{\text{n}} $ are the acceleration and angular velocity in geographic coordinates, respectively. Superscript n is geographic coordinates, sub-indices k shows discrete time, and sub-indices b is carrier coordinate.

$$ a_{k}^{n} = c_{b}^{n} \cdot a_{k}^{b} $$

(3)

$$ \omega_{k}^{n} = c_{b}^{n} \cdot \omega_{k}^{b} $$

(4)

2.2.3 FFT

The Feature Fusion Engineering in FPRF-GR combines FFT with PCA. Firstly, FFT is used to convert the time-varying signals into time-invariant signals in frequency domain. The reason is that when a normal person walks at a certain speed, his whole walking state presents periodic regularity, but the gait data of each sample point at the moment is different from the sample point after 1 s. Although in practical physical sense, the latter is only the translation of the former and should be the same gait. But in the classifier, because of the phase, the two sampling points on gait state will appear in completely different positions in the output space. Therefore, the classifier will assume that the current state and the state after 1 s are not the same gait so that making a wrong judgment. In this paper, FFT operator is introduced to process each row of data separately, and only the amplitude–frequency features of each gait are extracted to avoid the misjudgment of the classifier caused by the phase features.

In order to verify the validity of FFT, a random sample of the measured data is selected. The selected data are pre-processed, windowed and coordinate system transformed. Then the Z-axis acceleration amplitude–frequency map and X-axis angular velocity amplitude–frequency map of the measured person in walk, run, stationary, up and down stairs gait are plotted by Spyder Softeware. Figure 7 depicts the amplitude–frequency variation of Z-axis acceleration in five gaits. The abscissa represents the frequency point and the ordinate represents the amplitude. Figure 8 gives the amplitude–frequency variation of X-axis angular velocity after FFT in five gaits. In Figs. 7 and 8, it can be found that the phase characteristics are eliminated, and the amplitudes of acceleration and angular velocity under the five gaits are different, which can reflect the characteristics of each gait.

Therefore, three-axis acceleration and three-axis angular velocity can eliminate the interference of phase characteristics after FFT, and the difference of amplitude–frequency characteristics is obvious in each state, which can improve the accuracy of model training. In this experiment, the sampling frequency of accelerometer and gyroscope is 1 kHz, and the data output frequency of sensor is 100 Hz. The data of each tested person is obtained within 100 s. Therefore, in this paper, the FFT of 64 points is selected in FPRF-GR, and the amplitude of FFT is determined as input data.

2.2.4 PCA

At present, every training data input item is a 600-dimensional vector. When feature construction is carried out, if the dimension of input learning samples is very high, the amount of data for classifier learning will also increase. When solving large scale classification problems, the time and space complexity of the classifier will increase with the increase of data, which will affect the performance of the algorithm. Comparing the high-dimensional matrix with the sparse matrix, it will be found that the feature extraction process of the high-dimensional matrix is very troublesome, and it may extract the abnormal features, which will lead to the reduction of the accuracy of the model. Therefore, when the data sets have many features, dimension reduction can be used to improve the accuracy. The principle of dimension reduction is to reduce dimension without losing a large number of useful features. After dimension reduction, the number of learning samples will be reduced, which can improve the efficiency of classifier for data processing.

Based on FFT feature engineering, because PCA can reduce the complexity of data and identify the most important features, this paper uses PCA to reduce dimension. PCA can be implemented in two ways: covariance matrix decomposition and singular value decomposition. In this paper, covariance matrix is used. The selected method is not singular value decomposition but covariance matrix decomposition. The reason is as following. The purpose of dimension reduction is noise reduction and de-redundancy. The purpose of “noise reduction” is to make the correlation between the remaining dimensions as small as possible, while the purpose of “de-redundancy” is to do the remaining dimensions contain as much “energy” or variance as possible. So the first problem is that we need to know the correlation between the dimensions and the variance of the dimensions. What data structure can show the correlation between different dimensions and the variance on each dimension? It is covariance matrix. The covariance matrix measures the relationship between dimensions, not between samples. The elements on the main diagonal of the covariance matrix are variances (that is, energy) on each dimension, and the other elements are covariances (that is, correlations) between the two dimensions. We have the covariance matrix for everything we want.

As for the choice of dimension k after dimension reduction, the size of k can be arbitrary. Moreover, according to the constraints of Eq. (5), the minimum k value that meets the constraints of Eq. (5) is calculated.

In Eq. (5), $ \frac{1}{m}\sum\nolimits_{i = 1}^{m} {||x^{(i)} - x^{(i)}_{approx} ||^{2} } $ is the average of the square of projection error, $ x^{(i)}_{approx} $ is the mapping value, and $ \frac{1}{m}\sum\nolimits_{i = 1}^{m} {||x^{(i)} ||^{2} } $ is the total variance of data. In addition, $ x^{(i)} $ and $ x^{(i)}_{approx} $ represent the vector after FFT.

$$ \frac{{\frac{1}{m}\sum\nolimits_{i = 1}^{m} {\left\| {x^{(i)} - x^{(i)}_{approx} } \right\|^{2} } }}{{\frac{1}{m}\sum\nolimits_{i = 1}^{m} {\left\| {x^{(i)} } \right\|^{2} } }} \le t $$

(5)

The value of t is determined by oneself. The value of t in this paper is 0.05, which represents that the PCA retains 95% of the main information. The selection of k in this paper is 60, and the new matrix calculated is a 60 dimensional matrix. The training data input set of each sample is 60 dimensions.

2.3 Implementation of FPRF-GR

2.3.1 Comparisons of relevant classification algorithms

The classification model designed in this paper is expected to achieve high accuracy, anti-noise, strong generalization ability and fast running speed. In this paper, SVM, KNN, GBDT and RF are used to compare with the proposed algorithm. As shown in Table 2, RF is superior to SVM, KNN and GBDT in terms of accuracy, generalization ability, speed and anti-noise ability of the model, which fully meets the design requirements of classification model.

Table 2 Performance comparison table of four algorithms

Full size table

Therefore, the paper concludes that the model designed based on RF will be more suitable for the classification model requirements of this paper. Moreover, after data feature construction, FPRF-GR builds and optimizes the model on the basis of RF.

2.3.2 Optimization of stochastic forests

Figure 9 shows the experimental process of FPRF-GR. The original data is collected by the IMU. Firstly, filtering and denoising, data calibration is carried out. Secondly, the window and coordinate system are transformed. Thridly, the fusion feature engineering combining FFT and PCA is used to extract the amplitude–frequency characteristics of data and reduce the dimension. Finally, the data after feature construction is taken as the final learning sample, and the model is constructed on the basis of RF. Table 3 describes the FPRF-GR, and n is the number of integration trees.

Table 3 FPRF-GR

Full size table

FPRF-GR optimizes parameters based on RF. Appropriate parameters are very important for the running speed and evaluation results of machine learning models and inappropriate parameters may lead to over-fitting or under-fitting. Therefore, this paper proposes two optimization schemes for the number of samples and the number of decision trees.

1.
Sample size optimization scheme. For machine learning problem, if the number of samples is small at first, the accuracy of the model will be low. With the increase of sample size, the accuracy of the model will be improved. But when the number of samples reaches a certain level, if the number of samples continues to increase, the data granularity becomes finer and finer, and the accuracy of gait recognition will decrease. Because the characteristics of some data only exist under the specific granularity, it leads to the phenomenon of over-fitting if the sample is divided too fine so that it will lose some features.

Therefore, a sample size optimization scheme is proposed for FPRF-GR. This paper argues that the impact of sample size on the accuracy of model recognition will show a trend of first rising and then declining, so the optimal sample size is the corresponding sample number before the decline of accuracy.
2.
Optimizing the number of decision trees. The theoretical study shows that the accuracy of the results is low when the number of decision trees is small. When the number of decision trees increases, the accuracy of the model will be improved, but it will not reach 100% because there will still be noise and error in the data characteristics. When the number of trees reaches a certain level, the change of model accuracy will be very small and stable in a value, which will not appear a significant downward trend. This is because the randomness of sample and feature of RF can reduce the probability of over-fitting, so as the number of decision trees increases, the accuracy of the model will not decrease too much.

Therefore, a decision tree quantity optimization scheme is proposed for FPRF-GR. This paper argues that the number of decision trees will have an upward trend and then a steady trend on the accuracy of model recognition. Considering resource saving and computing cost, the optimal number of decision trees is the corresponding value when the accuracy of the model has just reached stationary.

In this paper, tenfold cross validation method is used to evaluate the model. This method will reduce the probability of under-fitting and over-fitting, so the accuracy of the final model is more convincing.

3 Experiments

It is difficult to directly implement FPRF-GR algorithm in STM32 chip, so this paper uses Spyder software to implement FPRF-GR and related classification algorithm in PC. Firstly, Sect. 3 evaluates FPRF-GR according to several evaluation indexes in machine learning. Secondly, through several comparative experiments, the correctness of the parameter optimization scheme proposed in Sect. 2 and the effectiveness of FFT + PCA fusion feature engineering are verified. The results show that FPRF-GR is superior to SVM, KNN and GBDT.

3.1 Evaluating indicator

To validate the performance of FPRF-GR, this paper uses the average of Precision, Recall and F₁-score for each class to evaluate the results. For each class, we use the confusion matrix to calculate the Precision, Recall and F₁-score. Precision, Recall and F₁-score are defined in Eqs. (6), (7) and (8) respectively, and the confusion matrix is defined in Table 4. TPR and FPR are separately defined in Eqs. (8) and (9) respectively.

Table 4 Definition of TP, FP, FN, and TN

Full size table

$$ Precision = \frac{{N_{TP} }}{{N_{TP} + N_{FP} }} $$

(6)

$$ Recall = TPR = \frac{{N_{TP} }}{{N_{TP} + N_{FN} }} $$

(7)

$$ F1 - score = \frac{2 \times Precision \times Recall}{Precision + Recall} $$

(8)

$$ FPR = \frac{{N_{FP} }}{{N_{FP} + N_{TN} }} $$

(9)

TP, FP, FN and TN are defined in Table 4. N_TP is the number of true positives, N_TN is the number of true negatives, N_FN is the number of false negatives, and N_TN is the number of true negatives. This paper defines the number of classes is 5, and the results are the average of these 5 classes.

Equations (6), (7) and (8) can be used to calculate the Precision, Recall and F₁-score of FPRF-GR algorithm in five gaits, respectively. The results of accuracy are shown in Table 5. For stationary gait, all the results are 1.000, while the other gait classes are close to 1.000. All these values are greater than 0.960, and the average value is greater than 0.980. That is to say, under these three evaluation indicators, the recognition results of walk, run, stationary, up and down stairs are very good, which is suitable for the gait recognition problem in this paper.

Table 5 Precision, Recall and F₁-score of FPRF-GR

Full size table

3.2 Effect of parameters

In Sect. 3.2, we study the effect of the two parameters on the proposed algorithm. Firstly, we discuss the proper value of the number of samples for training. Secondly, we study the number of trees on FPRF-GR.

3.2.1 Effect of the number of samples

Figure 10 shows the effect of sample size on model recognition results. According to Fig. 10, it is under-fitting and the accuracy is low when the number of samples is 200. With the increase of the number of samples, the recognition accuracy of the model is also increasing. The recognition result of the model is the highest when the number of samples reaches 1400. However, if the number of samples continues to increase, the model is in the state of over-fitting, and the accuracy is slowly declining.

The effect of sample size on the accuracy of model results in Fig. 10 confirms the parameter optimization scheme proposed in Sect. 2 of this paper. Therefore, 1400 samples are selected as the input sample size of the experiment. That is to say, for 20 subjects, 70 experimental samples are needed for each of the five gaits.

3.2.2 Effect of the number of trees

In this paper, the number of decision trees is experimented in groups, and it is found that the trend of each group is almost the same. Now taking out one group of experiments to study the accuracy. According to Fig. 11, it can be found that the number of trees in the forest does affect the accuracy of the test results. When the number of trees is between 0 and 1000, the accuracy is low at first, and then increases with the increase of the number of trees, which confirms that a small number of decision trees may lead to the phenomenon of under-fitting. When the number of trees reaches 1000, the accuracy rate is the highest. However, after 1000, with the increase of the number of decision trees, the accuracy rate will only stabilize at a fixed value and will not rise. This result confirms that the phenomenon of over-fitting can not be completely eliminated.

The above experiments confirm the optimization schemes of the number of trees proposed in Sect. 2 of this paper. Therefore, this paper uses 1000 trees to construct the model, which not only guarantees the accuracy of the model, but also reduces the calculation cost.

3.3 Comparative study

3.3.1 Feature engineering contrast

In order to verify the necessity of FFT + PCA fusion feature engineering, four kinds of feature engineering, including FFT feature, PCA feature, FFT + PCA feature and original feature are compared. The above four feature engineering models are constructed based on RF. As shown in Table 6, the experimental accuracy is the average recognition accuracy of stationary, walk, run, up and down stairs. The recognition algorithm based on FFT + PCA features has a higher accuracy rate than other recognition algorithms based on several other features, and the accuracy rate is 98.2%. It proves that the classifier based on FFT + PCA features has the best result.

Table 6 Comparison table of results of different feature construction methods

Full size table

3.3.2 Comparison of different algorithms

In order to prove that the FPRF-GR proposed in this paper is the best, the optimal models of SVM_rbf, SVM_linear, KNN and GBDT are constructed on the basis of the same feature engineering as FPRF-GR, and evaluated by tenfold cross-validation method. Among them, SVM_rbf denotes that SVM uses a Gauss kernel function, and SVM_linear denotes that SVM uses a linear kernel function.

Figure 12 shows the average recognition accuracy of five different gait classification algorithms. From Fig. 12, it can be seen that the accuracy of five kinds of gait classification of FPRF-GR algorithm is 98.2%, which is the highest, superior to SVM, GBDT and KNN. Table 7 gives the parameters of five algorithms.

Table 7 Parameters of five algorithms

Full size table

In addition, in the later part of this paper, the designed algorithm will be written into the chip to achieve online operation, so the running time of the algorithm is also very important. The running time of the algorithms mainly depends on the training time of the models, so this paper compares the training time of these algorithms. The training time of FPRF-GR is the shortest as shown in Fig. 13, which is 323.51 s.

Table 8 compares the results of this study with those of related papers. At same the time, the results of Table 8 are from corresponding references. These papers are based on the MEMS sensor for gait recognition, and the accuracy of multi-classification recognition papers is lower than this paper. Some papers’ accuracy is similar to this paper, but their gait recognition types are fewer than this paper, which is lack of practical application.

Table 8 Comparisons with related research results

Full size table

4 Conclusions

This paper proves that FPRF-GR has the highest model accuracy and the shortest training time, which is superior to SVM, GBDT and KNN. The proposed FPRF-GR can recognize pedestrian gait accurately, and the accuracy can reach 98.2%. It has good practicability and application prospects, such as medical rehabilitation, smart phone, criminal investigation, navigation and positioning and other fields.

References

Achutegui K, Martino L, Rodas J, Escudero CJ, Miguez J (2009) A multi-model particle filtering algorithm for indoor tracking of mobile terminals using RSS data. In: IEEE international conference on control applications (CCA). San Petersburgo (Russia), 8–10 July, pp 1702–1707
Ahmadi A, Mitchell E, Richter C, Destelle F, Gowing M, O’Connor NE, Moran K (2014) Automatic activity classification and movement assessment during a sports training session using wearable inertial sensors. In: International conference on wearable and implantable body sensor networks. Zurich, 16–19 June, pp 98–103
Anwary AR, Yu H, Vassallo M (2018) Optimal foot location for placing wearable IMU sensors and automatic feature extraction for gait analysis. IEEE Sens J 18(6):2555–2567
Article Google Scholar
Bai GF, Sun YQ (2019) Application and research of MEMS sensor in gait recognition algorithm. Clust Comput 22(4):9059–9067
Article Google Scholar
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Article Google Scholar
Dehzangi O, Taherisadr M, Changalvala R (2017) IMU-based gait recognition using convolutional neural networks and multi-sensor fusion. Sensors 17(12):2735
Article Google Scholar
Gafurov D (2007) A survey of biometric gait recognition: approaches, security and challenges. In: Proceedings of the annual Norwegian computer science conference. Oslo Norway, 5 January, pp 19–21
Khan MA, Akram T, Sharif M, Javed MY, Muhammad N, Yasmin M (2018) An implementation of optimized framework for action classification using multilayers neural network on selected fused features. Pattern Anal Appl 22:1377–1397
Article MathSciNet Google Scholar
Li Z, Zhang G (2011) A gait recognition system for rehabilitation based on wearable micro inertial measurement unit. In: IEEE international conference on robotics and biomimetics. Karon Beach, 7–11 December, pp 1678–1682
Liu Z, Malave L, Sarkar S (2004) Studies on silhouette quality and gait recognition. In: IEEE computer society conference on computer vision and pattern recognition. Washington, DC, 27 June–2 July, 2:II-II
Liu Y, Li Y E, Hou J (2010) Gait recognition based on MEMS accelerometer. In: IEEE 10th international conference on signal processing proceedings. Beijing, 24–28 October, pp 1679–1681
Liu GX, Shi LF, Xun JX, Chen S, Zhao L, Shi YF (2018) An orientation estimation algorithm based on multi-source information fusion. Meas Sci Technol 29(11):115101
Article Google Scholar
Loudon SJ, Janice K (2008) The clinical orthopedic assessment guide, 2nd edn. Human Kinetics, Lawrence, pp 395–408
Google Scholar
Lu H, Huang J, Saha T, Nachman L (2014) Unobtrusive gait verification for mobile phones. In: Proceedings of the 2014 ACM international symposium on wearable computers. Seattle, Washington, 13–17 September, pp 91–98
Martino L, Read J, Elvira V, Louzada F (2017) Cooperative parallel particle filters for on-line model selection and applications to urban mobility. Digit Signal Process 60:172–185
Article Google Scholar
Mashal I, Alsaryrah O, Chung TY (2016) Testing and evaluating recommendation algorithms in internet of things. J Ambient Intell Humaniz Comput 7(6):889–900
Article Google Scholar
Murray MP, Drought AB, Kory RC (1964) Walking patterns of normal men. J Bone Jt Surg Am 46(2):335–360
Article Google Scholar
Qureshi U, Golnaraghi F (2017) An algorithm for the in-field calibration of a MEMS IMU. IEEE Sens J 17(22):7479–7486
Article Google Scholar
Shuai T, Zhang X, Cai H, Lv Z, Hu C, Xie H (2018) Gait based biometric personal authentication by using mems inertial sensors. J Ambient Intell Humaniz Comput 9(5):1705–1712
Article Google Scholar
Sprager S, Zazula D (2009) A cumulant-based method for gait identification using accelerometer data with principal component analysis and support vector machine. World Sci Eng Acad Soc 5(11):369–378
Google Scholar
Vikas V, Crane CD (2013) Measurement of robot link joint parameters using multiple accelerometers and gyroscope. In: ASME 2013 international design engineering technical conferences and computers and information in engineering conference. Portland, Oregon, 4 August, pp V06BT07A007–V06BT07A007
Wang SC, Liu Y, Hao WF, Liu KH, Lu WP (2014) Method of recognition of human movement based on inertial sensing. J Electron Meas Instrum 28(6):630–636
Google Scholar
Watanabe Y (2014) Influence of holding smart phone for acceleration-based gait authentication. In: Fifth international conference on emerging security technologies. Alcala de Henares, 10–12 Sptember, pp 30–33
Wixted AJ, Thiel DV, Hahn AG, Gore CJ, Pyne DB, James DA (2007) Measurement of energy expenditure in elite athletes using MEMS-based triaxial accelerometers. IEEE Sens J 7(4):481–488
Article Google Scholar
Wu W, Black MJ, Mumford D, Gao Y, Donoghue JP (2004) Modeling and decoding motor cortical activity using a switching Kalman filter. IEEE Trans. Biomed Eng 51(6):933–942
Article Google Scholar
Yuan XP (2012) Accelerometer-based gait authentication via neural network. Chin J Electron 21(3):481–484
Google Scholar
Zhi L, Zhang G (2012) A gait recognition system for rehabilitation based on wearable micro inertial measurement unit. In: IEEE international conference on robotics and biomimetics, Guangzhou, December 11–14, pp 1678–1682
Zou Q, Wang Y, Zhao Y, Wang Q, Shen C (2019) Deep learning based gait recognition using smartphones in the wild. Machine Learning 1–14. https://arxiv.org/abs/1811.00338?context=eess.SP. Accessed 14 Mar 2020

Download references

Author information

Authors and Affiliations

Institute of Electronic CAD, Xidian University, Xi’an, 710071, China
Ling-Feng Shi, Chao-Xi Qiu, Dong-Jin Xin & Gong-Xu Liu

Authors

Ling-Feng Shi
View author publications
You can also search for this author in PubMed Google Scholar
Chao-Xi Qiu
View author publications
You can also search for this author in PubMed Google Scholar
Dong-Jin Xin
View author publications
You can also search for this author in PubMed Google Scholar
Gong-Xu Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ling-Feng Shi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shi, LF., Qiu, CX., Xin, DJ. et al. Gait recognition via random forests based on wearable inertial measurement unit. J Ambient Intell Human Comput 11, 5329–5340 (2020). https://doi.org/10.1007/s12652-020-01870-x

Download citation

Received: 25 April 2019
Accepted: 06 March 2020
Published: 16 March 2020
Issue Date: November 2020
DOI: https://doi.org/10.1007/s12652-020-01870-x

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Gait recognition via random forests based on wearable inertial measurement unit

Abstract

Similar content being viewed by others

Development Human Activity Recognition for the Elderly Using Inertial Sensor and Statistical Feature

A Random Forest Method to Detect Parkinson’s Disease via Gait Analysis

Human Gait Analysis Based on Decision Tree, Random Forest and KNN Algorithms

1 Introduction