1 Introduction

Human gait is quite complex and requires a harmony between neural, muscular, and skeletal systems. There are several factors that cause variations in human gait. Walking itself includes certain variations in stride length and frequency [1] etc. Personal parameters like age [2,3,4,5], weight [4], and sex [5] cause further variations in the walking pattern. In addition to variations related to personal parameters, having a walking abnormality or a disease may disturb the harmony. Diseases such as osteoarthritis [6], multiple sclerosis [7], Parkinson’s [8] can cause variations. The early detection of the walking problems or abnormalities provides better treatments through early interventions and minimizes serious consequences related to the walking problems, for example osteoarthritis [9]. Therefore, it requires an expert to determine the source of differences. To reduce possible subjective mistakes of detection and to guide clinicians, variations should be defined quantitatively. In this research, an artificial neural network-based motion characterization model was proposed by taking personal parameters as inputs of the artificial neural network model. Variations of the gait parameters based on the personal parameters were investigated.

The assessment of the gait is conducted in hospitals or medical centers by using standardized gait analyses. Different gait parameters were used to investigate the variations of walking. Scalar and vectorial gait parameters have been used to investigate the gait variations and abnormalities. Step length, stride width, range of motions, joint stiffness etc. are some of scalar gait parameters. The variability of step length and width was investigated to understand the post-stroke effects [10]. Increased range of motion at hip joint and reduced range of motion at ankle joint were shown for patients with multiple sclerosis [11]. It was found that the presence of knee osteoarthritis causes kinematic asymmetries [12]. The variability of foot and trunk acceleration was investigated in control and multiple sclerosis patients [13]. Joint stiffness is another index which combines joint angles and joint moments into a single variable. It was revealed that knee joint stiffness of subjects with severe knee osteoarthritis was higher than that of a control group in the weight acceptance phase due to reduced movement of the joint [14]. In another study, it was found that runners with low back pain have greater knee joint stiffness [15]. Ankle joint stiffness was found to be lower for the subjects Prader-Willi syndrome [16] and Down syndrome [16, 17]. Besides, subjects with Down syndrome exhibited a greater hip joint stiffness [16, 17]. Leg stiffness of subjects with cerebral palsy is significantly lower than that of control group during load response, mid-stance and pre-swing [18]. Joint angles and ground reaction forces (GRFs) are vectorial parameters of gait, and they contain more information than a single gait parameter and most of the gait parameters were derived from joint angles and GRFs. In this research, gait cycle, body orientation, the length of the limbs, joint angles, and GRFs were considered as gait parameters to understand the effect of subjects age, sex, height, weight, and walking speed.

In this research, a method for neural network-based motion characterization was presented to understand the normal gait and its variations depending on personal parameters (age, height, weight, sex, and walking speed). Personal parameters were considered as input parameters of the neural network model. Joint angles, GRFs, body posture, etc. were predicted through neural network models. Then, the effects of personal parameters on output functions were investigated. The difference between model and experimental data was calculated and walking trials with excessive differences were investigated. The objectives of the present study are to model the effects of personal parameters on gait variables for the description of normal walking, obtain the distribution of the variations that identifies the range of variations for normal walking, and determine which phases of gait excessive variations intensify.

2 Methods

In this study, motion capture experiments were used to create a database of biomechanical gait parameters. Then, artificial neural networks were used to model the relation between personal parameters and gait parameters.

2.1 Experimental setup and data processing

The AIST gait database 2019 was used as experimental data [19]. Experimental protocol was approved by the local institutional review board and written consent was obtained from participants. 3D motion capture systems (VICON MX, sampled at 200 Hz) with force plates (AMTI, sampled at 1000 Hz) have been used to collect data. Butterworth filter with 6 and 10 Hz cutting frequencies was used to filter the raw marker position data and GRFs, respectively.

Subjects whose ages are lower than 18 and who have a missing datapoint of marker position or GRF data during one gait cycle were excluded. 225 subjects who walk freely without any walking assistance were used from the database. The distribution of the subjects’ age, height, weight, sex, and walking speed were given as mean values and standard deviations in Table 1. It was requested for participants to walk at their preferred walking speeds. Subjects were instructed to enter a room and then all body motion was recorded. After leaving the room, subjects were turned around and reentered the room. Motion capture data of 10 walking trials that contain data of a complete gait cycle were collected. To investigate the walking performance of the subjects, experimental data should be undergone several processes after filtering: data clipping, calculation of biomechanical parameters, and resampling.

Table 1 Mean values and standard deviations of the subjects’ age, height, weight, and speed

Data clipping is a required process to store data in a consistent way. Motion capture data was clipped between initial data points of consecutive contacts of the foot to force plates, and the single gait cycle was taken into consideration. The duration of the gait cycle was calculated. The human body was modeled as eight rigid links (two feet, two shanks, two thighs, pelvis, and head-arms-trunk) and seven revolute joints in the sagittal plane. Joint angles, GRFs, and body orientations were obtained. Even though conventional gait cycle was used, the length of the vectorial quantities might be different due to having a different duration of the gait cycle. Therefore, temporal biomechanical gait parameters (joint angles and GRFs) should be resampled. The number of resampling points was determined by considering the reconstruction error of the data. A previous study by Moissenet et al. showed that increasing the number of points reduces the reconstruction error and error becomes almost zero around 60 data points [20]. For this reason, temporal data were resampled at 60 data points by using cubic spline interpolation. Moreover, the mean value of each joint angle was subtracted from the corresponding joint angle, so that all joint angle data had zero mean value to reduce error related to marker placements and to focus more on the behavior of the joints.

2.2 Artificial neural networks

In this research, multilayer perceptron, a supervised neural network model, was used to capture nonlinear dynamics of personal parameters and gait behavior. An artificial neural networks-based gait characterization model was created to understand the effect of personal parameters (age, sex, height, weight, and walking speed) on walking behavior as well as to detect deviations from normal gait exhibited during a gait cycle as shown in Fig. 1. The input layer of the neural network model, \({\varvec{x}}\), was the information about personal parameters. Output layers of the neural network model were gait parameters, for example orientation of body limbs during initial contact, lengths of each limb, joint angles, and GRFs.

Fig. 1
figure 1

Overview of the system. Motion capture experiments (schematic), outputs of the experiments, and corresponding neural network models

First, input vectors were created. There are five nodes in input layer for the information about sex, height, weight, age, and walking speed. Then, the vectors which describe limb lengths, body orientation of all limbs during the initial contact, and the period of the gait cycle have been created. Limb lengths and orientation vector contains eight nodes for each limb. The duration of the gait cycle, in other words period, has a single node. Separate neural network models were used for each output vector. When it comes to joint angles and GRFs, 6 different output vectors were created: joint angles of ankle, knee, hip, and pelvis-body joints, and GRFs in the horizontal and vertical axis. The dataset of each joint angle and GRFs in each direction were considered as the outputs of separate neural network models.

A total of 2250 gait cycles, 10 gait cycles for 225 subjects, were used. Output vectors were divided into train, test, and validation data with ratios of 60%, 20%, and 20%, respectively. Data were divided randomly into five groups and one of the groups was assigned as test data during fivefold cross-validation. The remaining 80% of them were distributed randomly into training and validation data. Train data was used to calculate gradient and update parameters. The early stopping technique was used to prevent overfitting. If the error in the validation data increases for a number of iterations, the training procedure stops. Performance function was determined as mean squared error (MSE) and calculated for train, test, and validation data. To measure the quality of learning, the coefficient of determination (R2) and MSE were checked for each model.

The number of nodes in the hidden layer was determined by using fivefold cross-validation. MSE, R2, and computational time for learning were obtained as given in Table 2. It was revealed that the increasing number of nodes in the hidden layer reduced MSE, and increased R and computational time. When the number of nodes in the hidden layer increases more, the gap between the MSE of the training group and the validation/test group increases. Therefore, further investigations related to the period, limb lengths, orientation, joint angles, and GRFs were carried out by considering a single hidden layer with 30 nodes not to increase computational cost more.

Table 2 Mean values of MSE, R.2 and computational time for artificial neural network models with different configurations

3 Results

Effects of the input vector on different outputs have been presented by using neural network models. First, separate neural network models between input and output vectors have been created to reveal how personal parameters affect gait parameters. Based on the models, output vectors have been generated using the same input vectors. The difference between experimental data and generated data shows how experimental data deviate from the predictions of the model.

3.1 Regression model of gait parameters

First, neural network models for the different gait parameters were created. 5-fold cross-validation was performed to determine the number of neurons in the hidden layer and it was determined to continue 30 neurons with a single hidden layer. The performance of the artificial neural network models was investigated by mean square error (MSE) and coefficient of determination R2.

The neural network model of the period has R2 of 0.9057 and MSE of 0.0006. It was expected to have high R2 and low MSE in limb lengths output due to the proportionality of the human body. As expected, R2 was more than 0.99 even with 5 nodes in a single hidden layer and increased till 30 nodes. MSE of the neural network model for ankle, knee and hip joint angles in test data was 0.002, 0.003, and 0.0012, respectively. R2 values of them were greater than 0.92. The neural network model of the pelvis-body joint had MSE of 0.00039 and R2 of 0.7666. It should be noted that MSE of pelvis-body joint angle was the lowest due to the low range of motion of the pelvis-body joint. R2 of ground reaction forces was more than 0.97.

In addition to R2 and MSE, the accuracies of the models in terms of physical units were obtained. For this purpose, the difference vectors were obtained by subtracting model prediction and experimental data. Mean values and standard deviations of the difference vectors for each gait parameter were given in Table 3.

Table 3 Mean values and standard deviations of the difference vectors between model prediction and experiments

3.2 Effects of personal parameters on outputs

Period, orientation of limbs, length of limbs, joint angles, GRFs, etc. vary depending on subjects’ characteristics such as age, sex, average walking speed. To understand how each personal parameter affects output vectors, neural network models were simulated by changing only corresponding personal parameter and other personal parameters were kept same. It was revealed that period decreases with walking speed and other input parameters have a less significant effect on period. The orientations of the body limbs during initial contact were affected mostly by body height and walking speed.

The effects of personal parameters on joint motion and GRFs were shown in Fig. 2. Joint angles were affected mostly by sex, age, and walking speed. It was estimated that female subjects have a slightly larger range of motion for the ankle joint, whereas male subjects have a larger range of motion for the hip joint. Male subjects have a larger knee flexion/extension during stance phase (between data points of 0–30) as shown in Fig. 2.

Fig. 2
figure 2

Effect of input parameters of sex, age, height, weight, and preferred walking speed on joint angles and ground reaction forces. The adaptations were marked by arrows

The effects of age on joint motion were evident especially in ankle and knee joints. Increasing age decreased the range of motion and affected the slope of joint motion during the second rocker phase (between data points of 5–30) in the ankle joint as shown in Fig. 2. It was also appeared that reduction in knee joint motion occurs during the stance phase (between data points of 0–30) when age increases according to Fig. 2.

Walking speed has the most prominent adaptations in joint angles. When walking speed increases, the range of motion of the ankle joint increases. Besides, the slope of the ankle joint changes during second rocker phase depending on walking speed. According to Fig. 2, the flexion and extension of the knee joint during the stance phase increase with walking speed. Figure 2 also shows that the range of motion in the hip joint also increases when walking speed increases. The hip joint moves faster during the stance phase when walking velocity increases. However, the slope of the curve during the swing phase does not change significantly.

When it comes to ground reaction forces, walking speed affects both vertical and horizontal GRFs, dramatically. When walking speed increases, the loading rate and amount of the first peak of vertical GRF increases. Similarly, the peak propulsive and braking GRF in the horizontal direction increases as shown in Fig. 2.

3.3 Level of deviations

After modeling the tendencies of the gait parameters related to personal parameters, the differences between model predictions and experimental dataset were investigated for each output vector. Let \({{\varvec{x}}}_{{\varvec{e}}{\varvec{x}}{\varvec{p}}{\varvec{e}}{\varvec{r}}{\varvec{i}}{\varvec{m}}{\varvec{e}}{\varvec{n}}{\varvec{t}}{\varvec{a}}{\varvec{l}}}\) and \({{\varvec{x}}}_{{\varvec{N}}{\varvec{N}}}\) be experimental and neural network prediction of an output vector, such as knee joint angle with \(n\)=60 data points. The deviation, \(v\), was defined as the summation of absolute values of the elements in the difference vector of each output as given in Eq. (1).

$$v = \mathop \sum \limits_{i = 1}^{n} \left| {{\varvec{x}}_{{{\varvec{NN}}}} \left( i \right) - {\varvec{x}}_{{{\varvec{experimental}}}} \left( i \right)} \right|$$
(1)

Thus, a scalar, \(v\), indicated how much difference occur between experiments and estimation of the artificial neural network model. The distribution of the deviations for each output was as given in Fig. 3. All of the distributions were right-skewed and there were few trials that contains an excessive amount of deviation. Then, total deviation was defined as the summation of the deviations in each output vector. A trial ID was assigned to each gait cycle. Trial ID’s and their total deviations were given in Fig. 4 and trials with maximum deviation were marked. Also, each component of the total deviation with trial ID was given in Table 4 for the trials with maximum deviations. Knee joint angle, hip joint angle, and vertical GRF contributed more than other temporal output vectors to total deviation for the trials with maximum deviations.

Fig. 3
figure 3

Distribution of the deviation in joint angles and GRF

Fig. 4
figure 4

Distribution of total deviation among trials. X and Y are trial ID and total deviation of the trial, respectively

Table 4 Components of the total deviation in trials with maximum deviation

To illustrate what was performed in experiments and what was predicted by the artificial neural network model, joint angles, and GRFs of the trial with maximum deviation were given together in Fig. 5. The trial with maximum deviation, trial ID = 1192 (Female, Age 23 years old, Height 1.58 m, Weight 47 kg, and Speed 1.23 m/s), have large deviations in the joint angles and GRFs according to Table 4 and Fig. 5. It was predicted that the range of motion of the ankle joint to be lower. The difference in the knee joint motion during the stance phase is observed. Also, the loading rate of vertical GRF was predicted to be lower.

Fig. 5
figure 5

Experimental and neural network model generated data for joint angles and GRFs for the trial ID = 1192

Another important topic is to understand where excessive differences intensify between model and experimental data. For this purpose, standard deviations of difference vectors were calculated at each data point. The interval of ± 3 standard deviations contain more than 99.5% of data by assuming normal distribution. If the difference was more than 3 standard deviations for a data point of an output vector, the data point was considered as outlier. The number of outliers for each data point was counted for all outputs. If data were distributed normally, it is expected to have no significant difference between data points. However, the existence of the considerable differences between model and experiments reveals where deviations occur commonly. Figure 6 shows where the differences between model and experiments were concentrated by using average joint angles and GRFs curves. A red to black gradient was used to express how frequent difference occurs more than three standard deviations. The number of outlier data was normalized for each output vector and the red color represents where the maximum difference occurs between predicted and experimental data.

Fig. 6
figure 6

The distribution of the high (more than 3 standard deviations) differences between model and experiments for joint angles and GRFs. Black to red gradient color shows the frequency of high difference

According to Fig. 6, the locations with red color contain more excessive differences than other locations. There are three regions where high differences occur mostly for ankle joints: second rocker phase, termination of the stance, and termination of the swing. In the second rocker phase, the foot is stationary on the ground and the thigh rotates around the ankle joint. The behavior of the ankle in the second rocker phase is useful to understand ankle joint characteristics. Termination of the swing is important to adjust foot orientation to touch the ground properly. The ratio of stance/swing time may cause excessive differences around termination of the stance for ankle joint, also GRFs. Knee joint data have one location with high deviation: weight acceptance phase. Knee joint motion during the weight acceptance phase is one of the indicators of several problems such as osteoarthritis, low back pain, etc. The range of motion for pelvis-body motion can be considered as another important parameter. Although the range of motion for the pelvis-body joint is important, it should be noted that the performance of the neural network was not so high. As a result, it can be said that neural network model predictions catch the behavior for joint angles and GRFs. Red areas are the phases where output vectors exhibit excessive differences.

4 Discussions

In this research, various gait parameters of the 225 subjects were obtained in 10 trials, which is equal to 2250 samples for each gait parameter in total. First, the effects of personal parameters on gait parameters were modeled by using artificial neural networks to identify normal walking that considers personal parameters. It can be said that all gait parameters except pelvis-body joint motion have good fitting performances with R2 more than 0.90.

The effects of personal parameters on joint angles and GRFs were modeled by using artificial neural networks. The artificial neural network model revealed the effects of personal parameters obtained in other studies on gait parameters collectively and individually. The neural network model predictions match with previous works [2, 5, 21]. The effects of walking speed on joint angles have similar behavior to previous research [21]. Similarly, the effects of age by using the neural network models have consistency with previous works on aging effect on the ankle and knee joint motion [2].

The artificial neural network models expressed the tendencies of the different gait parameters depending on the personal parameters. The neural network model of joint motion of pelvis-body joint has less accuracy in terms of the coefficient of determination, R2. The pelvis-body joint motion has less range of motion and is more sensitive to small errors compared to other joint motions. Also, small but sudden changes in spinal posture can cause a spiky joint angle curve that reduces the accuracy of the model. When the accuracy of models was considered in terms of physical units, standard deviation of the error in joint angles was around 3 degrees. The standard deviation of error was between 4 and 6 degrees in a similar previous study which uses multiple linear regression [20]. It can be said that the accuracy of the model is better than multiple linear regression model. Besides, the errors in the range of ± 5 degree were considered acceptable limits for clinical applications [22].

Another purpose of the study was to detect deviations from normal gait. The deviation, \(v\), was used for this purpose. According to Table 4 and Fig. 3, the deviation of knee joint angle of trial ID = 1192 is quite high in the population. The predicted motion and experimental data were given in Fig. 5. The deviation of knee joint angle is obvious, especially in loading phase of knee joint. In future research, a threshold level of deviation should be set to determine abnormality by including abnormal walking dataset.

The present study shows that artificial neural network models improve the accuracy of the normal gait prediction models. The effects of age, sex, weight, height, and walking speed on the human gait was obtained. Besides, the application of the neural network models on gait data can be further extended to understand adaptations such as slope walking, walking with additional mass, stair ascending/descending, and so on. Thus, predicted gait data can be used to design personalized walking assistive devices, orthoses, prosthetics, etc.

One of the drawbacks of the proposed method is the non-uniqueness of the outputs for a given personal parameter. It is expected to have variability in each gait parameter, therefore upper and lower bound should be determined in future studies to specify the range of normal gait.

5 Conclusion

The artificial neural network is a technique for classification, regression, and clustering problems. Recently, neural network applications of complex behavior of the human gait have gained attention. In this research, it is aimed to detect the variations of the subject’s gait behavior by considering personal parameters, such as age, sex, and walking speed. Neural network models were built to understand the relationship between personal parameters and gait parameters. The differences between neural network model estimations and experimental data were calculated. The effect of age, sex, height, weight, and walking speed on gait variables were discussed. The walking trails with excessive deviations were investigated deeply by considering the distribution of the deviations in each gait variable. It was revealed that a neural network-based motion characterization model can predict where excessive differences from normal gait occurred. Finally, the stages of walking where large deviations were intensified were shown.