Abstract
Walking speed is used principally to assess the status of human health. Recent gait recognition systems often experience difficulties, including variations in the viewing angle and enormous variations in the intraclass. For decades, computer vision-based approaches are in great demand and more effective in clinical gait analysis. Used referenced dataset is under well-controlled conditions. Gait movement rate is the predominant human biomechanical determinant. This paper proposes a scientific way of determining the pattern of gait movement at a particular speed of healthy people. Based on deep learning technology, our proposed model comprises of a fully convolutional neural network accompanied by batch normalization and max pooling. Proposed models on large-scale gait datasets were being extensively tested and have been compared with the benchmark approaches. The proposed model's overall average accuracy is about 93.0%. Where various clothing and transport conditions have been encountered, the method is also robust for such conditions.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
6.1 Introduction
Gait is one of the most popular biometric aspects of humans because it can be bona fide excluding subject cooperation at a distance from a lens. In the gait recognition model, the speed difference between matching pairs plays an important role and the gait mode of walking or running makes it more challenging. The distinctive features of the human body, such as handwriting, voice, and gait, have been highly studied for many years to make excellent progress in biometrics [1]. Every individual has a particular way of walking that happens because of facilitated and worked together activities of the skeletal muscles and sensory system. This makes the biometric step a ground-breaking marker to decide obsessive behavior caused by actual injury, maturation, or related issues. These inside and outside variables legitimately influence the movement and activity of the body and result in walk impedance [1]. Biomechanical examples of human movement are by and large speed-dependent, the adequacy of explicit development normally levels with the development speed (stride velocity is a determining factor in the nature of phases) [1, 2].
In an average walk examination, patients perform stride preliminaries at their agreeable speed and their step designs are regularly contrasted, and a reference design is from a regulating information base. Nonetheless, the impact of stride speed is commonly not represented when the step example of obsessive people is contrasted and sound ones who do not walk at an equal speed [2].
A repeated pattern consisting of steps and strides is the gait loop. A step phase began with one foot's initial contact and finishes with the other foot's initial contact. The stride phase begins with one foot's initial contact and finishes with the same foot's next initial contact, which comprises of two steps. There are two main phases in the gait cycle: stance phase comprises of 60% of the entire gait cycle and swing phase comprises of 40 percent of the full gait cycle. The step is the time when the foot is in touch with the ground and the weight is borne by the limb. The swing process is the time during which the reference foot is not in touch with the surface and swings in the air, indicating the weight of the body is borne by the contralateral limb [1, 3,4,5].
6.2 Related Work
6.2.1 Gait Recognition System
Human authentication based on gait has significant importance and a wide scope of research. The science of recognizing or identifying people by their physical characteristics is known as biometrics. One form of biometric technology is gait recognition [5]. The authors are working on gait recognition biometric technology to analyze the movements of body parts such as the foot, the knee, the shoulder, and so on. There are various gait acquisition technologies such as cameras, wearable sensors, and non-wearable sensors. Figure 6.1 demonstrates the gait recognition system.
Any recognition system is developed based on the training phase and the testing phase. There are two approaches, such as authentication and recognition. Automatic video processing was the basis for the first gait recognition [3, 6, 7]. It generates a mathematical model of motion. This method is the most common and requires the study of video samples of walking subjects, joint trajectories, and angles. In order to perform detection, the second approach uses a radar system to monitor the gait period that is then compared with the other samples [7].
A potential solution for this issue would be to gather a few walking preliminaries at different walking rates to construct a reference information base for essentially any conceivable step speed [8,9,10,11]. The tedious idea of such an assortment of knowledge, however, would be cost-restrictive and unviable. Scientists have suggested regression techniques as a reachable alternative for anticipating phase limits based on exploratory facts to resolve this barrier. The expectation data depends purely on the normal, slow, and fast walking speeds for stable subjects but only 10% time frame phase period when the complete step cycle was considered. [1, 2, 7, 12, 13]
The gait energy image (GEI) is an enhanced spatiotemporal feature to analyze a subject's gait and posture characteristics. Binary silhouette sequence normalization was used instead of direct silhouette sequence to provide robustness. GEI can be stated as,
where Sf states the total number of silhouette images in the gait cycle and s is an image number at an instant. Io (u, v) denotes the silhouette frame at an instant. Figure 6.2 shows the extracted GEI image from full gait cycle.
6.3 Proposed Work
6.3.1 Implementation Details of Proposed Model and Model Summary
CASIA B Gait Dataset contains silhouette sequence for each of the view angle. This silhouette sequence of images has been converted to gait energy image. So, there are in total 13,640 images. 11 angle view * 10 categories * 124 subjects = 13,640 GEIs. To evaluate the proposed method, all the sequences have been converted to a gait energy image (GEI), then set of GEIs are fed to gait recognition proposed model. Figure 6.3 shows the various GEIs under three conditions and 11 view angles.
One layer consists of a 2D convolutional layer followed by batch normalization and max pooling in the proposed deep learning model for cross-view gait recognition. As such, three stacks of layers are available in total. Figure 6.4 shows the design of the model proposed. Table 6.1 shows description of the model proposed.
6.3.1.1 Convolutional 2D Layer
A convolutional layer is a part of the deep neural network that can intake an input image, attribute importance to the different aspects/objects in the frame by learning weights and biases, and distinguish one from the other. The preprocessing needed for the convolutional neural network is much lower than for other classification algorithms. The size of input image is (240 × 240), and size of convolutional kernel is (7 × 7). The input image is downsampled, and output of convolutional 2D layer is about (234 × 234). Output of convolution layer is nothing but the feature map.
6.3.1.2 Batch Normalization
Batch normalization helps each network layer to learn much more independently of other layers. Higher learning values can be used because batch normalization assures that no activation has actually went high or really low. It reduces overfitting due to a slight regulation effect. To stabilize a neural network, it normalizes the performance of a previous activation unit by deducting batch average and dividing by batch confidence interval.
6.3.1.3 Pooling 2D
An issue with a feature map output is that they are sensitive to input feature position. To fix this sensitivity, sample the function maps is one approach. Pooling layers offer an approach to sample feature maps by summarizing features in the feature map kernel. Pooling is needed to detect feature in feature maps. We used max pooling. Results are pooled function maps highlighting the kernel's most present feature. The kernel size is (2 × 2).
6.3.1.4 Dropout
To simulate having a large number of distinct network architectures, a single model can also be used by arbitrarily lowering out nodes during training. This is known as dropout and provides a very inexpensive and effective regularization impact for reducing overfitting and enhancing generalization error in deep neural networks.
There are two sets of GEIs known as gallery set and probe set. While training the model gallery set is used and for evaluation probe set is used. As there are 11 views, there will be 11 models trained and tested.
6.4 Experiments and Result Analysis
6.4.1 CASIA-B Gait Dataset
This is a large multiview gait database; the information was gathered from 11 views and consists of 124 subjects. There are three varieties of variations clothing and carrying conditions including wide range of view angles and also consist of extracted silhouettes.
With 124 subjects, including both genders, 93 males and 31 females in an indoor environment, the data was gathered at 25 fps and a size of frame is 320 × 240. Therefore, there are 13,640 sequences in total. Figure 6.5 shows how data was collected from different angles [1,2,3,4,5,6, 14]. CASIA-B dataset contains 124 subjects; data is captured from 11 angle views from 0° to 180° with 18° interval. There are in total 10 categories such as normal walking with 6 subsets, walking while carrying bags with 2 subsets, and walking while wearing coat with 2 subsets.
6.4.2 Experimental Environment
To perform our experiment, we have used Python 3.7, Jupyter Notebook, and Anaconda Environment. TensorFlow 2.0 is used. The dataset used is CASIA-B dataset.
6.4.3 Result Analysis on CASIA-B Dataset
Proposed model is trained at 11 angle views, and each trained model is evaluated against different viewing angles probe set. Table 6.2 shows the experimental results on probe set.
It should be noted that the proposed method achieves the best recognition accuracies at almost all angles. This is the result of gait energy image over the silhouette sequence. For evaluation of the proposed model, we compared our model with GEI template methods, as input to model is GEIs. This are the models—Gait-Net [5], L-CRF [15], LB [16], RLTDA [17], CPM [12], and DV-GEIs [14]. Table 6.3 shows the comparison of the benchmark approaches and state-of-the-art models with the proposed approach with respect to some of the probe evaluation CASIA-B gait dataset.
It should be noted from Table 6.2. The average accuracy of the proposed model is around 93.0%, which outperforms the GaitNet with 12.2% and DV-GEI with 9.6%. Compared the output with state-of-the-art methods like L-CRF, use GEIs instead of.
Silhouette sequence as an input to the model, this is another proof that our gait characteristics strongly illustrate all major gait variations. Table 6.4 shows the comparison of cross-view recognition accuracy on CASIA-B dataset.
6.5 Conclusion
The deep learning approach is proposed in this paper. The proposed method can efficiently extract spatial and temporal parameters. In this model, we fed gait energy image (GEIs) as input to the model rather than gait silhouette series. This method is efficient than conventional methods. The gait energy image keeps the dynamic and static facts of a gait sequence, although time is not properly considered, this is the one limitation of the proposed method. The proposed model is robust in variations including obstructive objects, clothes, and viewing angles. Eventually, gait awareness develops stronger. Our model was tested on a broad CASIA-B dataset. Max pooling layer is used to adapt the spatial information, improving the mapping efficiency. Recognition results obtained demonstrated our model’s dominance over well-known approaches.
References
Sepas-Moghaddam, A., Etemad, A.: View-invariant gait recognition with attentive recurrent learning of partial representations. IEEE Trans. Biometrics Behavior Identity Sci.
Sepas-Moghaddam, A., Ghorbani, S., Troje, N.F., Etemad, A.: Gait recognition using multi-scale partial representation transformation with capsules (2020). arXiv preprint arXiv:2010.09084
Lin, B., Zhang, S., Yu, X., Chu, Z., Zhang, H.: Learning effective representations from global and local features for cross-view gait recognition (2020). arXiv preprint arXiv:2011.01461
Fan, C., Peng, Y., Cao, C., Liu, X., Hou, S., Chi, J., Huang, Y., Li, Q., He, Z.: GaitPart: temporal part-based model for gait recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14225–14233 (2020)
Zhang, Z., Tran, L., Yin, X., Atoum, Y., Liu, X., Wan, J., Wang, N.: Gait recognition via disentangled representation learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4710–4719 (2019)
Chao, H., He, Y., Zhang, J., Feng, J.: Gaitset: regarding gait as a set for cross-view gait recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 8126–8133 (2019)
Moissenet, F., Leboeuf, F., Armand, S.: Lower limb sagittal gait kinematics can be predicted based on walking speed, gender, age and BMI. Sci. Rep. 9(1), 1–12 (2019)
Nandy, A., Chakraborty, R., Chakraborty, P.: Cloth invariant gait recognition using pooled segmented statistical features. Neurocomputing 191, 117–140 (2016)
Anusha, R., Jaidhar, C.D.: Human gait recognition based on histogram of oriented gradients and Haralick texture descriptor. Multimedia Tools Appl. 1–22 (2020)
Hasan, M.M., Mustafa, H.F.: Multi-level feature fusion for robust pose-based gait recognition using RNN. Int. J. Comput. Sci. Inform. Secur. (IJCSIS) 18(1) (2020)
Janković, M., Savić, A., Novičić, M., Popović, M.: Deep learning approaches for human activity recognition using wearable technology. Medicinski podmladak 69(3), 14–24 (2018)
Chen, X., Weng, J., Wei, Lu., Jiaming, Xu.: Multi-gait recognition based on attribute discovery. IEEE Trans. Pattern Anal. Mach. Intell. 40(7), 1697–1710 (2017)
Kusakunniran, W., Wu, Q., Zhang, J., Li, H.: Support vector regression for multi-view gait recognition based on local motion feature selection. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 974–981. IEEE (2010)
Liao, R., An, W., Yu, S., Li, Z., Huang, Y.: Dense-view GEIs set: view space covering for gait recognition based on dense-view GAN (2020). arXiv preprint arXiv:2009.12516
Recognition based on attribute discovery. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 40(7), 1697–1710 (2018)
Wu, Z., Huang, Y., Wang, L., Wang, X., Tan, T.: A comprehensive study on cross-view gait based human identification with deep CNNs. IEEE Trans. Pattern Anal. Mach. Intell. 39(2), 209–226 (2017)
Haifeng, Hu.: Enhanced gabor feature based classification using a regularized locally tensor discriminant model for multiview gait recognition. IEEE Trans. Circ. Syst. Video Technol. 23(7), 1274–1286 (2013)
Kusakunniran, W.: Recognizing gaits on spatio-temporal feature domain. IEEE Trans. Inf. Forensics Secur. 9(9), 1416–1423 (2014)
Hu, M., Wang, Y., Zhang, Z., Little, J.J., Huang, Di.: View-invariant discriminative projection for multi-view gait-based human identification. IEEE Trans. Inf. Forensics Secur. 8(12), 2034–2045 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Bharti, J., Lohiya, L. (2022). Cross-View Gait Recognition Using Deep Learning Approach. In: Senjyu, T., Mahalle, P., Perumal, T., Joshi, A. (eds) IOT with Smart Systems. Smart Innovation, Systems and Technologies, vol 251. Springer, Singapore. https://doi.org/10.1007/978-981-16-3945-6_6
Download citation
DOI: https://doi.org/10.1007/978-981-16-3945-6_6
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-3944-9
Online ISBN: 978-981-16-3945-6
eBook Packages: EngineeringEngineering (R0)