Human Action Recognition by Employing DWT and Texture

Thanikachalam, V.; Thyagharajan, K. K.

doi:10.1007/978-81-322-2135-7_34

V. Thanikachalam⁵ &
K. K. Thyagharajan⁶

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 325))

2204 Accesses
1 Citations

Abstract

Human action recognition is a very challenging task due to the great variability with which different people may perform the same action. It involves in the development of applications such as automatic monitoring, surveillance, and intelligent human–computer interfaces. We propose an action recognition scheme to classify human actions based on positive portion using template-based approach from a video. We first define the accumulated motion image (AMI) using frame differences to represent the spatiotemporal features of occurring actions. Then, the direction of motion is found out by computing motion history image (MHI). Texture and spatial information are extracted from AMI and MHI using (LBP) local binary pattern and (DWT) discrete wavelet transform, respectively. The detection of object and extraction of moving objects are done by feature extraction over LBP and DWT. The feature vectors are computed by employing the seven Hu moments. The system is trained using nearest neighbor classifier, and the actions are classified and labeled accordingly. The experiments are conducted on Weizmann dataset.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Spatiotemporal wavelet correlogram for human action recognition

Article 14 January 2019

A Robust Framework for the Recognition of Human Action and Activity Using Spatial Distribution Gradients and Gabor Wavelet

Action recognition based on binary patterns of action-history and histogram of oriented gradient

Article 07 September 2016

Keywords

1 Introduction

Due to the popularization of surveillance cameras and personal video devices, video-based human motion analysis and recognition have become a highly active area in computer vision. Human action recognition is difficult for many reasons, such as high dimension of video data, intra-class variability caused by scale, viewpoint and illumination changes, low resolution, and video quality. Human action recognition is the process of identifying human actions that occur in the video sequences. Its application is involved in surveillance footage, user interfaces, robotics, automatic video organization, patient monitoring systems, athletic performance analysis, etc. Even though this topic has been studied for several years, it has some problem in classifying the actions performed by human accurately.

2 Review of Related Work

Various surveys have been done in this area on recognition methodologies and approach-based and general overview-based taxonomies [1, 2]. Tracking motion of human and action recognition from the sequence of images is analyzed [3]. The extension of [3] in the development of motion analysis has been surveyed by Wang et al. [4]. This paper is organized in hierarchical manner which describes issues caused during detection, tracking, and behavior understanding within human motion analysis system. Kim et al. [5] proposed the scheme over an ordinal measure of accumulated motion in a single query action video without considering whole silhouette of human body as in the earlier [6].

A set of activities is described using motion template system in [7]. Various methods which employ MHI have been proposed in [8] and provide variants and its applications of this method. The combined prompt of MHI and MEI is proposed in [9] for representation of motion based on optical flow-based directional history and energy image templates. To extract the judicial information from given video sequences, Shao et al. [10] proposed a system using wavelet transform-based descriptors for representation and recognition of human action. A multiscale representation has been proposed in [11] by comparing the efficiency with Gabor filter and discrete wavelet transform (DWT) which applied prior to dimensionality reduction. It results Gabor has the highest efficiency in recognition.

The local binary pattern (LBP) has been used from past 10 years, and it is widely used for texture classification proposed in [12–14].

3 Proposed Work

The proposed method consists of the following stages: accumulated motion image (AMI), motion history image (MHI), DWT, LBP, feature extraction, training, and classification. The overall procedure is shown in the flow diagram Fig. 1.

3.1 Accumulated Motion Image (AMI)

In the proposed system to represent the spatiotemporal features of human actions, we define AMI and it is computed by using frame differences. AMI is computed using frame differences as in Eq. 1

$$ {\text{AMI}} = \frac{1}{T}\sum\limits_{t = 1}^{T} {\left| {{D}(x,y,t)} \right|} $$

(1)

where D(x, y, t) = I(x, y, t) − I(x, y, t − 1) and T denotes the total number of frames present in a single action video.

Figure 2 shows the AMI computed for a jacking video and one-hand wave action.

3.2 Motion History Image (MHI)

MHI is extensively used in action recognition research areas. It provides motion shape information of a video to recognize actions.

To represent how the motion occurs in a video, form a motion history image (MHI) is computed by using a simple replacement and decay operator as in Eq. 2

$$ H_{\tau } \left( {x,y,t} \right) = \left\{ {\begin{array}{*{20}l}\uptau \hfill & {{\text{if}}\;D(x,y,t) = 1} \hfill \\ {\hbox{max} (0,H_{\tau } (x,y,t-1) - 1)} \hfill & {\text{otherwise}} \hfill \\ \end{array} } \right. $$

(2)

where τ is the current time stamp and D is the absolute value of silhouette difference between frames t and t − 1. The result will be the scalar-valued image where brighter pixel shows the most recently occurred action.

Figure 3 shows the MHI computed for a jacking action and one-hand wave action video

3.3 Discrete Wavelet Transform (DWT)

DWT is used to extract the spatial features and for dimensionality reduction. In the proposed system, we use Haar wavelet filter for decomposition of the MHI image.

Figure 4 shows the output of the DWT.

3.4 Local Binary Pattern (LBP)

LBP is applied over AMI to extract the edge information. In texture analysis, the sampling is done on equally spaced sampling points P on a circular neighborhood with radius R. The value of the LBP code of the pixel is given in P-bit binary number as in Eq. 3

$$ {\text{LBP}}_{P,R} { = }\sum\limits_{P = 0}^{P} {S(g_{p} - g_{c} )2^{p} \quad s(x) = \left\{ {\begin{array}{*{20}l} {1,} \hfill & {{\text{if}}\;x \ge 0} \hfill \\ {0,} \hfill & {\text{otherwise}} \hfill \\ \end{array} } \right.{\kern 1pt} } $$

(3)

where

g _c :: gray value of center pixel (x _c, y _c, t _c)
g _p :: gray value at p _plane sampling points

Figure 5 shows the output of the LBP.

3.5 Feature Extraction

In the present work, the moment invariants are evaluated using central moments of the image function I(x, y) up to third order. Regular moments are defined as in Eq. 4

$$ M_{pq} = \sum\limits_{x} {\sum\limits_{y} {x^{p} y^{q} } } $$

(4)

where for p, q = 0, 1, 2, … and is M _pq the order moment (p + q)th of the continuous image function I(x, y). The central moments of order up to 3 are calculated using Eq. 4 which is invariant under translation.

3.6 Training and Classification

The system has to be trained with some classification algorithm which helps and finds the exact match of test actions. In the proposed system, K-nearest neighbor (KNN) classification method is applied. KNN algorithm is a part of supervised learning that has been used in many applications in the field of data mining, statistical pattern recognition, and many others.

KNN is a method for classifying objects based on closest training examples in the feature vector. An object is classified by a majority vote of its neighbors. K is always a positive integer. If K = 1, it is then called as nearest neighbor (NN). The neighbors are taken from a set of objects for which the correct classification is known.

3.7 Weizmann Dataset

The Weizmann dataset has been used in the proposed system which consists of large number of subjects and actions. It includes 81 low-resolution videos from nine different people, each performing 10 natural actions. A sample is shown in Fig. 6.

4 Results and Discussion

The proposed system is worked out with Weizmann dataset consists of 8 actions of different persons. The actions including bend, jumping jack, jumping, walk, run, skip, gallop side, and wave.

Number of actions taken for testing = N

The total classification rate of the proposed system is calculated as follows

$$ C \, = \frac{{({\kern 1pt} N - {\text{Actions}}{\kern 1pt} {\kern 1pt} {\text{wrongly}}{\kern 1pt} {\kern 1pt} {\text{recognized}})}}{N} $$

The percentage of video giving correct output is 90 %, and the percentage of video giving wrong output is 10 %. The accuracy given above is obtained by using both accumulated motion image and MHI with LBP and DWT, respectively. It gives better result when both the techniques are concatenated rather than applied separately. This proposed method is able to recognize 9 out of 10 actions.

5 Conclusions

AMI evaluation helps us in handling the variations of appearance and clearly shows the moving pixels. MHI is defined, and its determination gives direction of motion. LBP gives shape and texture information of AMI and also helped to improve recognition accuracy. DWT aids in extracting spatial features of MHI. Then, the feature vectors of LBP and DWT obtained by Hu seven moments and distance metric calculated using KNN classifier and the actions are labeled accordingly. The proposed system gives the high performance when compared with previous proposals which uses KNN. The computation is simpler and less time consuming.

References

J.K. Aggarwal, M.S. Ryoo, Human activity analysis: a review. ACM Comput. Surv. 43(3), 16 (2011)
Article Google Scholar
T. Moeslund, E. Granum, A survey of computer vision based human motion capture. Comput. Vision Image Understand. 81, 231–268 (2001)
Google Scholar
J.K. Aggarwal, Q. Cai, Human motion analysis: a review. Comput. Vision Image Understand. 7(3), 428–440 (1999)
Article Google Scholar
L. Wang, W. Hu, T. Tan, Recent developments in human motion analysis. Pattern Recognit. 36, 585–601 (2003)
Google Scholar
W. Kim, J. Lee, M. Kim, D. Oh, C. Kim, Human action recognition using ordinal measure of accumulated motion. EURASIP J. Adv. Signal Process. 2010, 1–12 (2010)
Google Scholar
V.H. Chandrashekhar, K.S. Venkatesh, Action energy images for reliable human action recognition, in Proceedings of the Asian Symposium on Information Display (ASID ’06), pp 484–487 (2006)
Google Scholar
J. Davis, A. Bobick, in The representation and recognition of action using temporal templates. IEEE Conference on Computer Vision and Pattern Recognition (1997)
Google Scholar
M.A.R. Ahad, J.K. Tan, H. Kim, S. Ishikawa, in Human Activity Analysis: Concentrating on Motion History Image and Its Variants. ICROS-SICE International Joint Conference (2009)
Google Scholar
M.A.R. Ahad, J.K. Tan, H. Kim, S. Ishikawa, Action recognition by employing combined directional motion history and energy images. Image Vision Comput. 73–78 (2010)
Google Scholar
L. Shao, R. Gao, Y. Liu, H. Zhang, Transform based spatiotemporal descriptors for human action recognition. Neurocomputing 74, 962–973 (2010)
Article Google Scholar
R. Shrivastava, A. Pratap Singh, Analysis and performance of three methods of human action recognition. Int. J. Adv. Res. Electr. Commun. Eng. (IJARECE). 19(3), 72 (2012)
Google Scholar
T. Ahonen, A. Hadid, M. Pietikainen, Face description with local binary patterns: application to face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 28(12), 2037–2041 (2006)
Article Google Scholar
T. Ojala, M. Pietikainen, T. Maenpaa, Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 971–984 (2002)
Article Google Scholar
P. Guo, Z. Miao, Motion description with local binary pattern and motion history image: application to human motion recognition, in IEEE International Workshop on Haptic Audio Visual Environments and their Applications, pp 18–19(2008)
Google Scholar

Download references

Author information

Authors and Affiliations

SSN College of Engineering, Chennai, India
V. Thanikachalam
RMD Engineering College, Chennai, India
K. K. Thyagharajan

Authors

V. Thanikachalam
View author publications
You can also search for this author in PubMed Google Scholar
K. K. Thyagharajan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to V. Thanikachalam .

Editor information

Editors and Affiliations

Electrical & Electronics Engineering, Noorul Islam Centre for Higher Education, Kumaracoil, Tamil Nadu, India
L Padma Suresh
Electrical and Electronics Engineering, SRM Engineering College, Kattankulathur, Tamil Nadu, India
Subhransu Sekhar Dash
Electrical Engineering, IIT Delhi, New Delhi, Delhi, India
Bijaya Ketan Panigrahi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Thanikachalam, V., Thyagharajan, K.K. (2015). Human Action Recognition by Employing DWT and Texture. In: Suresh, L., Dash, S., Panigrahi, B. (eds) Artificial Intelligence and Evolutionary Algorithms in Engineering Systems. Advances in Intelligent Systems and Computing, vol 325. Springer, New Delhi. https://doi.org/10.1007/978-81-322-2135-7_34

Download citation

DOI: https://doi.org/10.1007/978-81-322-2135-7_34
Published: 26 November 2014
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-2134-0
Online ISBN: 978-81-322-2135-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics