Weighted CoHOG (W-CoHOG) Feature Extraction for Human Detection

Andavarapu, Nagaraju; Vatsavayi, Valli Kumari

doi:10.1007/978-981-10-0451-3_26

Nagaraju Andavarapu⁷ &
Valli Kumari Vatsavayi⁷

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 437))

1406 Accesses
2 Citations

Abstract

Human recognition techniques are used in many areas such as video surveillance, human action recognition, automobile industry for pedestrian detection, etc. The research on human recognition is widely going on and is open due to typical challenges in human detection. Histogram-based human detection methods are popular because of its better detection rate than other approaches. Histograms of oriented gradients (HOG) and co-occurrence of histogram-oriented gradients (CoHOG) are used widely for human recognition. A CoHOG is an extension of HOG and it takes a pair of orientations instead of one. Co-occurrence matrix is computed and histograms are calculated. In CoHOG, gradient directions alone are considered and magnitude is ignored. In this paper magnitude details are considered to improve detection rate. Magnitude is included to influence the feature vector to achieve better performance than the existing method. In this paper, weighted co-occurrence histograms of oriented gradients (W-CoHOG) is introduced by calculating weighted co-occurrence matrix to include magnitude factor for feature vector. Experiments are conducted on two benchmark datasets, INRIA and Chrysler pedestrian datasets. The experiment results support our approach and shows that our approach has better detection rate.

Access provided by Autonomous University of Puebla. Download conference paper PDF

SCHOG Feature for Pedestrian Detection

Efficiency Improvement in the Extraction of Histogram Oriented Gradient Feature for Human Detection Using Selective Histogram Bins and PCA

G2P: a new descriptor for pedestrian detection

Article 21 November 2018

Keywords

1 Introduction

Computer vision is a wide and emerging area over the past few years. The analysis of images involving humans comes under computer vision problem. Human detection techniques are used in many areas such as people abnormal behavior monitoring, robots, automobile safety systems, and gait recognition. The main goal of a human detector is to check whether humans are present in the image or not. If human is identified in the particular image then it can be used for further analysis. Human detection is still an open problem. Human detection is one of the active and challenging problems in computer vision, due to different articulations and poses, different types of appearances of clothes and accessories acting as occlusions. In this paper humans are identified in a static image. Identifying humans in a static image is more difficult than in a video sequence because no motion and background information is available to provide clues to approximate human position. In our approach, input of the human detector is an image and output is a decision value finding whether there is a human in a given image or not. In this paper static images are considered to detect humans.

2 Related Works

Many human detection techniques have been proposed so far in different approaches. The implicit shape model (ISM) [1], a part-based object detection algorithm proposed by Leibe et al., uses local features derived from a visual vocabulary or codebook as object parts. Codebooks are generated using SIFT [2] or shape context local feature descriptor. Lu et al. proposed image depth based algorithm [3] to detect humans by taking depth information of given image. Jiaolong et al. proposed a part-based classifier technique [4] to detect humans in a given image window. In this method mixture of parts technique was used for part sharing among different aspects. Andriluka et al. proposed a generic approach for nonrigid object detection and articulated pose estimation based on the pictorial structures framework [5]. Gavrila et al. introduced a template matching approach for pedestrian detection [6]. Template hierarchy of pedestrian silhouettes is built to capture the variety of pedestrian shapes. For identifying shapes, canny edge detector [7] is used.

Gradient orientation based feature descriptors such as SIFT [2], HOG [8], CoHOG [9], etc., are recent trends in human detection. SIFT [2] (scale-invariant feature transform) features proposed by Lowe et al. are used in human body parts detection in [10]. Histogram-based features are popularly used in human recognition and object detection because of their robustness. Histograms of oriented gradients (HOG) [8] is a famous and effective method for human detection. It uses histograms of oriented gradients as a feature descriptor. HOG features are robust towards illumination variances and deformations in objects. Co-occurrence histograms of oriented gradients (CoHOG) [9] is an extensive work of HOG which has more detection rate and lesser miss rate. In recent days, CoHOG used in many computer vision applications such as object recognition [11], image classification [12], and character recognition [13] . In CoHOG, co-occurrence matrices calculated for oriented gradients for making feature descriptor strong. In CoHOG only gradient direction details are considered and gradient magnitude details are ignored. In the proposed method, gradient magnitude components are also considered to bring more accuracy to the existing CoHOG.

The rest of the paper is organized as follows: Sect. 2 gives a brief overview of HOG and CoHOG. Proposed method W-CoHOG is discussed in Sect. 3 in detail. Section 4 contains experimental results and comparison with existing methods. Finally the work concluded in Sect. 5.

3 Background: HOG and CoHOG

3.1 HOG

In HOG, initially gradients are computed on each pixel in a given image and are divided into nine orientations. Next the image is divided into small nonoverlapping regions. Typical regions are of size 8 × 8 or 16 × 16. Then HOGs are calculated for each and every small region. Finally histograms of each region are concatenated using vectorization.

3.2 CoHOG

Co-occurrence histograms of oriented gradients (CoHOG) is an extension to HOG and more robust than HOG. In CoHOG pair of oriented gradients is used instead of single gradient orientation. Co-occurrence matrix is calculated for pair of gradient orientation with different offsets.

$$C_{\Delta x,\Delta y} \left( {p,q} \right) = \mathop \sum \limits_{i = 1}^{n} \mathop \sum \limits_{j = 1}^{m} \left\{ {\begin{array}{*{20}l} 1 & {{\text{if}}\,I\left( {i,j} \right) = p\,{\text{and}}\,I\left( {i + x,j + y} \right) = q} \\ {\text{None}} & {\text{Otherwise}} \\ \end{array} } \right.$$

(1)

Equation (1) shows the calculation of co-occurrence matrix. Figure 1a shows typical co-occurrence matrix histograms of oriented gradients and Fig. 1b shows possible offsets for CoHOG.

In CoHOG, orientation values of gradient are alone considered and magnitude is ignored. In the proposed method magnitude is also considered, as magnitude also contains discriminative information for human detection. Let us consider the following example: Fig. 2a is quite different from Fig. 2b because of different magnitude values even though it has same gradient orientation. Hence magnitude also describes about what image contains. Existing feature descriptors does not consider magnitude details.

4 Proposed Method (W-CoHOG)

4.1 Overview

In the CoHOG method gradient directions are alone considered and magnitude is ignored. In the proposed method magnitude is also considered to extract more robust feature. Magnitude weighted co-occurrence histograms of oriented gradients (W-CoHOG) is proposed for better feature descriptor. Figure 3 briefly explains the classification process for human detection using W-CoHOG extraction method.

Initially, gradients of image are computed in magnitude and direction form and converted into oriented gradients. Next, image is divided into 3 × 6 or 6 × 12 sized non-overlapping cells. Then, weighted co-occurrence matrices are computed for each region. After that, all co-occurrence matrices of all regions are combined.

4.2 Feature Extraction

For a given input image, gradients are computed for each pixel. In this method, Sobel and Robert’s filters are used to compute gradients of a given input image. Equations (2) and (3) show gradient calculation using Sobel and Robert’s filters, respectively, for a given input image I, as shown in below.

Sobel gradient operator

$$({\text{a}})\,G_{x} = \left[ {\begin{array}{*{20}c} { - 1} & 0 & { + 1} \\ { - 2} & 0 & { + 2} \\ { - 1} & 0 & { + 1} \\ \end{array} } \right] \,^{*} \,I\quad \quad ( {\text{b)}}\,G_{y} = \left[ {\begin{array}{*{20}c} { + 1} & { + 2} & { + 1} \\ { 0} & { 0} & { 0} \\ { - 1} & { - 2} & { - 1} \\ \end{array} } \right]\,^{*} \,I$$

(2)

Robert’s gradient operator

$$({\text{a}})\,G_{x} = \left[ {\begin{array}{*{20}c} { + 1} & { 0} \\ { 0} & { - 1} \\ \end{array} } \right] \,^{*} \,I\quad \quad ({\text{b}})\,G_{y} = \left[ {\begin{array}{*{20}c} { 0} & { + 1} \\ { - 1} & { 0} \\ \end{array} } \right]\,^{*} \,I$$

(3)

Then, gradients are converted into magnitude and direction using Eq. (4). The gradients directions are converted into eight equal bins with 45⁰ intervals.

$$( {\text{a)}}\,\uptheta = \tan^{ - 1} \frac{{g_{x} }}{{g_{y} }}\quad \quad ( {\text{b)}}\,m = \sqrt {g_{x}^{2} + g_{y}^{2} }$$

(4)

After that, magnitude matrix is convoluted with mean mask to eliminate noise which may cause aliasing effect. Equation (5) shows the 7 × 7 mean mask used in the proposed method. Figure 4 shows the overview of W-CoHOG feature calculation process.

$${\text{Conv}}_{7 \times 7} = \frac{1}{49}\left[ {\begin{array}{*{20}c} 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ \end{array} } \right]$$

(5)

Weight Function

In this proposed method magnitude component of a gradient is used as weight function to calculate weighted co-occurrence matrix. In order to calculate magnitude weighted co-occurrence matrix, the magnitude weights of each pixel are calculated. Weight function is applied to co-occurrence matrix to influence the co-occurrence matrix using gradient magnitude of each pixel. The weight functions used in this method are described in following paragraph.

Let I be a given input image. i, j are the any two orientations in the given eight orientations and Δx, Δy are offset for co-occurrence. $C_{{\Delta x,\Delta y}} \left( {i,j} \right)$ is weighted co-occurrence matrix for a given offset Δx, Δy and orientation i, j. The Eqs. (6 and 7) describes the calculation of the weighted co-occurrence matrix.

$$C_{\Delta x,\Delta y} \left( {i,j} \right) = \mathop \sum \limits_{p = 1}^{n} \mathop \sum \limits_{q = 1}^{m} \left\{ {W_{{\left( {p,q} \right) ,(p+\Delta x,p+\Delta y)}} *\alpha } \right.$$

(6)

where

$$\alpha = \left\{ {\begin{array}{*{20}l} 1 & {{\text{if}}\,O\left( {p,q} \right) = i \,{\text{and}}\,O\left( {p +\Delta x,q +\Delta y} \right) = j} \\ 0 & {\text{Otherwise }} \\ \end{array} } \right.$$

(7)

Let $m_{{p, q}}$ be a gradient at a given pixel p, q for a given input image I. $\bar{M}$ and M _max are mean and maximum gradient values in I. The weight calculation was performed with simple operations like mean and division operations. Equations (8 and 9) show two possible weight functions to calculate weight for a given pixel (p, q) and (p + Δx, q + Δy). Any of the two functions is preferable to calculate weights for calculating weighted co-occurrence matrix. In this proposed method, Eq. (8) is used to calculate weights for experimental results.

$$W_{{\left( {p,q} \right) ,(p + \Delta x,p + \Delta y)}} = \left(\frac{{m_{p,q} }}{{\bar{M}}}*\frac{{m_{p + \Delta x,p + \Delta y} }}{{\bar{M}}}\right) + \mu$$

(8)

$$W_{{\left( {p,q} \right) ,(p + \Delta x,p + \Delta y)}} = \left(\frac{{m_{p,q} }}{{M_{\hbox{max} } }}*\frac{{m_{p + \Delta x,p + \Delta y} }}{{M_{\hbox{max} } }}\right) + \mu$$

(9)

where, μ is constant and μ = 1.

After computing magnitude weighted co-occurrence matrices for all regions, the matrices are vectorized by simple concatenation of all matrix rows into a single row. There are 31 offsets possible for calculating co-occurrence matrix shown in Fig. 1b. Co-occurrence matrices need not be calculated for all offsets. In calculation of W-CoHOG, two offsets are good enough for pedestrian detection problem.

The size of feature vector is very large in histogram-based feature descriptors. For these types of features linear SVM [04] classifier is suitable. In this proposed method LIBLINEAR classifier [14] is used. LIBLINEAR classifier is an SVM-based classifier which works faster than SVM classifier [15] even for million instances of data. HOG and CoHOG also used SVM classifier for classification.

5 Experimental Results

Experiments are conducted on two datasets Daimler Chrysler [11] and INRIA dataset [8]. These are the familiar benchmark datasets for human detection. In Daimler Chrysler dataset, 4,800 human images and 5,000 nonhuman images are taken for training; and another 4,800 human images for training and 5,000 images are taken for testing. Each image size in Daimler Chrysler dataset is 48 × 96 pixels. In INRIA dataset, 1208 positive images and 12,180 patches are randomly sampled from person-free image for training and testing.

Figures 5 and 6 show that sample positive and negative examples of INRIA dataset and Chrysler dataset, respectively. Negative images are generated by taking 64 × 128 patches from no person images in INRIA dataset. Simple Sobel filter and Roberts filter were used to calculate the gradients of input image.

ROC curves are used for performance evaluation of binary classification like object detection problems. Sliding window technique is used to detect the humans in the image. A typical scanning window size for INRIA dataset is equal to the same as positive image size 64 × 128. In this paper, true positive rate versus false positive per window (FPPW) was plotted to evaluate the performance of proposed method and to compare with state-of-the-art methods. An ROC curves towards the top-left of the graph means better performance for classification problem. Figures 7 and 8 clearly show that curves obtained by proposed method achieved better detection rate for all false positive rates than other existing methods or at least comparable. The results clearly show that our method reduced miss rate around 20 % compared with CoHOG. The accuracy of the classifier is also better than other state-of-the-art methods shown in the figure. In the proposed method only two offsets are used instead of all 31 possible offsets, even though good results are acquired by adding gradient magnitude component.

6 Conclusion

In this paper a new method called weighted CoHOG is proposed which is an extension work to CoHOG. Magnitude component is also added to feature vector to improve the classification. The proposed method achieved improvement in accuracy on two benchmark datasets. Experimental results prove that performance of the proposed method is better than the other state-of-the-art methods. Even though calculation of weights adds additional computational complexity, the overall feature vector generation time decreased by reducing the number of offsets to two. Future work involves proposed feature descriptor to be used in other applications such as person tracking.

References

Leibe, B., Leonardis, A., Schiele, B.: Combined object categorization and segmentation with an implicit shape model. In: Workshop on statistical learning in computer vision, ECCV, vol. 2. no. 5 (2004)
Google Scholar
Lowe, David G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60(2), 91–110 (2004)
Article Google Scholar
Lu, X., Chen, C.-C., Aggarwal, J.K.: Human detection using depth information by Kinect. In: 2011 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), IEEE (2011)
Google Scholar
Xu, J., et al.: Learning a part-based pedestrian detector in a virtual world. IEEE Trans. Intell. Transp. Syst. 15(5), 2121–2131 (2014)
Google Scholar
Andriluka, M., Roth, S., Schiele, B.: Pictorial structures revisited: people detection and articulated pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE (2009)
Google Scholar
Gavrila, D., Philomin, V.: Real-time object detection for “smart” vehicles. In: The Seventh IEEE International Conference on Computer Vision, vol. 1, pp. 87–93. IEEE Computer Society Press, Los Alamitos (1999)
Google Scholar
Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 6, 679–698 (1986)
Article Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. vol. 1. IEEE (2005)
Google Scholar
Watanabe, T., Ito, S., Yokoi, K.: Co-occurrence histograms of oriented gradients for pedestrian detection. In: Advances in Image and Video Technology, pp. 37–47. Springer, Berlin (2009)
Google Scholar
Shashua, A., Gdalyahu, Y., Hayun, G.: Pedestrian detection for driving assistance systems: single-frame classification and system level performance. In: Intelligent Vehicles Symposium, IEEE (2004)
Google Scholar
Iwata, S., Enokida, S.: Object detection based on multiresolution CoHOG. Advances in visual computing. In: Springer International Publishing, pp. 427–437 (2014)
Google Scholar
Kawahara, T., et al.: Automatic ship recognition robust against aspect angle changes and occlusions. In: Radar Conference (RADAR), IEEE (2012)
Google Scholar
Su, B., et al.: Character Recognition in natural scenes using convolutional co-occurrence HOG. In: 22nd International Conference on Pattern Recognition (ICPR), IEEE (2014)
Google Scholar
Fan, R.-E., et al.: Liblinear: A library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874
Google Scholar
Hearst, M.A., et al.: Support vector machines. In: Intelligent Systems and their Applications, IEEE 13(4), 18–28 (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Systems Engineering, Andhra University, Visakhapatnam, India
Nagaraju Andavarapu & Valli Kumari Vatsavayi

Authors

Nagaraju Andavarapu
View author publications
You can also search for this author in PubMed Google Scholar
Valli Kumari Vatsavayi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nagaraju Andavarapu .

Editor information

Editors and Affiliations

Dept of Applied Sci & Eng, Indian Instit of Tech Roorkee, Roorkee, India
Millie Pant
Department of Mathematics, Indian Inst of Tech Roorkee, Roorkee, India
Kusum Deep
Dept of Mathematics, South Asian University New Delhi, New Delhi, India
Jagdish Chand Bansal
Department of Mathematics and Comp Sci, Liverpool Hope University, LIVERPOOL, United Kingdom
Atulya Nagar
Department of Mathematics, National Inst of Tech Silchar, Silchar, Assam, India
Kedar Nath Das

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Andavarapu, N., Vatsavayi, V.K. (2016). Weighted CoHOG (W-CoHOG) Feature Extraction for Human Detection. In: Pant, M., Deep, K., Bansal, J., Nagar, A., Das, K. (eds) Proceedings of Fifth International Conference on Soft Computing for Problem Solving. Advances in Intelligent Systems and Computing, vol 437. Springer, Singapore. https://doi.org/10.1007/978-981-10-0451-3_26

Download citation

DOI: https://doi.org/10.1007/978-981-10-0451-3_26
Published: 21 April 2016
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-0450-6
Online ISBN: 978-981-10-0451-3
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Weighted CoHOG (W-CoHOG) Feature Extraction for Human Detection

Abstract