Identification of ISL Alphabets Using Discrete Orthogonal Moments

Kaur, Bineet; Joshi, Garima; Vig, Renu

doi:10.1007/s11277-017-4126-2

Identification of ISL Alphabets Using Discrete Orthogonal Moments

Published: 09 April 2017

Volume 95, pages 4823–4845, (2017)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Wireless Personal Communications Aims and scope Submit manuscript

Identification of ISL Alphabets Using Discrete Orthogonal Moments

Download PDF

Bineet Kaur¹,
Garima Joshi¹ &
Renu Vig¹

209 Accesses
8 Citations
Explore all metrics

Abstract

In this paper, discrete orthogonal moment-based shape features up to 5th order are proposed for Indian sign language (ISL) recognition system. The shape recognition capability of discrete orthogonal moment-based local features is verified on two databases. These include the standard Jochen-Triesch’s database and 26 ISL alphabets. The ISL alphabets are collected on both uniform and complex backgrounds, with variations in position, scale and rotation. The feature-set is increased for 26 ISL alphabets by varying Region of Interest (ROI) and extracting features from each ROI. A minimum possible feature-set with least redundancy is selected that gives the best recognition accuracy. The effect of order and feature dimensionality for different classifiers is studied. Results show that both Dual-Hahn and Krawtchouk moments are found to exhibit user, scale, rotation and translation invariance. Moreover, they have shape identification capability, thus achieving good recognition accuracy.

Analysis of Zernike Moment-Based Features for Sign Language Recognition

Chinese Character Recognition by Krawtchouk Moment Features

Weighted Euclidean Distance Based Sign Language Recognition Using Shape Features

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Human–computer interaction (HCI) systems based on hand gestures find applications in sign language communication. Sign language is a mode of communication among the deaf community through gestures. The Indian sign language (ISL) exists in India. The idea is to make computer understand ISL alphabets by means of hand shapes which can be interpreted in the textual/audio form on the computer screen, thus making interaction with the deaf people easy, without the need of an interpreter. For this, moment-based shape features play an important part in distinguishing between different ISL alphabets. The proposed ISL recognition system is shown in Fig. 1.

2 Literature Survey

In pattern recognition domain, moments have been identified to possess the capability to extract both global and local information of shape. These are termed as shape-based features.

After the introduction of non-orthogonal Hu moments in 1962, the continuous orthogonal moments such as Zernike and Legendre were introduced [1]. These were deployed in various applications of shape analysis [2]. The continuous orthogonal moments involved change of the image co-ordinate space into a different domain. For example, the Zernike moments are defined in polar co-ordinates. These also involved approximation of continuous integrals, which resulted in discretization error, thus limiting the accuracy with the increase in order [3]. Also, their complexity in terms of computation of moments increased with the increase in order. Due to these limitations, discrete orthogonal moments such as Tchebichef moments were introduced. They are used as global descriptors and are defined in image co-ordinate space itself [4].

Krawtchouk moments also come under the category of discrete orthogonal moments, based on classical Krawtchouk polynomials [5]. In case of Krawtchouk moments the discretization error is non-existent. They have been used in image reconstruction and object recognition because of minimum information redundancy [6–9]. Moreover, they extract local information by varying the region of interest (ROI) in the image. The Krawtchouk moments have been used in various applications like character recognition [6, 10–12], image classification [13], 3-D object retrieval [14], face recognition [15–20], gesture recognition [21, 22], speech signal processing [23], watermarking systems [24] and medical image analysis [25].

Similar to Krawtchouk and Tchebichef moments, Dual-Hahn moments are discrete orthogonal moments. These can however be used as local and global descriptors, thus providing them an added advantage over other moments [7, 26].

Krawtchouk moments were first time used for object recognition by Yap et al. [6]. The database of 7 English uppercase binary alphabets was used which were rotated by various angles and were scaled. Krawtchouk moments outperformed Hu moments for both noisy and noiseless images. Sit et al. [11] proposed local Krawtchouk and Hu-based invariants. These were tested on a database of 9 English uppercase binary alphabets and 9 gray-scale clip art images. It was concluded that Krawtchouk moments outperformed Hu moments in terms of recognition accuracy.

The image analysis of Dual-Hahn moments was done by Zhu et al. [26]. It was concluded that Dual-Hahn moments performed better than Hu, Legendre, Tchebichef and Krawtchouk moments in terms of recognition capability on a database of 7 English binary alphabets in both noisy and noiseless conditions.

Significant research work has been done in face recognition using discrete orthogonal moments. Krawtchouk moments were extracted on a database of 40 subjects, differing in expression, position, rotation and scale. These outperformed Geometric, Zernike and Tchebichef moments in terms of recognition accuracy. It was found that the Krawtchouk moments gave good classification accuracy, even with the addition of noise for face recognition [15–17].

However, very few papers have used discrete orthogonal moments in gesture recognition domain. Priyal et al. [21, 22] compared the recognition accuracy of Krawtchouk moments with Zernike, Geometric and Tchebichef on a database of 10 gesture signs of digits collected from 23 users, which were rotated, translated and scaled. It was concluded that Krawtchouk moments outperformed other moments and were viewpoint and user-invariant. However, the role of Krawtchouk moments in gesture recognition domain was not deeply investigated.

In this paper, an ISL database contains 26 ISL alphabets collected on a uniform and complex background with variations in position, scale and rotation. Based on the literature survey, Krawtchouk and Dual-Hahn moments are found to be one of the best shape descriptors. The paper focuses on the following objectives:

(1)
To extract Krawtchouk and Dual-Hahn moment-based features till 5th order for both Jochen-Triesch and ISL databases by varying ROI.
(2)
To select a minimum possible feature-set that gives good recognition accuracy at various classifiers using correlation-based feature selection (CFS) algorithm.
(3)
To prove that discrete orthogonal moment-based features have shape recognition capability and are user, position, rotation and scale invariant.

3 Krawtchouk Moments

Krawtchouk moments are derived from classical Krawtchouk polynomials, associated with binomial functions [5].

3.1 Krawtchouk Polynomials

The rth order classical Krawtchouk polynomial is:

$$K_{r} \left( {i;p,X} \right) = \mathop \sum \limits_{k = 0}^{X} \left( {a_{k,r,p} i^{k} } \right) = {}_{2}F_{1} \left( { - {\text{r}}, - {\text{i}}; - {\text{X}};\frac{1}{p}} \right)$$

(1)

where i, r = 0, 1, 2…X, X > 0, p ϵ (0, 1), $a_{k,r,p} \;{\text{are the Krawtchouk Polynomial coefficients}}$.

Here, ${}_{2}F_{1}\; {\text{is a hypergeometric function}}{:}$

$${}_{2}F_{1} \left( {{\text{m}},{\text{n}};{\text{o}};{\text{t}}} \right) = \mathop \sum \limits_{k = 0}^{\infty } \frac{{\left( m \right)_{k} \left( n \right)_{k } t^{k} }}{{\left( o \right)_{k} \left( {{\text{k}}!} \right)}}$$

(2)

where (m)_k is a pochhammer symbol:

$$\left( {\text{m}} \right)_{\text{k}} = {\text{m}}\left( {{\text{m}} + 1} \right) \ldots \left( {{\text{m}} + {\text{k}} - 1} \right) = \frac{{\Gamma \left( {{\text{m}} + {\text{k}}} \right) }}{{\Gamma \left( {\text{m}} \right) }}$$

The normalized Krawtchouk polynomials are:

$${\tilde{\text{K}}}_{\text{r}} \left( {{\text{i;}}p , {\text{X}}} \right) = {\text{K}}_{\text{r}} \left( {{\text{i;}}p , {\text{X}}} \right)\sqrt {\frac{1}{{\uprho\left( {{\text{r}};p,{\text{X}}} \right)}}}$$

(3)

The weighted Krawtchouk polynomials are [6]:

$${\tilde{\text{K}}}_{\text{r}} \left( {{\text{i;}}p , {\text{X}}} \right) = {\text{K}}_{\text{r}} \left( {{\text{i;}}p , {\text{X}}} \right)\sqrt {\frac{{w\left( {i;p,X} \right)}}{{\uprho\left( {{\text{r}};p,{\text{X}}} \right)}}}$$

(4)

The weight function is:

$${\text{w}}\left( {{\text{i}};p,{\text{X}}} \right) = \left( {\begin{array}{*{20}c} X \\ i \\ \end{array} } \right)p^{i} \left( {1 - p} \right)^{X - i}$$

(5)

$$\uprho\left( {{\text{r;}}p , {\text{X}}} \right) = \left( { - 1} \right)^{\text{r}} \left( {\frac{1 - p}{p}} \right)^{\text{r}} \frac{{r!}}{{\left( { - X} \right)_{r} }}$$

(6)

The Krawtchouk polynomials till the second order are:

$${\text{K}}_{0} \left( {{\text{i;}}p , {\text{X}}} \right) = 1$$

(7)

$${\text{K}}_{ 1} \left( {{\text{i;}}p ; {\text{X}}} \right) = 1 - \left[ {\frac{1}{Xp}} \right]{\text{i}}$$

(8)

$${\text{K}}_{ 2} \left( {{\text{i;}}p , {\text{X}}} \right) = 1 - \left[ {\frac{2}{Xp} + \frac{1}{{X\left( {X - 1} \right)p^{2} }}} \right]{\text{i}} + \left[ {\frac{1}{{X\left( {X - 1} \right)p^{2} }}} \right]i^{2}$$

(9)

With the increase in the order of Krawtchouk polynomials, the range of the polynomials also increases thus, resulting in numerical instability. Therefore, weighted Krawtchouk polynomials were introduced to overcome this drawback [6].

3.2 Krawtchouk Moment Invariants

The Krawtchouk invariant moments corresponding to (r, q) order of an image intensity function f(i,j) is:

$${\text{Q}}_{\text{rq}} = \mathop \sum \limits_{i = 0}^{X - 1} \mathop \sum \limits_{j = 0}^{Y - 1} {\bar{\text{K}}}_{r} \left( {{\text{i}};{\text{p}}_{1} ,{\text{X}} - 1} \right){\bar{\text{K}}}_{q} ({\text{j}};{\text{p}}_{2} ,{\text{Y}} - 1) {\text{f}}\left( {{\text{i}},{\text{j}}} \right)$$

(10)

where f(i,j) is of size X × Y and these are substituted for X−1 and Y−1. These are Krawtchouk moment-based invariants, which are used as shape-based features for ISL alphabets. p ₁ and p ₂ are used for varying ROI horizontally and vertically. For p ₁ > 0.5, ROI shifts horizontally towards positive x-direction and for p ₁ < 0.5 it shifts horizontally towards negative x-direction. For p ₂ > 0.5, ROI shifts vertically towards negative y-direction and for p ₂ < 0.5 it shifts vertically towards positive y-direction. In this paper, r = q is chosen for varying orders and the feature vector size is (r + 1)².

4 Dual-Hahn Moments

Dual-Hahn moments encompass all the properties of Tchebichef and Krawtchouk moments. These can be used as both global and local feature descriptor as compared to Tchebichef moments which are global feature descriptors and Krawtchouk moments which extract local information [7].

4.1 Dual-Hahn Polynomials

The rth order Dual-Hahn polynomial is [27]:

$$h_{r}^{v} \left( {p,a,b} \right) = \frac{{\left( {a - b + 1} \right)_{r} \left( {a + v + 1} \right)_{r} }}{{{\text{r}}!}} {}_{3}F_{2} \left( { - {\text{r}},{\text{a}} - p,{\text{a}} + p + 1;{\text{a}} - {\text{b}} + 1,{\text{a}} + {\text{v}} + 1;1} \right)$$

(11)

where r = 0,1,2,…,R−1, p = a, a + 1,…, b−1

Here, ${}_{3}F_{2}\; {\text{is a hypergeometric function}}{:}$

$${}_{3}F_{2} \left( {{\text{m}},{\text{n}},{\text{o}};{\text{p}},{\text{q}};{\text{r}}} \right) = \mathop \sum \limits_{k = 0}^{\infty } \frac{{\left( m \right)_{k} \left( n \right)_{k } \left( o \right)_{k } r^{k} }}{{\left( p \right)_{k} \left( q \right)_{k } \left( {{\text{k}}!} \right)}}$$

(12)

where (m)_k is a pochhammer symbol:

$$\left( {\text{m}} \right)_{\text{k}} = {\text{m}}\left( {{\text{m}} + 1} \right) \ldots \left( {{\text{m}} + {\text{k}} - 1} \right) = \frac{{\Gamma \left( {{\text{m}} + {\text{k}}} \right) }}{{\Gamma \left( {\text{m}} \right) }}$$

(13)

To avoid the numerical instability with increase in order, the Dual-Hahn polynomials are scaled by using the weighing function.

The weighted Dual-Hahn polynomials are given by [26]:

$$\bar{h}_{r}^{v} \left( {p,a,b} \right) = \bar{h}_{r}^{v} \left( {p,a,b} \right) \sqrt {\frac{w\left( p \right)}{{d_{r}^{2} }}}$$

(14)

where

$$d_{\text{r}}^{2} = \frac{{\Gamma \left( {{\text{a}} + {\text{v}} + {\text{r}} + 1} \right) }}{{{\text{r}}! \left( {{\text{b}} - {\text{a}} - {\text{r}} - 1} \right)!\Gamma \left( {{\text{b}} - {\text{v}} - {\text{r}}} \right) }}$$

(15)

Weighing function is

$${\text{w}}\left( p \right) = \frac{{\Gamma \left( {{\text{a}} + p + 1} \right)\Gamma \left( {{\text{v}} + p + 1} \right)}}{{\Gamma \left( {p - {\text{a}} + 1} \right)\Gamma \left( {{\text{b}} - p} \right)\Gamma \left( {{\text{b}} + p + 1} \right)\Gamma \left( {p - {\text{v}} + 1} \right)}}$$

(16)

4.2 Dual-Hahn Moments

The Dual-Hahn invariant moments corresponding to (r, q) order of an image intensity function f (p,u) is:

$${\text{h}}_{\text{rq}} = \mathop \sum \limits_{p = a}^{b - 1} \mathop \sum \limits_{u = a}^{b - 1} \bar{h}_{r}^{v} \left( {p,a,b} \right)\bar{h}_{q}^{v} \left( {u,a,b} \right) {\text{f}}\left( {{\text{p}},{\text{u}}} \right)$$

(17)

where −0.5 < a < b, $\left| {\text{v}} \right|$ < 1 + a, b = a + R, r, q = 0, 1,…,R−1 and f (p,u) is of size R × R.

The parameters a and v are used for varying ROI. As v increases, the ROI shifts from left to right and with increase in a, shifting of ROI takes place from top to bottom. In this paper, r = q is chosen for varying orders and the feature vector size is (r + 1)².

5 Tools and Techniques

5.1 Database

(1)
The standard Jochen-Triesch’s database consisting of 10 static hand postures collected from 24 subjects in uniform dark, uniform light and complex background [28].
(2)
The ISL database consists of 26 ISL alphabets from ‘A’ to ‘Z’. Some alphabets have identical shapes like, ‘A’, ‘B’, ‘P’, ‘Q’, ‘U’, ‘W’, ‘M’, ‘N’, high occlusion as in ‘M’, ‘N’, ‘W’ and one gesture being sub-gesture of other as in ‘I’, ‘K’. In most of the signs both hands are used, which leads to complexity. A dataset of around 72 subjects is constructed for 26 ISL alphabets on the uniform background. It has total number of 1865 images. It is shown in Fig. 2. These are pre-processed and each image is converted from RGB to binary. Edge detectors are used and each image is resized to form 30 × 30 binary images. These binary images are varied in a scale of 0.7, 0.8 and 0.9 rotated at 90°, 180° and 270° angles resulting in the total number of 13,055 images in the first dataset. In the second dataset, the alphabet signs are superimposed on complex backgrounds. 4 variations of background are taken, with 100 images per alphabet. It has a total number of 2600 images in a complex background. During classification phase, 60% of the samples are used for training and rest 40% are used for the testing phase. Some of the samples of ISL alphabets on the complex background are shown in Fig. 3.
Fig. 2
Samples of ISL alphabets on a uniform background
Full size image

Fig. 3
Samples of ISL database on a complex background
Full size image

5.2 Feature Extraction

5.2.1 Krawtchouk Moment-Based Local Features

Figure 4 shows reconstructed images from 1st to 8th order. As can be seen, at lower orders finer details of the image are captured giving local information. Increasing the order beyond 5th order does not add many details into the reconstructed images. The complexity of database increases when 26 classes of ISL database are taken. Therefore, the feature vector size is increased by extracting Krawtchouk features at different ROIs in order to capture local features from different positions of the image, thus covering the entire image. For this, the values of (p ₁, p ₂) taken are 0.1, 0.3, 0.5, 0.7 and 0.9 corresponding to which ROIs are varied by (0.1, 0.1), (0.1, 0.3), (0.1, 0.5), (0.1, 0.7), (0.1, 0.9), (0.3, 0.1)…till (0.9, 0.9) giving 25 ROIs by permutation as shown in Fig. 5. For each ROI, 36 features are extracted till 5th order, thus giving a feature vector size of 900 (36 × 25 = 900).

5.2.2 Dual-Hahn Moments as Global Features

The Dual-Hahn moments can be used as global descriptors by setting a = v = 0 as can be seen in Fig. 6 where global features are extracted at different orders. For a 30 × 30 image, perfect reconstruction takes place when moments are extracted till 29th order as shown in Fig. 6.

5.2.3 Dual-Hahn Moments as Local Features

In case of Dual-Hahn moments, ROI is varied by changing the tuning parameters ‘a’ and ‘v’. Dual-Hahn moments can also extract local features by setting {a, v} > 0. The smaller the values of tuning parameter, ROI shifts in the upper-left corner of the image. As its value becomes larger, ROI stretches to the bottom-right corner of the image as can be seen in the Fig. 7 for an image size of 30 × 30. Thus, Dual-Hahn local features are extracted by changing a and v parameters with even values of (a, v) = (2, 4,…,48, 50) so that feature extraction is done at each of the 25 different ROIs. Thus, 36 features till 5th order are extracted at each ROI giving a total of 900 features (36 × 25 = 900). For global feature extraction, a large number of moments are needed to extract features of the entire image, which results in a large feature-vector as opposed to local feature extraction where features are extracted for a particular portion of the image, thus giving a smaller feature-vector. The detailed methodology used for ISL recognition system has been illustrated in Fig. 8.

5.3 Parameter Selection of Moments

The parameters of discrete orthogonal moments are adjusted on the basis of experimental results given in Sect. 5.2 in order to extract local and global features for ISL database as shown in Table 1.

Table 1 Parameter selection of moments

Full size table

5.4 Feature Selection

Feature Selection removes irrelevant and redundant features. It reduces the feature vector size and the computation time of the classifier. In this paper, correlation-based feature selection (CFS) algorithm with greedy stepwise search method is used. As the moment-based features have minimum redundancy and the features of the same class are highly correlated, CFS with greedy stepwise search method is highly suitable. It is considered as the most stable feature selection algorithm and reduces problems of class imbalance, high dimensionality and information redundancy [29–31].

The CFS gives a feature-set which has a higher correlation with the class and lower correlation within one another. CFS uses Pearson correlation coefficient which is calculated as follows [32]:

$${\text{M}}_{\text{s}} = \frac{{{\text{k}}\bar{\text{r}}{\text{cf}} }}{{\sqrt {{\text{k}} + {\text{k}}\left({{\text{k}} - 1} \right){\bar{\text{r}}}_{ff}}}}$$

(18)

M_S is the merit of the current subset of features, k is the number of features, ${\bar{\text{r}}}_{\text{ff}}$ is the mean of correlations between each feature and the class and ${\bar{\text{r}}}_{\text{ff}}$ is the mean of pair-wise correlations between every two features. The feature subsets are formed using search strategies: forward and backward elimination. In case of forward elimination, one feature at a time is added in the subset and stops when the performance deteriorates. In backward elimination, all features are added and each feature one at a time is removed until the performance degrades. These are then ranked on the basis of highest correlation coefficient (M_s). Out of all the feature subsets, the best feature subset is selected on the basis of the largest M_s. Greedy stepwise search method used in the paper can start with either forward or backward elimination.

5.5 Classification

Features are classified by K-Nearest Neighbour (k-NN) using Manhattan and Euclidean distance, with k = 1 for both the classifiers, multi-layer perceptron (MLP), support vector machine (SVM) with radial basis function (RBF), PUK (Pearson VIII Universal Kernel) and Polynomial kernels and extreme learning machines (ELM) with Linear, RBF and Polynomial kernels. 60% samples are used for training, rest 40% are used for testing. The recognition accuracy in terms of feature set dimensionality and orders is analysed using all the above classifiers.

5.5.1 k-Nearest Neighbour (k-NN)

This classifier is based on instance-based learning algorithms which are lazy- learning algorithms where the generalizations take place during the classification phase. In this, the distance between the query sample and the training samples is measured. Some of the widely used distance metrics include Manhattan, Euclidean, Chebychev and Minkowski. Euclidean and Manhattan distance metrics have been used in the classification of ISL alphabets.

The Euclidean distance metric,

$$d\left( {x,y} \right) = \sqrt {\mathop \sum \limits_{{{\text{i}} = 1}}^{\text{m}} \left( {x_{i} - y_{i} } \right)^{2} }$$

(19)

The Manhattan distance metric,

$$d\left( {x,y} \right) = \mathop \sum \limits_{{{\text{i}} = 1}}^{\text{m}} \left| {x_{i} - y_{i} } \right|$$

(20)

Algorithm for k-NN classification:

(1)
Determine the value for K = number of nearest neighbours.
(2)
Determine the distance metric (Euclidean and Manhattan) to find distance between test and training samples.
(3)
Find the minimum distance.
(4)
Determine the nearest neighbours on the basis of majority vote.
(5)
Assign the class of maximum nearest neighbours to the test sample.

5.5.2 Support Vector Machine (SVM)

In this, features of different classes are separated by a hyperplane. The position of the features decides as to which class the test sample belongs. It is used for linear as well as non-linear classification. In case of non-linear classification, the feature points are mapped into a high-dimensional feature space by means of kernel functions. SVM with PUK shows better generalization as compared to other kernel functions like polynomial and RBF and classifiers like k-NN and MLP [29, 33, 34].

5.5.3 Multi-Layer Perceptron (MLP)

In this, the neural network is first trained through training samples given to input neurons. Accordingly, learning takes place through weight updation. The test samples are then supplied to input neurons where the desired and predicted values are compared in order to calculate local error. To minimize the mean squared error, weight learning is carried out from output to hidden layer till the weight convergence is achieved.

5.5.4 Extreme Learning Machine (ELM)

is a generalized feed-forward network in which hidden layer doesn’t need any prior tuning. It can be applied on regression and multi-class classification problems directly. ELM is used in various multiclass classification applications and gives similar generalizations at faster learning speed as compared to SVM which has high computational complexity [35].

6 Results and Discussions

In this section, results and discussions are presented to validate the proposed feature extraction method for ISL recognition system. The experiments were executed on MATLAB R2014b using Intel(R) Pentium(R) laptop with windows 7, 32-bit operating system at 2 GHz with 4 GB RAM memory.

6.1 Analysis of Dual-Hahn Moments as Local and Global Features

Figures 9 and 10 shows the comparison of effectiveness of Dual-Hahn moments as local and global features for ISL database. Dual-Hahn moments are set in global feature extraction mode by setting, a = v = 0 with orders varying from 9 to 29. The feature vector size is (r + 1)² corresponding to order (r, q) when orders are taken as r = q. Thus, the feature vector varies from 100 to 900 for orders varying from 9 to 29 as shown in Fig. 9. For local feature extraction, the values of a and v is varied from 2, 4, 6… till 50 giving 25 different ROIs at orders 1 to 5, with a feature vector size of 100, 225, 400, 625 and 900 for orders 1, 2, 3, 4 and 5, respectively as can be seen in Fig. 10.

It shows that for a feature-vector size of 900, Dual-Hahn moments as global features give 87.9 and 61.9% for ISL in uniform and complex background, respectively while as global features for the same feature vector size of 900, Dual-Hahn moments as local features give 98.2 and 75.9% for ISL in uniform and complex background, respectively. Thus, Dual-Hahn moments perform best when features are extracted for a particular ROI as compared to global features where feature extraction is done on the entire image.

6.2 Comparison of Discrete Orthogonal Moments for ISL Database

To validate the effectiveness of discrete orthogonal moments, ISL database is used. The recognition accuracies of Tchebichef, Krawtchouk and Dual-Hahn moments are compared till 10th order. Figures 11 and 12 shows a significant increase in accuracy from 1st to 5th order after which the accuracy stabilizes at higher orders. The Dual-Hahn moments when used as local features perform best as compared to Tchebichef and Krawtchouk moments.

To demonstrate the recognition capability of discrete orthogonal moments, the comparison of Krawtchouk and Dual-Hahn moments is done for a standard Jochen-Triesch’s database as these achieve best recognition accuracy.

Table 2 shows the comparative analysis of the recognition accuracies at various classifiers using Jochen-Triesch’s database by varying ROIs for Krawtchouk and Dual-Hahn moments. Raw features are normalized to map the feature values in the range of [−1, 1], thus improving the recognition accuracy [36, 37].

Table 2 Recognition accuracies for Jochen-Triesch’s images

Full size table

Dual-Hahn moments show best results with 96.5% accuracy using SVM PUK. Krawtchouk moments also give comparable results with 93.4% accuracy using SVM PUK. Table 3 shows a comparative analysis of the proposed method with other recently proposed methods for Jochen-Triesch’s database. The Dual-Hahn moments as local features outperforms other recently proposed methods giving an accuracy of 96.5%.

Table 3 Comparison of results for Jochen-Triesch’s dataset

Full size table

Table 4 shows the performance of Dual-Hahn moments for ISL database on a uniform as well as complex background. An accuracy of 98.2% is observed for the database on a uniform background using SVM PUK. However, for a complex background, the accuracy obtained is 75.9% using k-NN.

Table 4 Recognition accuracies for ISL images using Dual-Hahn moments

Full size table

Table 5 compares the performance of Krawtchouk moments for ISL database on a uniform as well as on a complex background. An accuracy of 97.8% is obtained for the database on a uniform background using SVM PUK. However, for a complex background, the accuracy obtained is 72.9% using k-NN and SVM PUK.

Table 5 Recognition accuracies for ISL images using Krawtchouk moments

Full size table

The CFS algorithm reduces the feature-set so as to minimize information redundancy. Feature selection results in slight improvement in accuracy results. For ISL database on a complex background, Dual-Hahn moments give 75.9% accuracy using SVM Polynomial kernel as shown in Table 6. For ISL database on a uniform background, Dual-Hahn moments give 98.3% accuracy using SVM PUK as shown in Table 7.

Table 6 Recognition accuracy using CFS of ISL database on a complex background

Full size table

Table 7 Recognition accuracy using CFS of ISL database on a uniform background

Full size table

The time taken to compute discrete orthogonal moment-based features till 5th order in case of Tchebichef, Krawtchouk and Dual-Hahn moments are shown in Table 8. It is seen that Tchebichef, Krawtchouk and Dual-Hahn moments consume 1.234, 0.182 and 0.102 s to compute features till 5th order for a 30 × 30-resolution image.

Table 8 CPU elapsed time (in s) to compute discrete orthogonal moments till 5 order

Full size table

Dual-Hahn moments till 5th order perform best for 26 ISL alphabets on a uniform background with an accuracy of 98.3% followed by Krawtchouk moments giving 97.9%. For complex background, accuracies obtained are 75.9 and 72.6% for Dual-Hahn and Krawtchouk moments, respectively. However, recent works have used limited ISL classes on a uniform background only as illustrated in Table 9.

Table 9 Comparison of proposed methodology with recent methods on ISL database

Full size table

The confusion matrices for ISL database are shown in Figs. 13 and 14. Figure 13 illustrates the confusion matrix of ISL database on a uniform background using Dual-Hahn moments which give 98.3% accuracy. It is seen that most of the samples of alphabets ‘M’ and ‘N’ are misinterpreted because of their similar shapes and high occlusion. Alphabets ‘A’, ‘B’, ‘E’,‘C’ have identical shapes which results in incorrect classifications.

Figure 14 illustrates confusion matrix of ISL alphabets on a complex background with Dual-Hahn moments that give 75.9% accuracy. Because of background variations, alphabets are difficult to identify. However, discrete orthogonal moments have shape recognition capability and thus are able to achieve good results on both uniform as well as complex background.

Recent works have used Tchebichef and Krawtchouk moments at higher orders (80th order) for gesture recognition [21, 22]. In this work, the recognition accuracy stabilizes at higher orders. Moreover, this increases the feature-vector size thus taking more computational time. Therefore, to increase the recognition accuracy for complex background, features can be extracted on more number of ROIs at lower orders which capture finer details from different ROIs in an image.

7 Conclusion and Future Scope

In this paper, the shape recognition capability of discrete orthogonal moment-based local features is studied on 26 ISL alphabets in uniform as well as complex backgrounds. The comparative analysis of the performance of proposed feature vector is first analysed on standard Jochen- Triesch’s database. The proposed method shows competent results as the recognition accuracy obtained for Krawtchouk and Dual-Hahn moments is better than the other recently proposed methods. An accuracy of 98.3 and 97.9% is achieved for ISL database on a uniform background by Dual-Hahn and Krawtchouk moments, respectively. For ISL database on a complex background, an accuracy of 75.9 and 72.6% is observed for Dual-Hahn and Krawtchouk moments, respectively. Thus, the orthogonal moment-based local features are rotation, scale and translation and user invariant. They also have shape identification capability to distinguish between similar shapes in case of ISL alphabets. In future, the static gestures can be extended to dynamic gestures, involving the movement of hands.

References

Hu, M. K. (1962). Visual pattern recognition by moment invariants. IRE Transactions on Information Theory, 8(2), 179–187.
Article MATH Google Scholar
Teague, M. R. (1980). Image analysis via the general theory of moments. Journal of Optical Society of America, 70(8), 920–930.
Article MathSciNet Google Scholar
Liao, S. X., & Pawlak, M. (1998). On the accuracy of Zernike moments for image analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(12), 1358–1364.
Article Google Scholar
Mukundan, R., & Lee, P. A. (2001). Image analysis by Tchebichef moments. IEEE Transactions on Image Processing, 10(9), 1357–1364.
Article MathSciNet MATH Google Scholar
Pryzva, G. Y. (1992). Kravchuk orthogonal polynomials. Ukrainian Mathematical Journal, 44(7), 792–800.
Article MathSciNet MATH Google Scholar
Yap, P. T., Raveendran, P., & Ong, S. H. (2003). Image analysis by Krawtchouk moments. IEEE Transactions on Image Processing, 12(11), 1367–1377.
Article MathSciNet Google Scholar
Yap, P. T., Raveendran, P., & Ong, S. H. (2007). Image analysis using Hahn moments. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(11), 2057–2062.
Article Google Scholar
Zhu, H., Liu, M., Shu, H., Zhang, H., & Luo, L. (2010). General form for obtaining discrete orthogonal moments. IET Image Processing, 4(5), 335–352.
Article MathSciNet Google Scholar
Yap, P. T., Raveendran, P. & Ong, S. H. (2002). Krawtchouk moments as a new set of discrete orthogonal moments for image reconstruction. In International Joint conference on Neural Network, pp. 908–912.
Potocnik, B. (2006). Assessment of region-based moment invariants for object recognition. In IEEE International Symposium on Multimedia Signal Processing and Communications, pp. 27–32.
Sit, A., & Kihara, D. (2014). Comparison of image patches using local moment invariants. IEEE Transactions on Image Processing, 23(5), 2369–2379.
Article MathSciNet Google Scholar
Wang, X., Xie, B. & Yang, Y. (2006). Combining Krawtchouk moments and HMMs for offline handwritten chinese character recognition. In 3rd International IEEE Conference on Intelligent Systems, pp. 661–665.
Hmimid, A., Sayyouri, M., & Qjidaa, H. (2015). Fast computation of separable two-dimensional discrete invariant moments for image classification. Pattern Recognition, 48(2), 509–521.
Article Google Scholar
Zhao, S., Yao, H., Zhang, Y., Wang, Y., & Liu, S. (2015). View-based 3D object retrieval via multi-modal graph learning. Signal Processing, 112, 110–118.
Article Google Scholar
Nor’aini, A. J., Raveendran, P. & Selvanathan, N. (2005). A Comparative analysis of feature extraction methods for face recognition system. In Proceedings of Asian Conference on Sensors and the International Conference on New Techniques in Pharmaceutical and Biomedical Research, pp. 176–181.
Nor’aini, A. J., & Raveendran, P. (2009). Improving face recognition using combination of global and local features. In Proceedings of the 6th International Symposium on Mechatronics and its Applications, pp. 1–6.
Noraini, A. J. (2010). A comparative analysis of face recognition using discrete orthogonal moments. In International Conference on Information Sciences, Signal Processing and their Applications, pp. 197–200.
Rahman, S. M. Mahbubur, Howlader, T., & Hatzinakos, D. (2016). On the selection of 2D Krawtchouk moments for face recognition. Pattern Recognition, 54(2016), 83–93.
Article Google Scholar
Rani, J. S., & Devaraj, D. (2012). Face recognition using Krawtchouk moment. Sadhana-Academy Proceedings in Engineering Sciences, 37(4), 441–460.
MathSciNet MATH Google Scholar
Shekar, B. H., & Rajesh, D. S. (2015). Affine normalized Krawtchouk moments based face recognition. Procedia Computer Science, 58, 66–75.
Article Google Scholar
Priyal, S. P. & Bora, P. K. (2010). A study on static hand gesture recognition using moments. In IEEE International Conference on Signal Processing and Communications, pp. 1–5.
Priyal, S. P., & Bora, P. K. (2013). A robust static hand gesture recognition system using geometry based normalizations and Krawtchouk moments. Pattern Recognition Letters, 46(8), 2202–2219.
Article MATH Google Scholar
Jassim, W. A., Raveendran, P., & Mukundan, R. (2012). New orthogonal polynomials for speech signal and image processing. IET Signal Processing, 6(8), 713–723.
Article MathSciNet Google Scholar
Tsougenis, E. D., Papakostas, G. A., Koulouriotis, D. E., & Tourassis, V. D. (2012). Performance evaluation of moment-based watermarking methods: A review. Journal of Systems and Software, 85(8), 1864–1884.
Article Google Scholar
Dai, X. B., Shu, H. Z., Luo, L. M., Han, G. N., & Coatrieux, J. L. (2010). Reconstruction of tomographic images from limited range projections using discrete Radon transform and Tchebichef moments. Pattern Recognition, 43(3), 1152–1164.
Article MATH Google Scholar
Zhu, H., Shu, H., Zhou, J., Luo, L., & Coatrieux, J. L. (2007). Image analysis by discrete orthogonal dual Hahn moments. Pattern Recognition Letters, 28, 1688–1704.
Article Google Scholar
Nikiforov, A. F., & Uvarov, V. B. (1988). Special functions of mathematical physics. Basel: Birkhauser.
Book MATH Google Scholar
Triesch, J., & Von der malsuburg, C. (2002). Classification of hand postures against complex backgrounds using elastic graph matching. Image and Vision Computing, 20(13–14), 937–943.
Article Google Scholar
Chapaneri, S., Lopes, R., & Jayaswal, D. (2015). Evaluation of music features for PUK kernel based genre classification. Procedia Computer Science, 45, 186–196.
Article Google Scholar
Wald, R., Khoshgoftaar, T. M. & Napolitano, A. (2014). Using correlation-based feature selection for a diverse collection of bioinformatics datasets. In IEEE International Conference on Bioinformatics and Bioengineering, pp. 156–162.
Xu, X., Li, A., & Wang, M. (2015). Prediction of human disease-associated phosphorylation sites with combined feature selection approach and support vector machine. IET Systems Biology, 9(4), 155–163.
Article Google Scholar
Rodgers, J. L., & Nicewander, W. A. (1988). Thirteen ways to look at the correlation coefficient. The American Statistician, 42(1), 59–66.
Article Google Scholar
Üstün, B., Melssen, W. J., & Buydens, L. M. C. (2006). Facilitating the application of support vector regression by using a universal Pearson VII function based kernel. Chemometrics and Intelligent Laboratory Systems, 81(1), 29–40.
Article Google Scholar
Zhang, G., & Ge, H. (2013). Support vector machine with a Pearson VII function kernel for discriminating halophilic and non-halophilic proteins. Computational Biology and Chemistry, 46, 16–22.
Article Google Scholar
Huang, G. B., Zhou, H., Ding, X., & Zhang, R. (2012). Extreme learning machine for regression and multiclass classification. IEEE Transactions on Systems, Man, and Cybernetics—Part B: Cybernetics, 42(2), 513–529.
Article Google Scholar
Dinç, İ., Sigdel, M., Dinç, S., Sigdel, M. S., Pusey, M. L. & Aygün, R. S. (2014). Evaluation of normalization and PCA on the performance of classifiers for protein crystallization images. In IEEE SOUTHEASTCON, pp. 1–6.
Shalabi, L. A., Shaaban, Z., & Kasasbeh, B. (2006). Data mining: A preprocessing engine. Journal of Computer Science, 2(9), 735–739.
Article Google Scholar
Just, A., Rodriguez, Y. & Marcel, S. (2006). Hand posture classification and recognition using the modified census transform. In 7th International Conference on Automatic Face and Gesture Recognition, pp. 351–356.
Kelly, D., McDonald, J., & Markham, C. (2010). A person independent system for recognition of hand postures used in sign language. Pattern Recognition Letters, 31(11), 1359–1368.
Article Google Scholar
Dahmani, D., & Larabi, S. (2014). user independent system for sign language finger spelling recognition. Journal of Visual Communication and Image Representation, 25(5), 1240–1250.
Article Google Scholar
Kaur, B., & Joshi, G. (2016). Lower order Krawtchouk moment-based feature-set for hand gesture recognition. Advances in Human–Computer Interaction, 2016(2016), 1–10.
Google Scholar
Khurana, G., Joshi, G. & Vig, R. (2014). Static hand gestures recognition system using shape based features. Recent Advances in Engineering and Computational Sciences, pp. 1–4.
Sharma, K., Joshi, G & Dutta, M. (2015). Analysis of shape and orientation recognition capability of complex Zernike moments for signed gestures. In International Conference on Signal Processing and Integrated Networks, pp. 730–735.
Joshi, G., Vig, R. & Singh, S. (2017). CFS-Infogain based combined shape based feature vector for signer independent ISL database. In 6th International Conference on Pattern Recognition Applications and Methods, 24th–26th February, 2017, Portu, pp. 1–8 (accepted).

Download references

Author information

Authors and Affiliations

Department of Electronics and Communication Engineering, UIET, Panjab University, Chandigarh, India
Bineet Kaur, Garima Joshi & Renu Vig

Authors

Bineet Kaur
View author publications
You can also search for this author in PubMed Google Scholar
Garima Joshi
View author publications
You can also search for this author in PubMed Google Scholar
Renu Vig
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bineet Kaur.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kaur, B., Joshi, G. & Vig, R. Identification of ISL Alphabets Using Discrete Orthogonal Moments. Wireless Pers Commun 95, 4823–4845 (2017). https://doi.org/10.1007/s11277-017-4126-2

Download citation

Published: 09 April 2017
Issue Date: August 2017
DOI: https://doi.org/10.1007/s11277-017-4126-2

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Identification of ISL Alphabets Using Discrete Orthogonal Moments

Abstract

Similar content being viewed by others

Analysis of Zernike Moment-Based Features for Sign Language Recognition

Chinese Character Recognition by Krawtchouk Moment Features

Weighted Euclidean Distance Based Sign Language Recognition Using Shape Features

1 Introduction

2 Literature Survey