1 Introduction

Human–computer interaction (HCI) systems based on hand gestures find applications in sign language communication. Sign language is a mode of communication among the deaf community through gestures. The Indian sign language (ISL) exists in India. The idea is to make computer understand ISL alphabets by means of hand shapes which can be interpreted in the textual/audio form on the computer screen, thus making interaction with the deaf people easy, without the need of an interpreter. For this, moment-based shape features play an important part in distinguishing between different ISL alphabets. The proposed ISL recognition system is shown in Fig. 1.

Fig. 1
figure 1

Block diagram of a proposed ISL recognition system

2 Literature Survey

In pattern recognition domain, moments have been identified to possess the capability to extract both global and local information of shape. These are termed as shape-based features.

After the introduction of non-orthogonal Hu moments in 1962, the continuous orthogonal moments such as Zernike and Legendre were introduced [1]. These were deployed in various applications of shape analysis [2]. The continuous orthogonal moments involved change of the image co-ordinate space into a different domain. For example, the Zernike moments are defined in polar co-ordinates. These also involved approximation of continuous integrals, which resulted in discretization error, thus limiting the accuracy with the increase in order [3]. Also, their complexity in terms of computation of moments increased with the increase in order. Due to these limitations, discrete orthogonal moments such as Tchebichef moments were introduced. They are used as global descriptors and are defined in image co-ordinate space itself [4].

Krawtchouk moments also come under the category of discrete orthogonal moments, based on classical Krawtchouk polynomials [5]. In case of Krawtchouk moments the discretization error is non-existent. They have been used in image reconstruction and object recognition because of minimum information redundancy [69]. Moreover, they extract local information by varying the region of interest (ROI) in the image. The Krawtchouk moments have been used in various applications like character recognition [6, 1012], image classification [13], 3-D object retrieval [14], face recognition [1520], gesture recognition [21, 22], speech signal processing [23], watermarking systems [24] and medical image analysis [25].

Similar to Krawtchouk and Tchebichef moments, Dual-Hahn moments are discrete orthogonal moments. These can however be used as local and global descriptors, thus providing them an added advantage over other moments [7, 26].

Krawtchouk moments were first time used for object recognition by Yap et al. [6]. The database of 7 English uppercase binary alphabets was used which were rotated by various angles and were scaled. Krawtchouk moments outperformed Hu moments for both noisy and noiseless images. Sit et al. [11] proposed local Krawtchouk and Hu-based invariants. These were tested on a database of 9 English uppercase binary alphabets and 9 gray-scale clip art images. It was concluded that Krawtchouk moments outperformed Hu moments in terms of recognition accuracy.

The image analysis of Dual-Hahn moments was done by Zhu et al. [26]. It was concluded that Dual-Hahn moments performed better than Hu, Legendre, Tchebichef and Krawtchouk moments in terms of recognition capability on a database of 7 English binary alphabets in both noisy and noiseless conditions.

Significant research work has been done in face recognition using discrete orthogonal moments. Krawtchouk moments were extracted on a database of 40 subjects, differing in expression, position, rotation and scale. These outperformed Geometric, Zernike and Tchebichef moments in terms of recognition accuracy. It was found that the Krawtchouk moments gave good classification accuracy, even with the addition of noise for face recognition [1517].

However, very few papers have used discrete orthogonal moments in gesture recognition domain. Priyal et al. [21, 22] compared the recognition accuracy of Krawtchouk moments with Zernike, Geometric and Tchebichef on a database of 10 gesture signs of digits collected from 23 users, which were rotated, translated and scaled. It was concluded that Krawtchouk moments outperformed other moments and were viewpoint and user-invariant. However, the role of Krawtchouk moments in gesture recognition domain was not deeply investigated.

In this paper, an ISL database contains 26 ISL alphabets collected on a uniform and complex background with variations in position, scale and rotation. Based on the literature survey, Krawtchouk and Dual-Hahn moments are found to be one of the best shape descriptors. The paper focuses on the following objectives:

  1. (1)

    To extract Krawtchouk and Dual-Hahn moment-based features till 5th order for both Jochen-Triesch and ISL databases by varying ROI.

  2. (2)

    To select a minimum possible feature-set that gives good recognition accuracy at various classifiers using correlation-based feature selection (CFS) algorithm.

  3. (3)

    To prove that discrete orthogonal moment-based features have shape recognition capability and are user, position, rotation and scale invariant.

3 Krawtchouk Moments

Krawtchouk moments are derived from classical Krawtchouk polynomials, associated with binomial functions [5].

3.1 Krawtchouk Polynomials

The rth order classical Krawtchouk polynomial is:

$$K_{r} \left( {i;p,X} \right) = \mathop \sum \limits_{k = 0}^{X} \left( {a_{k,r,p} i^{k} } \right) = {}_{2}F_{1} \left( { - {\text{r}}, - {\text{i}}; - {\text{X}};\frac{1}{p}} \right)$$
(1)

where i, r = 0, 1, 2…X, X > 0, p ϵ (0, 1), \(a_{k,r,p} \;{\text{are the Krawtchouk Polynomial coefficients}}\).

Here, \({}_{2}F_{1}\; {\text{is a hypergeometric function}}{:}\)

$${}_{2}F_{1} \left( {{\text{m}},{\text{n}};{\text{o}};{\text{t}}} \right) = \mathop \sum \limits_{k = 0}^{\infty } \frac{{\left( m \right)_{k} \left( n \right)_{k } t^{k} }}{{\left( o \right)_{k} \left( {{\text{k}}!} \right)}}$$
(2)

where (m)k is a pochhammer symbol:

$$\left( {\text{m}} \right)_{\text{k}} = {\text{m}}\left( {{\text{m}} + 1} \right) \ldots \left( {{\text{m}} + {\text{k}} - 1} \right) = \frac{{\Gamma \left( {{\text{m}} + {\text{k}}} \right) }}{{\Gamma \left( {\text{m}} \right) }}$$

The normalized Krawtchouk polynomials are:

$${\tilde{\text{K}}}_{\text{r}} \left( {{\text{i;}}p , {\text{X}}} \right) = {\text{K}}_{\text{r}} \left( {{\text{i;}}p , {\text{X}}} \right)\sqrt {\frac{1}{{\uprho\left( {{\text{r}};p,{\text{X}}} \right)}}}$$
(3)

The weighted Krawtchouk polynomials are [6]:

$${\tilde{\text{K}}}_{\text{r}} \left( {{\text{i;}}p , {\text{X}}} \right) = {\text{K}}_{\text{r}} \left( {{\text{i;}}p , {\text{X}}} \right)\sqrt {\frac{{w\left( {i;p,X} \right)}}{{\uprho\left( {{\text{r}};p,{\text{X}}} \right)}}}$$
(4)

The weight function is:

$${\text{w}}\left( {{\text{i}};p,{\text{X}}} \right) = \left( {\begin{array}{*{20}c} X \\ i \\ \end{array} } \right)p^{i} \left( {1 - p} \right)^{X - i}$$
(5)
$$\uprho\left( {{\text{r;}}p , {\text{X}}} \right) = \left( { - 1} \right)^{\text{r}} \left( {\frac{1 - p}{p}} \right)^{\text{r}} \frac{{r!}}{{\left( { - X} \right)_{r} }}$$
(6)

The Krawtchouk polynomials till the second order are:

$${\text{K}}_{0} \left( {{\text{i;}}p , {\text{X}}} \right) = 1$$
(7)
$${\text{K}}_{ 1} \left( {{\text{i;}}p ; {\text{X}}} \right) = 1 - \left[ {\frac{1}{Xp}} \right]{\text{i}}$$
(8)
$${\text{K}}_{ 2} \left( {{\text{i;}}p , {\text{X}}} \right) = 1 - \left[ {\frac{2}{Xp} + \frac{1}{{X\left( {X - 1} \right)p^{2} }}} \right]{\text{i}} + \left[ {\frac{1}{{X\left( {X - 1} \right)p^{2} }}} \right]i^{2}$$
(9)

With the increase in the order of Krawtchouk polynomials, the range of the polynomials also increases thus, resulting in numerical instability. Therefore, weighted Krawtchouk polynomials were introduced to overcome this drawback [6].

3.2 Krawtchouk Moment Invariants

The Krawtchouk invariant moments corresponding to (r, q) order of an image intensity function f(i,j) is:

$${\text{Q}}_{\text{rq}} = \mathop \sum \limits_{i = 0}^{X - 1} \mathop \sum \limits_{j = 0}^{Y - 1} {\bar{\text{K}}}_{r} \left( {{\text{i}};{\text{p}}_{1} ,{\text{X}} - 1} \right){\bar{\text{K}}}_{q} ({\text{j}};{\text{p}}_{2} ,{\text{Y}} - 1) {\text{f}}\left( {{\text{i}},{\text{j}}} \right)$$
(10)

where f(i,j) is of size X × Y and these are substituted for X−1 and Y−1. These are Krawtchouk moment-based invariants, which are used as shape-based features for ISL alphabets. p 1 and p 2 are used for varying ROI horizontally and vertically. For p 1 > 0.5, ROI shifts horizontally towards positive x-direction and for p 1 < 0.5 it shifts horizontally towards negative x-direction. For p 2 > 0.5, ROI shifts vertically towards negative y-direction and for p 2 < 0.5 it shifts vertically towards positive y-direction. In this paper, r = q is chosen for varying orders and the feature vector size is (r + 1)2.

4 Dual-Hahn Moments

Dual-Hahn moments encompass all the properties of Tchebichef and Krawtchouk moments. These can be used as both global and local feature descriptor as compared to Tchebichef moments which are global feature descriptors and Krawtchouk moments which extract local information [7].

4.1 Dual-Hahn Polynomials

The rth order Dual-Hahn polynomial is [27]:

$$h_{r}^{v} \left( {p,a,b} \right) = \frac{{\left( {a - b + 1} \right)_{r} \left( {a + v + 1} \right)_{r} }}{{{\text{r}}!}} {}_{3}F_{2} \left( { - {\text{r}},{\text{a}} - p,{\text{a}} + p + 1;{\text{a}} - {\text{b}} + 1,{\text{a}} + {\text{v}} + 1;1} \right)$$
(11)

where r = 0,1,2,…,R−1, p = a, a + 1,…, b−1

Here, \({}_{3}F_{2}\; {\text{is a hypergeometric function}}{:}\)

$${}_{3}F_{2} \left( {{\text{m}},{\text{n}},{\text{o}};{\text{p}},{\text{q}};{\text{r}}} \right) = \mathop \sum \limits_{k = 0}^{\infty } \frac{{\left( m \right)_{k} \left( n \right)_{k } \left( o \right)_{k } r^{k} }}{{\left( p \right)_{k} \left( q \right)_{k } \left( {{\text{k}}!} \right)}}$$
(12)

where (m)k is a pochhammer symbol:

$$\left( {\text{m}} \right)_{\text{k}} = {\text{m}}\left( {{\text{m}} + 1} \right) \ldots \left( {{\text{m}} + {\text{k}} - 1} \right) = \frac{{\Gamma \left( {{\text{m}} + {\text{k}}} \right) }}{{\Gamma \left( {\text{m}} \right) }}$$
(13)

To avoid the numerical instability with increase in order, the Dual-Hahn polynomials are scaled by using the weighing function.

The weighted Dual-Hahn polynomials are given by [26]:

$$\bar{h}_{r}^{v} \left( {p,a,b} \right) = \bar{h}_{r}^{v} \left( {p,a,b} \right) \sqrt {\frac{w\left( p \right)}{{d_{r}^{2} }}}$$
(14)

where

$$d_{\text{r}}^{2} = \frac{{\Gamma \left( {{\text{a}} + {\text{v}} + {\text{r}} + 1} \right) }}{{{\text{r}}! \left( {{\text{b}} - {\text{a}} - {\text{r}} - 1} \right)!\Gamma \left( {{\text{b}} - {\text{v}} - {\text{r}}} \right) }}$$
(15)

Weighing function is

$${\text{w}}\left( p \right) = \frac{{\Gamma \left( {{\text{a}} + p + 1} \right)\Gamma \left( {{\text{v}} + p + 1} \right)}}{{\Gamma \left( {p - {\text{a}} + 1} \right)\Gamma \left( {{\text{b}} - p} \right)\Gamma \left( {{\text{b}} + p + 1} \right)\Gamma \left( {p - {\text{v}} + 1} \right)}}$$
(16)

4.2 Dual-Hahn Moments

The Dual-Hahn invariant moments corresponding to (r, q) order of an image intensity function f (p,u) is:

$${\text{h}}_{\text{rq}} = \mathop \sum \limits_{p = a}^{b - 1} \mathop \sum \limits_{u = a}^{b - 1} \bar{h}_{r}^{v} \left( {p,a,b} \right)\bar{h}_{q}^{v} \left( {u,a,b} \right) {\text{f}}\left( {{\text{p}},{\text{u}}} \right)$$
(17)

where −0.5 < a < b, \(\left| {\text{v}} \right|\) < 1 + a, b = a + R, r, q = 0, 1,…,R−1 and f (p,u) is of size R × R.

The parameters a and v are used for varying ROI. As v increases, the ROI shifts from left to right and with increase in a, shifting of ROI takes place from top to bottom. In this paper, r = q is chosen for varying orders and the feature vector size is (r + 1)2.

5 Tools and Techniques

5.1 Database

  1. (1)

    The standard Jochen-Triesch’s database consisting of 10 static hand postures collected from 24 subjects in uniform dark, uniform light and complex background [28].

  2. (2)

    The ISL database consists of 26 ISL alphabets from ‘A’ to ‘Z’. Some alphabets have identical shapes like, ‘A’, ‘B’, ‘P’, ‘Q’, ‘U’, ‘W’, ‘M’, ‘N’, high occlusion as in ‘M’, ‘N’, ‘W’ and one gesture being sub-gesture of other as in ‘I’, ‘K’. In most of the signs both hands are used, which leads to complexity. A dataset of around 72 subjects is constructed for 26 ISL alphabets on the uniform background. It has total number of 1865 images. It is shown in Fig. 2. These are pre-processed and each image is converted from RGB to binary. Edge detectors are used and each image is resized to form 30 × 30 binary images. These binary images are varied in a scale of 0.7, 0.8 and 0.9 rotated at 90°, 180° and 270° angles resulting in the total number of 13,055 images in the first dataset. In the second dataset, the alphabet signs are superimposed on complex backgrounds. 4 variations of background are taken, with 100 images per alphabet. It has a total number of 2600 images in a complex background. During classification phase, 60% of the samples are used for training and rest 40% are used for the testing phase. Some of the samples of ISL alphabets on the complex background are shown in Fig. 3.

    Fig. 2
    figure 2

    Samples of ISL alphabets on a uniform background

    Fig. 3
    figure 3

    Samples of ISL database on a complex background

5.2 Feature Extraction

5.2.1 Krawtchouk Moment-Based Local Features

Figure 4 shows reconstructed images from 1st to 8th order. As can be seen, at lower orders finer details of the image are captured giving local information. Increasing the order beyond 5th order does not add many details into the reconstructed images. The complexity of database increases when 26 classes of ISL database are taken. Therefore, the feature vector size is increased by extracting Krawtchouk features at different ROIs in order to capture local features from different positions of the image, thus covering the entire image. For this, the values of (p 1, p 2) taken are 0.1, 0.3, 0.5, 0.7 and 0.9 corresponding to which ROIs are varied by (0.1, 0.1), (0.1, 0.3), (0.1, 0.5), (0.1, 0.7), (0.1, 0.9), (0.3, 0.1)…till (0.9, 0.9) giving 25 ROIs by permutation as shown in Fig. 5. For each ROI, 36 features are extracted till 5th order, thus giving a feature vector size of 900 (36 × 25 = 900).

Fig. 4
figure 4

Reconstruction of original image using Krawtchouk moments at different orders

Fig. 5
figure 5

Representation of various ROIs using Krawtchouk moments

5.2.2 Dual-Hahn Moments as Global Features

The Dual-Hahn moments can be used as global descriptors by setting a = v = 0 as can be seen in Fig. 6 where global features are extracted at different orders. For a 30 × 30 image, perfect reconstruction takes place when moments are extracted till 29th order as shown in Fig. 6.

Fig. 6
figure 6

Reconstruction of original image using Dual-Hahn moments

5.2.3 Dual-Hahn Moments as Local Features

In case of Dual-Hahn moments, ROI is varied by changing the tuning parameters ‘a’ and ‘v’. Dual-Hahn moments can also extract local features by setting {a, v} > 0. The smaller the values of tuning parameter, ROI shifts in the upper-left corner of the image. As its value becomes larger, ROI stretches to the bottom-right corner of the image as can be seen in the Fig. 7 for an image size of 30 × 30. Thus, Dual-Hahn local features are extracted by changing a and v parameters with even values of (a, v) = (2, 4,…,48, 50) so that feature extraction is done at each of the 25 different ROIs. Thus, 36 features till 5th order are extracted at each ROI giving a total of 900 features (36 × 25 = 900). For global feature extraction, a large number of moments are needed to extract features of the entire image, which results in a large feature-vector as opposed to local feature extraction where features are extracted for a particular portion of the image, thus giving a smaller feature-vector. The detailed methodology used for ISL recognition system has been illustrated in Fig. 8.

Fig. 7
figure 7

Representation of various ROIs using Dual-Hahn moments

Fig. 8
figure 8

Proposed methodology of ISL recognition system

5.3 Parameter Selection of Moments

The parameters of discrete orthogonal moments are adjusted on the basis of experimental results given in Sect. 5.2 in order to extract local and global features for ISL database as shown in Table 1.

Table 1 Parameter selection of moments

5.4 Feature Selection

Feature Selection removes irrelevant and redundant features. It reduces the feature vector size and the computation time of the classifier. In this paper, correlation-based feature selection (CFS) algorithm with greedy stepwise search method is used. As the moment-based features have minimum redundancy and the features of the same class are highly correlated, CFS with greedy stepwise search method is highly suitable. It is considered as the most stable feature selection algorithm and reduces problems of class imbalance, high dimensionality and information redundancy [2931].

The CFS gives a feature-set which has a higher correlation with the class and lower correlation within one another. CFS uses Pearson correlation coefficient which is calculated as follows [32]:

$${\text{M}}_{\text{s}} = \frac{{{\text{k}}\bar{\text{r}}{\text{cf}} }}{{\sqrt {{\text{k}} + {\text{k}}\left({{\text{k}} - 1} \right){\bar{\text{r}}}_{ff}}}}$$
(18)

MS is the merit of the current subset of features, k is the number of features, \({\bar{\text{r}}}_{\text{ff}}\) is the mean of correlations between each feature and the class and \({\bar{\text{r}}}_{\text{ff}}\) is the mean of pair-wise correlations between every two features. The feature subsets are formed using search strategies: forward and backward elimination. In case of forward elimination, one feature at a time is added in the subset and stops when the performance deteriorates. In backward elimination, all features are added and each feature one at a time is removed until the performance degrades. These are then ranked on the basis of highest correlation coefficient (Ms). Out of all the feature subsets, the best feature subset is selected on the basis of the largest Ms. Greedy stepwise search method used in the paper can start with either forward or backward elimination.

5.5 Classification

Features are classified by K-Nearest Neighbour (k-NN) using Manhattan and Euclidean distance, with k = 1 for both the classifiers, multi-layer perceptron (MLP), support vector machine (SVM) with radial basis function (RBF), PUK (Pearson VIII Universal Kernel) and Polynomial kernels and extreme learning machines (ELM) with Linear, RBF and Polynomial kernels. 60% samples are used for training, rest 40% are used for testing. The recognition accuracy in terms of feature set dimensionality and orders is analysed using all the above classifiers.

5.5.1 k-Nearest Neighbour (k-NN)

This classifier is based on instance-based learning algorithms which are lazy- learning algorithms where the generalizations take place during the classification phase. In this, the distance between the query sample and the training samples is measured. Some of the widely used distance metrics include Manhattan, Euclidean, Chebychev and Minkowski. Euclidean and Manhattan distance metrics have been used in the classification of ISL alphabets.

The Euclidean distance metric,

$$d\left( {x,y} \right) = \sqrt {\mathop \sum \limits_{{{\text{i}} = 1}}^{\text{m}} \left( {x_{i} - y_{i} } \right)^{2} }$$
(19)

The Manhattan distance metric,

$$d\left( {x,y} \right) = \mathop \sum \limits_{{{\text{i}} = 1}}^{\text{m}} \left| {x_{i} - y_{i} } \right|$$
(20)

Algorithm for k-NN classification:

  1. (1)

    Determine the value for K = number of nearest neighbours.

  2. (2)

    Determine the distance metric (Euclidean and Manhattan) to find distance between test and training samples.

  3. (3)

    Find the minimum distance.

  4. (4)

    Determine the nearest neighbours on the basis of majority vote.

  5. (5)

    Assign the class of maximum nearest neighbours to the test sample.

5.5.2 Support Vector Machine (SVM)

In this, features of different classes are separated by a hyperplane. The position of the features decides as to which class the test sample belongs. It is used for linear as well as non-linear classification. In case of non-linear classification, the feature points are mapped into a high-dimensional feature space by means of kernel functions. SVM with PUK shows better generalization as compared to other kernel functions like polynomial and RBF and classifiers like k-NN and MLP [29, 33, 34].

5.5.3 Multi-Layer Perceptron (MLP)

In this, the neural network is first trained through training samples given to input neurons. Accordingly, learning takes place through weight updation. The test samples are then supplied to input neurons where the desired and predicted values are compared in order to calculate local error. To minimize the mean squared error, weight learning is carried out from output to hidden layer till the weight convergence is achieved.

5.5.4 Extreme Learning Machine (ELM)

is a generalized feed-forward network in which hidden layer doesn’t need any prior tuning. It can be applied on regression and multi-class classification problems directly. ELM is used in various multiclass classification applications and gives similar generalizations at faster learning speed as compared to SVM which has high computational complexity [35].

6 Results and Discussions

In this section, results and discussions are presented to validate the proposed feature extraction method for ISL recognition system. The experiments were executed on MATLAB R2014b using Intel(R) Pentium(R) laptop with windows 7, 32-bit operating system at 2 GHz with 4 GB RAM memory.

6.1 Analysis of Dual-Hahn Moments as Local and Global Features

Figures 9 and 10 shows the comparison of effectiveness of Dual-Hahn moments as local and global features for ISL database. Dual-Hahn moments are set in global feature extraction mode by setting, a = v = 0 with orders varying from 9 to 29. The feature vector size is (r + 1)2 corresponding to order (r, q) when orders are taken as r = q. Thus, the feature vector varies from 100 to 900 for orders varying from 9 to 29 as shown in Fig. 9. For local feature extraction, the values of a and v is varied from 2, 4, 6… till 50 giving 25 different ROIs at orders 1 to 5, with a feature vector size of 100, 225, 400, 625 and 900 for orders 1, 2, 3, 4 and 5, respectively as can be seen in Fig. 10.

Fig. 9
figure 9

Comparison of Dual-Hahn moments as global features for ISL database

Fig. 10
figure 10

Comparison of Dual-Hahn moments as local features for ISL database

It shows that for a feature-vector size of 900, Dual-Hahn moments as global features give 87.9 and 61.9% for ISL in uniform and complex background, respectively while as global features for the same feature vector size of 900, Dual-Hahn moments as local features give 98.2 and 75.9% for ISL in uniform and complex background, respectively. Thus, Dual-Hahn moments perform best when features are extracted for a particular ROI as compared to global features where feature extraction is done on the entire image.

6.2 Comparison of Discrete Orthogonal Moments for ISL Database

To validate the effectiveness of discrete orthogonal moments, ISL database is used. The recognition accuracies of Tchebichef, Krawtchouk and Dual-Hahn moments are compared till 10th order. Figures 11 and 12 shows a significant increase in accuracy from 1st to 5th order after which the accuracy stabilizes at higher orders. The Dual-Hahn moments when used as local features perform best as compared to Tchebichef and Krawtchouk moments.

Fig. 11
figure 11

Comparison of discrete orthogonal moments for ISL on a uniform background

Fig. 12
figure 12

Comparison of discrete orthogonal moments for ISL on a complex background

To demonstrate the recognition capability of discrete orthogonal moments, the comparison of Krawtchouk and Dual-Hahn moments is done for a standard Jochen-Triesch’s database as these achieve best recognition accuracy.

Table 2 shows the comparative analysis of the recognition accuracies at various classifiers using Jochen-Triesch’s database by varying ROIs for Krawtchouk and Dual-Hahn moments. Raw features are normalized to map the feature values in the range of [−1, 1], thus improving the recognition accuracy [36, 37].

Table 2 Recognition accuracies for Jochen-Triesch’s images

Dual-Hahn moments show best results with 96.5% accuracy using SVM PUK. Krawtchouk moments also give comparable results with 93.4% accuracy using SVM PUK. Table 3 shows a comparative analysis of the proposed method with other recently proposed methods for Jochen-Triesch’s database. The Dual-Hahn moments as local features outperforms other recently proposed methods giving an accuracy of 96.5%.

Table 3 Comparison of results for Jochen-Triesch’s dataset

Table 4 shows the performance of Dual-Hahn moments for ISL database on a uniform as well as complex background. An accuracy of 98.2% is observed for the database on a uniform background using SVM PUK. However, for a complex background, the accuracy obtained is 75.9% using k-NN.

Table 4 Recognition accuracies for ISL images using Dual-Hahn moments

Table 5 compares the performance of Krawtchouk moments for ISL database on a uniform as well as on a complex background. An accuracy of 97.8% is obtained for the database on a uniform background using SVM PUK. However, for a complex background, the accuracy obtained is 72.9% using k-NN and SVM PUK.

Table 5 Recognition accuracies for ISL images using Krawtchouk moments

The CFS algorithm reduces the feature-set so as to minimize information redundancy. Feature selection results in slight improvement in accuracy results. For ISL database on a complex background, Dual-Hahn moments give 75.9% accuracy using SVM Polynomial kernel as shown in Table 6. For ISL database on a uniform background, Dual-Hahn moments give 98.3% accuracy using SVM PUK as shown in Table 7.

Table 6 Recognition accuracy using CFS of ISL database on a complex background
Table 7 Recognition accuracy using CFS of ISL database on a uniform background

The time taken to compute discrete orthogonal moment-based features till 5th order in case of Tchebichef, Krawtchouk and Dual-Hahn moments are shown in Table 8. It is seen that Tchebichef, Krawtchouk and Dual-Hahn moments consume 1.234, 0.182 and 0.102 s to compute features till 5th order for a 30 × 30-resolution image.

Table 8 CPU elapsed time (in s) to compute discrete orthogonal moments till 5 order

Dual-Hahn moments till 5th order perform best for 26 ISL alphabets on a uniform background with an accuracy of 98.3% followed by Krawtchouk moments giving 97.9%. For complex background, accuracies obtained are 75.9 and 72.6% for Dual-Hahn and Krawtchouk moments, respectively. However, recent works have used limited ISL classes on a uniform background only as illustrated in Table 9.

Table 9 Comparison of proposed methodology with recent methods on ISL database

The confusion matrices for ISL database are shown in Figs. 13 and 14. Figure 13 illustrates the confusion matrix of ISL database on a uniform background using Dual-Hahn moments which give 98.3% accuracy. It is seen that most of the samples of alphabets ‘M’ and ‘N’ are misinterpreted because of their similar shapes and high occlusion. Alphabets ‘A’, ‘B’, ‘E’,‘C’ have identical shapes which results in incorrect classifications.

Fig. 13
figure 13

Confusion matrix for ISL alphabets on a uniform background

Fig. 14
figure 14

Confusion matrix for ISL alphabets on a complex background

Figure 14 illustrates confusion matrix of ISL alphabets on a complex background with Dual-Hahn moments that give 75.9% accuracy. Because of background variations, alphabets are difficult to identify. However, discrete orthogonal moments have shape recognition capability and thus are able to achieve good results on both uniform as well as complex background.

Recent works have used Tchebichef and Krawtchouk moments at higher orders (80th order) for gesture recognition [21, 22]. In this work, the recognition accuracy stabilizes at higher orders. Moreover, this increases the feature-vector size thus taking more computational time. Therefore, to increase the recognition accuracy for complex background, features can be extracted on more number of ROIs at lower orders which capture finer details from different ROIs in an image.

7 Conclusion and Future Scope

In this paper, the shape recognition capability of discrete orthogonal moment-based local features is studied on 26 ISL alphabets in uniform as well as complex backgrounds. The comparative analysis of the performance of proposed feature vector is first analysed on standard Jochen- Triesch’s database. The proposed method shows competent results as the recognition accuracy obtained for Krawtchouk and Dual-Hahn moments is better than the other recently proposed methods. An accuracy of 98.3 and 97.9% is achieved for ISL database on a uniform background by Dual-Hahn and Krawtchouk moments, respectively. For ISL database on a complex background, an accuracy of 75.9 and 72.6% is observed for Dual-Hahn and Krawtchouk moments, respectively. Thus, the orthogonal moment-based local features are rotation, scale and translation and user invariant. They also have shape identification capability to distinguish between similar shapes in case of ISL alphabets. In future, the static gestures can be extended to dynamic gestures, involving the movement of hands.