1 Introduction

In today’s world, personal authentication using biometrics plays vital role in human life. With the advancement in technology and boost in biometric applications, biometrics has been a highly researched topic from the last decade due to its applications in security and surveillance. Nowadays biometric authentication systems are inbuilt in most of mobile and cellular phones, in laptops, or in services like mobile banking, security systems, etc. In general, biometric systems are of two types depending upon their characteristics: physical which includes palmprint, iris, dorsal hand veins, fingerprint, face, etc; behavioral which includes signature, gait and keystroke (Jain et al. 2016).

For any biometric system, first requirement is acquisition of enough data for proper training and testing. For palmprint, few systems were reported (Zhang and Shu 1999; Duta et al. 2002) which used ink marking methods for sample acquisition. These systems were not very user-friendly because of use of ink during data acquisition. Then, few scanner and camera-based methods were proposed, which used charge-coupled device (CCD) scanner and digital scanner (Kong et al. 2009). They used contact and contact-less types of systems. Few systems used pin (Noh and Rhee 2005) and peg (Zhang et al. 2003) for hand placing to avoid rotational problems. There were several systems in the literature which acquired images without any constraints (Pan and Ruan 2009; Badrinath and Gupta 2007).

After data acquisition, second step is to extract region of interest (ROI) from the data samples. Most of the general biometric systems are made to have interclass variations to make the system more real time and includes those mistakes that can be added by users at the verification point in offices or banks. Hence, orientation or rotational variations are made. In case where data acquisition is constrained by peg or pins, ROIs can easily be extracted. If database has rotational variation, then hand samples are normalized before ROI extraction to make the system robust to such variations. All the steps that align database samples for feature extraction are referred as preprocessing. In most of the preprocessing methods, valley points between the fingers are extracted to crop ROI. The basic steps are: binarizing to extract the boundary of hand which also helps in masking of hand on a background free image; finding the key points; generating the coordinate system; cropping the ROI. The primary steps in all the preprocessing algorithms are similar. However, the other step (finding the key points and coordinate system) has several different implementations including tangent (Zhang et al. 2003), bisector (Li et al. 2002; Wu et al. 2004) and finger-based (Han 2004; Han et al. 2003) to detect the key points between fingers.

Hand-based biometrics provide easy, efficient and secure biometric systems and hence are favorites of research groups working in this area, investigation departments and forensic science departments (Jain et al. 2006). In hand-based biometrics, palmprint, palm-phalanges, dorsal hand vein, fingerprint, knuckles, palm vein or their fusion have been researched in the literature for biometric system. Like fingerprints, palmprint has ridges like structures with a few distinct principles or main lines, minutiae points, singular points and textural visible structure which consists of lots of unique information and hence can be used as biometric modality. Palmprint-based biometric systems are preferred in contrast to other systems due to their inherent advantages like low cost, stable print patterns, easy acquisition, highly age prone and user acceptable (Hong et al. 2015; Nigam and Gupta 2015). To date, palmprint recognition has received increasing research attention, and a variety of methods have been proposed for palmprint feature extraction and recognition (Nigam and Gupta 2015; Chen et al. 2013; Ahmad et al. 2016; Zheng et al. 2016; Tiwari et al. 2013; Chakraborty et al. 2013; Wang et al. 2013; Zhao et al. 2015; Leng and Teoh 2015; Lin and Tai 2015; Yue et al. 2013; Xu et al. 2015; Hong et al. 2015; Zhang et al. 2010; Kumar and Shekhar 2011; Malik et al. 2015). A survey of palmprint recognition method is presented in Kong et al. (2009), Zhang et al. (2012) and Fei et al. (2018) . It is also studied in Kong et al. (2006) that no palmprint is duplicate even in case of mono-zygotic twins; that is why palmprint can be used as reliable password and provides the highest accuracy if proper hardware is available. Thus, palmprint-based biometric systems have ample scope of security applications such as access control, network and personal security.

Feature extraction in biometrics plays a pivotal role in extracting unique information from the data samples. In the literature, palmprint is represented using structural features which includes extraction of principle lines, wrinkles, datum points, minutiae points, ridges and crease points (Zhang and Shu 1999; Duta et al. 2002; Han et al. 2003). These approaches are also known as line-based approaches. In these methods, either these structures are matched directly or mapped in other format for matching. For example, Chen et al. (2013) and Huang et al. (2008) used the intrinsic features of palmprint, e.g., principal lines and wrinkles, for palmprint recognition.

The second type of approach is subspace-based approach which includes principle component analysis (Lu et al. 2003), linear discriminant analysis (LDA) (Wu et al. 2003), independent component analysis (ICA) (Shang et al. 2006), discrete cosine transformation (DCT) (Jing and Zhang 2004). These methods are applied on images or their subsections to evaluate the subspace coefficients. In these subspace, coefficients are used as features.

Last types of methods are statistical methods which include Fourier transforms (Li et al. 2002), mean (Kumar and Shen 2004), AAD (Lu et al. 2004), GMF-based features (Chaudhary et al. 2016; Srivastava et al. 2016), Gabor filter (Chu et al. 2007), scale invariant feature transform (SIFT) (Zhu and Zhang 2010), fusion code (Badrinath and Gupta 2009) and wavelets (Lu et al. 2004). In these approaches, first images are transformed into another domain using Fourier transforms, wavelets, Gabor transform, Stockwell transform (Badrinath and Gupta 2011), etc; then, images are divided into nonoverlapping windows, and local statistics like mean, variance, moments, centers of gravity (weighted mean) are calculated. Then, these statistics are used as features.

Due to the presence of lines and unique textures, palmprint carries rich distinctive orientation information. Gabor transform-based feature extraction methods extract certain orientation feature of palmprint images such as palmcodes (Zhang et al. 2003), competitive code (CompC) (Kong and Zhang 2004), ordinal code (Sun et al. 2005), double orientation code (DOC) (Fei et al. 2016) and discriminative and robust competitive code (DRCC) (Xu et al. 2016). The CompC code method (Kong and Zhang 2004) uses six Gabor filters with different orientations to extract the dominant orientation feature from a palmprint. Six Gabor filters with six orientations [e.g., \(j/6(j=0,1,\ldots ,5\))] are convoluted with the palmprint image. The dominant orientation is the one that produces the most strong response is taken as the competitive code. In DOC (Fei et al. 2016), two top-most dominant orientation responses of gabor filters are encoded and used as DOC features. In DRCC (Xu et al. 2016), side code with dominant orientation code is also extracted to improve the accuracy. From the above-mentioned methods, subspace-based approaches and statistical methods are highly comparable.

In addition to feature extraction-based approaches, fusion-based palmprint recognition methods were also proposed to enhance the performance of biometric system. In Xu et al. (2015), both the left and right palmprint images were fused for more accurate personal identification. In Hong et al. (2015) and Zhang et al. (2010), the multispectral palmprint recognitions methods were proposed, which fused the features of palmprint images captured under a different spectrum. In addition, Kumar and Shekhar (2011) investigated the rank-level fusion of multiple palmprint representations. To further improve the performance of palmprint-based recognition, several multimodal-based palmprint recognition methods were proposed where palmprint is fused with different biometric modalities like palm vein. In Srivastava et al. (2016), fusion of palm-phalanges print with palmprint and dorsal hand vein is proposed. In Lin and Tai (2015), the palmprint with palm vein fusion was proposed. In Yue et al. (2013), a hashing-based fast palmprint recognition method was proposed.

Till now, none of method has mentioned the noise robustness of their approaches. But a combinatorial algorithm is suggested in Liambas and Tsouros (2007) that only extract ROI from a highly noise image. What would happen if noisy conditions during security check, access control occur ? Like in palmprint systems, user hands can be dirty, marked or dusty. Noise may be due to poor illumination and reflection of light. Hence, biometric systems must be robust so as to adopt these noisy conditions.

Taking into consideration the above facts, a biometrics recognition system has been suggested in this work with a novel feature extraction technique: two-dimensional (2D) Cochlear transform (CT) which is a powerful time-frequency transform for signal and texture analysis (Li and Huang 2011). This transform is similar to wavelet and Gabor transform in few respects and is able to extract the frequency contribution of the palmprint. The performance of 2D-CT has been validated using KNN with Euclidean distance. The proposed 2D-CT method is compared with CompC coding, ordinal coding, standard Gabor transform, GMF-based features, AAD and mean features. The method is compared on CASIA palmprint database (CASIA-Palmprint database) and Indian Institute of Technology Delhi (IITD) palmprint database (IIT Delhi Palmprint Image Database version 1.0). Results prove that the proposed feature extraction method is better than other transform methods. Graphical abstract of the proposed work is shown in Fig. 1.

Fig. 1
figure 1

Graphical abstract of the proposed work

The 2D-CT is also validated in the presence of noise and found to be robust. The performance of the various modalities using different feature extraction techniques has been analyzed on the basis of receiver operative characteristics curves (ROC), which show the clear superiority of the 2D-CT technique. All the above-mentioned databases do not contain any noise or disturbances. Hence, different types of noise, e.g., Gaussian noise (that is similar to the poor illumination of light), salt and pepper noise (that is similar to noise due to dust) and speckle noise (that is similar to the reflection light) are added and tested to evaluate the proposed method in real environment. Principal sources of Gaussian noise in digital images arise during acquisition, e.g., sensor noise caused by poor illumination and/or high temperature, and/or transmission, e.g., electronic circuit noise. In palmprint recognition system, images are scanned using light illuminations. So, if light illumination is poor, this noise can be modeled as Gaussian noise (Philippe Cattin 2013).

1.1 Contributions of the paper

  1. 1.

    A biometrics recognition system is described with a novel feature extraction technique: two-dimensional (2D) Cochlear transform 2D-CT.

  2. 2.

    Robustness of the proposed method is also validated theoretically using orthogonality and experimentally in the presence of different noises.

  3. 3.

    The proposed feature extraction technique is compared with DRCC, DOC, ordinal coding, gabor transform, GMF, AAD and mean features on IITD palmprint database, CASIA palmprint database and PolyU palmprint database. Results prove that the proposed feature extraction method is better than existing techniques.

The organization of the paper is as follows. Section 2 presents feature extraction method. Section 3 demonstrates simulations and result analysis. Section 3.1 describes the method of preprocessing used for ROI extraction. Finally, Sect. 4 concludes the suggested work.

2 Feature extraction

Cochlear transform (CT) is basically a time-frequency transform comprised of a pairing of a forward transform and an inverse transform (Li and Huang 2011) which can be shown through admissibility property.

All wavelet-based orthogonal transforms are integral transforms that can be expressed as an inner product of the signal x(t) with a transform kernel function \(\varphi _{a,b}(t)\). The generalized CT \(C(\tau , f)\) of time varying 1D-signal x(t) is defined as

$$\begin{aligned} C(\tau , f)= & {} x(t) \otimes {\varphi _{a,b}}(t) \end{aligned}$$
(1)
$$\begin{aligned} \varphi _{a,b}(t)= & {} \frac{1}{\sqrt{a}} \varphi \left( \frac{t-b}{a} \right) \end{aligned}$$
(2)

where \(\otimes \) denotes the convolution operator and the kernel function \(\varphi _{a,b}(t)\) represents the member of a family of complete basis functions that span the space in which the signal x(t) exist. Here, CT is also an integral transform based on a set of kernel functions and may be referred to as the daughter wavelets, all derived from a mother wavelet \(\varphi _{a,b}(t)\) that satisfies the following conditions:

  1. 1.

    \(\varphi _{a,b}(t)\) should have a compact support, i.e., \(\varphi _{a,b}(t)\ne 0\) only inside a bounded range \(a< t < b\). \(\{ (a,b) | \varphi _{a,b}(t)\ne 0 \}:D_{1} \)

  2. 2.

    \(\varphi _{a,b}(t)\) has zero-mean or zero-DC component.

    $$\begin{aligned} \int ^\infty _{-\infty } \varphi _{a,b}(t)\mathrm{d}t=0 \end{aligned}$$
    (3)

Compared to FFT, CT has flexible time-frequency resolution and its frequency distribution can take any linear or nonlinear form. In FFT, the time information is completely lost and frequency axis is divided uniformly. Frequency resolution can be very precise if we integrate along the whole time axis. The scaling parameter a is inversely proportional to the frequency. If we want to focus on low frequencies, larger a is used, while higher frequencies use small a. This flexibility increases the time-frequency analysis.

A typical Cochlear impulse response function or Cochlear filter can be defined as:

$$\begin{aligned} \varphi _{a,b}(t) =\frac{1}{\sqrt{a}}\left( \frac{t-b}{a}\right) ^\alpha e^{-2\pi {f_L}\beta (\frac{t-b}{a})}e^{-j2\pi {f_L}\beta (\frac{t-b}{a})} \end{aligned}$$
(4)

Further, CT \(C(\tau , f)\) of x(t) is given by

$$\begin{aligned}&C(\tau , f)\nonumber \\&\quad {=}\frac{ 1}{\sqrt{\left| a\right| }} \int ^{\infty }_{-\infty } x(t)\left( \frac{t-b}{a}\right) ^\alpha e^{-2\pi {f_L}\beta (\frac{t-b}{a})}e^{{-}j2\pi {f_L}\beta (\frac{t-b}{a})}\mathrm{d}t\nonumber \\ \end{aligned}$$
(5)
$$\begin{aligned}&=\frac{ a^{\alpha +1}}{\sqrt{\left| a\right| }} \int ^{\infty }_{-\infty } x(t) t^\alpha e^{-2\pi {f}\alpha \beta t}e^{-j2\pi {f}\alpha \beta t}\mathrm{d}t \end{aligned}$$
(6)

where \(a=f_L/f\) and \(f_L\) is lowest central frequency of filter. The CT \(C(\tau , f)\) of signal x(t) can be represented in amplitude and phase form as Stockwell (2007)

$$\begin{aligned} C(\tau , f)=A_x(\tau , f, \alpha , \beta )e^{i\phi _x(\tau , f, \alpha , \beta )} \end{aligned}$$
(7)

where \(A_x(\tau , f, \alpha , \beta )\) represents the amplitude spectrum of the transform which is time−frequency representation and depends on parameter \(\alpha , \beta \). It is seen that amplitude spectrum is effected by noise and illumination. Hence, to make the system more robust to noise and illumination, phase spectrum is used for feature extraction. Phase spectrum is represented by \(\phi _x(\tau , f, \alpha , \beta )\). This phase is dependent on parameters \(\alpha \) and \(\beta ,\) and hence, phase spectrum gives resonating peaks by suitably choosing these parameters. However, Stockwell transform used for palmprint recognition in Badrinath and Gupta (2011) gives constant phase and this daughter wavelet does not satisfy the zero-mean condition for an admissible wavelet (Saedi and Charkari 2014), which is a necessary condition for orthogonality.

In a system, lighting condition may differ that changes the pixel intensity of two images. Assume that illumination of two palmprint differs by constant k. This illumination may change the signal x(t) to kx(t) where \(k>0\).

The CT \(C(\tau , f)\) of signal kx(t) is given by

$$\begin{aligned}= & {} \frac{ a^{\alpha +1}}{\sqrt{\left| a\right| }} \int ^{\infty }_{-\infty } kx(t) t^\alpha e^{-2\pi {f}\alpha \beta t}e^{-j2\pi {f}\alpha \beta t}\mathrm{d}t \end{aligned}$$
(8)
$$\begin{aligned} C(\tau , f)= & {} kA_x(\tau , f, \alpha , \beta )e^{i\phi _x(\tau , f, \alpha , \beta )} \end{aligned}$$
(9)

It is seen in Eq. (9) that phase is independent of constant k which is the illumination difference, and hence, it is unaffected due to illumination. Hence, it is shown theoretically that Cochlear transform is robust to illumination.

A signal or an image is unfortunately corrupted by various factors which enter as noise during acquisition or transmission. Usually, noise is modeled as high-frequency signal (Pizurica et al. 2003). On filtering the noise from the original signal, prominent part of the original signal must be conserved. The wavelet-based noise removal techniques have provided this conservation of the prominent part (Benesty et al. 2012). The wavelet transform generally has used for the decomposition of the signal into high- and low-frequency components. In practice, the wavelet transform is implemented with a perfect reconstruction filter bank using orthogonal wavelet family. The idea is to decompose the signal into sub-signals corresponding to different frequency contents. In the decomposition step, a signal is decomposed on to a set of orthonormal wavelet function that constitutes a wavelet basis. A wavelet expansion of CT in terms of an orthogonal component proves the robustness of the method due to highly uncorrelated nature. If the chosen mother wavelet has orthogonal properties, then the multi-resolution algorithm decomposes a signal into scales with different time and frequency resolutions. The noise in signal is typically of high frequency, and it is possible to discriminate it through the decomposition of multi-resolution into different levels.

2.1 Orthogonality of Cochlear transform

We define a mother wavelet function \(\varphi (t)\in L^2(R)\), which is limited in time domain. That is, \(\varphi (t)\) has values in a certain range and zeros elsewhere. To find the orthogonality, the inner product of basis function is calculated.

$$\begin{aligned} \int \frac{1}{\sqrt{\left| a\right| }}\varphi \left( \frac{t-b}{a} \right) \frac{1}{\sqrt{\left| a\right| }}\varphi \left( \frac{t'-b}{a} \right) f(a) \mathrm{d}a\, \mathrm{d}b=\delta {(t'-t)} \end{aligned}$$
(10)

Taking Fourier transform both sides,

$$\begin{aligned} e^{-jwt'}= & {} \int \frac{f(a)}{{\left| a\right| }} \hat{\varphi } \left( aw \right) e^{-jwb} \varphi \left( \frac{t'-b}{a} \right) \mathrm{d}a \,\mathrm{d}b \end{aligned}$$
(11)
$$\begin{aligned}= & {} \int \frac{f(a)}{{\left| a\right| }} \hat{\varphi } \left( aw \right) \mathrm{d}a \int \varphi \left( \frac{t'-b}{a} \right) e^{-jwb} \mathrm{d}b \end{aligned}$$
(12)

by substituting \(\frac{t'-b}{a} =x\), \(\left| a\right| \mathrm{d}x=\mathrm{d}b\), this gives,

$$\begin{aligned} \int \varphi \left( \frac{t'-b}{a} \right) e^{-jwb} \mathrm{d}b= & {} \int \varphi \left( x\right) e^{-jw(t'-ax)} \left| a\right| \mathrm{d}x\nonumber \\ \end{aligned}$$
(13)
$$\begin{aligned} \int \varphi \left( x\right) e^{-jw(t'-ax)} \left| a\right| \mathrm{d}x= & {} \hat{\varphi } \left( aw \right) e^{-jwt'} \left| a\right| \end{aligned}$$
(14)

substituting the value from Eq. 14 in Eq. 12 and we get,

$$\begin{aligned} \int f(a) \left| \hat{\varphi \left( aw \right) }\right| ^2 \mathrm{d}a= & {} 1 \end{aligned}$$
(15)
$$\begin{aligned} \int f\left( \frac{a}{w}\right) \left| \hat{\varphi \left( a \right) }\right| ^2 \frac{\mathrm{d}a}{\left| w\right| }= & {} 1 \end{aligned}$$
(16)

Now, putting the value of scaling function \(f(\zeta )=\frac{1}{\left| \zeta \right| }\)

$$\begin{aligned} \int \frac{\left| \hat{\varphi \left( a \right) }\right| ^2}{\left| a\right| } {\mathrm{d}a} =1 \end{aligned}$$
(17)

This equation is also called the admissibility condition which is sufficient condition for orthogonality.

$$\begin{aligned}&\int f(t)\frac{1}{\sqrt{\left| a\right| }}\varphi \left( \frac{t-b}{a} \right) \frac{1}{\sqrt{\left| a\right| }}\varphi \left( \frac{t'-b}{a} \right) \mathrm{d}a\,\mathrm{d}b\nonumber \\&\quad = \int f(t)\delta (t'-t)\mathrm{d}t=f(t) \end{aligned}$$
(18)

The transform is found orthogonal, this implies that a function f can be recovered easily from the inner products \(<\varphi , X>\). Hence, it is shown theoretically that Cochlear transform is orthogonal which shows that this is robust to noise.

2.2 Two-dimensional Cochlear transform

Same idea has been utilized to transform an image in frequency domain. Here, we have proposed a two-dimensional Cochlear transform for images henceforth named as two-dimensional Cochlear Transform (2D-CT). It has a bell-shaped response whose values depend upon the system parameters and image sample. It is worthwhile to mention here, about the unique feature of this proposed technique that it is quite successful in feature extraction of various image-based modalities like palmprint, fingerprint, dorsal hand vein, etc. To extract the features, 2D-CT technique is applied over the ROI of testing and training samples. For any image sample I(xy), cropped ROI R(mn) is extracted after preprocessing which is further used in feature extraction.

On the basis of CT, we define \(\varphi _{a,b}(x, y)\) having dilation and translation parameters \((a_x, a_y)\) and \((b_x, b_y),\) respectively, each varying over \(\mathfrak {R}^2\). On the basis of wavelet analysis, 2D-CT can be written as dilated and translated mother wavelet as shown in Eq. (19).

$$\begin{aligned} \varphi _{a,b}(x, y) =\frac{1}{\sqrt{a_x a_y}} \varphi \left( \frac{x-b}{a_x}, \frac{y-b}{a_y} \right) \end{aligned}$$
(19)

Like wavelet transform, factor \(a_x\) and \(a_y\) are scaling variables in x and y direction. Factor \((b_x, b_y)\) is time shift. \(\frac{1}{\sqrt{a_x a_y}}\) is energy normalizing factor.

As wavelet, the Fourier transform of this wavelet becomes

$$\begin{aligned}&\widehat{\varphi }_{a_x, a_y, b_x, b_y} \left( u, v \right) \nonumber \\&\quad {=}\frac{2 \pi }{\sqrt{\left| a_x a_y\right| }} \int ^{\infty }_{-\infty } \int ^{\infty }_{-\infty } e^{-j \pi (ux+uy)} \varphi _{a_x, a_y, b_x, b_y} \left( x, y \right) \mathrm{d}x \, \mathrm{d}y\nonumber \\&\quad =\frac{1}{\sqrt{\left| a_x a_y\right| }} e^{-j \pi (ub_x+ub_y)} \widehat{\varphi } \left( ua_x, v a_y \right) \end{aligned}$$
(20)

where \(\varphi _{a_x, a_y, b_x, b_y} \left( x, y \right) \)=\(\varphi \left( \frac{x-b_x}{a_x}, \frac{y-b_y}{a_y} \right) \)

A transform of f(xy) with respect to \(\varphi _{a,b}(x, y)\) is defined as

$$\begin{aligned} T(\tau , f)= & {} f(x, y) \otimes \varphi _{a,b}(x, y) \nonumber \\= & {} <f, \varphi _{a,b}> \nonumber \\= & {} \int ^{\infty }_{-\infty } \int ^{\infty }_{-\infty } \frac{1}{\sqrt{\left| a_x a_y\right| }} f(x, y) \overline{\varphi \left( \frac{x-b_x}{a_x}, \frac{y-b_y}{a_y} \right) } \mathrm{d}x \, \mathrm{d}y\nonumber \\ \end{aligned}$$
(21)

where \(\otimes \) denotes the convolution operator.

To prove the orthogonality of 2D-CT, we define

\(\varphi _{a_x,b_x}(x)\varphi _{a_y,b_y}(y)=(\varphi _{a_x,b_x} \otimes \varphi _{a_y,b_y})(x,y)=\varphi _{a,b}(x,y)\) where \(a= \left[ a_x \, a_y \right] \) and \( b= \left[ b_x \, b_y\right] \).

$$\begin{aligned} \begin{aligned} \int \varphi _{a,b} \left( x, y \right) \varphi _{a,b} \left( x', y' \right) \frac{\mathrm{d}a_x \mathrm{d}a_y \mathrm{d}b_x \mathrm{d}b_y}{{\left| a_x a_y\right| }}{=}\delta (x{-}x') \delta (y{-}y') \end{aligned} \end{aligned}$$
(22)

To ensure the orthogonality, there exists a resolution of the identity for wavelets. One finds, for all \(f_l, f _2 \in L^2(\mathfrak {R}^2)\),

$$\begin{aligned} \begin{aligned} \int<\varphi , f_1> <f_2, \varphi > \frac{\mathrm{d}a_x \mathrm{d}a_y \mathrm{d}b_x \mathrm{d}b_y}{(a_x a_y)^2}=C_\varphi (f_1, f_2) \end{aligned} \end{aligned}$$
(23)

where \(C_\varphi \) is constant and is defined as.

$$\begin{aligned} \begin{aligned} C_\varphi = \int \int \frac{\mathrm{d}a_x \mathrm{d}a_y}{\left| a_x a_y\right| } \varphi {\left| a_x a_y\right| }^2 \end{aligned} \end{aligned}$$
(24)

leading to inversion formula

$$\begin{aligned} \begin{aligned} f(x, y)=C_\varphi ^{-1}\int \int \frac{\mathrm{d}a_x \mathrm{d}a_y \mathrm{d}b_x \mathrm{d}b_y}{\left| a_x a_y\right| ^2} [<\varphi , f >]\varphi _{a_x, a_y, b_x, b_y} \end{aligned} \end{aligned}$$
(25)

Now, 2D Cochlear transform is defined as

$$\begin{aligned} \varphi _{a,b}(x, y) =\frac{1}{\sqrt{a_x a_y}} s_i ^ \alpha \exp (-2\pi \beta s_i) \cos ( 2 \pi f_l s_i ) \end{aligned}$$
(26)

where \(s_i\) normalizes energy and \(\beta \) defines the shape of filter and \(\beta >0\). The term \(\exp (-2\pi \beta s_i )\cos ( 2 \pi f_l s_i )\) acts as band pass filter which is used in image enhancement and noise filtration.

$$\begin{aligned} s_i = \left[ \left( \frac{m-b}{a_x}\right) ^2 + \left( \frac{n-b}{a_y}\right) ^2 \right] ^{1/2} \end{aligned}$$
(27)

The 2D-CT \(T(\tau , f)\) of signal f(xy) can be represented in amplitude and phase form as Stockwell (2007)

$$\begin{aligned} T(\tau , f) = R_{x,y}(\tau , f, \alpha , \beta ) e^{i\phi _{x,y}(\tau , f, \alpha , \beta )} \end{aligned}$$
(28)

Then, final features are calculated by negative derivative of phase \(\phi _{x,y}(\tau , f, \alpha , \beta )\).

$$\begin{aligned} log ~ T(\tau , f)= & {} log~ R_{x,y}(\tau , f, \alpha , \beta ) + i\phi _{x,y}(\tau , f, \alpha , \beta )\nonumber \\ \end{aligned}$$
(29)
$$\begin{aligned} C(\tau , f)= & {} \frac{-\phi _{x,y}(\tau , f, \alpha , \beta )}{d f} \end{aligned}$$
(30)

The obtained features \(C(\tau , f)\) are dependent on the phase \(\phi _{x,y}(\tau , f, \alpha , \beta )\) which is dependent on \(\alpha \), \(\beta \) that gives the time-frequency domain representation which is more robust.

2.3 Robustness to noise

Assume the corrupted image to be \(X(x, y)=f(x, y)+n(x, y)\). Using the wavelet analysis of 2D-CT, the transformation of corrupted image is shown in Eq. (31).

$$\begin{aligned} W_{\varphi }X(a,b)\!=\!\int X(x, y) \varphi \left( \frac{x-b_x}{a_x}\right) \varphi \left( \frac{y-b_y}{a_y} \right) \frac{\mathrm{d}x \mathrm{d}y}{\sqrt{a_x a_y}} \end{aligned}$$
(31)

Through linearity property of wavelet transform, this gives-,

$$\begin{aligned} W_{\varphi }X(a,b)=W_{\varphi }f(a,b)+W_{\varphi }n(a,b) \end{aligned}$$
(32)

Now, we define the transformed region of image and noise as \(D_\in f\) and \(D_\in n\), respectively. In their respective region, \(D_\in f\) and \(D_\in n\), image and noise are dominating by a threshold \(\in \) ( in case of noise, \(\in \) is defined by average intensity that is variance \(\sigma ^2_n\)).

$$\begin{aligned} \left\{ (a,b) |W_{\varphi }f(a,b)> \in \right\} :D_\in f \left\{ (a,b) |W_{\varphi }n(a,b) > \in \right\} :D_\in n \end{aligned}$$

where \(D_\in f \bigcap D_\in n =\varPhi \) is a empty set which shows the minimum or zero interaction of noise and image. We define

$$\begin{aligned} \begin{aligned} \varphi _{a, b} (x, y ) \frac{\mathrm{d}a_x \mathrm{d}a_y \mathrm{d}b_x \mathrm{d}b_y}{|a_x a_y|} = \varPsi \mathrm{d}a \mathrm{d}b \end{aligned} \end{aligned}$$
(33)

Now, corrupted image is present in four region, \(D_\in f\) and \(D_\in n\) and their compliments, \(D^c_\in f\) and \(D^c_\in n\).

$$\begin{aligned} \begin{aligned}&X(x, y)\\&\quad =\int _{D_\in f }W_{\varphi }f(a, b)\varPsi \mathrm{d}a \mathrm{d}b+\int _{D^c_\in f }W_{\varphi }f(a, b)\varPsi \mathrm{d}a \mathrm{d}b\\&\qquad +\int _{D_\in f }W_{\varphi }n(a, b)\varPsi \mathrm{d}a \mathrm{d}b+\int _{D^c_\in f }W_{\varphi }n(a, b)\varPsi \mathrm{d}a \mathrm{d}b \end{aligned} \end{aligned}$$
(34)

Now, in region \(D_\in f\), difference of the corrupted image and clear image gives the compliment of noise present in same region.

$$\begin{aligned} \begin{aligned}&\left| \int _{D_\in f }W_{\varphi }X(a, b)\varPsi \mathrm{d}a \mathrm{d}b-\int _{D_\in f }W_{\varphi }f(a, b)\varPsi \mathrm{d}a \mathrm{d}b\right| \\&\quad \le \int _{D^c_\in f }\left| W_{\varphi }n(a, b)\varPsi \mathrm{d}a \mathrm{d}b\right| \\&\quad \le K\in \\ \end{aligned} \end{aligned}$$
(35)

where \(D_\in f \bigcap D_\in n =\varPhi \) and \(D_\in f \subset D^c_\in n \). Here \(K<0\), which implies than in region \(D_\in f\), difference of the corrupted image and clear image is lower than the defined threshold (that is lower than average intensity of noise). If we choose \(K=0.5\), Then, \(\sigma ^2_X-\sigma ^2_f\le \frac{\sigma ^2_n}{2}\). Hence we can reject the noise in region \(D_\in f\).

Now, over the range where \(\left| W_{\varphi }f(a, b)\right| <\in \), this lies only in \({D^c_\in f }\) where noise exists. This clearly implies that,

$$\begin{aligned} \begin{aligned}&\int _{\left| W_{\varphi }f(a, b)\right| <\in }W_{\varphi }f(a, b)\varPsi \mathrm{d}a \mathrm{d}b\\&\quad = \int _{D^c_\in f }\left| W_{\varphi }n(a, b)\varPsi \mathrm{d}a \mathrm{d}b\right| \\ \end{aligned} \end{aligned}$$
(36)

If we integrate the clear image over the union \({(D_\in f \cup D^c_\in f )}\),

$$\begin{aligned} \begin{aligned}&\left| \int _{D_\in f }W_{\varphi }X(a, b)\varPsi \mathrm{d}a \mathrm{d}b-\int _{(D_\in f \cup D^c_\in f )}W_{\varphi }f(a, b)\varPsi \mathrm{d}a \mathrm{d}b\right| \\&\quad =\left| \int _{D_\in f }W_{\varphi }X(a, b)\varPsi \mathrm{d}a \mathrm{d}b-f(a, b)\right| \\&\quad \le K\in + \int _{\left| W_{\varphi }f(a, b)\right| <\in }W_{\varphi }f(a, b)\varPsi \mathrm{d}a \mathrm{d}b\\&\quad \le 2K\in \end{aligned} \end{aligned}$$
(37)

This shows that over the whole range of a and b, we can easily remove the noise if threshold is set as \(\le 2K\in \). If we choose \(K=0.5,\) then \(\sigma ^2_X-\sigma ^2_f\le \sigma ^2_n\). This shows that difference of corrupted image and clear image is lower than noise intensity in the range \({(D_\in f \cup D^c_\in f )}\). That means 2D-CT has already removed the most part of noise. Hence, we can easily reconstruct the clear image, which directly states the robustness of 2D-CT in the presence of noise.

On summarizing, firstly, it is shown that compared to FFT, CT has flexible time-frequency resolution and its frequency distribution can take any linear or nonlinear form. Secondly, it is shown that in \(C(\tau , f)\), amplitude spectrum is effected by noise and illumination. Hence, to make the system more robust to noise and illumination, phase spectrum is used for feature extraction. Phase spectrum is represented by \(\phi _x(\tau , f, \alpha , \beta )\). This phase is dependent on parameters \(\alpha \) and \(\beta \) . Then, it is shown that proposed method is unaffected due to illumination also. Then, orthogonality of Cochlear transform is shown. Finally, it is also shown that 2D-CT (two dimensional) is also orthogonal, which directly states the robustness of 2D-CT in the presence of noise which is also proved experimentally.

2.4 Stage-wise steps followed in proposed work

Steps followed in proposed work are discussed in the form of algorithm as below. The graphical representation of this work mainly includes three stages shown in Fig. 1. A general block diagram of the proposed work is shown in Fig. 2. First stage explains the procedure of data collection. Stage 2 explains the preprocessing of the databases to make system hand rotation invariant. In stage 3, feature extraction and matching.

Fig. 2
figure 2

A general block diagram of the proposed work

  1. Stage 1

    Collection of database

  2. 1.

    Different palmprint databases are procured from standard source for comparison and analysis.

  1. Stage 2

    Preprocessing

  2. 1.

    Hand samples of both the databases have position difference. To make hand samples rotation invariant, coordinates of fingertips and finger valleys and centroid are calculated.

  3. 2.

    Hand samples are straightened using fingertips and centroid.

  4. 3.

    Next ROIs are extracted using finger valleys of straightened hand samples.

  1. Stage 3

    Feature extraction and matching

  2. 1.

    Apply adaptive histogram equalization (AHE) Srivastava et al. (2016) on each ROI, based on Rayleigh distribution. Dimensions of ROI are kept as \(150 \times 150\) in the experiments. After preprocessing, the ROIs are partitioned into nonoverlapping windows of size \(15 \times 15\) each. Thereby, this divides each ROI in 100 windows.

  3. 2.

    For feature extraction, image f(xy) is convolved with respect to \(\varphi _{a,b}(x, y)\) with different values of \(\alpha , \beta \) to evaluate \(T(\tau , f)\) shown in Eq.26 . For this normalized energy for each window is calculated using Eq. 27. Then, final features \(C(\tau , f)\) are calculated by negative derivative of phase \(\phi _{x,y}(\tau , f, \alpha , \beta )\) where

    \(C(\tau , f) = \frac{-\phi _{x,y}(\tau , f, \alpha , \beta )}{d f} \) as shown in Eqs. 29 and 30.

  4. 3.

    For identification, recognition rate is adopted which is obtained by KNN classifier.

  5. 4

    For verification, Euclidean distance is used to calculate the scores between the training and test sample. Using scores, obtain Receiver Operating Characteristic (ROC), Equal Error Rate (EER) and Area under the curve of ROC (AUC) for each database.

3 Experimental results

In simulations, the implementation of the suggested 2D-CT method has been validated in both verification and identification modes. For verification, person is validated from its own previous enrolled samples i.e., 1:1 mapping. While in identification, system validates a person from the entire N enrolled person, i.e., 1: N mapping. Dimensions of ROI are kept as \(150 \times 150\) in the experiments. After preprocessing, the ROIs are partitioned into nonoverlapping windows of size \(15 \times 15\) each. Thereby, this divides each ROI in 100 windows.

3.1 Preprocessing

For feature extraction, the orientation of data samples must be same to withdraw the same set of information. There are variations in sample position in database as shown in Fig. 3.

Fig. 3
figure 3

Samples of IITD palmprint database

  1. 1.

    Binarizing the palm images first, a binary mask of the gray-scale hand image is prepared using Ostu’s thresholding (Xu 2011). This is mathematically passing an image through a low-pass filter with a threshold \(\tau \). This gives \(I_{\mathrm{bin}}(x, y) = 1\), If \(I(x, y)\otimes L_{\mathrm{filter}}(x, y) \ge \)\(\tau \), otherwise \(I_{bin}(x, y) = 0\), where \(I_{bin}\) is binarized image, I(xy) is original image and \(L_{\mathrm{filter}}\) is low-pass filter. This is shown in Fig. 4a.

  2. 2.

    Finding boundary of the binarized image this can be done by boundary tracing algorithm which basically search the number of pixel neighbors in the binarized image.

  3. 3.

    After building the line from the center of gravity of each pixel, the centroid of the image is calculated which can be mathematically represented as \(Xc = (\hat{x}, \hat{y})\), where \((\hat{x}, \hat{y})\) are the arithmetic mean of x, y in binary region \(\mathfrak {R}\).

  4. 4.

    Then, thinning operation is applied on this mask to get a hand skeleton like structure. Next, the coordinates of the end points of the previously obtained thinned skeleton image are computed using the logic of crossing number. In this, fingertips and finger valleys are calculated to capture the most accurate and similar ROI. After finding the fingertips and centroid, masking of original bounded image is done. The ordering of fingertips is done by doing circular traversal with centroid as the center of the hand in clockwise direction so that the first coordinates are that of small finger and last coordinates are of thumb. In this way, the coordinates of all the fingertips are obtained. With the help of centroid and fingertips, the image is rotated such that the line joining the tips of the index finger and ring finger becomes horizontal (in Fig. 4d). Next, finger valleys are calculated using the rate of change of the slope of boundary. Then, coordinates of finger valleys are used to crop the region of interest (ROI) of each palm. This method can be easily applied to both the right and left hand palmprint only by changing the indexing of fingers. This is shown in Fig. 4.

Fig. 4
figure 4

Preprocessing a binarized image, b fingertips and centroid in the binary image, c indexing of fingertips: small, ring, middle, index, thumb, d fingertips of index finger and ring finger are connected with a line, e straitening of the hand by making joined line parallel to horizontal line, f masked image with valley points, g line joining the valley points is made parallel to horizontal line, h extracted ROI

Fig. 5
figure 5

Performance of proposed feature extraction with different valves of \(\alpha \) and \(\beta \) on CASIA database

3.2 Databases for palmprint

In this work, three palmprint databases, Hong Kong Polytechnic University (PolyU) database, Chinese Academy of Sciences Institute of Automation (CASIA) database and Indian Institute of Technology Delhi (IITD) database, were used to validate the proposed 2D-CT method. IITD palmprint database consists of both the left and right hand. Proposed method was applied on both hands treating as different database. As resolution, illumination and setup in both the databases are different, so the different environmental conditions are thereby included.

3.2.1 IITD database

IIT Delhi palmprint database version 1.0 contains left and right hand anterior samples of approximately 230 persons in the age group 14–56 years. This database consists of 5 to 6 samples of each hand. So, from each group 200 people are selected having 6 samples for the experiments. Database acquisition is based on contact-less type of scanning system with a digital CMOS camera. This type of system is effortless and highly useful in office environment. There is no use of pegs for the placement of hand.

3.2.2 CASIA database

CASIA Palmprint Image Database contains 312 subjects with approximately 8 left and right hand images. Database is acquired by a CMOS camera without any constraint and saved in 8 bit gray-level .JPEG format. In the experiments, six images of right hand from every person are used to validate the proposed method.

3.2.3 PolyU database

The PolyU palmprint database contains anterior samples of approximately 386 persons with overall 7752 gray-scale samples. In this, every person has 20 samples collected in two sessions with a difference of approximately 60 days. In this system, pegs are used to position the hand sample to remove position variance. From the database, only ten samples of each individual are selected to perform the training and testing.

3.3 Identification

For identification, recognition rate is adopted to evaluate the performance of 2D-CT. The recognition rate is the fraction of the test samples which are correctly recognized by identification system. K-nearest neighbor (KNN) classifier (Srivastava et al. 2016) is used to classify the features, finally. After the preprocessings, 2D-CT features are calculated on each window with different values of \(\alpha \) and \(\beta \). Performance of proposed feature extraction with different valves of \(\alpha \) and \(\beta \) on CASIA database is shown in Fig. 5. Results show that 2D-CT features are dependent on parameters \(\alpha \) and \(\beta \) and hence phase spectrum gives resonating peaks by suitably choosing these parameters.

To compare the performance of the proposed method and other palmprint authentication techniques, 2D-CT features are calculated on each window by suitably choosing the necessary parameters \(\alpha = 2.6\) and \(\beta =0.034\). Now, proposed method is compared with DOC (Fei et al. 2016), DRCC (Xu et al. 2016), gabor transform (Chu et al. 2007), CompC Code (Kong and Zhang 2004), Ordinal Code (Sun et al. 2005), GMF (Arora and Srivastava 2015; Srivastava et al. 2016), Mean (Srivastava et al. 2016) and AAD (Srivastava et al. 2016). For the Gabor transform, we have used 5-scale and 8-orientation Gabor filter bank (40 Gabor filters) (Mohammad and Mahoor 2014).

Table 1 Identification results of different feature extraction methods of different database
Fig. 6
figure 6

a Salt and pepper noise in PolyU database, b Gaussian noise in CASIA database, c speckle noise in IITD database

Table 2 Identification results of proposed method using KNN, SVM and random forest
Table 3 Average recognition rate of 2D-CT on addition of different noise

For CASIA database, the results show that recognition rate (RR) for 2D-CT, DRCC, DOC, CompC Code, ordinal, GMF, AAD, Gabor and Mean are 98.66%, 98.1%, 97.8%, 97.6%, 96.1%, 94.4%, 93.2 %, 90.2% and 91.98%, respectively. For other database, the results are shown in Table 1. We have 10 samples for each individual in PolyU database. The training to testing ratio is taken as 6: 4. The PolyU database is acquired with pegs, so interclass variations are very few. Hence the results obtained are higher than other database. Results obtained from IITD Left and Right database varied on the selection of testing and training samples and different group of individuals whose total count is constant, i.e., 200. From IITD database, 6 samples are selected randomly. The training to testing ratio is taken as 4: 2. For CASIA database, six images of right hand from every person are used to validate the proposed method. The training to testing ratio is taken as 4: 2.

First, it is shown that K-NN is worth considering and achieved good overall performance than SVM and RF. Second, as compared to SVM, both K-NN and RF are very simple and well understood. Paper concentrates more on noise robustness. We formulated the proposed method using SVM, random forest also. The SVM using polynomial kernel gave much better results than those from radial basis function. Therefore, only the results from polynomial kernel are reported. For degree-d polynomials, quadratic polynomial \(d = 2\) is reported as it is giving better results than \(d = 1\), and \(d = 3\). Random forest (RF), a pattern recognition method based on “ensemble learning” strategy, is also reported here with learning rate as 0.1. Identification results of proposed method using KNN, SVM and random forest are shown in Table 2.

Noisy conditions which can occur during security check, access control and taking attendance of user are usually dealt by biometric systems. In palmprint systems, user hands can be dirty, marked or dusty. At that time, our recognition system should not consider a genuine as an imposter. Hence, biometric systems must be robust in nature. All the above-mentioned databases do not contain noise or disturbances. Hence, different type of noise, for example, Gaussian noise, salt and pepper noise and speckle noise are added and tested to evaluate the proposed method in real environment.

Fig. 7
figure 7

Variation of average recognition rate of 2D-CT in presence of a Gaussian noise, b Salt and pepper noise, c speckle noise

Fig. 8
figure 8

Variation of average recognition rate of different feature extraction methods in presence of speckle noise on CASIA database

To check the robustness of proposed technique, we have intentionally added different types of noises while performing the simulation study. Three types of noises are selected which includes Gaussian noise (with mean \(\mu =0\) at variance \(\sigma ^2=0.1\), \(\sigma ^2=0.3\), \(\sigma ^2=0.5\)), salt and pepper noise (with noise intensity \(\sigma ^2=0.1, 0.2, 0.3\)) and speckle noise (with mean \(\mu =0\) at variance \(\sigma ^2=0.1\) , \(\sigma ^2=0.3\), \(\sigma ^2=0.5\)). Figure 6 shows some noisy palmprint images with Gaussian noise, salt and pepper noise and speckle noise at different database.

Fig. 9
figure 9

Variation of average recognition rate of different feature extraction methods in presence of Gaussian noise on CASIA database

Fig. 10
figure 10

Receiver operating characteristic of different methods on CASIA palmprint database

Recognition rates between the training and noisy test sample are calculated using K-nearest neighbor (KNN) classifier. Recognition rates of 2D-CT-based features of left and right hand palmprint for IITD palmprint database version 1.0, CASIA database and PolyU database on addition of noise are tabulated in Table 3. In Fig. 7a–c, it is seen that performance of the proposed system is highly robust and does not change much in the presence of noise. By varying the intensity of noise, maximum rate of change is only 0.027% in case of 200 subjects in IITD database, 0.0168% in case of CASIA palmprint database and in PolyU database, the maximum rate of change is quite low, i.e., 0.0068% in 386 subjects.

Table 4 Verification results of different feature extraction methods on CASIA palmprint database
Table 5 Verification results of different feature extraction methods on IITD Right palmprint database
Table 6 Verification results of different feature extraction methods on IITD Left palmprint database
Table 7 Verification results of different feature extraction methods on PolyU palmprint database
Fig. 11
figure 11

Receiver operating characteristic of different methods on IITD Left palmprint

Now, comparison of the robustness of proposed technique with other feature extraction methods in presence of speckle noise (with mean \(\mu =0\) at variance \(\sigma ^2=0.1\) , \(\sigma ^2=0.3\), \(\sigma ^2=0.5\)) and gaussian noise (with mean \(\mu =0\) at variance \(\sigma ^2=0.1\) , \(\sigma ^2=0.3\), \(\sigma ^2=0.5\)) on CASIA database is observed in Figs. 8 and 9. It is observed that proposed method is highly robust and a slow variation in recognition rate is seen by varying the intensity of noise. Also, DRCC has shown a good performance due to its robust properties and fusion of side and top orientation indices. It was found in the simulations that both GMF and Ordinal coding has shown good performance. Since, GMF is based on Gaussian membership function so it is inherently more stable to noise and Ordinal coding is also based on the orthogonality property of wavelets, so both of them are highly uncorrelated to noise and are more robust. Further, Mean and AAD are statistical features, so they are highly affected by noise. Performance of Gabor transform and CompC coding is found to be lower than others due to the use magnitude spectrum for extraction of their features and magnitude is more affected by noise.

3.4 Verification

ROC curves are used to visually analyze the implementation of the suggested method in verification. Typically, ROC is the curve between the Genuine acceptance rate (GAR) and False Acceptance Rate (FAR) where \(\mathrm{GAR}=100-\mathrm{FRR}\) and FRR is False Rejection Rate. FAR is the rate of wrongly accepted person, while FRR is the rate of genuine subjects wrongly rejected. To calculate the scores between the training and test sample, Euclidean distance is calculated with features obtained from each biometric modality. For verification, EER is chosen as a performance measurement quantity where its lower values show high performance of the system (Xu et al. 2016). EER is calculated where FAR equals FRR. Also, area under the curve of receiver operative characteristics is also used to validate the results. The AUC has an important statistical property: the AUC of a classifier is equivalent to the probability that the classifier will rank a randomly chosen positive instance higher than a randomly chosen negative instance. The maximum value of AUC is 1. The higher the value, the better the performance of the system.

Fig. 12
figure 12

Receiver operating characteristic of different methods on IITD Right palmprint

Fig. 13
figure 13

Receiver operating characteristic of different methods on PolyU palmprint

As shown in Fig. 10, ROC curves for CASIA palmprint database are shown for proposed 2D-CT with other features. It is clear that ROC of the proposed technique covers the maximum area under the curve (AUC) and reaches 100% GAR at a faster rate as compared to ROC’s of other technique. The AUC values are shown in Fig. 10. While the convergence rate of mean features and gabor features is slower, DRCC and DOC shows good performance. Ordinal Code and CompC Code shows comparable performance. The GMF-based features also shows good performance. As shown in Table 4 at \(FAR=0.1\), GAR is 98.74 for 2D-CT, 98.74 for DOC, 99.4 for DRCC, 96.4 for CompC Code, 96.53 for ordinal code, 88.93 for GMF, 86.89 for AAD, 82.65 for gabor and 87.83 for mean features. While at \(FAR=1\), 2D-CT reaches to 99.22, while others are slower than proposed results. Other verification results from IITD right, IITD left and PolyU database are shown in Tables 5, 6 and 7.

In Figs. 1112, and 13, ROC curves are plotted for IITD palmprint database for both right and left hands and PolyU palmprint database. The ROC of the proposed technique reaches 100% GAR faster and cover the maximum area under the curve.

3.5 Statistical performance

To show the quantitative analysis of the performance, results are shown in form of Identification results (recognition rate) in Table 1, in AUC and EER in Tables 4, 5, 6 and 7. The recognition rate is the fraction of the test samples which are correctly recognized by identification system. 2D-CT shown good performance with respect to other methods such as, 8.6–9.8% to gabor feature and 2.63% to DRCC. In the terms of AUC and EER, 2D-CT is found to be superior. However, Paper concentrates more on noise robustness. The performance improvement on addition of noise can be seen in Figs. 8 and 9. To prove that the performance of the proposed method is superior to the existing methods, standard deviation is calculated using recognition rate, AUC and EER from different database as shown in Table 8.

Table 8 Statistical analysis using standard deviation of Recognition rate (RR), AUC and EER
Table 9 Percentage improvement in standard deviation \( (\%) \varDelta \sigma \) of 2D-CT over DRCC and DOC
Table 10 Speed comparison

To further show the signification difference, percentage improvement in standard deviation \( (\%) \varDelta \sigma \) of 2D-CT is calculated with respect to DRCC and DOC. The formula to calculate \( (\%) \varDelta \sigma \) is given in Eq. 38. Table 9 shows the significant difference of proposed method with respect to DRCC and DOC.

$$\begin{aligned} (\%) \varDelta \sigma = \frac{\sigma (x_{\mathrm{previous}})-\sigma (x_{\mathrm{proposed}})}{\sigma (x_{\mathrm{proposed}})}\times 100 \end{aligned}$$
(38)

3.6 Speed

To evaluate the speed, we compare the computational time of the proposed method and other mentioned methods for PolyU database. All methods are implemented using MATLAB on a PC with configuration of double-core Intel i3 (2.40 GHz), RAM 4.00 GB and Windows 7.0 operating system. Table 10 summarizes the comparison results. The total computational cost of the proposed method is about 367.666 ms, which is comparable to other compared methods. The code size of CompC coding and ordinal coding is same. However, the ordinal code scheme can perform filter level combination. As a result, the filtering process can be performed on only three orientations, which makes it save half the time for feature extraction compared to competitive coding scheme. Mean, AAD and GMF are statistical methods where GMF combines both Mean and AAD. So GMF take double the time of mean and AAD. Speed of DRCC and DOC is a little more than that of the competitive code method. The main reason is that it extracts the additional side code. Proposed method is different from coding algorithms. It is also a statistical method but slower than GMF, AAD and mean. This method concentrates more on noise robustness. So there is a trade-off between speed and noise robustness. Still speed is comparable with DRCC and DOC.

4 Conclusion

In this paper, 2D-CT which is a powerful time-frequency transform for texture analysis has been proposed. The performance of 2D-CT is validated using KNN with Euclidean distance. The method has been tested on IITD palmprint database, PolyU palmprint database and CASIA palmprint database and has achieved high accuracy. The proposed 2D-CT method is compared with several state-of-the-art methods. The ROC curves show the superiority of the proposed method over other existing methods. PolyU database is acquired with pegs and pins. So intra-class variations of individual user are very less. Hence, there are little differences in EER, AUC between the mentioned methods. The proposed method is validated in the presence of noise and found to be very robust. By varying the intensity of noise, maximum rate of change is only 0.027% in case of 200 subjects in IITD database, 0.0168% in case of CASIA palmprint database and in PolyU database, the maximum rate of change is quite low, i.e., 0.0068% in 386 subjects. The use of different database shows that method is independent of environmental conditions.

5 Future perspectives

In future, the proposed 2D-CT feature extraction method can be applied to other biometric modalities. When 2D-CT is applied, the parameters \(\alpha \) and \(\beta \) are manually chosen. These values can be optimized using some optimization algorithm like genetic algorithm, cuckoo search, etc. Furthermore, fusion of different modalities with proposed feature extraction method can also improve the performance of system.