Keywords

1 Introduction

Signature is one of the significant and most popular Biometric for person authentication and identification. It has been widely accepted in banking and commercial systems. Unlike other authentication schemes that uses password or PIN, signatures can not be forgotten. The signature verification systems can be categorised as online and offline verification systems. Offline signature verification can be either writer dependent and writer independent. “In writer independent approach the classifier is trained only once with genuine and forged signature samples of varying number of writers” [3]. whereas in writer dependent approach for every newly added signer class, the classifier has to be retrained. The offline signature verification becomes more challenging since it lacks dynamic information. Due to the wide acceptance of the signature biometrics, there is a high demand for the powerful feature extraction technique for the offline signature verification systems. The features extracted can be either local or global in nature. In our approach we capture both global and local features using spatial pyramid image representation with Taylor Series Expansion (TSE) as the feature extraction technique. Even though there are an ample number of algorithms in the literature, designing an accurate algorithms for offline signature verification is still a challenging task. This motivates the researchers to come up with better algorithms. The following section presents a brief review of related works in the state of art literature.

2 Literature Review

Over the past two decades research in signature verification has drawn the attention of a lot of researcher, contributing in the development of this area. The Local Binary Pattern (LBP) and HOG based technique [17] partitions the signature image into blocks using the cartesian and polar coordinate systems. The features are then stored as a histogram. The Blockwise Binary Pattern (BBP) [13] takes 3 \(\times \) 3 neighbourhood of every point of the signature image and uses histogram representation for representing the local BBP features. The deep multitask metric learning (DMML) [16] is based on multitask and transfer learning. This approach trains a distance metric that enables the model to learn user-dependent factors. The Writer-independent [8] approach mainly focuses on the shape of the signature and considers the spatial distribution of a background pixel around foreground pixel. For classification this approach uses the Multi-layer Perceptron and the Support Vector Machine classifiers. The inter point distances is computed by Distance Moments [7] in order to captures the structural and temporal information. Shikha et al. proposed a method “based on Multi-layer perceptron (MLP) and Self Organizing Map groups of neural networks (SOM)” [15]. Shekar et al., [12] proposed a morphological pattern spectra and Bhattacharyaa et al., [4] have proposed an pixel matching technique (PMT). Yasmine et al., [6] proposed a one class support vector machine. The nature of similarities and dissimilarities of both the genuine and forge signature by the Deep Multitask Metric Learning (DMML) [16] approach that uses both the writer dependent and writer independent approaches. The bag-of-visual words (BoVW) [10] uses forensic document examiners (FDEs) cognitive process for feature extraction by extracting KAZE features.

The organization of the paper as follows. The proposed approach is brought down in the Sect. 3. The experimentation and discussion of the results are presented in Sect. 4 followed by the conclusion in Sect. 5.

3 Proposed Approach

The proposed methodology uses the spatial pyramid image [5] technique on the two dimensional signature images. The partial sum of second order Taylor Series Expansion (TSE) is used as the feature extraction technique to obtain the local features of the input image. The spatial pyramid is an extension of the bag of features that consists of collection of the orderless features represented in the form of histograms, \(FF=\left\{ Hi \mid i = 1,2, \ldots , n\right\} \). Here, a given image is partitioned into sub regions recursively. The features in the form of histograms from the image and its sub regions are extracted. Then these orderless collection of histograms representing local features are concatenated to form a single histogram representing that image. Let F be the features extracted by the image I and \(H_{I}=\left\{ h_{i} \mid i= 1,2 \ldots , m \right\} \) be the histogram representation of the features, where m is the number of bins in the histogram. The image I is partitioned into N blocks, say \(B^{ij}\), for \(i=1, \ldots , N\) and \(j=1\), and j represents the level L. From each \(B^{ij}\), features are extracted and are represented in the form of histogram \(H_{B^{i}}\) and the partitioning and extraction of local features from sub blocks is performed recursively for \(\textit{L}\) levels. We have used \(N=4\) and \(\textit{L}\) =3 in our approach. The histogram \(H_{I}\) and the histograms, \(B^{ij}\) obtained by the subblocks at various levels together forms the collection of features representing that image. The features are obtained from the image and its subregions using the same feature extraction technique. The subregions are then recursively partitioned forming orderless statistics of low-level image features in these subblocks. The Fig. 1 illustrates the spatial pyramid applied on a signature image.

“The features extracted from the signature are the partial sum of second order Taylor Series Expansion (TSE). The Taylor series expansion of function f(x) at a point \(x=a\) computes the function in a small neighbourhood of a point a. The necessary condition is that the f(x) must be a regular function i.e f(x) is continuously infinitely differentiable in the neighbourhood. The equation for infinite Taylor series expansion of a regular function f(x) at \(x=a\) is given below” [14]

$$\begin{aligned} f(x) \equiv \sum _{n=0}^{\infty } \frac{f^{n}(a)}{n!} (x-a)^n \end{aligned}$$
(1)

The function can be calculated at all the points in a neighbourhood of a point say \(a = (x,y)\), provided the function value and its derivatives are known at point a. The above equation can be written as,

$$\begin{aligned} f(x) \equiv \sum _{n=0}^{N} \frac{f^{n}(a)}{n!} (x-a)^n +R_{n} \end{aligned}$$
(2)

where \(R_{n}\) is the Taylor’s remainder. If we assume \(R_{n}\) tends to zero as \(n\rightarrow \infty \), we can obtain the approximation for f(x) by the partial sum of TSE computed with N terms (where N is finite) within a small neighbourhood. This property of TSE is used in this approach to obtain the local signature features.

Fig. 1.
figure 1

Image courtesy [9]

Spatial pyramid with 3 levels. Level 0: Signature image, level 1: Image is partitioned into 4 blocks. Level 2: Image is partitioned into 16 subblocks,

In the papers [2] and [1] only the second and third coefficients of TSE is used. This is based on the first and second order derivatives calculated individually at eight different scales. This is encoded based on the zero crossings of the convolution outputs. The first and the second derivatives represents the minute local information and the information about the concavity and convexity of the edges respectively. The global information within a neighbourhood is captured by the derivatives of the higher order. It is verified that the fifth and higher order derivatives do not contribute much. In our experimentation we have taken only upto third order derivatives because there are only minor variations between fourth and fifth derivatives. “An example of the derivative outputs of a 1D signal of the signature sample from CEDAR database is presented in Fig. 2. The range of values on the Y axis specifies the range by which the derivatives differ and this becomes wider as the order becomes higher” [14].

Fig. 2.
figure 2

(a) 1D signal of the signature sample from CEDAR database; (b)–(f) First to fifth derivatives of this signal in order. The vertical axis specifies the range through which the derivatives vary. The range becomes wider as the order becomes higher.

“To compute the horizontal and vertical higher order derivatives we have extended Sobel’s kernel using the coefficients of \(n^{th}\) order binomial expansion. i.e. the elements of the \(n^{th}\) order kernel are the elements of the \(n^{th}\) row of Pascal triangle which are devised by the coefficients of the binomial expansion \( (a+b)^{n} \), a and b being unity” [14]. For example, the extended kernels to compute third order derivatives along the X and Y direction are given below.

\( \begin{bmatrix} -1&{}-3&{}-3&{}-1\\ 0&{}0&{}0&{}0\\ 0&{}0&{}0&{}0\\ 1&{}3&{}3&{}1 \end{bmatrix} \quad \begin{bmatrix} -1&{}0&{}0&{}1\\ -3&{}0&{}0&{}3\\ -3&{}0&{}0&{}3\\ -1&{}0&{}0&{}1 \end{bmatrix} \)

The summation of horizontal and vertical partial sum of TSE forms the features for that point \(x=a\). The features are stored as a histogram of 10 bins.

We have used weighted histograms. The weight associated with the histogram is directly proportional to the depth of the level. In other words, the weight associated with the level \(\textit{l}\) is set to \(\frac{1}{2^{L - \textit{l}}}\), where L is the total number of levels in the pyramid and \(l=\left\{ 0,1 \ldots L-1 \right\} \). The equation is given below.

$$\begin{aligned} HH = H_{I} + \sum _{j=0}^{L-1} B^{ij} \end{aligned}$$
(3)

The best results were obtained when the number of levels was \(L=3\) and the number of bins in the histogram was m = 10. We obtain a single histogram at level l = 0 from the signature image, 4 histograms from the level \(l=1\) and 16 histograms from the level \(l=2\). Hence we have an orderless collection of 21 histograms. We have carried out experimentations with the number of bins in the histogram as \(m=\left\{ 10, 50, 100, 200\right\} \). We have obtained the best results with \(m=10\). We concatenate the histograms to form a single combined histogram with 210 features. We have performed SVM based classification.

Support Vector Machines (SVM) learns a linear decision boundary for a given training set. It finds a decision plane that maximizes the margin. “Given a set of n training samples \(\lbrace (x_{i}; y)\rbrace _{i=1}^{n}\) where \(x_{i} \in R^{L}\) is drawn from a domain X and each of the label \(y_{i}\) is an integer from \(Y = \lbrace 0,1\rbrace \). For a given instance of label pairs \(\left( x_{i},y_{i}\right) , i = 1, \ldots , \textit{l}\) where \(x_{i} \in R^{n}\) and \( y \in \lbrace 0, 1\rbrace ^{l},\) the SVM needs the solution for the following optimization problem” [12]:

$$\begin{aligned} {min}{_{\left( W,b,\xi \right) }} \frac{1}{2}W^{T}W + C\sum _{i=1}^{l}\xi _{i} \end{aligned}$$
(4)

Subjected to

$$\begin{aligned} y_{i}\left( W^{T}\phi \left( x_{i} \right) + b\right) \ge 1-\xi _{i} \end{aligned}$$
(5)

and

$$\begin{aligned} \xi _{i} \ge 0. \end{aligned}$$
(6)

“The function \(\phi \) maps the training features \(x_{i}\) into a higher dimensional space. The SVM finds a linear separating hyperplane with the maximal margin in the higher dimensional space. C is the penalty parameter. The kernel trick is used to transform a lower dimensional data into a higher dimensional one. Since SVM is a binary classifier, for classification of N classes, N SVM classifiers are needed. Hence, in the proposed work number of SVM classifiers is equal to the number of writers [12]”.

4 Experimental Results and Discussions

In order to reveal the accuracy of the proposed approach, we have conducted various experimentations on offline signature datasets like CEDAR and MUKOS (Mangalore university Kannada Off-line Signature). The CEDAR dataset consists of 55 Signers with 24 genuine and 24 skilled Forgery images for each signer. MUKOS dataset consists of 30 signers with 30 genuine and 15 skilled forged Signature images.

Each signature dataset is divided into training and testing sets. We have conducted 4 sets of experiments.

  • Set-1: The first 10 genuine and 10 forged images are taken as training set and rest are taken as testing sample set.

  • Set-2: The first 15 genuine and 15 forged images are taken as training set and rest are taken as testing sample set.

  • Set-3: 10 genuine and 10 randomly chosen forged images are taken as training set and rest are taken as testing sample set.

  • Set-4: 15 genuine and 15 randomly chosen forged images are taken as training set and rest are taken as testing sample set.

Table 1. Experimental results obtained for CEDAR dataset

Experimental Results on CEDAR Dataset

We have carried out 4 sets of experiments on CEDAR dataset as explained in the previous section. The results are presented in Table 1.

In order to compare the accuracy of the proposed method with the other methods in the literature, we have tabulated the results in the Table 2.

Table 2. Experimental results for CEDAR dataset - a comparison:
Table 3. Experimental results for MUKOS dataset
Table 4. Experimental results for MUKOS dataset- a comparative analysis

Experimentation on MUKOS Dataset

We have carried out 4 sets of experiments on MUKOS dataset as explained in the previous section. The results are presented in Table 3. In order to compare the accuracy of the proposed method with the other methods in the literature, we have tabulated the results in the Table 4.

5 Conclusion

In this paper we have explored Spatial Pyramid image representation based technique with Taylor Series Expansion for offline signature verification. Here, the given signature image is partitioned into sub blocks recursively and the local features in each sub blocks are computed and are stored as histogram. The histograms of the signature image and subblocks at various levels are concatenated to form a single histogram. Thus, the obtained spatial pyramid feature vector is an extension of an orderless bag-of-features consisting of collection of histograms concatenated to form a single histogram. The partial sum of second order TSE calculated with N number of terms (where N is finite) within a small neighbourhood, gives approximation for the function and extracts the localised structural features of signature. The Experimental results reveals the performance of the proposed approach.