Keywords

1 Introduction

Fingerprints, consisting of the ridges and furrows on human fingers, have been one of the most describable biometric traits which are largely employed for authentication and identification tasks in the fields of civil and forensic systems. A number of features could be captured from a fingerprint to identify a person through a fingerprint recognition system [1]. Generally, the fingerprint recognition system operates in either verification mode or identification mode. For the latter one, identification of an unknown template operating in a 1:N matching process over a huge database usually consumes significant time in order to provide a reliable conclusion. Although state-of-the-art fingerprint matching algorithms are fast and accurate, the size of the database can be over one hundred million (e.g., the police database of a country) and it poses much more challenges for the recognition accuracy and efficiency.

To address the above challenges, fingerprint classification and fingerprint indexing are the most common solutions. In the terms of fingerprint classification, Henry distinguished fingerprints into 5 classes including left loop, right loop, whorl, arch, and tented arch [2]. However, the number of classes is small and fingerprints are unevenly distributed among them. On contrary, fingerprint indexing is a more efficient approach where fingerprints are specified with feature vectors. These feature vectors are generated through a similarity-preserving transformation and similar fingerprints are mapped into close points (vectors) in the multidimensional space. The indexing is performed by matching the query fingerprint against the template fingerprints in the database whose vectors are close to the query one. Subsequently, the top N most similar template fingerprints are returned. Since it is only necessary to match N candidates instead of every fingerprint of the database, the fingerprint indexing narrows the number of the large database effectively.

Fingerprint indexing is challenging and promising so that there have been lots of approaches to improve the related techniques, in which the involved features can be generally classified into minutiae and global features. In global feature methods, singular points or orientation fields represent the global information of ridges. In [3] and [4], the algorithms adopt singular points as features to index fingerprints. However, these techniques mostly require pre-alignment of fingerprints and it is difficult to extract reliable location of singular point from poor fingerprint images. In addition, feature vectors for fingerprints indexing are received by orientation fields [5, 6]. However, these features are still global characters and their discrimination is not as good as minutiae.

In minutia feature methods, the intrinsic idea is to establish a structure of minutiae that is of enough discrimination and robustness in presence of rotation and translation variations. Depending on these minutia features, there mainly exist two indexing techniques, namely hash-based indexing and Approximate Nearest Neighbor (ANN) indexing. The algorithm [7, 8] based on minutiae derives triangle or quadrangle geometric features and adopts simple hashing technique for searching. Unfortunately, these geometric features are more sensitive to noise and distortion. On the other, Minutiae Cylinder Codes (MCC) characterize a minutia neighborhood through encoding the neighborhood of each minutia into a fixed-length bit vector and the algorithm indexes by means of a typical kind of ANN indexing technique, Locality Sensitive Hashing (LSH) [9]. The representation of a minutia’s neighborhood in MCC seems quite complicated for the reason that it has a very high-dimensional feature vector. Furthermore, Locality Sensitive Hashing (LSH) is an approximate searching method which is not quite applicable for the cases requiring high accuracy rate.

In this paper, to our knowledge, the k-plet local pattern of minutiae [10] as feature is the first time to be imported in the field of fingerprint indexing. The k-plet representation is invariant under translation and rotation since it is based on its own local coordinate system. In addition, an efficient tree based indexing scheme designed for the k-plet representation is proposed to accelerate retrieval. In order to make this indexing scheme more robust, the multipath indexing strategy is employed which means that brother bins located in a certain range of the hit bin are all browsed at the stage of indexing. Furthermore, this indexing scheme is quite fast since it just need to look up index trees avoiding the complex computation. The k-plet local pattern is enrolled in k 3-level index trees, then a query only need to orderly index k trees and count the matched votes.

The rest of this paper is organized as follow: Sect. 2 introduces the k-plet local pattern representation while Sect. 3 describes the efficient indexing approach. In Sect. 4, experiments for testing the proposed algorithm are conducted on public datasets. Finally, Sect. 5 draws some conclusions.

2 K-Plet Local Pattern

In the field of fingerprint matching, an algorithm named as k-plet presents excellent performance, which consists a fixed-length vector by quantizing the distance and orientation difference between a central minutia and its neighboring minutiae, Fig. 1.

Fig. 1.
figure 1

K-plet local patterns of minutiae defined in a fingerprint [10]

Local pattern for indexing fingerprints is characterized based on the k-plet to get the local structural information in fingerprints. The main idea of k-plet is to represent a central minutia by using the nearest k neighbors of the central minutia. To solve the problem that minutiae are clustered and to maintain high connectivity in fingerprint image, k/4 nearest neighbors are sequentially selected in each of the four quadrant in local coordinate system of minutia \( m_{i} \). It is noteworthy that the resulting k neighbors need not necessarily be the k closest neighbors of minutia \( m_{i} \). The k-plet representation is invariant under translation and rotation since it is defined with its own local coordinate system. The advantage of these k neighbors is that it still performs well in the situation where the fingerprints exist missing or spurious minutiae. In this study, we considered 8 neighbors empirically (i.e., k = 8), Fig. 2(a).

Fig. 2.
figure 2

(a). An example of k-plet (b). IFV formed by a neighbor minutia and central minutia

The k-plet consists of a central minutia \( m_{i} \) and other k neighborhood minutiae defined as \( \{ m_{1} ,m_{2} ,m_{3} , \ldots ,m_{k} \} \). Each k-plet owns its local coordinate in which the central minutia \( m_{i} \) is the original point and the direction of the central minutia \( m_{i} \) is chosen as the positive direction of the X axis. Each neighboring minutia \( m_{j} (j = 1,2, \ldots ,k) \) is defined with its local radial coordinates \( (d_{ij} ,{\varphi }_{ij} ,\theta_{ij} ) \), where \( d_{ij} \) represents the Euclidean distance between minutia \( m_{i} \) and neighboring minutia \( m_{j} \); \( \varphi_{ij} \) is the relative orientation of minutia \( m_{j} \) with respect to the central minutia \( m_{i} \); \( \theta_{ij} \) represents the direction of the edge connecting the two minutiae and \( \theta_{ij} \) is also measured relative to orientation of minutia \( m_{i} \). The translation-invariant and rotation-invariant vector \( (d_{ij} ,{\varphi }_{ij} ,\theta_{ij} ) \) formed by minutia \( m_{i} \) and minutia \( m_{j} \) is represented as \( A_{j} = (d_{ij} ,\varphi_{ij} ,\theta_{ij} ) \), named as IFV (Invariant Feature Vector), Fig. 2(b). For each k-plet, the k neighboring minutiae \( \{ m_{1} ,m_{2} ,m_{3} , \ldots ,m_{k} \} \) and the central minutia \( m_{i} \) form k invariant feature vectors \( \{ A_{1} ,A_{2} ,A_{3} , \ldots ,A_{k} \} \). Thus, the k-plet local pattern is defined in terms of \( LP = \{ A_{1} ,A_{2} ,A_{3} , \ldots ,A_{k} \} \).

3 Indexing Approach

In the proposed approach, it consists of two stages, known as enrollment of template fingerprints T and indexing of query fingerprints I. Figure 3 illustrates a general flow chart of the proposed approach.

Fig. 3.
figure 3

The flow chart of the proposed indexing approach

At the stage of enrollment, k index trees are constructed for storing the k-plet local patterns orderly and per tree has 3 levels representing \( d \), \( \varphi \) and \( \theta \) respectively. Finally, the items \( \# \) of LP ID and image ID is enrolled in the corresponding leaf bins. At the stage of indexing, the corresponding branches are selected. In the leaf bins, the items which are hit will be cast one vote. If votes are more than a certain threshold, the two LPs are regarded as matched successfully. According to the number of matched LPs, the similarity scores \( S \) between input fingerprint I and template fingerprint T can be acquired by Eq. (1).

$$ S = \frac{{\sum\limits_{m} {N_{m} } }}{{N_{i} + N_{t} }} $$
(1)

Where \( N_{i} \) is the number of LP in input fingerprint I, \( N_{t} \) is the number of LP in template fingerprint T, \( N_{m} \) is the number of matched LPs between input fingerprint I and template fingerprint T. Finally, the output result can be achieved by sorting all template fingerprints with the scores in descending order through the whole database.

3.1 Enrollment of Template Fingerprints

An off-line enrollment is made with the purpose of filling all LPt of template fingerprints into k index trees before indexing query fingerprints. Firstly, k duplicate index trees are constructed respectively. Each tree consists of 3 levels, which are presented by: \( d \),\( \varphi \),\( \theta \) and there are numbers of fixed-length bins in every level. A \( LP_{t} = (A_{1}^{t} ,A_{2}^{t} ,A_{3}^{t} , \ldots ,A_{k}^{t} ) \) need to be filled into the \( j_{th} (j = 1,2, \ldots ,k) \) index tree depending on the \( j^{th} \) invariant feature vector \( A_{j}^{t} = (d_{j}^{t} ,\varphi_{j}^{t} ,\theta_{j}^{t} ) \). That is, a LPt need to be stored totally k times. In the \( j_{th} \) index tree, the hash function (2), (3) and (4) at the corresponding level is used to obtain the sequence number \( n_{j}^{d} \),\( n_{j}^{\varphi } \) and \( n_{j}^{\theta } \) of the hit bin respectively.

$$ n_{j}^{d} = H_{d} (d_{j}^{t} ) = d_{j}^{t} /L_{d} + 1 $$
(2)
$$ n_{j}^{\varphi } = H_{\varphi } (\varphi_{j}^{t} ) = \varphi_{j}^{t} /L_{\varphi } + 1 $$
(3)
$$ n_{j}^{\theta } = H_{\theta } (\theta_{j}^{t} ) = \theta_{j}^{t} /L_{\theta } + 1 $$
(4)

Here, \( L_{d} \),\( L_{\varphi } \) and \( L_{\theta } \) is the length of one bin at \( d \),\( \varphi \) and \( \theta \) level and this operator /represents the quotient of the integer division. After the three levels browsed, along with the route in all levels, the item \( \# \) of LP ID as well as fingerprint ID is registered into the last hit bin at the last level.

For example, suppose that we are going to fill the 3rd LPt in the 4th template fingerprint in database into the first index tree, which is instantiated as \( LP_{3,4} = (A_{1}^{t} ,A_{2}^{t} ,A_{3}^{t} , \ldots ,A_{8}^{t} ) \), where the 1st invariant feature vector \( A_{1}^{t} = (5,8,15) \). In addition, we assume that \( L_{d} = L_{\varphi } = L_{\theta } = 10 \). The enrollment process seems to be regarded as tracing the branches of the 3 levels in the 1st index tree. As \( d = 5 \), then \( n_{1}^{d} = 5/10 + 1 = 1 \), it need to be fell into the first bin of the first level, and all the rest branches should be traced along with \( A_{1}^{t} = (5,8,15) \). At the end, the item \( \# (3,4) \) of LPt ID and fingerprint ID is enrolled into the second bin of the last level, Fig. 4. Similarly, depending on the following \( A_{j}^{t} = (j = 2,3, \ldots ,8) \), the item \( \# (3,4) \) should be enrolled into the rest index trees by order (i.e., the 2nd, 3rd,…,8th index tree).

Fig. 4.
figure 4

An example of enrolling an LPt into the 1st hash table

Summary of the off-line enrollment process is given in Algorithm 1.

3.2 Indexing of a Query Fingerprint

Indexing of a query fingerprint aims to obtain the similar template fingerprints by looking up k established index trees. It is similar to the process of enrollment. A \( LP_{i} = (A_{1}^{i} ,A_{2}^{i} ,A_{3}^{i} , \ldots ,A_{k}^{i} ) \) is need to be browsed into \( j_{th} (j = 1,2, \ldots ,k) \) index tree depending on the \( j^{th} \) invariant feature vector \( A_{j}^{i} = (d_{j}^{i} ,\varphi_{j}^{i} ,\theta_{j}^{i} ) \). Within the \( j^{th} \) index tree, each sequence number \( n_{j}^{d} \),\( n_{j}^{\varphi } \),\( n_{j}^{\theta } \) will be got by the hash function (2), (3), (4) separately. One bin in the last level will be searched after three levels’ routing. All the template items filled in this bin will be recorded into a vote box, moreover the number of votes of these items will plus one. After searching all k index trees, the template items whose the number of votes is more than \( T \) will be treated as successfully matched with this LPi. According to the number of matched LPs, the similarity scores \( S \) between input fingerprint I and template fingerprint T can be calculated by Eq. (1). Finally, the output result can be achieved by sorting all template fingerprints with the scores in descending order through the whole database.

Summary of the on-line indexing process is shown in Algorithm 2.

3.3 Multipath Indexing Strategy

In practice, there are some inherent variations in the fingerprints, including translation, rotation, distortion and other noise. They get reflected in the indexing procedure and thus reduce the overall accuracy of the system. To account for errors that might be caused due to such variations, we implement the multipath techniques in our indexing scheme.

At the stage of indexing, we select the hit bin and also its brother bins according to (5), (6) and (7) at different levels.

$$ \left| {n_{d} - n_{d0} } \right| \le T_{nd} $$
(5)
$$ \left| {n_{\varphi } - n_{\varphi 0} } \right| \le T_{n\varphi } $$
(6)
$$ \left| {n_{\theta } - n_{\theta 0} } \right| \le T_{n\theta } $$
(7)

Where \( T_{nd} \), \( T_{n\varphi } \) and \( T_{n\theta } \) are some certain thresholds. That is to say, if bin \( n_{d0} \) is hit at level \( d \), all its brother bins between \( (n_{d0} - T_{nd} ,n_{d0} + T_{nd} ) \) are also selected. Finally, in this way, a LPi will search certain numbers of leaf bins so that it increasing the probability of indexing the correct item. Figure 5 gives a general example, where \( T_{nd} = T_{n\varphi } = T_{n\theta } = 1 \).

Fig. 5.
figure 5

An example of multipath indexing scheme. Circles are hit bins and triangles are their brother bins.

4 Experimental Results

This section describes experiments carried out to evaluate the proposed algorithm (LP + Idx) and to compare it with the algorithm (OrigLP) of original k-plet local pattern without indexing. As well, the time analysis of the proposed algorithm are given at the end.

4.1 Datasets

The experimental verification of LP + Idx and its comparison with OrigLP are performed based on three standard databases:

  • FVC2002 DB1 database [11]: it contains 800 fingerprints, eight impressions for each of 100 distinct fingers, with the images of 388*374 pixels.

  • NIST DB4 database [12]: it contains 4000 fingerprints, two different fingerprint instances (F and S) for each of 2000 distinct fingers, with the images of 512*512 pixels.

  • NIST DB14 database [13]: we choose the first 2000 fingerprints, two impressions for each of 1000 distinct fingers, with the images of 832*768 pixels.

4.2 Evaluation Indicators and Setup

Typically, Correct Index Power (CIP) and Penetration Rate are adopted to evaluate the accuracy and efficiency. Here \( CIP = N_{ci} /N_{d} \), where \( N_{ci} \) is the number of correctly retrieved input query fingerprints, \( N_{d} \) is the number of all input query fingerprints. Obtaining a higher CIP and a lower Penetration Rate is the common target of fingerprint indexing algorithm.

In the experiments, each database is equally divided into two parts: input fingerprints and template fingerprints. In FVC2002 DB1 database, the first 4 impressions of each individual finger are selected as the input fingerprints while the rest 4 impressions are regarded as the template fingerprints to build the index trees. Similarly, the first and second impressions of each individual finger in NIST DB4 and DB14 database are respectively chosen as input and template fingerprints.

For the feature extraction, an open investigation named as FM3 [14] is adopted to provide the reliable manual-marked minutiae in FVC2002 DB1 database. Meanwhile, a commercial automatic fingerprint identification system GAFIS [15] is used for the minutiae extraction in NIST DB4 and DB14, which is widely employed in Chinese criminal investigation departments.

4.3 Results and Discussions

Figures 6 and 7 show the trade-off between Penetration Rate and CIP in FVC 2002 DB1, NIST DB4 and NIST DB14 where k = 4 and 8, respectively. Table 1 compares the algorithm LP + Idx with OrigLP at some certain CIP in FVC 2002 DB1. In addition, Table 2 lists the time factors in FVC2002 DB1, NIST DB4 and NIST DB14.

Fig. 6.
figure 6

Indexing performance in FVC 2002 DB1, NIST DB4 and NIST DB14, where k = 4

Fig. 7.
figure 7

Indexing performance in FVC 2002 DB1, NIST DB4 and NIST DB14, where k = 8

Table 1. Indexing performance of LP + Idx and OrigLP at certain CIP in FVC 2002 DB1

From Figs. 6 and 7, it is noteworthy that the indexing performance in FVC2002 DB1, NIST DB4 and DB14 is quite outstanding. Moreover, it indicates the performance in FVC2002 DB1 looks much better than the one in NIST DB4 and DB14. It is mainly induced by two factors: (1) the image in database of FVC2002DB1 is more clear and smaller than the one in NIST DB4 and DB14. (2) Comparing with the only one template fingerprint for each individual input fingerprint in the database of NIST DB4 and DB14, there are four template fingerprints for each one in the database of FVC2002 DB1, which largely increases the probability of matching one input query fingerprint successfully.

In addition, with the increasing of the number of neighboring minutiae in LP, the discrimination of LP becomes stronger than before. Indeed, it achieves higher correct index power with a lower penetration rate. However, it will slow down the runtime and consume more memory resource. Consequently, we consider 8 neighbors empirically to achieve a balance between the two aspects (i.e., k = 8).

Furthermore, Table 1 compares the algorithm LP + Idx with OrigLP at some certain CIP in FVC 2002 DB1. The comparison demonstrates that the performance of LP + Idx is almost the same to OrigLP and verifies the robustness of LP + Idx.

For real-time fingerprint recognition system, time is another critical factor related to the efficiency. Consequently, the time factors in FVC2002 DB1, NIST DB4 and NIST DB14 are investigated and listed in Table 2. The time tests running on 2.40 GHz Intel Core CPU are implemented in C++ without particular optimizations. It is obvious to reveal that proposed algorithm LPIdx is very fast and efficient since these time factors are much faster than OrigLP.

Table 2. Indexing time in FVC2002 DB1, NIST DB4 and NIST DB14

5 Conclusion

In summary, we creatively adopt the k-plet local pattern for fingerprint indexing, which is translation-invariant and rotation-invariant. Then, according to these local patterns, k efficient index trees are constructed based on enrollment and indexing stages. The multipath indexing strategy ensures the accuracy and robustness while simple look-up operation makes it quite fast and effective. The performance testing proves that the proposed algorithm is beneficial for fingerprint indexing as it achieves high correct index power with a low penetration rate.

As our future focus, a completely hierarchical fingerprint indexing with other effective features, is expected to develop for large scale fingerprint database.