1 Introduction

Biometric identification is a task of searching biometric collections to find the best match for a biometric trait, and further checking whether these two traits are from the same individual. Despite the proliferation of biometric identification, there are also increasing concerns on its associated privacy and legal issues since biometric data are highly sensitive and are impossible to be revoked and replaced once leaked. Appropriate security and privacy protection mechanism shall be in place to defend against deliberate or inadvertent disclosure or misuse of biometric data. A privacy-preserving biometric identification is challenging considering the requirements of practical systems on security, efficiency, and system scalability. A hash function is an essential building block used for the protection of information and performs a significant role in cryptography. A message is an input and produces an output referred to as a hash value (Norouzi et al. 2014). Un-keyed hash functions and keyed hash functions are commonly used hash functions. Traditional hash functions, for example, MD5 and SHA, contain logical operations or multi-round iterations of some available ciphers (Al-Riyami et al. 2016).

Hashing comprises significant approaches which is used for storing and accessing data to a great information scheme. The hash function techniques are used to pinpoint a part of the information in a table by way of transforming the search key directly into a table address (Wan et al. 2016). The multimedia hashes are utilized in the recovery of content from the databases, images, and video watermarking (Sun and Zeng 2014). The biometric system is necessary for the identification of pattern performed by obtaining biometric data from an individual. The feature set is extricated from the captured data, and the feature set is correlated against the template established in the database (Kavsaoğlu et al. 2014).

The single trait is used in the biometric systems for the identification of patterns is known as unimodal biometric systems, which are affected by unacceptable error rates, spoof attacks, and noisy sensor data (Tome et al. 2014). To overcome these problems, the multi-biometric system is utilized to enhance the performance of identification considerably as well as raising the degree of freedom, preventing spoof attack, improving population coverage, and reducing the failure-to-enroll rate (Telgad et al. 2014). Distance and similarity measures are significant in scientific fields, for example, machine learning, decision making, market prediction, and pattern identification (Jiao et al. 2015). The similarity measure is directly calculated over the complete sample point sets.

The similarities among whole pairs of sample points are studied based on their distances. The similarity measure is more robust than the simple tracking methods of template matching or the sum of squared differences (SSD) (Gupta and Gupta 2015). Clustering is utilized as one key element to measure the similarity among two data. This information could be in different forms containing raw values of unequal or equal length, transition matrices, and vectors of feature-value pairs (Ozturk et al. 2005). Dynamic time warping (DTW) used for verifying the signatures. The features space is precisely connected with the set of similarity measures, which can be used in the signature verification process (Doroz et al. 2016). The template matching technique based on cross-correlation coefficient is employed for the generation of cloud motion vector. The performance of the template-based matching function is reduced by generating false maxima in the correlation surface (Petrou and Tian 2016).

The normalized fuzzy similarity measure (NFSM) algorithm is utilized in the nonlinear distortions. The template and input image are arranged in the initial stage following NFSM technique which was employed to calculate the similarity among template and input images and the output was active and consistent (Wang and Qin 2015). The biometric-hash framework is used to clarify the incorporation of exterior and biometric information regarding a random multi-space quantization (RMQ) technique (Topcu et al. 2016). The likelihood ratio test is utilized for the fusion of match scores in a multi-biometric structure and to reduce the false reject rate (FRR) at a specified false accept rate (FAR) (Thai et al. 2016). The elastic distortions in the fingerprint are reduced by utilizing minutiae matching algorithm and thin-plate splines used in this algorithm it will handle nonlinear distortions (Si et al. 2017).

The fingerprint is identified by utilizing a new Minutia Cylinder-Code (MCC), and it depends on a robust discretization of the region of every minutia into a 3D cell-based structure known as cylinder (Jain and Prasad 2016. Optical and capacitive sensors are fused in the multi-sensor fingerprint verification scheme. This multi-sensor system could enhance the accomplishment of the leading optical sensor (Li 2016). A digital watermarking approach with face and demographic text data, for instance, multiple watermarks used for conserving the integrity of a fingerprint image. The discrete wavelet transform is used for embedding the watermarks into the preferred regions of a fingerprint image. The degree of similarity is calculated by using human visual system metrics and pixel-based metrics (Nair and Aruna 2015). Orientation codes and polylines features are utilized in the fingerprint matching. These features are defined in minutiae of the fingerprint, macroadjustment patterns, and microridge representation. It is used in different phases of fingerprint matching, for example, minutiae pairing, matching rates fusion, alignment, and matching score computation (Zhang et al. 2016).

The cloud security is provided by biometric techniques are used such as iris authentication and fingerprint. Although these biometric authentication methods such as fingerprint and iris are able to provide personal identification, they have high infrastructure cost and access privileges which cannot be transferred among trusted users.

Contribution The major contribution is to reduce the memory consumption of an entire system. For that introduces a Minkowski distance-based matching with Merkle hash tree authentication. This novel methodology inhibits better security authentication with the Merkle hash tree. This hash function authenticated the extracted data, and here memory consumption is done with the storage of generated keys. For the large data storage in biometric-based application, this proposed work is well suited. Since system data storage space is reduced with the cloud data storage. Finally, the stored key is compared with the external data with the help of Minkowski distance. Thus, the cloud data storage is enhanced with the minimum data storage, and this proposed range matching provides accurate performance when compared to existing methods.

The remainder of this manuscript is organized as follows. In Sect. 2, we review the works related to the biometric matching. In Sect. 3, the proposed methodology for improving the accuracy of matching is described. The experimental setup of our evaluation is presented and discussed the strategies for combining relevant criteria, and the obtained results are also discussed in Sect. 4. The conclusions of this paper are addressed in Sect. 5.

2 Related works

In large-scale fingerprint databases, fingerprint indexing was dangerous for conveniently discovered fingerprint. Yijing et al. (2016) had developed a technique used to register the fingerprint into a finger-like structure. The fingerprint pose estimation algorithm was utilized in the fingerprint registration, and the ridge orientation data were utilized as features in place of singularities which were repeatedly affected by noise. The pose estimation algorithm was used for improving the matched minutiae by way of global spatial limitation. The pose estimation algorithm was combined with the local sensitive hashing algorithm for MCC descriptor. The scalability was tested on large databases.

In various computer vision algorithms, the distance measure was an important part. The feature components in the distance measure were autonomous and identically dispersed. But, the heterogeneous sources were not identically dispersed presented in the real world. Swaminathan et al. (2017) had proposed a statistical study used to display the feature components of the local image descriptors. Poisson-binomial radius (PBR) was a bin-to-bin distance; it was utilized to test the influence of unified dispersion postulation. Twelve benchmark datasets were used in the performance calculation. The actual result was enhanced with the performance in the recognition and classification tasks.

The cancelable biometrics was in charge of enhanced security of biometric-based confirmation. But, the binary biometric representations contained in cancelable biometrics had shortages in the protection. Wang et al. (2017b) had developed a non-invertible transform—partial Hadamard transform used for protecting the binary biometric representation. The prominent feature of the partial Hadamard transform was conserved the stochastic distance among binary vectors after the transformation. The performance, revocability, diversity in the cancelable fingerprint templates was improved using this technique.

The potency of an authentication system for fingerprint depends on the features extracted from the fingerprint images. Sasirekha and Thangavel (2016) developed a method to obtain a set of invariant moment features from the approximation coefficient in the wavelet domain. Stationary wavelet transform (SWT) was utilized to denoise the fingerprint images, and the denoised image was enhanced by short-time Fourier transform. Region of interest (ROI) was found by complex filter, and ROI was decomposed by SWT. The decomposed images were subdivided to reduce the noise and nonlinear distortion. The computational difficulties were diminished using simpler distance measure.

Fingerprint image enhancement, fingerprint indexing, fingerprint matching, and fingerprint registration were utilized in an orientation field (OF). Gupta and Gupta (2016) had developed a new OF regularization algorithm based on variational technique, which was utilized to conserve the unaffected OF in the areas containing uniform flow, and the bad quality area OF was reconstructed. Estimated OF could be specious in singular points, so it was additionally refined using OF modeling based on Fourier basis, and it improved the performance.

Alyasseri et al. (2017), Wang et al. (2017a) and Abualigah et al. (2018a, b, c, d) proposed denoising and clustering algorithm for improving the performance of various database. Li et al. (2017) proposed a method for accuracy enhancement. Abualigah et al. (2018a, b, c, d) introduced a machine learning model for performance analysis. Abualigah et al. (2016, 2017a, b, c, d), Abualigah and Hanandeh (2015), and Abualigah and Khader (2017) used different optimization algorithms with hybridization for accurate results. Various datasets were used by the author to prove the optimization performance.

3 Proposed biometric recognition in cloud computing

Cloud computing is one of the emerging technologies that transfers network users to the next level. Security is one of the critical challenges faced by cloud computing. Biometrics proves its efficiency to achieve secured authentication. Biometric recognition is proposed in order to access to the cloud platform. A new attribute should be created to handle authentication information in the infrastructure. The proposed system model uses a new distance idea for biometric security system based on fingerprint recognition. It automated the verification method to match between two human fingerprints, where fingerprints are considered a commonly used biometrics to identify an individual and to verify their identity. This proposed method consists of two phases: enrollment phase and recognition phase. A general approach in a biometric system is to store all captured biometric images in the enrollment phase, and then the authentication is performed using a matching process which is demonstrated in Fig. 1.

Fig. 1
figure 1

Schematic representation of proposed method

Preprocessing stage in the registration period consists of three steps like binarization, thinning, and minutiae detection. During the sensing process, the biometric trait may affect with many challenges. Therefore, proposed methodology needs some successive modules to handle these problems. In this process, template fingerprint is formed with the minutiae feature extraction process. Merkle hash tree is used for authentication of the data which is used for biometric identification. Here, key generation takes place to avoid memory consumption in the cloud system.

In this process, extracted features are the input for the Merkle hash tree. The signature is incepted and stored in the cloud server. The user has to submit the adjacent and sibling shares of fingerprint template for authentication purpose. The signature is created by the cloud service source and therefore tested by the deposited signature in the cloud. In this recognition stage, once again this biometric trait is given to identify the authenticated user. For that, this type of temple matching uses Minkowski distance-based authentication strategy.

3.1 Enrollment phase

3.1.1 Biometric minutiae extraction

A fingerprint is an individual identification, and this technology may combine with more authentication methods over the years. There are two appropriate structures which are adequately used that are ridge junction and ridge ending. The most advantageous technique employed in the biometric recognition is minutiae feature extraction.

In this paper, each observed minutiae for the fingerprint biometric \( BK_{i} \) described as,

$$ m_{i} = (x_{i} ,y_{i} ,t_{i} ) $$
(1)

\( x_{i} \) and \( y_{i} \) are organizes the minutiae point and \( t_{i} \) is the variety of minutiae point (ridge ending or ridge bifurcation). The step-by-step procedure to extract the minutiae is as follows:

  • Step 1: Binarization

Binarization is the process of converting gray-scale image into binary image. This image has black and white pixels in which edges, and valleys are identified with the black and white pixels. This strategy is well suited for the sensed image.

  • Step 2: Thinning

Thinning is the process of detecting edge pixel width. The procedure contains concert progressive deteriorations till a recognized of connected positions of unit width is prolonged. The lines in the technique are known as skeletons. A significant belonging of thinning is the safety of the adaptability and topology which nevertheless can be guided to the creation of minor connection relics and consequently, with respect of false minutiae. As a result, certain process pointing the rejection of these relics must be achieved next to the thinning.

  • Step 3: Minutiae detection

The minutiae detection is the feature extraction strategy in which (m*m) pattern masks are used to discover the minutiae feature. These extracted features are the major concept which includes minutiae pattern, type (termination or bifurcation), and its position related to the vertical and horizontal direction (x and y). Subsequently, identifying the points, the standard deviation \( {\text{SD}} \) of the minutiae points can be intended as,

$$ {\text{SD}} = \sqrt {\frac{1}{{M\sum\nolimits_{i = 1}^{M} {(m_{i} - \mu )} }}} $$
(2)

In which, the mean value \( \mu = \frac{{\sum\nolimits_{i = 1}^{M} {m_{i} } }}{M} \) for \( i = 1,\,2, \ldots ,\,M \) and \( m_{i} \) is the removed minutiae points. Likewise, the minutiae are recognized as the query image and these also stored as the template. During the registration, the deposited standard deviation values used in the matching process as location pattern or database model. Through the verification or identification, the removed minutiae are also stored in a pattern and are used as query template during the matching.

3.1.2 Merkle hash tree authentication

Feature extracted data are given as the input for the Merkle hash tree in which minutiae detected function consists of some features related to the database. Then, it splits the data into clusters (chunks) for further processing, and these data are not in the same order. These chunks consist of an odd and even number of a pair. Completion of this odd sequence takes zero-valued chunks.

  • Step 1: Key generation

The following procedure is carried out for the key generation:

  1. (i)

    Select two large random prime integers \( p \) and \( q \).

  2. (ii)

    Compute n and \( \varphi \,(n) \):\( n\, = pq\, \),\( \,\varphi \,(n)\, = (p - 1)\,(q - 1) \)

  3. (iii)

    Select an integer \( e\, \),\( 1 < \,e\,\, < \,\varphi (n) \) such that: \( \gcd \,(e,\varphi (n))\, = 1 \) (where \( \gcd \) means greatest common denominator).

  4. (iv)

    Calculate \( d \), \( 1\, < d < \phi (n) \) such that:\( e * d \equiv 1\,(\bmod \varphi (n)) \).

The encryption public key is \( K_{\text{E}} = (n,e) \), and the decryption private key is \( K_{\text{D}} = (n,d) \). The values of \( p \), \( q \) and \( \phi (n) \) are private; \( e\, \) is the public or encryption exponent; \( d \) is the private or decryption exponent. In our work, the data has to be encrypted with the masked password \( {\text{MP}}_{i} \)

  • Step 2: Generation of Hash digest

Generation of Hash digest takes MD-5 hash function for each chunk of data.

$$ {\text{MD5}}\left( {m_{A} } \right) = {\text{MD5}}\left( {{\text{MD5}}\left( {Q^{b} \left\| {\,m} \right.} \right)} \right) $$
(3)

In this Eq. (3), m is the data given to the authentication and \( Q^{b} \) is the odd number of blocks and this process will be repeated for the even number of sequences which is given in Eq. (4).

$$ {\text{MD5}}\left( {m_{B} } \right) = {\text{MD5}}\left( {{\text{MD5}}\left( {E^{b} \left\| {\,m} \right.} \right)} \right) $$
(4)

Arrange these files in an identical order for creating chunks. Randomly, these are able to be introduced into an array by the even indexes. Concatenate every match of chunk hashes, and hash the following data over by MD-5.

$$ k_{g} = {\text{MD5}}\left( {m\,\left\| {\,{\text{key}}} \right.} \right) $$
(5)

In this strategy, the new hash digest inserted into the Merkle hash tree structure, and this digest moved up or down to make the hash among two chunks. Repeat this process for newly generated hashes and this process repeated until the single hash digest function will be achieved.

3.2 Recognition phase

The framework recently obtains the biometric quality of a person. Then extracts a feature set from it. Compare the feature set against the templates in the database to determine a match or to verify a claimed identity. Features extracted from this biometric query sample and contrasted to the template stored in the database to decide the character related with the question biometric test utilizing the Minkowski distance.

3.2.1 Key storing facility in the cloud

Subsequently creating a hash function, the key for the particular data secured in the cloud where memory for the particular data is decreased. In this cloud storage data has to be stored, and in which data are remotely accessed by the clients. Cloud storage suppliers have included working costs that can make the innovation more costly than clients anticipated. Cloud security keeps on being a worry among clients. Suppliers have attempted to manage those feelings of dread by building security abilities, for example, encryption and verification, into their administrations.

3.2.2 Identification based on Minkowski distance

Depending on the Minkowski distance, the matching process among two different process takes place. In the case of Manhattan distance \( \lambda = 1 \), it can be only used for \( L_{1} \)-norm. In the event of a Euclidean algorithm \( \lambda = 2 \), it is appropriate for \( L_{2} \)-norm. Nevertheless, these two techniques have a specific limitation in distance matrix. To exhaust, these tasks tend to introduce one technique mentioned to as Minkowski distance that is a generalized metric that has others as different cases of the universal type. The Minkowski distance is used as variables on magnitude relation scales with associate entire zero value. Even a pair of outliers with high values bias the result and disrespect the likeness given by one or two of variables with a lower boundary. From the objective function, Eq. (6) is given as

$$ d^{\text{MKD}} \left( {i,j} \right) = \sqrt[\lambda ]{{\sum\limits_{k = 0}^{n - 1} {\left| {y_{i} - y_{j} } \right|^{\lambda } } }} $$
(6)

In Eq. (7), \( d^{\text{MKD}} \) is the Minkowski distance between the real object i and fake object j, n as the absolute number of nodes in the system and λ is the order of the Minkowski metric. Even if it is defined for any \( \lambda > 0 \), it is rarely used for values further than 1, 2, and ∞. Minkowski metric transformed for \( \lambda = \propto \), and it becomes:

$$ d^{\text{MKD}} \left( {i,j} \right) = \mathop {\lim }\limits_{\lambda \to \propto } \,\left[ {\sqrt[\lambda ]{{\sum\limits_{k = 0}^{n - 1} {\left| {y_{i} - y_{j} } \right|}^{\lambda } }}} \right] = \mathop {\hbox{max} \,\left| {y_{i} - y_{j} } \right|}\limits_{{}} $$
(7)

Minkowski metric of the order \( \propto \) returns the space alongside that axis on which the two objects show the greatest entire difference. An algorithmic flow of the proposed biometric recognition in cloud computing is given in Table 1.

Table 1 Pseudocode for the proposed methodology

Given a fingerprint matcher, its accuracy and speed performance in a realistic setting are evaluated to measure the system performance. Unlike passwords and cryptographic keys, biometric templates have high uncertainty. There is considerable variation between biometric samples of the same user taken at different instances of time. Therefore, the match is always done probabilistically. During authentication, the biometric of the user is captured again and minute data are extracted to form the test template that is matched against the already stored template in the database. In each case, if the matching score is less than the threshold, then the person is rejected; otherwise, the person is accepted.

4 Experimental setup

In this section, the fingerprint databases and the test methodology are explained, which are used for evaluation. The performances are compared with related approaches also discussed in this section.

4.1 Test database

The proposed method executed using MATLAB and approved by associating with the conventional techniques. We use database in our experiments named as NIST Special Database 4 (NIST-4) (Watson and Wilson 1992), which is a publicly accessible fingerprint database. The size of the fingerprint images is 388 × 374 pixels with a perseverance of 96 × 96 DPI. From the dataset, 60% data are taken as a training image and the remaining 40% are testing images. Then, the proposed methodology compared with some distances like Canberra, Hamming, Weighted Hamming, Euclidean, Weighted Euclidean, Manhattan, Bray–Curtis, Akritean and Manhabolis.

False match rate (FMR) and false non-match rate (FNMR) quantify the errors in a verification system. False-positive identification rate (FPIR) and false-negative identification rate (FNIR) are used as the error metrics in an identification system. Also, the performance measures such as accuracy, recall, and F1 score are evaluated to show our proposed methodology which is better than existing.

Some of the parameters for the performance evaluation stated below:

False matching rate (FMR):

The negative predictive values (FMR) are the extents of undesirable consequences in measurements and analytical assessments.

$$ {\text{FMR}} = \frac{{T_{\text{n}} }}{{T_{\text{n}} + F_{\text{n}} }} $$
(8)

False-positive identification rate (FPIR):

The fraction of non-relevant features which remains recovered throughout of all non-relevant features.

$$ {\text{FPIR}} = \frac{{F_{\text{p}} }}{{F_{\text{p}} + F_{\text{n}} }} $$
(9)

False Non-Matching Rate (FNMR):

False Non-Matching Rate (FNMR) is some technique of intellectualizing the proportion of faults in insignificant postulate challenging while accompanying various assessments.

$$ {\text{FDR}} = \frac{{F_{\text{p}} }}{{F_{\text{p}} + T_{\text{p}} }} $$
(10)

False-Negative Identification Rate (FNIR):

True positive means the actual instances happened at both test and the training cases.

$$ {\text{FNIR}} = \frac{{F_{\text{n}} }}{{F_{\text{n}} + T_{\text{p}} }} $$
(11)

Accuracy:

Accuracy is the ratio of amount of appropriate data retrieved to the total sum of retrieved data.

$$ {\text{Accuracy}} = \frac{{T_{\text{p}} }}{{T_{\text{p}} + F_{\text{p}} + T_{\text{n}} + F_{\text{n}} }} $$
(12)

Recall:

The recall is the ratio of amount of appropriate data retrieved to the total sum of appropriate data.

$$ {\text{Recall}} = \frac{{T_{\text{p}} }}{{T_{\text{P}} + F_{\text{n}} }} $$
(13)

F-score:

$$ F_{\text{Score}} = \frac{{\left( {1 + \beta^{2} } \right){\text{Recall}} * {\text{Accuracy}}}}{{\beta^{2} \left( {{\text{Recall}} + {\text{Accuracy}}} \right)}} $$
(14)

where \( T_{\text{p}} \)—registered image is identified as registered image; \( T_{\text{n}} \)—registered image is identified as not registered image; \( F_{\text{P}} \)—actual not registered image is identified as not registered image; \( F_{\text{n}} \)—actual not registered image is identified as registered image.

Table 2 shows the performance of proposed and the existing methodologies. In our process, we have compared ten types of distances like Canberra, Hamming, and Weighted Hamming, Euclidean and Weighted Euclidean, Manhattan, Bray–Curtis, Akritean and Manhabolis distances.

Table 2 Performance comparison analysis

The overall performance shows that the proposed methodology provides better performance. Figures. 2, 3 and 4 demonstrate the precision, recall, and the F-score measures. Among these techniques, Minkowski distance is the best biometric matching strategy toward concentrating on the security considerations. In this case, biometric matching with the proposed strategy inhibits accurate performance. Thus, the Minkowski distance-based matching strategy attains 25% better performance in case of F-score, 30% in the event of recall and compared to the existing method accuracy yields 20% better performance.

Fig. 2
figure 2

Performance comparison for accuracy

Fig. 3
figure 3

Performance comparison for recall

Fig. 4
figure 4

Performance comparison of F-measure

5 Conclusion

In this paper, an efficient Minkowski distance-based Merkle hash tree authentication has been proposed, which is used for biometric recognition in cloud computing. The proposed method is compared regarding accuracy, recall, and F-score. Since cloud database is insecure, it requires more effective methods to accomplish mutual authentication, and hence a highly secured biometric-based mechanism has been used. The proposed methodology mainly achieves major authentication to ensure the biometric identity. In addition to this technique, instead of storing the encrypted data, the proposed methodology stores only an encrypted key for consuming memory capacity. Finally, an implementation of the proposed work performed on the platform of MATLAB and the performance are evaluated in terms of FMR, FNMR, FPIR, FNIR, accuracy, recall, and F1 score. Therefore, the proposed method is more suitable for cloud environments in terms of user authentication.