1 Introduction

Biometric systems are widely used for authentication of users at several establishments. A biometric system uses physical biometric traits (face, fingerprint, etc.) or behavioural biometric traits (signature, gait, etc.) to recognize a person’s identity for verification or identification purpose. Such biometric system uses a single biometric trait to establishes one’s identity. But, several challenges exist for the usage of a single biometric system due to inter-class similarities, intraclass variation, non-universality, noisy data and susceptibility to circumvention [51]. These can be circumvented, to an extent, with the fusion of multiple biometric traits. Therefore, multimodal biometrics are increasingly used in an authentication mechanism.

The term ‘multimodal biometrics’can be broadly used in all of the following contexts: multiple biometric traits (e.g., face and gait [50], face and palm-print [15]), multiple samples of similar biometrics (e.g., fingerprint biometrics from multiple fingers [25]), multiple acquisitions of same biometrics (e.g., face biometrics from multiple instances of the same face [48]) and various feature descriptors of same biometrics (e.g., palm-print by using Gabor phase, Gabor orientation and Radon orientation [30]).

In all of the above categories, fusion of multimodal biometrics can be carried out in one or more of the following levels: sensor level, [13, 42], feature level [20, 56], score level [6, 44, 55, 58, 64, 66, 67], rank level [37, 50] and decision level [32, 46]. Sensor level and feature level fusion schemes fall under the category of fusion schemes before matching module. Score level, rank level and decision level fusion schemes fall under the category of fusion schemes after matching module. This differentiation indicates the approach used in these two categories. The sensor level and the feature level fusion schemes operate on the raw captured biometric images/signals and extracted features, respectively. The variations in the captured biometric signals and in the lengths of extracted feature vectors pose a challenge for adoption of sensor level and feature level fusion schemes [51], unless all the modalities relate to same type of biometrics or feature vectors are normalized to common dimension.

Fusion schemes (score level, rank level and decision level), which fall under the category of fusion after matching, are widely accepted in the area of multimodal biometric fusion because these schemes do not depend on the nature of the captured images/signals or the extracted features. In all these schemes, fusion is performed only after obtaining scores for individual modalities. According to score level fusion, similarity or dissimilarity scores are computed by comparing the biometric trait of probe subject with the traits of all enrolled subjects for each biometric modality. The matching score contains more information to identify a person than the information available at rank level fusion and decision level fusion. Hence, matching score level fusion schemes are believed to be more acceptable. However, it suffers from the problem of having varying range of scores depending on the adopted modality, feature extraction, and similarity or dissimilarity metric [6]. Therefore, score normalization [24, 26, 27] is performed before fusing the scores from different modalities. In rank level fusion, ranks are derived from the matching scores. These ranks represent possible set of matching identities in decreasing order of confidence. Therefore, rank level fusion schemes do not require normalization. Decision level fusion only requires the decision form each individual biometric matcher and combines them to generate the final decision. Therefore, very less information is present during fusion. It may lead to false identification, if individual decision makers are not accurate [51]. In summary, score level fusion scheme has the most amount of information during fusion among the ’after matching’ fusion schemes, but it needs a good normalization technique. On the contrary, rank level fusion scheme does not need a normalization as well as it has more information than decision level fusion scheme. Therefore, rank level fusion is being considered in this paper.

In rank level fusion [3, 9, 11, 29, 37, 50, 54, 60] schemes, rank lists are derived by considering relative ordering of the similarity or dissimilarity scores between biometric traits of the probe and the enrolled subjects. Several such rank lists corresponding to various biometric modalities are combined to generate an aggregated or fused rank list. Several rank level fusion schemes can be found in literature, e.g., Borda count [50], the highest rank [15], non-linear weighted exponential [30]. These traditional rank level fusion schemes derive an aggregated rank list through simple mathematical operations involving the input rank lists. As an example, Borda count [15] considers summation of ranks of a subject in the input rank lists. Similarly, non-linear weighted exponential [30] considers weighted summation of exponential values of ranks of a subject in the input lists. On the contrary, rank level fusion of multimodal biometrics is considered as an optimization problem in [3]. The distances of an aggregated rank list from each input rank list are minimized here. A widely used distance measure in the domain of rank aggregation problems, Spearman footrule distance [47], is considered in this context. Here, cross-entropy Monte Carlo algorithm is used to solve the optimization problem. Superiority of this optimization-based approach is experimentally demonstrated in [3] as compared against traditional rank level fusion approaches. It acts as a motivation to carry out further research in this direction.

Therefore, similar to [3], this paper proposes a rank level fusion of multimodal biometrics as an optimization problem. Genetic algorithm is a widely accepted paradigm to solve this kind of optimization problem involving large search spaces [7, 36, 59, 68, 70,71,72]. In the current paper, a genetic algorithm based approach to solve the above optimization problem is proposed. The major contribution of this paper is formulation of rank level fusion of multimodal biometrics as an optimization problem and adoption of genetic algorithm in this context. Representation of chromosomes and custom-designed operators to suite this problem domain are the highlights of this paper. The genetic algorithm provides a rank list which has minimum distances from all the input rank lists. But unlike the approach in [3], a weighted Spearman footrule distance is proposed to prioritize the top-ranked subjects. The proposed approach is experimentally studied on three different multimodal biometric datasets: (i) NIST BSSR1 multimodal dataset [16] involving fingerprint and face modalities, (ii) OU-ISIR GAIT dataset [23] involving multiple feature extraction methods of gait biometrics and (iii) Bioscote AR Face dataset [53] involving multiple matching models face modality. Experimental results justify the suitability of the proposed genetic algorithm approach of rank level fusion in the context of person identification in multimodal biometrics. Contributions of this proposed work can be summarized as follows:

  1. 1.

    Rank level fusion of multimodal biometrics is conceptualized as an optimization problem.

  2. 2.

    A novel genetic algorithm based solution is proposed to obtain an aggregated rank list having minimum distances from each input list.

  3. 3.

    Superiority of the proposed approach over state-of-the-art rank level fusion methods is demonstrated through experimental results on different multimodal biometrics datasets.

  4. 4.

    The proposed approach is also compared with state-of-the-art score level fusion methods to establish its superiority.

The rest of this paper is organized as follows: A literature on existing rank level fusion methods is presented in Section 2. A detailed formulation of rank level fusion of multimodal biometrics as an optimization problem is presented in Section 3. Section 4 describes the proposed rank level fusion method using genetic algorithm. Section 5 describes the experimental setup. Section 6 reports the results of applying the proposed rank level fusion method as well as several state-of-the-art fusion methods on all three multi-biometrics datasets. Finally, Section 7 draws the concluding remarks.

2 Related works

In this section, several existing rank level fusion methods for multimodal biometrics are discussed.

2.1 Borda count

According to Borda count method [1, 4, 8, 15, 33, 38,39,40, 45, 50], summation of ranks of a subject for each modality provides an aggregated rank for the concerned subject. Mathematically, it can be expressed as:

$$ R(i)=\sum\limits^{N}_{j=1} r_{j}(i) $$
(1)

Here, N represents number of different biometric modalities and rj(i) represents the rank of ith subject according to the jth modality. At the end, the final aggregated rank of a subject is decided based on the ordering of R(i) values of all subjects.

2.2 Weighted borda count

This method [1, 31, 38,39,40, 50, 61], is extension of Borda count method. In this method, a weight is assigned to each biometric modality based on their individual performance. Then the final rank is calculated by weighted summation of ranks of a subject for each modality. Mathematical expression to find the rank of a subject using weighted Borda count method is defined as:

$$ R(i)=\sum\limits^{N}_{j=1} w_{j} \times r_{j}(i) $$
(2)

Here, N represents number of different biometric modalities, wj is the weight given to jth modality and rj(i) represents the rank of ith subject according to the jth modality. At the end, the final aggregated rank of a subject is decided based on the ordering of R(i) values of all subjects. Introduction of weight in this method is advantageous when different biometric modalities have significant differences in their performances.

2.3 Highest rank method

This ranking method [8, 15, 17, 62, 69] finds the highest rank of a subject from all the ranks in various modalities as the final rank for that subject. For example, consider a subject i having 1st rank in the rank list of biometric modality B1 and 3rd rank in rank list of biometric modality B2. The final rank for subject i is 1st because this is the highest rank for the subject i among both modalities. Expression to find the rank of a subject using the highest rank method is given below:

$$ R(i)=min(r_{j}(i)) $$
(3)

Here, N represents number of different biometric modalities, rj(i) is the rank of ith subject in the jth modality. It is to be noted that lower rank value is better. Therefore, min() function is used in (3) to represent the highest rank.

As this method considers the highest rank for each subject across all the modalities, chances of having same rank for more than one subject is very high. These ties are randomly broken to get the final rank for a subject [25, 57], which can lead to decrease in identification accuracy.

2.4 Non-linear weighted methods

Several non-linear weighted rank methods for rank level fusion are also available in literature. Exponential ranking method [30], is one of these methods. Mathematically, it can be expressed as:

$$ R(i)=\sum\limits^{N}_{j=1} exp (w_{j} \times r_{j}(i)) $$
(4)

A modified version of above exponential rank level fusion is also presented in [30], which is known as weighted exponential rank fusion method. It can be mathematically, defined as following:

$$ R(i)=\sum\limits^{N}_{j=1} w_{j} \times exp (r_{j}(i)) $$
(5)

A division exponential based non-linear rank level fusion method is proposed in [25]. It can be mathematically, defined as following:

$$ R(i)=\sum\limits^{N}_{j=1} \frac{1}{1 + exp (w_{j} \times r_{j}(i))} $$
(6)

Similarly, logarithm based non-linear rank level fusion is also presented in [25]. Metamerically, it can be expressed as following:

$$ R(i)=\sum\limits^{N}_{j=1} log (1 + w_{j} \times r_{j}(i)) $$
(7)

In (4)-(7), N represents number of different biometric modalities, wj is the weight given to jth modality and rj(i) represents the rank of ith subject according to the jth modality. At the end, the final aggregated rank of a subject is decided based on the ordering of R(i) values of all subjects. The main objective of using non-linearity here is that the non-linear functions exhibits the property of varying slopes. Therefore, it helps in magnification of top ranks [30].

2.5 Cross-entropy monte carlo algorithm based fusion

In a completely different approach, rank level fusion of multimodal biometrics is considered as an optimization problem in [3]. Here, aim is to minimize the distances of an aggregated rank list from each input rank list. A widely used distance measure in the domain of rank aggregation problems, Spearman footrule distance [47], is considered in this context. Here, the optimization problem is solved by using cross-entropy Monte Carlo algorithm. Though this method assumes the performance of individual matchers as equal, still it is able to achieve high identification accuracy. This is due to the fact that it tries to find an optimum rank list having minimum distances from the input lists of individual biometric modalities. Superiority of this optimization-based approach is experimentally demonstrated in [3] as compared against traditional rank level fusion approaches.

2.6 Other rank level fusion methods

Rank level fusion of multimodal biometrics involving face, iris and ear is performed in [37] by using Markov chain. In this method, a node in the Markov chain is associated with a subject. Transitions in this Markov chain represent an ordered relation among these enrolled subjects. Stationary distribution of this Markov chain gives the rank list. Similarly, Markov chain based rank level fusion methods for several face and ocular biometric modalities are presented in [10] and [41], respectively.

A fuzzy rank level fusion method has been proposed in [57]. Features from face has been extracted by three different feature extractors: generalized 2D Fisher’s linear discriminant (G-2DFLD), fuzzy generalized 2D linear discriminant analysis (FG-2DLDA) and modular version of local binary patterns (LBP). A fuzzy rank has been generated for each of the three feature vectors. Fusion of these fuzzy ranks is performed at rank level by a weighted summation.

2.7 Discussion and motivation

Traditional rank level fusion schemes (e.g., Borda count [50], weighted Borda count [50], the highest rank [15], and non-linear weighted methods [25, 30]) derive an aggregated rank list through simple mathematical calculations involving the input rank lists. As an example, Borda count [50] considers summation of ranks of a subject in the input rank lists. Similarly, non-linear weighted exponential [30] considers weighted summation of exponential values of ranks of a subject in the input lists. On the contrary, rank level fusion of multimodal biometrics is considered as an optimization problem in [3]. Cross-entropy Monte Carlo algorithm is used to solve the optimization problem. Strength of this method is that it attempts to find an optimal solution by minimizing the distances between aggregated rank list and each input rank list. Superiority of this optimization-based approach is experimentally demonstrated in [3] as compared against traditional rank level fusion approaches. The superiority of this method in [3] over state-of-the-art rank and score level fusion methods has motivated us to carry out further research in this direction. Therefore, similar to [3], the current paper proposes a rank level fusion of multimodal biometrics as an optimization problem. But in a noteworthy variation in problem formulation from the work in [3], the current paper considers weighted Spearman footrule distance. Justification of considering these weights for computing the distance between a pair of rank lists is provided in next section. Moreover, this method proposes a novel genetic algorithm based approach to obtain an aggregated rank list. The details of the problem formulation and the proposed novel genetic algorithm based approach for rank level fusion in multimodal biometrics are discussed in subsequent sections.

3 Rank level fusion as an optimization problem

Formulation of rank level fusion of multimodal biometrics as an optimization problem was introduced in [3]. The same formulation of the optimization problem is adopted in the current paper. This problem formulation is stated here for the completeness of this paper.

Let B1, B2, ... , BN be various biometric modalities to identify a person. Let the matching score \({S_{i}^{j}}\) be associated with each such biometric modality Bi for a jth person (subject) for an input probe. A rank-ordered list Li of those subjects can be generated from an ordering of these matching scores. Considering a high value of \({S_{i}^{j}}\) as good (for a similarity score), the following is true about the ordered list Li: \({S_{i}^{j}} > {S_{i}^{k}}\) implies \(r^{L_{i}}(j) < r^{L_{i}}(k)\). Here, \(r^{L_{i}}(j)\) indicates the rank (i.e., position) of the jth subject in the list Li.

Therefore, N ordered lists are created as L1, L2, ..., LN for biometric modalities B1, B2, ... , BN, respectively. A combination of these N ordered lists generates an aggregated list δ as it is shown in Fig. 1.

$$ \delta = aggregate(L_{1}, L_{2}, ... L_{N}) $$
(8)
Fig. 1
figure 1

Fusion of multimodal biometrics at rank level (taken from [3])

The objective, here, is to generate the aggregated list δ having minimum distances from the input lists L1, L2, ..., LN. Hence, the objective function to generate an aggregated list can be defined as:

$$ minimize {\varPhi}(\delta) = \sum\limits_{i=1}^{N} w_{i} \times d(\delta,L_{i}) $$
(9)

δ denotes an aggregated list.Li is the ith input list associated to biometric modality Bi.N denotes number of modalities.wi denotes the associated weight for list Li. For the initial experiments in this paper, each input list has been assigned the same weight. Although, non-uniform weights could be given to these input lists in specific cases, i.e., where the qualities of few of these input lists are doubted.d denotes a distance metric between two lists.

The goal is to obtain an aggregated list δ, which minimizes the objective function Φ(δ), among the set of all candidate lists.

Weighted Spearman footrule [47] distance is applied here to calculate the distance d(δ,Li) between two lists δ and Li. Weighted Spearman footrule distance computes a weighted summation of the absolute differences between ranks (i.e., positions) of each subject in two lists as:

$$ d(\delta, L_{i}) = \sum\limits_{t \epsilon L_{i} \cup \delta } I(t) * |r^{\delta}(t)- r^{L_{i}}(t)| $$
(10)

Here, rδ(t) represents the rank (i.e, position) of subject t in the list δ. \(r^{L_{i}}(t)\) represents the rank (i.e, position) of subject t in the input list Li. Contrary to the work in [3], an influence factor I(t) is associated with the rank difference for each subject t in (10). This consideration of influence factor I(t) as a weight in the weighted Spearman footrule distance distinguishes the problem formulation in this paper from the problem formulation in [3]. The motivation for considering these weights are mentioned as following: If rank (i.e., position) of the subject t is good (i.e., close to 1) in any one of the lists, then the subject will have more influence on the computed distance. Otherwise, the assigned influence factor I(t) is less to indicate lesser influence of the lower-ranked subject (in both lists) on the distance computation. Hence, the value of I(t) is decided by following (11).

$$ I(t)= 1- \frac{(min(r^{\delta}(t),r^{L_{i}}(t))-1)}{n} $$
(11)

Here, n represents the number of enrolled subjects. If subject t is not present in one of the two lists (either δ or Li), the rank of the subject (rδ(t) or \(r^{L_{i}}(t)\)) in the list is considered as one more then the size of the list.

Figure 2 presents an example of applying weighted Spearman foot rule distance for aggregating ranked lists of subjects. Let there be four input lists L1,L2,L3 and L4. Each list contains six subjects (A, B, C, D, E and F). There is one candidate list δ. Weighted distance of this candidate list δ from each input list is computed using (10). For example, subjects A and B are at same position in the list L1 and δ. Hence, rank difference is zero. Subject C is at third position in list L1 and is present at fourth position in δ list. Hence, absolute difference of the two ranks of subject C is one. Minimum of these two ranks is three. Hence, influence factor I(t) is selected as 0.67 as it is at third position as shown in Fig. 2. Similarly, rank difference is calculated for each subject and a weighted summation is taken as shown in Fig. 2. Thus, it can be estimated that weighted Spearman footrule distance between lists L1 and δ is 1.34. Similarly, estimated weighted Spearman footrule distance of δ from the input lists L2,L3 and L4 are 4.51,4.68 and 3.33, respectively. Finally, distances of a candidate list from each of the input lists are added to generate the final fitness value ϕ(δ) as 13.86.

Fig. 2
figure 2

Weighted Spearman foot rule distance computation

4 Proposed genetic algorithm based approach

In this section, a genetic algorithm based approach is proposed to solve the above optimization problem (9). This algorithm is based on the process of natural selection where the fit solutions (chromosomes) are selected from a population based on a fitness function. These selected chromosomes produce the offsprings having better chance of survival due to inheritance of the characteristics of these parent chromosomes. The proposed genetic algorithm based approach uses the elitism concept. In elitism based genetic algorithm, better solutions are memorised. If the newly produced offsprings do not have better fitness values than their parents, then the parents are retained in the new population. Otherwise, produced offsprings substitute their parents in the new population. This concept of elitism ensures that the best solution in an iteration does not deteriorate. The speed up of the performance of genetic algorithm due to this elitism is well documented in [52, 73]. Detail of the proposed genetic algorithm based approach is presented in this section.

4.1 Representation of a chromosome

The objective of this proposed genetic algorithm based rank level fusion scheme for multimodal biometrics is to find an ordered list of subjects based on their ranks. Therefore, chromosome in the proposed approach represents an ordered list of subjects. Two main characteristics of this representation of chromosome are stated as following:

  • Unique position of a subject in a list indicates its rank. Hence, a subject appears in a list only once.

  • Every subject is present in a list.

Based on the above facts about the design of a chromosome, it can be said that, length of a chromosome is equal to the number of enrolled subjects. For example, an ordered list δ in Fig. 2 is represented by a chromosome

(A,B,D,C,E,F). Here, subject A has rank one and subject B has rank two. Similar observations can be made for other subjects in the list too.

4.2 Fitness of a chromosome

In order to solve the formulated optimization problem in Section 3, fitness of a chromosome (i.e., a candidate list) is measured as a summation of distances of the candidate list from each input list (9). Weighted Spearman footrule distance (10) is used as the distance measure between a candidate list and an input list. Here, the objective is to minimize the fitness value. Therefore, the fittest solution is having the lowest fitness value.

4.3 Initialization of population

In traditional genetic algorithm, chromosomes in an initial population are generated randomly. On the contrary, domain knowledge is used to generate the chromosomes in initial population for achieving fast convergence [21, 49, 65]. Similarly, initialization of chromosomes in the proposed genetic algorithm is carried out using a mix of knowledge-based and random initialization. Let a population of fixed size M is considered. N out of these M chromosomes are initialized to represent N input lists (i.e., solutions from each individual modality). Justification of initializing N chromosomes using input lists is that an input list represents a ranked list of subjects based on matching scores for the concerned biometric modality. Ideally, all biometric modalities should generate the same ranked order of the subjects. But practically, some deviations will be observed for each modality. Hence, the problem of list aggregation arises. Unless the quality of the acquired biometric signal is poor, consideration of the input lists in the population improves the convergence rate. Remaining (MN) chromosomes are randomly generated (i.e., a random sequence of subjects are considered as chromosome). But two characteristics of a chromosome (Section 4.1) are maintained during this random initialization.

4.4 Selection

A roulette wheel based selection process is used in the proposed work. The chromosomes which are having better fitness values ϕ(δ) (better distances) share the larger areas in roulette wheel. Let M chromosomes be δ1, δ2, …, δM. Corresponding fitness values of these chromosomes are ϕ(δ1), ϕ(δ2),…, ϕ(δM), respectively. As the objective is minimization of distances, each fitness value ϕ(δi) is converted as:

$$ \phi^{\prime}(\delta_{i})= max(\phi(\delta_{1}),\phi(\delta_{2}),\ldots,\phi(\delta_{M}))/\phi(\delta_{i}) $$
(12)

Then, the proportion of area Ai in the roulette wheel for a chromosome

$$ A_{i} = \frac{\phi^{\prime}(\delta_{i})}{\phi^{\prime}(\delta_{1})+\phi^{\prime}(\delta_{2})+\ldots+\phi^{\prime}(\delta_{M})} $$
(13)

Thus, a fitter chromosome (having lower objective function value Φ(δi) will get a larger area on the roulette wheel. The roulette wheel is rotated M times to select M chromosomes for the new population. Every time, one chromosome is selected form M chromosomes in the current population. The chromosome, whose assigned area in the roulette wheel appears in front of a pivot, is selected each time. Bigger the assigned area in the roulette wheel is, the probability of getting selected into the new population for the chromosome is higher.

4.5 Crossover

The newly generated population based on fitness values are then randomly divided into \(\frac {M}{2}\) non-overlapping pairs. In the proposed work, population size M is considered as an even number. Crossover is performed between each pair of parent chromosomes. For each pair of chromosomes, a crossover point is decided randomly. In crossover operation, a pair of offspring chromosomes are generated by interchanging the parts of parent chromosomes around the crossover point. As shown in Fig. 3, two parents are presented and a crossover point is marked. Every element of the first parent up to the crossover point (i.e, A, B and C) and every element of the second parent after the crossover point (i.e, B, D and F) are combined together to produce the first offspring. It has elements of both parents (i.e, A, B, C, B, D and F). Similarly, the second offspring is produced by combining elements of the second parent up to crossover point (i.e., C, E and A) and elements of the first parent after crossover point (i.e., D, E and F). The second offspring contains elements as C, E, A, D, E and F. After interchanging parts of the chromosomes, repetition of subjects is possible in a chromosome. As the length of the chromosome is fixed, it causes missing subjects in the chromosome. It is illustrated using an example in Fig. 3. It violates the designed characteristics of a chromosome in this context. In order to deal with this situation, the second occurrence of every repeated subject is replaced with a subject which is not present in that chromosome. This process is performed for every newly generated child chromosome. As it is shown in Fig. 3, there are six subjects (A to F) in each of the parent chromosomes. After crossover, two offsprings are generated which may contain repeated subjects. As an example in Fig. 3, subject B and subject E are repeated in the first and the second offsprings, respectively. In order to remove those repetitions, the second occurrences of each of the repeated subjects are identified and are replaced with the missing subjects. As an example in Fig. 3, subject B is repeated and subject E is missing in the first offspring. Therefore, the second occurrence subject B is replaced by subject E. Similarly, subject E is repeated and subject B is missing in the second offspring.. Therefore, the second occurrence of subject E is replaced by subject B. Moreover, an elitist genetic algorithm is adopted to speed up convergence. Therefore, all four chromosomes (both offsprings and both parents) are evaluated using the fitness function. The two fittest chromosomes (having the two least fitness values) are selected from the set of four chromosomes. These two selected chromosomes are retained in the population. It ensures retaining of better solutions in the population.

Fig. 3
figure 3

Crossover

4.6 Mutation

The newly generated population after crossover further goes through the mutation process. The population of chromosomes can be represented using a matrix of size M × n. Let the number of chromosome in population be M and n be the total number of subjects, i.e., length of a chromosome. Here, each row refers to a chromosome in the population. Another matrix of size M × n is generated, where each element of it is a randomly generated value in the range [0,1]. The positions in this matrix, where the values are less than or equal to a small mutation probability pm, are considered for mutation operation. Thus, all such genes are identified, which will go through a mutation process for each chromosome. Let, for each chromosome i, ki number of such candidate positions are identified for mutation. Then, such candidate positions for each chromosome i are paired into \(\lfloor \frac {k_i}{2}\rfloor \) number of pairs. Here, ⌊⌋ refers to the largest integer which is smaller than or equal to its argument. The mutation operation, here, is designed as pair-wise swapping of gene values in each \(\lfloor \frac {k_i}{2}\rfloor \) position pairs for the ith chromosome. As an example in Fig. 4, there are ten chromosomes from C1 to C10. Each chromosome contains six subjects. A random matrix of size 10 × 6 is generated, whose elements are randomly generated values in the range [0,1]. Each position in the matrix having value less than or equal to mutation probability as 0.01 is marked using a different color. These elements are candidate positions for mutation. Now for each chromosome, pairs of such positions are selected and their genes are interchanged to generate a new chromosome. The process also helps in avoiding the repetition of subjects in a chromosome. For example in the Fig. 4, a chromosome in the second row has two subjects (genes) D and F, whose positions are interchanged. Interestingly, chromosome C1 has only one candidate position. As there is no other candidate position for mutation in C1, the mutation does not take place for this chromosome. Similarly, chromosome C5 has three candidate positions for mutation with gene values C, B and E. Here, only first two genes (C and B) are interchanged. There is no other position with which the gene E can be interchanged. After this step, each mutated chromosomes is tested on the fitness function. If the fitness value of a new chromosome is better than the fitness value of its parent chromosome, then the new chromosome replaces the parent chromosome. Otherwise, the parent chromosome is retained in the new population. This is how elitism of the proposed genetic algorithm is maintained. It is to be noted that swapping is always performed between genes of same chromosome. If genes of different chromosomes are swapped, then repetition of subjects in new chromosomes may occur. Therefore, genes of the same chromosomes are swapped to avoid this problem.

Fig. 4
figure 4

Mutation

4.7 Stopping criteria

Genetic algorithm is iterative in nature. Selection, crossover and mutation steps are repeated iteratively until a stopping criteria is satisfied. If there is no change in any of the chromosomes in the population due to either crossover or mutation over a number of iterations (which is experimentally decided), the algorithm stops. Then, the best chromosome in the final population is considered as the final ranked list of the subjects.

The flow of this genetic algorithm is given in Fig. 5. The discussed concept of elitism is shown in the Fig. 5, the offsprings due to crossover and mutation are compared with their parents to retain better chromosomes. Even, the iterations having no change in the population are tracked using a flag variable having a value 0.

Fig. 5
figure 5

Genetic Algorithm

5 Experimental setup

The performance of the proposed genetic algorithm depends on a few parameters such as influence factor for distance measure (10) and number of iterations size for stopping criteria in the genetic algorithm. These values have been set experientially to perform the experiments.

5.1 Influence factor

The desired output of the proposed method depends on the chromosomes selected in the initial population. Each chromosome in this problem is a ranked list of subjects. A ranked list contains the name of subjects which are ordered based on their ranks for the particular subject of a modality. Performance of the proposed GA will be high when a subject in the higher positions in the ordered ranked list has more influence on the final output as compared to the subject at the bottom of the ordered ranked list. In order to give higher preference to a subject present at the higher positions of the ordered ranked list, influence factor I(t) is introduced as a weight for the distance measure in (10). This influence factor is generated using two methods: (i) using linearly decreasing function (11) and (ii) using an exponential decreasing function I(t) = (1 − 0.01)t with varying tuning parameters. It can be observed that both these functions are decreasing functions in the range of [0,1]. The influence factor I(t) is selected based on the minimum position (i.e., higher rank) of both the subjects. This way the subjects having higher rank in the ranked list will have more influence on the computed distance. It was experimentally found that the influence factor being generated by linear decreasing function (11) have exhibits performance than the exponential decreasing function. Therefore, influence factor is generated using (11) for the rest of the experiments.

5.2 Stopping criteria

The second parameter is the window size for the stopping criteria. To achieve convergence of the algorithm, this window size defines maximum number of successive iterations for which changes are not observed for any of the chromosomes in the population. It is very important to set the window size properly. Otherwise, the algorithm can converge at local optima, which will not give the desired result. In order to set the window size, a set of experiments has been performed. For this experiment, the proposed algorithm is executed on all datasets with varying the window size from 200 to 3500 using an interval of 100. Result for one of these experiments is presented in Fig. 6 through a plot of various cumulative ranks and their accuracies with varying window size for NIST BSSR1 multimodal dataset [16]. It can be observed from Fig. 6 that the window size of 2000 gives better results as compare to the lesser window sizes. Hence, to carry out experiments in the proposed work, a window size of 2000 is selected for NIST BSSR1 multimodal dataset. Similarly, window sizes of 3500 and 600 is experimentally selected for OU-ISIR Gait dataset [23] and Bioscote ARFace dataset (developement set) [53] respectively.

Fig. 6
figure 6

Result of various window sizes for NIST BSSR1

6 Experimental results

Detailed performance analysis is carried out for the proposed genetic algorithm based rank level fusion method against several existing fusion methods (both at rank level and score level). It is experimentally established that the proposed genetic algorithm based rank level fusion method performs better than several existing state-of-the-art rank and score level fusion methods. These existing state-of-the-art methods are mentioned in Section 6.1. The proposed method is experimentally tested on three different multi biometric datasets. The datasets along with the corresponding performance measures of all these comparing methods are discussed in subsequent sections.

6.1 State-of-the-art methods for performance comparison

The existing state-of-the-art linear and non-linear rank level fusion methods are compared with the proposed genetic algorithm based rank level fusion method. Following rank-level fusion methods are used for experimental comparison: Borda count [50], weighted Borda count [50] and highest rank method [15] belong to linear rank level fusion method. Even in recent years, these three methods are widely used in the context of rank level fusion [12, 63]. Exponential [30], weighted exponential [30], division exponential [25] and logarithm [25] are non-linear rank level fusion methods. Similarly, the proposed genetic algorithm based method is also compared with cross-entropy Monte Carlo based optimization method [3]. The proposed method is also compared against few state-of-the-art score level fusion methods, sum-rule [58], normalized sum rules [27, 28], product-rule [51], max-rule [51], min-rule [51], WQAM [2], Frank t-norm [18] and Hammcher t-norm [18].

Some of these existing rank level fusion methods (weighted Borda count, exponential, weighted exponential, division exponential and logarithm) need weights for different biometric modalities. These weights have a significant influence in the performance of these methods. Hence, an optimal set of weights for various modalities are obtained to produce the best performance for these comparing methods. Therefore, an elitist genetic algorithm is used to obtain the optimum set of weights for each of the comparing methods. A chromosome contains weights in the range of [0,1], i.e, each gene in a chromosomes contains a weight value. Recognition accuracy (defined in (14)) is considered as a fitness function. Here, ra represents recognition accuracy, ip is number of probes correctly matched and p is the total number of probes.

Initial population is set to 10. Hence, for each chromosome, fitness value is computed and selection of these chromosomes for next generation is done using roulette wheel based method. For crossover, 5 random pairs are selected from the population. After generation of offsprings, all four chromosomes (two parents and two offspring) are tested for fitness value. Among these four chromosomes, the two chromosomes having higher fitness values are selected for next generation. Mutation is performed with the mutation probability of 0.1. In mutation, the selected gene values are subtracted from 1 to produce new gene values. Elitist approach is followed to retain better chromosomes. A new chromosome having high fitness value replaces the old chromosome. Otherwise, the old chromosome is retained. The algorithm converges when the optimal weights are found.

$$ ra= \frac{ip}{p}*100 $$
(14)

The set of selected weights depends on the dataset on which the weights are trained. Hence, k-fold cross-validation (using k = 5) is used to eliminate this dependency. The dataset is split into k parts. k − 1 parts are considered for training. These k − 1 parts learn the set of weights using the elitist genetic algorithm. The remaining one part is used as a test set to report the recognition accuracy. The whole training and testing steps are repeated k-times to get different sets of weights and corresponding accuracies on the test sets. Finally, average accuracy from k such test sets are presented in the result. This process of obtaining weights for each modality is followed for each of the three datasets being used for performance comparison. On contrary, proposed work assigns same weight to each individual modality (input list) to check the efficiency of the proposed work over state-of-the-art rank level fusion methods.

6.2 Fusion of multimodal biometrics involving face and fingerprint

In this paper, the dataset namely NIST BSSR1 [16] is considered for performance comparison. This dataset is widely used to study the fusion of multimodal biometrics [14, 30, 55, 66]. This dataset contains information from two biometric modalities- fingerprint and face. Fingerprints of right index finger and left index finger are considered as part of this dataset. Two different face matching modules are used for face biometrics- termed as G and C in the dataset. Biometric information for the above biometric modalities were acquired during the enrolment phase form each of the 517 persons (subjects). In this dataset, similarity scores of each of these subjects as a probe with all 517 subjects are provided as per two different face matchers (termed as G and C) and fingerprint matchers for right and left index fingers. In the context of score level fusion, these similarity scores from various biometric modalities are fused using existing score level fusion methods. In order to perform a rank level fusion, rank lists are generated from the given similarity scores. These rank list are fused using the proposed as well as the existing rank level fusion methods.

Finally, the proposed rank level fusion method (Section 4) is used to combine given four rank lists which are mentioned in the Section 6.1. The recognition accuracies (in %) as defined in (14) for the proposed method are presented in Table 1 for the probe being found within top 1, top 2 and top 3 ranks (cumulative) in the aggregated rank list. Out of 517 probes, the probe appears at the top-1 position in the aggregated list for 514 times (99.42%). Hence, the proposed method correctly identifies the subject for a given probe in 99.42% of given cases. The recognition accuracy does not improve further while considering even top-3 ranks in the aggregated list. The recognition accuracy of the comparing methods within top-1, top-2 and top-3 ranks are also presented in Table 1. It can be seen from the reported recognition accuracies that the proposed genetic algorithm based rank level fusion method perform better than the majority of existing state-of-the-art rank level and score level fusion methods. Among these existing rank level fusion methods, division exponential and CES methods exbibit equivalent performance to the proposed method while top-2 and top-3 ranks are considered. But performance of the the proposed method is better than the division exponential and CES methods if only top-1 rank is considered. Among the score level methods, only the WQAM method exhibits equal performance as with the proposed method. The same table has also been used to presents the recognition accuracies for each of the unimodal biometric modalities in this dataset. The reason for this superiority of the proposed method is that the method considers minimization of the distances between aggregated and input rank lists. It is also noticed that the proposed method is performs better than each unimodal matcher justifying the need for the multi-biometric system. The presented CMC plots presented in Figs. 7, and 8, also show that the proposed method performs better than majority of the state-of-the-art rank level and score level fusion methods.

Table 1 Cumulative recognition accuracies in % for NIST BSSR1 dataset [16] using various comparative methods
Fig. 7
figure 7

CMC Plot for Rank Level Fusion Methods compared with proposed GA for BSSR1 Dataset [16]

Fig. 8
figure 8

CMC Plot for Score Level Fusion Methods compared with proposed GA for BSSR1 Dataset [16]

6.3 Fusion of multimodal biometrics for multiple gait feature representations

Additionally, the second dataset (BSS4) is form the Institute of Scientific and Industrial Research (ISIR), Osaka University (OU) [23, 43]. This dataset has also been used for fusion of multi biometrics in [34, 35]. In this dataset, an input image sequence from gait has been processed using five different feature extraction methods: (i) Gait energy image (GEI), (ii) Frequency-domain feature (FDF), (iii) Gait entropy image (GEnI), (iv) Chrono-gait image (CGI), (v) Gait flow image (GFI). The dataset is composed of dissimilarity scores of each of these 3249 subjects as probe with all 3249 subjects for above mentioned features. These dissimilarity scores from above gait features are fused using existing score level fusion methods. Details of these gait feature extraction methods can be found [23, 34].

Moreover, rank lists are generated based on the given dissimilarity scores. This provides five rank lists for each probe. These rank lists are combined using the proposed rank and score level level fusion method (Section 4) and other existing rank level fusion methods as discussed in Section 6.1. Table 2 presents the recognition accuracies of various methods (in %) for the probe subjects within top 1, top 2 and top 3 ranks (cumulative). The table also presents the recognition accuracies for each of the unimodal biometrics in the dataset.

Table 2 Cumulative recognition accuracies in % for OU-ISIR BSS4 dataset [23] using various comparative methods

From Table 2, the results clearly show that the proposed genetic algorithm based method has superior performance over other methods except score level fusion method with sum rule [58] for top-1 position (rank 1). Performance of the proposed system increases with top-2 and top-3 positions. The justification made for the previous dataset BSSR1 (Section 6.3) is also applicable here. It is also noticed that the proposed method is performing better than each unimodal matcher justifying the need for multi-biometric system. From the CMC plots in Figs. 9, and 10 it is verified that the proposed method performs better than the state-of-the-art rank level and score level fusion methods.

Fig. 9
figure 9

CMC Plot for Rank Level Fusion Methods compared with proposed GA for BSS4 Dataset [23]

Fig. 10
figure 10

CMC Plot for Score Level Fusion Methods compared with proposed GA for BSS4 Dataset [23]

6.4 Multi-algorithm fusion for face biometrics

The third dataset, Bioscote AR Face is form the Idiap dataset distribution portal [22, 53]. The AR Facedataset has 136 enrolled users. This dataset is again divided into three sets, 50 training subjects and 43 subjects in each of the development and the evaluation set. For experiments, raw scores of development set from this Bioscote AR face dataset is used. Gaussian mixture model (GMM), inter-session variability modeling (ISV), joint factor analysis (JFA) and total variability cosine modeling (TV-cosine) have been used by [53] to compute raw scores for AR face dataset. Multi algorithm fusion of these scores for several comparing models is performed in this work.

Moreover, rank lists are generated based on the given similarity scores. These rank lists are combined using the proposed rank level fusion method (Section 4) and other existing rank level fusion methods as discussed in Section 6.1. Table 3 presents the recognition accuracies of various comparing methods (in %) for the probe subjects within top 1, top 2 and top 3 ranks (cumulative). The table also presents the recognition accuracies for each of the recognition modals.

Table 3 Cumulative recognition accuracies in % for Bioscote AR Face dataset [53] using various comparative methods

Out of 43 probes, the probe appears at the top-1 position in the aggregated list for 42 times (97.67%). Hence, the proposed method correctly identifies the subject for a given probe in 97.67% of given cases. The recognition accuracy does not improve further while considering even top-3 ranks in the aggregated list. The recognition accuracy of the comparing methods within top-1, top-2 and top-3 ranks are also presented in Table 3. It can be seen from the reported recognition accuracies that the performance of the proposed genetic algorithm based rank level fusion method is better than the performances of to the existing rank level fusion methods (except another optimization based technique (CES) [3]). Though some of the score level methods are exhibiting the equal performance as with the proposed method. From the CMC plots presented in Figs. 11, and 12, it is verified that the proposed method performs better or equally well, as compared to the state-of-the-art rank level and score level fusion methods.

Fig. 11
figure 11

CMC Plot for Rank Level Fusion Methods compared with proposed GA for Bioscote ARFace [53]

Fig. 12
figure 12

CMC Plot for Score Level Fusion Methods compared with proposed GA for Bioscote ARFace [53]

7 Conclusion

Rank level fusion is studied, in this paper, for multimodal biometrics. The manifold contributions of this paper are highlighted here: The rank level fusion in multimodal biometrics is formulated as an optimization problem. To solve this optimization problem, a novel genetic algorithm based method is proposed. The proposed method has been tested for BSSR1 multi-biometric dataset [16], OU-ISIR BSS4 multi-sample GAIT dataset [23] and Bioscote AR Face multi-algorithm dataset [53]. The proposed method exhibits better performance in identifying the subjects than majority of the existing methods of fusion at rank level and score-level fusion methods for multimodal biometric systems. Though few of the existing methods exhibit equal performance as with the proposed method. This justifies the introduction of a novel genetic algorithm based method in this paper. Experiments also justify the usefulness of multimodal biometric systems over the unimodal biometric systems. The reported results shows significant improvement in performance inspite of considering same weight to each input list. Though in future, the effects of using different weights for each modality can be studied.

It is to be admitted that the execution time of the proposed method is longer than traditional score level and rank level fusion methods as the proposed method is based on evolutionary computing approach. But better performance in terms of accuracies of identification is achieved here. Though execution speed can be improved by using parallel implementation of genetic algorithm [5, 19]. Moreover, initial success for the reported experiments is encouraging enough to try out parallel genetic algorithms and other meta-heuristic search and optimization strategies (like particle swarm optimization) in the context of rank level fusion of multimodal biometrics.