Keywords

1 Introduction

Researchers involved in biometric field are more and more interested by new topics since this field is in full progress [21]. These topics include new approaches and algorithms that can be used to evolve existing modalities or to test new proposed ones: [20] permanent or transient [4] (knuckle and nail [4, 11]). However, a single modality cannot meet all biometric system characteristics: accuracy, speed and throughput rate, acceptability to users, uniqueness, resistance to counterfeiting and reliability. That is why new systems are proposed to join the advantages of multiple systems known as multibiometric systems.

Multibiometric systems combine multiple data or treatment to enhance performances [17] and meet as possible global system characteristics. They can combine:

  • Multiple sensors of the same modality;

  • multiple representations for the same capture or multiple classifiers;

  • multiple biometric modalities (i.e., face and fingerprint recognition);

  • multiple instances of the same modality;

  • multiple captures;

  • a hybrid system composed of the association of the previous ones.

Joining multiple systems leads to a complex system for which several parameters have to be considered. One of them is the level of fusion. In the multibiometric context, there are different levels and input data that can be combined to get a single output:

  • Feature level fusion: the fusion consists of feature selection in order to determine the discriminant features that enhance performances.

  • Score fusion: the fusion consists of combining outputs of different matchers or classifiers.

  • Decision fusion: the fusion consists of combining decisions taken for each of the biometric authentication system to get the final decision.

  • Rank fusion: the different systems output ranked individuals and the decision is made based on these ranks.

Score level fusion can be considered as the most applied level in the literature to enhance multibiometric, especially multimodal system performances. This is due to many advantages:

  • It offers best performances.

  • It combines scores, which carry more information than decision. Moreover, their fusion is simpler than feature ones.

  • Features can raise redundancy and correlation that is difficult to control. On the contrary, scores can be analyzed and interpreted to reduce correlation effect [19].

  • In the field of optimization, scores are more flexible and can be reachable to enhance system performances.

As reported in [3], the main goal of most approaches [2, 5, 9, 12, 16] in the literature is based on one of the following purposes:

  1. 1.

    Maximizing the separation between the genuine and impostor scores [13].

  2. 2.

    Finding the best weighting factor for fusion.

By resolving one of these problems, we intend to resolve an optimization problem. Since we must explore a wide solutions space to reach the best one, an evolutionary method is fundamental to reduce search cost and satisfy optimally fixed criteria. A number of works investigate this area using different bio-inspired methods [1, 3, 8, 10, 19]. The most common methods used are Genetic algorithm and Particle swarm that researchers apply in several contexts. Their propositions differ primarily in the problem conception and variable interpretation. Some of the works take the whole problem as an optimal weight search by fixing fusion rules [1, 3, 7, 19] while others address the problem of optimal fusion rules search [8, 10].

In our case, we are interested in studying rules mainly used in score fusion. Thus, we propose an optimization scheme based on tree function. We aim at retrieving the optimal fusion function that meets the required performances.

The rest of this paper is organized as follows: First, in Sect. 2, we present method and material used. To this end, we discuss the proposed fusion approach and explain the used protocol and database build on to conduct experiments. Then the optimal fusion using ancillary measures is detailed. After that, we describe experimental results in Sect. 3. Finally, we conclude and list some perspectives of our work.

2 Method and Material

In this section, we discuss the proposed approach and the database used for experiments.

2.1 Proposed Approach

In this paper, we are interested in score level fusion in a hybrid multibiometric system that combine multiple classifiers applied to different modalities. The fusion applied to scores is performed using a constructed tree of primitive operations. The evaluation of the tree gives the resulted fused score used for classification to compute system performances. At this level, we propose to optimize system performances using Genetic Algorithm (GA). The proposed GA operates on the tree by applying some modifications until reaching fixed criteria. In [8] authors investigated this idea, applied in classification [18], by building an evolving tree. They obtain a complex tree that evoke an extensible number of leaves as terminals. These terminals contain different kinds of data: constants, scores, functions without effect or ordinary variables. As they evolve the tree, the number of terminals may increase. This imply integrating new values that increase parameters number, which affect the tree evaluation and resulting system performances. The complexity of the proposed approach lead us to raise a reflection about the impact of simplifying the proposition. For the best of our knowledge, the most popular fusion rule is a simple weighted sum that offers considerable improvement. Therefore, the best is to search a simple tree based on simple operations (mainly used) to achieve competing results.

In what follows, we present the conducted approach in details:

Tree Generation.

We use a binary tree to codify our function. The tree structure is more flexible since it is more adapted to function generation for priority controlling and evaluation. The leaves can be one of two kinds: (a) pseudo-variables containing entries of the problem: the list of scores of each modality, (b) constants randomly generated and added if the number of leaves exceeds number of introduced scores.

We build the proposed tree upon the listed configurations as seen in Table 1 below:

Table 1. Parameters of the generated trees.

According to the set configuration, we get a tree composed of intern nodes containing operations and the required numbers of leaves to insert database scores and random constants. An example of a randomly generated tree is illustrated in Fig. 1.

Fig. 1.
figure 1

An example of a generated tree.

Genetic Algorithm.

We use genetic algorithm to optimize function construction respecting some parameters. This aims to get a simple function (simple tree structure) that offers the optimal performances. Table 2 presents configuration of the genetic algorithm.

Table 2. Overview of the parameters of the genetic algorithm.

Population.

Our initial population is composed of a set of trees that form functions which we intend to test. Number of trees is set to 100 trees. We generate trees randomly as explained in previous section. We simulate genetic algorithm with a steady number of individuals to observe the evolution of the set and avoid population extinction.

Fitness Function.

The fitness function evaluates the generated tree (function score fusion). We use it to sort the population at each step and to select the ones on which will be applied mutation and crossover functions.

Fitness function indicates system performances after fusion for each applied tree. Thus, we choose the HTER as performance indicator employed with the EPC curve [6]. This curve is the operating performance due to a systematic evaluation of all possible a priori chosen thresholds. The thresholds are selected based on the training set and applies them on the test set to get errors. Thus, we can propose an optimization method that tend to minimize the produced Half Total Error Rate (HTER) vector error taking into account two criteria: mean of hter vector and. The fitness function will be as follows:

$$ {\text{fitness}} = {\text{max(mean(hter)}},{\text{max(hter)}} - {\text{min(hter))}} $$
(1)

Genetic Operations.

Genetic algorithm is based on two main operations: the mutation and crossover. They should allow browsing the maximum space of configurations as possible. These two methods depends entirely on employed selection scheme. We mention that the number of individuals, on which we apply operations, is generated randomly according to the fitness and selection procedure.

  • Selection: our selection scheme is carried on population to filter individuals that participates in mutation and/or crossover. We use Roulette wheel selection applied on sorted population to get individuals candidates to crossover. These individuals will be added to the best individuals. These ones have a fitness value less than 0.05. The other individuals are used in mutation to evolve population and ensure a large diversity.

  • Mutation: we apply mutation by selecting randomly a number of trees and change randomly a sub-list of operations (according to mutation ratio). We maintain the tree structure to explore trees of the same configurations since the initial population is random with different depths, according to the limits, and different number of nodes. Furthermore, we apply permutation in the same tree by changing randomly position of selected pair nodes. At each step, we alter between adding a new operation and internal permutation according to a probability equal to 0.5. The mutation involve all the population except the best selected ones. Thus, it allows preserving the best population for crossover. We generate a probability depending on best fitness, the upper limit is set to 50 %, from which we discard best population for mutation operation. Therefore, the probability is variable.

  • Crossover: we apply crossover on tree pairs selected randomly from the defined list of trees. So, we select a tree structure and copy the operation of the other tree according to the crossover ratio.

Thus, we present the designed genetic algorithm composed of the following steps:

  1. 1.

    An initial population is randomly generated. The trees are built using an iterative procedure.

  2. 2.

    The following steps are repeated until the fitness function has reached the defined best value, or we reached the maximum number of generations:

    1. (a)

      Compute the fitness measure of each tree. This refers to compute hybrid multibiometric system performances. We maintain population number by selecting only 100 best ones that participates in each iteration in the following steps.

    2. (b)

      Select trees, with a probability based on their fitness, to apply genetic operations.

    3. (c)

      Create new generation of trees by applying the following genetic operations to the previously selected ones:

      • Reproduction: the individual is copied to the new population.

      • Crossover: A new offspring tree is created by recombining randomly chosen parts from two selected trees.

      • Mutation: A new offspring tree is created by mutating randomly a number of nodes of the selected tree.

The single best tree of the whole population is the chosen one.

2.2 Biometric Score Level Database

We conduct our experiments on the XM2VTS score database [15]. This database has many advantages and represents, for the best of our knowledge, a reference for reliable testing on score benchmarks since it:

  • Provides a database of scores taken from experiments carried out on the XM2VTS face and speaker verification database.

  • Proposes several fusion protocols.

  • Provides tools to evaluate fusion performance so we can compare different approaches in a uniform way.

2.3 Lausanne Protocol

The Lausanne protocol is for the XM2VTS database. This database is built respecting the Lausanne Protocols I and II, a published evaluation proposal. LP1 has 8 baseline systems when LP2 has 5 baseline systems as it is shown in Table 3.

Table 3. Description of XM2VTS and Lausanne Protocol.

The database, provide a new plot built upon this protocol called Expected Performance Curve (EPC) [6]. This curve allows a realistic comparison between different models. Moreover, the use of this curve lead to a better performance analysis. The curves offers several single measures that reflect a realistic performance comparison for a reachable operating point of the system.

3 Experiments and Results

We conduct the experiments on the XM2VTS database based on Lausanne protocol. The database includes development dataset used to configure system parameters and compute fuse scores as reference, which will be compared to fusion results applied on the evaluation dataset to compute system performances.

The proposed genetic algorithm aims to enhance the fusion function along 50 generations. We start from a random generated population as illustrated in Fig. 2. As we process mutation and crossover operations on this population, we produce different trees that we evaluate to get multibiometric system performances represented by the fitness value.

Fig. 2.
figure 2

Initial population generated randomly that includes 100 individuals.

Figure 3 shows statistics about fitness evolution during GA generations. As illustrated, we compute max and min value to delimit lower and upper bounds. Mean and standard deviation illustrate the global fitness of each population. We can see that the genetic algorithm reduces the gap between the max and mean values. This means that the population converge to the best performances even before the last generation for several times. However, the convergence is not systematic. The peaks on the max value demonstrate that we need to explore bad solutions to reach best ones. The increased std value prove that exploration is optimised since we generate bad individuals to get better performances and this is done several times (number of peaks of the std curve).

Fig. 3.
figure 3

Fitness value (min, max, mean and std) over 50 generations.

Furthermore, the implemented genetic algorithm aims to reach appropriate system performances (fitness less than 0.05). This maximum value is fixed to meet other system performances. Figure 4 shows the evolution of our population during generations according to the best list ratio. We can see that the population evolves to the desired fitness, which means that we explore solutions that respect standard performances by applying crossover on these individuals. However, we conserve a proportion to extend exploration (local minimum) with the mutation operation.

Fig. 4.
figure 4

Evolution of the best list population ratio during generations.

Figure 5 shows the mean fitness optimization during mutation and crossover operations. We start experiments with bad performance value, which we tend to bring to a bound under the fixed value (=0,05).

Fig. 5.
figure 5

Fitness mean evolution with cumulative mutation and crossover during generations.

After all generations, the best-reached performances are given in the EPC curve as seen in Fig. 6. We compare the new function fusion of all available sources with the primitive operations proposed in the XM2VTS database and implemented over different configurations of the baseline systems. We apply the HTER significance test to compare our function to the mean fusion. The test is significant, which means that our function outperforms the others fusion rules.

Fig. 6.
figure 6

Expected Performance Curve (EPC) and HTER significance test.

In this paper, we propose a fusion method based on tree generation and evaluation. Our proposition tends to simplify the proposed tree in [8] and apply elementary genetic operations. The main goal is to improve primitive operations by building simpler tree that supports the eight entries of the chosen database. Table 4 shows the reached performances compared to existing contributions in the literature. Our method outperforms the proposition of [14] where experiments are conducted on the same database. In addition, the performances carried with [8] on a similar database, built upon the same modalities, are quite similar.

Table 4. Comparison with another proposed system fusion on the same database, and the complex tree tested on a database of the same modalities.

4 Conclusion and Perspectives

In this paper, we propose a fusion function built with a simplified tree structure. The function include primitive fusion rules. We implement a Genetic Algorithm for tree retrieval. The use of evolutionary algorithms in fusion rules selection is a promising field that may help improving fusion performances. The development and evaluation process are carried out using the XM2VTS database and Lausanne Protocol, which ensure reliable results. Moreover, a significance test allows confirming performance enhancement, by comparing with primitive fusion rules. However, the presented work must be tested on other databases. Other primitive or complex fusion rules can be added to test other tree configurations.