Keywords

1 Introduction

Genetic algorithm (GA) is population based search and optimization algorithm proposed by Holland [1]. Reproduction operators such as crossover and mutation play an important role in GA’s performance and maintain diversity in the population—helps in achieving the global optima. There exists various selection techniques proposed includes roulette wheel, rank based, tournament, steady state, Boltzmann and elitism. Each of these selection techniques has their own ways of selection of populations at the initial stage and after each iteration (selection of population for next generation). The selection of population is known as selection pressure—if not selected intelligently leads to slow convergence rate and premature convergence [2].

Researchers studied the performance of GA via different selection techniques. Typically, GA’s performance is evaluated using two factors: convergence rate and the number of generations required to reach to optimal solution. In [3] results of GA was compared for proportional roulette wheel and rank-based roulette wheel selection techniques conclude that rank-based method outperformed proportional roulette wheel considering total number of generations in reaching to optima with some observations made outlined as: rank-based selection is faster, more robust and certainty towards the optima. The comparison of proportional roulette wheel with tournament selection was presented in [4] explains the superiority of tournament selection over proportional roulette wheel. Two versions of rank-based selection probabilities, namely, linear ranking and exponential ranking examined against tournament selection considering the convergence time of GA—concluded that tournament selection is a better choice over rank-based selection, since repeated tournament selection is much faster than sorting the population and assigning rank-based probabilities [4]. Three different GA’s selection methods: deterministic, tournament and roulette wheel were applied to inspect the performance of PCB inspection system—discovered that deterministic method needs lowest number of generation in reaching to the highest fitness [5]. A new selection approach known as sexual selection was proposed in [6]—was compared with most commonly used selection strategies—results that sexual selection either outperformed or performed better than roulette wheel selection and tournament selection when no fitness scaling was applied on the more difficult test cases. Goldberg and Deb [7] proposed a comprehensive studies four (proportional, ranking, tournament steady state) selection methods considering the solutions of differential equations. The focus of [7] was towards expected fitness ratio and convergence time—concluded that ranking and tournament selection outperformed proportional selection of maintaining steady pressure towards convergence. The study of [7] was extended considering linear ranking selection and stochastic binary selection leads to a conclusion that both have identical expectations—but they recommendation was drawn for binary tournament selection due to the efficiency in time complexity. Although significant works are conducted showing the importance of selection methods and their comparison but none of the research shows the statistical comparisons to represent the performance significance of various selection techniques. This paper presents the comparison of three selection techniques: tournament selection, roulette wheel selection and ranking selection. The domain of inquiry is travelling salesman problem (TSP)—a combinatorial problem implemented using GA considering all three selection methods. Experiments are conducted and results are collected for the analysis purpose. Statistical tests such as f-test and posthoc tests are conducted to examine the performance significance of selection methods considered.

The rest of the paper is organized as: Sect. 2 represent the GA applied for TSP. The importance of selection techniques with the discussion of three selection methods are reported in Sect. 3. The discussion of experiments, parameter tuning, results, statistical tests and discussion are drawn in Sect. 4. Lastly, conclusion is of the study in given in Sect. 5.

2 Genetic Algorithm and Travelling Salesman Problem

This section provides the general insight of GA for solving TSP. GA is an optimization method based on Darwinian principle of “survival of the fittest” proposed by Holland [1]. It employs stochastic approach to randomly search for optimum solution for the specified problem. Each individual also referred as chromosome—a member of population represents a potential solution. GA supports a number of possible chromosome representations. Fitness is the measure’s the quality of solution in GA. GA supports two most important operators: crossover and mutation—helpful in maintaining the diversity of the population and guide the searching towards the global optima. TSP is finding a Hamiltonian cycle with minimum cost [8]. There exist a number of cities present in TSP, where each pair of cities has a corresponding distance. The goal is to visit all the cities with total distance travelled should be minimized. The procedure for solving TSP using GA is represented in Fig. 1.

Fig. 1
figure 1

Genetic algorithm for solving TSP

The GA process begins by initializing important parameters such as maximum number of generations (GEN_MAX), population size (PS), crossover probability (CP) and mutation probability (MP). In TSP location of cities plays an important role, therefore considered. An initial random population is created and evaluation of each chromosome is performed. The population is then checked for termination (maximum number of generation or optimum results) condition—if achieved then displays the results and stop otherwise transformed into a new population or next generation applying three GA operators: selection, crossover and mutation. Selection operation is performed to select couple of parents in order to procreate offspring by crossover and mutation. The new offspring contains a higher proportion of characteristics produced by the ‘good’ chromosome of previous generation—helps in spreading the good characteristics over the population—which then mixed with other good characteristics in iterative manner. This process continues until GA reach to best solution present in the solution space.

3 Selection Strategies

The primary objective of selection technique is to pick the chromosomes in the current generation—will participate to reproduce the offspring with hopes that next generation chromosome will have higher fitness. Hence, selection plays a significant role in GA performance—it is important to formulate selection operator that should ensures better member with higher fitness—have a greater probability to be selected for mating, but in some situation worse member still have a small probability to be selected—results in premature convergence or converge at local optima. There exists various selection techniques each has different way of calculating selection probability. The detailing of various selection methods can be seen in [711]. In this section, we bring to light an overview of three important selection methods: tournament selection, roulette wheel selection and ranking selection.

Tournament selection is the most popular method in GA due to its efficiency and simple nature [7]. A total \( n \)-individuals are picked randomly from the whole population—compete against each other in a tournament referred as tournament size commonly set to 2. The highest fitness individual wins and will be selected as one of the next generation population. It gives a fair chance to each individual to be selected, hence maintain diversity.

In roulette wheel selection method individual get selected based on the certain probability directly proportional to the fitness value. Obviously, the individual have higher fitness has more probability of being selected, where the fittest individual occupies the largest segment within the roulette wheel.

In rank based selection the probability of an individual being selected is based on the fitness rank relative to the whole population. It first sort the individual chromosome in the population based on their fitness and then computes selection probabilities as per the rank rather than individual fitness value. It utilizes a function that maps the indices of individuals in the sorted list of the selection probabilities.

4 Simulation Model

Experiments are conducted to test the performance significance of selection strategies. Net Beans IDE 7.0.1, Intel Core TM2 processor 2.8 GHz with 2 GB RAM is used. GA’s performance is critical to the parameters guides the overall search process, therefore extensive tuning is performed employing orthogonal array and Taguchi signal to noise ratio. For experiment design five factor at two levels were performed give as: crossover rate (CR) = [0.5, 0.8], mutation rate (MR) = [0.3, 0.6], population size (PS) = [120, 180], number of cities (CITIES) = [40, 60] and maximum generations (MGEN) = [100, 200]—the following setting produced the best results [CR: MR: PS: CITIES: MGEN] = [0.8: 0.6: 60: 100: 100].

Figure 2a represents the comparison chart for each selection method implemented. Having seen the Fig. 2a, it is very much clear that average distance for tournament selection is higher indicates the worst performance whereas ranking selection produces least average distance—but this much information is not sufficient to reach to some conclusion. Hence, statistical tests are conducted.

Fig. 2
figure 2

a Average distance versus generations chart for ranking, roulette wheel and tournament selection methods. b Profile plot for estimated marginal mean for each selection methods implemented

F-test is conducted considering the hypotheses “there is no significant difference in the mean of samples at the 5 % level of confidence”. Table 1 shows the descriptive analysis whereas Fig. 2b graphically represents the means of three selection methods—X-axis shows the selection techniques and Y-axis presents the estimated marginal means. The result of F-test is reported in form of ANOVA table (see Table 2) indicates that significance value (p = 0.000) comparing the group level is less than 0.05—leads to rejection of the hypothesis. The results received from F-test do not indicate which selection method is responsible for the difference.

Table 1 Descriptive analysis
Table 2 Results received after ANOVA test
Table 3 Multiple comparison test: Posthoc-LSD test Tukey HSD test

Therefore, posthoc tests (compare individual approach results with each other) are conducted. LSD and TukeyHSD tests are conducted leads to a conclusion that tournament selection is mainly responsible for the difference because the average distance for TSP received by employing tournament selection is higher than other two selection methods—is verified by applying homogeneity test and results of homogeneous subset test is reported in Table 4. It can be seen that ranking selection and roulette wheel selection belongs to the same set at 0.436 level of significance whereas tournament selection is in separate group at 1.000 level of confidence. From Table 3 one can see that the asterisk (*) mark given next the mean difference indicates that difference is significant. LSD test produces results accurately—is very sensitive to violation to the assumptions of ANOVA therefore most likely to lead to Type-I error i.e. rejecting null hypothesis when it is true. To alleviate this problem, TukeyHSD test is conducted.

Table 4 Homogenous subset test

5 Conclusions

This paper analyzes the performance of three selection strategies of GA. A case of TSP was considered and implemented using GA. Extensive parameters tuning were done before performing the final experiments. This paper not only report the comparison of results obtained experimentally but also report the results of various statistical tests such as F-test and posthoc tests. A comprehensive discussion is presented on the statistical test conclude that rank based selection outperformed other two selection methods in terms of quality of results and convergence time. The performance of tournament selection is found worst whereas roulette wheel selection shows average performance. We believe that results and discussion reported in this paper will be helpful in selecting the appropriate selection method in conducting the experiments.