1 Introduction

Metaheuristic algorithms are developed to get near optimal solutions for difficult unsolvable complex problems. A large number of different types of metaheuristics are available in the literature. Metaheuristic inspired from nature, or natural phenomenon is called nature-inspired algorithm, and among them swarm intelligence (SI) based algorithms have received particular attention from the researchers [1, 22]. These algorithms are fast in nature, easy to understand and produce better solutions compared to the other classes of metaheuristics in most of the situations. SI techniques are based on the collective behaviour of swarms of bees, fish schools, and colonies of insects while searching for food, communicating with each other and socialising in their colonies. The SI models are based on self-organization, decentralisation, communication, and cooperation between the individuals within the team. The individual interaction is very simple but emerges as a complex global behaviour. It is the core of SI techniques [11]. In this paper, only the recently developed metaheuristics of SI class are considered, which are producing highly promising results for several optimisation problems. These are cuckoo search algorithm (CSA) and firefly algorithm (FA). Although SI based techniques have primarily been used and found very efficient in traditional optimisation problems, there is not much significant growth in the combinatorial optimisation field. Combinatorial problems are mostly NP-hard in nature, and we get very little information from their problem structure. So adapting these metaheuristic strategies in combinatorial solution space is a challenging job. This particular issue has motivated us to concentrate on this particular research problem. Here we present the solution process of single and multidimensional knapsack problem using the esteemed properties of CSA and FA and thereby contributed to the existing literature. Knapsack is one of the most intensively studied integer programming problems because of its simple structure. It allows exploitation of many combinatorial properties. Also, more complex optimisation problems can be solved through a series of knapsack-type sub-problems. The most common form of zero-one knapsack problem (KP) is shown below,

$$ \begin{array}{ll} Maximize\;\; & f(x_{1},x_{2},...,x_{n}) = \sum\limits_{j=1}^{n} c_{j} x_{j}\\ Subject\;\; to \;\;& \sum\limits_{j=1}^{n} a_{j} x_{j} \leq b,\\ & x_{j} \in \{0,1\} ,\;\;\; j=1,2,..,n\\ & c_{j} > 0, \;\; a_{j} \geq 0, \;\; b > 0. \end{array} $$
(1)

If there is more than one knapsack, then it is called multidimensional (multiple-constraint) knapsack problem (MKP), and its structure is given below.

$$ \begin{array}{ll} Maximize\;\; & f(x_{1},x_{2},...,x_{n}) = \sum\limits_{j=1}^{n} c_{j} x_{j}\\ Subject\;\; to \;\;& \sum\limits_{j=1}^{n} a_{ij} x_{j} \leq b_{i} ,\;\;\; i=1,2,...,m\\ & x_{j} \in \{0,1\} ,\;\;\; j=1,2,..,n\\ & c_{j} > 0, \;\; a_{ij} \geq 0, \;\; b_{i} > 0. \end{array} $$
(2)

A set of n items with profits c j > 0 and m resources with capacities b i > 0 are given. Each item j consumes an amount a i j ≥ 0 from each resource i. The 0 − 1 decision variables x j indicate which items are selected. The objective function f(x 1, x 2,..., x n ) should be maximized subject to the constraints. Research has shown the wide applications of knapsack problems, such as capital budgeting problem, allocating processors and databases in a distributed computer system, project selection, cargo loading, and cutting stock problems. KP is an NP-complete problem though MKP is a strictly NP-hard combinatorial optimisation problem.

In the recent times, several metaheuristic algorithms are applied to solve 01 KP like genetic algorithm (GA) [67], schema-guiding evolutionary algorithm (SGEA) [39], quantum swarm evolutionary algorithm [59], ant colony optimization (ACO) [51], binary particle swarm optimization based on multi-mutation strategy (MMBPSO) [38], harmony search algorithm [69], etc. Similarly for solving 01 MKP instances, many heuristics and metaheuristics have been employed. Few examples are like tabu search (TS)[4], simulated annealing (SA) [20], GA [15], ACO [32], and particle swarm optimization (PSO) [54].

CSA combines the breeding behaviour of cuckoos with the Lévy’s flight. It was first proposed by Yang and Deb in 2009 [64]. CSA is utilised to solve problems in diverse areas, and it includes mechanical engineering [66], automobile engineering [18], image processing [2]. CSA is also used for solving combinatorial problems like scheduling problem [13], traveling salesman problem (TSP) [44]. Authors have shown that CSA performs better than artificial bee colony in solving numerical optimisation problems [17]. Furthermore, CSA has been reported to converge faster than differential evolution [56]. Thus, CSA could be a potential metaheuristic algorithm for obtaining better performance in combinatorial optimisation problems.

FA is another recently developed SI method inspired by the flashing lights of fireflies in nature [61]. Like CSA, FA is also applied to solve many NP-hard problems, like TSP [30, 44], job shop scheduling problem (JSP) [35], quadratic assignment problem (QAP) [19]. In recent times, FA is utilised to solve different real-time applications like MRI brain segmentation [3], gray-scale image watermarking [43], optimal sensor placement in structural health monitoring [68], short-term load forecasting in electrical systems [31]. A comprehensive survey of FA can be found in [24].

Similar to CSA and FA, particle swarm optimisation is also an SI-based metaheuristic and developed by Kennedy and Eberhart [34]. It is inspired from the cooperation and communication of a swarm of birds. PSO finds applications in almost all types of optimisation problems. Also, several variants of PSO algorithms are applied to solve knapsack type problems, like binary particle swarm optimisation (BPSO) [33], genotype-phenotype modified binary particle swarm optimisation (GPMBPSO) [37], modified binary particle swarm optimisation (MBPSO) [5]. These variants are found to be very efficient in solving knapsack instances in the corresponding literature. So we choose PSO as the benchmark algorithm and also introduce a hybrid PSO algorithm to compare the experimental results.

As mentioned earlier, most of the metaheuristic strategies are developed concentrating on traditional optimisation problems in the continuous domain. Defining the neighbourhood solutions in combinatorial space and maintaining the key characteristics of metaheuristic, which govern the performance of the algorithm are the central theme of this study. For CSA, different transformation techniques are utilised to represent the solution structure. The combination of local random walk and the global explorative random walk is applied to define the neighbourhood solutions in combinatorial space. In FA, solutions are represented by the combinatorial structure, and the variable distance move along with the opposition-based learning mechanism is utilised to describe the neighbourhood solutions. The proposed algorithms are then applied to knapsack problem instances to validate the solution procedure and performance analysis. The work presented in this paper continues and expands our initial studies [9, 10]. More detailed descriptions of the algorithm structure along with the combination of local search techniques are added in this study. Solution procedure for MKP instances is also introduced here. Detailed study related to repair operator and parameter settings are discussed along with the statistical tests for validation.

The paper is organized as follows. First, a general framework of above-mentioned algorithms are discussed in Section 2, and then we focus on the implementation of these algorithms over knapsack problems in Sections 3 and 4. In Section 5, computational results and performance of these algorithms over knapsack problem are thoroughly investigated and discussed. Section 6 provides the concluding remarks and future direction of further study.

2 Background

The theoretical background required to develop the binary version of CSA and FA for solving knapsack problem is presented here.

2.1 Basic principle of cuckoo search algorithm

CSA is inspired by the reproduction strategy of cuckoos. Cuckoos lay their eggs in the nest of other host birds. If the host bird discovers that the eggs are not of its own, it will either throw these foreign eggs away or abandon its nest and build a new nest elsewhere. The general system equation of this algorithm is based on the generalised random walk pattern, given by

$$ x_{i}^{(t+1)} = x_{i}^{(t)} + \alpha \otimes L(s,\lambda), $$
(3)

where \(x_{i}^{(t+1)}\) denotes (t + 1)-th generation for i-th cuckoo; \(x_{i}^{(t)}\) denotes the previous generation; α>0, represents step size which is related to the considered problem and ⊗ means entry-wise multiplications. The Lévy’s flight essentially provides a random walk while the random step length is drawn from a Lévy distribution, given by

$$ L(s,\lambda) = \frac{\lambda {\Gamma}(\lambda) \sin(\pi \lambda/2)}{\pi} \times \frac{1}{s^{1+\lambda}}, $$
(4)

where Γ denotes gamma function. It has an infinite variance with an infinite mean. For the determination of the direction of the walk, the standard normal distribution is chosen. There is a probability that a particular egg may be discovered with a chance p a . For this process,

$$ X_{i}(new) = \left\{\begin{array}{ll} X_{i} + \mathbf{r} \otimes (X_{r_{1}}-X_{r_{2}}), & \text{if } u \leq p_{a}\\ X_{i}, &\text{otherwise}. \end{array}\right. $$
(5)

Here r 1 and r 2 are random positions; r and u are random numbers drawn from the uniform distribution. The basic procedure of CSA is shown in Algorithm 1.

figure a

2.2 Basic principle of firefly algorithm

In a typical FA, the brightness of a firefly means how well the objective of a solution is, and brighter firefly (solution) attracts other fireflies. As the distance between fireflies increases, the attractiveness decreases exponentially because of the light absorption by the medium. All fireflies are unisex so that one firefly will be attracted to other fireflies regardless of their sex [62].

  1. i)

    Distance between two fireflies: The distance between any two fireflies (i and j) at x i and x j , respectively, can simply be evaluated by the Euclidean distance given by (6).

    $$ r_{ij} = ||x_{j}-x_{j}|| = \sqrt{\sum\limits_{k=1}^{D} (x_{ik}-x_{jk})^{2}}, $$
    (6)

    where D is the dimension.

  2. ii)

    Attractiveness: The attractiveness of a firefly at a distance r is given by the (7).

    $$ \beta(r) = \beta_{0} e^{-\gamma r^{m}}, \quad m \geq 1. $$
    (7)

    Here γ is the light absorption coefficient; β 0 is the attractiveness at distance r = 0, generally accepted as one; and m defines the sharpness of the exponential curve; usually one uses m = 2. For a given length scale Γ in an optimisation problem, Γm is used as the initial value for γ.

  3. iii)

    Movement of firefly: The movement of a firefly i attracted to another more attractive (brighter) firefly j is defined by (8).

    $$ x_{i} = x_{i} + \beta(r_{ij}) (x_{j}-x_{i}) + \alpha (\epsilon - 0.5), $$
    (8)

    where 𝜖 is a random number drawn from the uniform distribution; and α ∈ [0, 1] is the scaling factor of the randomness.

Pseudocode for FA is given in Algorithm 2.

figure b

3 Binary cuckoo search algorithm

CSA was originally designed to handle a continuous problem. So one cannot directly apply CSA to knapsack problem. For this reason, we modified the original CSA procedure, and the modified binary cuckoo search algorithm (BCSA) is discussed in detail in this section.

3.1 Construction of individual nest

The knapsack problem is an integer programming problem. There are two possible values of decision variable x j , zero or one. The individual nest (solution) is represented by an n-bit binary string, where n is the number of variables in knapsack problem. A bit string x ∈ {0,1}n, might represent an infeasible solution.

For 01 KP, the fitness function of individual nest is defined as

$$ f(\mathbf{x}) = \mathbf{c} \times \mathbf{x} -\rho(\mathbf{a} \times \mathbf{x} - b)^{2}, $$
(9)

where \(\rho = max_{j=1,2,..,n} \frac {c_{j}}{a_{j}}\) considering a j > 0 in (1).

Similarly, for 01 MKP, we represent the fitness function as given in (10).

$$ f(x) = \sum\limits_{j=1}^{n} c_{j} x_{j} - s \times [max(\textbf{c})]^{2}, $$
(10)

where s represents the number of violated constraints and c is the profit vector in (2).

Similar to other metaheuristics, the initial locations of the nests are scattered over the solution space randomly. We define the stopping rule based on four basic criteria: reach the optimal solution, maximum run time, maximum number of iterations, and predefined number of non-improvement iterations. If any of these criteria is satisfied, the algorithm stops.

3.2 Evaluating new nest

New solutions are generated using the Léavy’s flight in original CSA. Here a balanced combination of a local random walk and the global explorative random walk controlled by switching parameter p a is used as it performs better than simple Léavy’s flight as suggested in [21, 65]. The local random walk is defined by

$$ x_{i}^{(t+1)} = x_{i}^{(t)} + \alpha s \otimes H(p_{a} -\epsilon) \otimes \left( x_{j}^{(t)}-x_{k}^{(t)}\right), $$
(11)

where \(x_{j}^{(t)}\) and \(x_{k}^{(t)}\) are two different solutions selected by random permutation at generation t, H(u) is a Heaviside function, 𝜖U(0, 1), s is the scaling factor, and α>0 is the step size.

The global random walk is same as given in the (3). Here generation of random numbers with Lévy’s flight is done by using Mantegna algorithm [42] for a symmetric Lévy stable distribution [63]. The random variable is generated by,

$$ w_{j} = (\frac{u_{j}}{v_{j}})^{\frac{1}{\beta}} \qquad \forall j, $$
(12)

where β = 1.50 and u j and v j are drawn from N(0, σ 2) and N(0, 1) respectively; and σ value is defined by,

$$ \sigma = \left( \frac{\Gamma(1+\beta) \times sin(\pi \beta/2)}{\Gamma((\frac{1+\beta}{2}) \times \beta \times 2^{\frac{\beta-1}{2}})}\right)^{\frac{1}{\beta}}. $$
(13)

3.3 Binary variable handling

Here three different approaches are used for handling binary variables, and these are various transformation function type. The general equation for both the local and global walk is given by

$$ x_{i}^{(t+1)} = x_{i}^{(t)} + \delta_{i}^{(t)}, $$
(14)

where \(\delta _{i}^{(t)}\) is the step size for the local or global walk given in (11) and (13), respectively. For symmetric Lévy distribution \(\delta _{i}^{(t)}\) can be positive or negative. At first we consider that \(\delta _{i}^{(t)}\) is bounded, i.e. \(-\delta _{0} \leq \delta _{i}^{(t)} \leq \delta _{0}\), where δ 0 is maximum allowable change. As \(x_{i}^{(t)}\) is either zero or one, so we define \(x_{i}^{(t+1)}(new)\) as

$$ x_{i}^{(t+1)}(new) = \left\{\begin{array}{ll} 0 & \text{if} \quad x_{i}^{(t+1)} \leq 0,\\ \left[x_{i}^{(t+1)}\right] & \text{if} \quad 0 < x_{i}^{(t+1)} < 1,\\ 1 & \text{otherwise}. \end{array}\right. $$
(15)

In the second case, we consider that

$$ x_{i}^{(t+1)}(new) = \left\{\begin{array}{ll} 1 & \text{if} \quad u < sigm\left( x_{i}^{(t+1)}\right),\\ 0 & \text{otherwise}. \end{array}\right. $$
(16)

In the last case, we consider

$$ x_{i}^{(t+1)}(new) = \left\{\begin{array}{ll} 1 & \text{if} \quad u < \frac{x_{i}^{(t+1)}+\delta_{0}}{1+2\delta_{0}},\\ 0 & \text{otherwise}. \end{array}\right. $$
(17)

The term u is a uniform random number (uU(0,1)), s i g m(z) is a sigmoid limiting transformation having “S” shape curve function, and defined as \(sigm(z) = \frac {1}{1+exp(-\gamma z)}\); where γ controls the steepness of the sigmoid function.

3.4 Local search

After evaluating new nest, Stochastic Hill Climbing technique was performed to detect the best solution in the neighbourhood. In the maximisation case, it is done if only the new solution is better than the current one. This process is repeated until no further improvement is possible or when a predefined number of steps are reached. The pseudo code of Stochastic Hill Climbing is given in Algorithm 3. For generating a new solution from the current one, we use a random position in the nest and change its value. Stochastic Hill Climbing is a simple local search method to find out local maxima. This procedure generates solutions of high quality without introducing vast computational effort.

figure c

3.5 Constraint handling

When the binary string violates the constraint of the given problem, then repair algorithm is employed to make infeasible solutions to feasible one. This procedure depends on the problem type and the constraint.

For 01 KP, variables are sorted and renumbered according to the decreasing order of their profit to weight ratios. The greedy algorithm is utilised for selection procedure which always chooses the last item for deletion.

In the case of 01 MKP at the initialization step, variables are sorted and renumbered according to the decreasing order of their pseudo-utility, and it is calculated by the surrogate duality approach introduced by Pirkul [46]. The general idea of this approach is described briefly.

The surrogate relaxation problem of the 01 MKP can be defined as:

$$ \begin{array}{ll} Maximize & \sum\limits_{j=1}^{n} c_{j} x_{j}\\ Subject\;\;to & \sum\limits_{j=1}^{n} \left( \sum\limits_{i=1}^{m} \omega_{i} a_{ij}\right) x_{j} \leq \sum\limits_{i=1}^{m} \omega_{i} b_{i}, \quad \forall i \\ & x_{j} \in \{0,1\},\quad j=1,...,n, \quad i = 1,...,m \end{array} $$
(18)

where ω = {ω 1, ω 2,..., ω m } is a set of surrogate multipliers (or weights) of some positive real numbers. These weights are obtained by using a simple method suggested in [46], in which the LP relaxation of the original 01 MKP is solved, and the values of the dual variables are viewed as the weights. The weight ω i can be seen as the shadow price of the i-th constraint in the LP relaxation of the 01 MKP. The pseudo-utility ratio for each variable, based on the surrogate constraint coefficient, is defined as

$$ u_{j} = \frac{c_{j}}{{\sum}_{i=1}^{m} \omega_{i} a_{ij}}. $$
(19)

The repair operator is inspired by the idea of Chu and Beasley [15], which consists of two phases. The first phase (called DROP) examines each bit of the solution string in the increasing order of u j and changes a bit from one to zero if feasibility is violated. The second phase (called ADD) reverses the process by examining each bit in decreasing order of u j and changes a bit from zero to one as long as feasibility is not violated. The pseudo-code of the repair operator is given in Algorithm 4.

figure d

4 Binary firefly algorithm

Several studies are available in the literature of FA for continuous optimisation problems. But only a few studies are available for integer programming problem. In a recent work by Palit et al. [45], a binary firefly algorithm was used for cryptanalysis of knapsack cypher algorithm so as to deduce the meaning of an encrypted message. Binary adaptive firefly algorithm was used for fault identification in parallel and distributed system by Falcon et al. [23]. Chandrasekaran and Simon [12] used binary real coded firefly algorithm for solving network and reliability constrained unit commitment problem. In all these works they use the sigmoid function for transforming continuous variables to binary variables, and use Euclidean distance to perform light intensity measurement. Recently, Baykasoglu and Ozsoydan [6] solved the dynamic multidimensional knapsack problem using FA. They used priority based encoding technique for mapping continuous solutions to the combinatorial domain and replace the pairwise comparison with a specialised comparison using tuning parameter depending upon the iteration number. Here we use a different approach for solving knapsack problems similar to the approach used in [47] for solving facility location problem. It does not involve any encoding technique and utilizes hamming distance to measure the light intensity.

4.1 Construction of individual firefly

The individual firefly (solution) is represented by an n-bit binary string, where n is the number of variables in knapsack problem. A bit string x ∈ {0,1}n, may represent an infeasible solution; so the fitness function of individual firefly for 01 KP is defined as earlier, given by (9). Similarly for 01 MKP, we represent the fitness function given by (10). The initial population is generated randomly from the entire dimension to achieve sufficient diversification. The stopping rule is defined based on four basic criteria: reach the optimal solution, maximum run time, maximum number of iterations, predefined number of non-improvement iterations. If any of these criteria is satisfied, the algorithm stops.

4.2 Distance of two fireflies

The distance between any two fireflies i and j is defined by the Hamming distance, i.e., the number of different elements between their permutations. The difference between the objective function values is directly proportional to the Hamming distance.

4.3 Attractiveness of fireflies

A firefly i is attracted to another firefly j if the objective function value of i is smaller than the objective function value of j. The attractiveness of firefly i on firefly j is given by

$$ \beta(r_{ij}) = \beta_{0} e^{-\gamma r_{ij}^{2}}, $$
(20)

where r i j is the Hamming’s distance between fireflies i and j. Theoretical value of the light absorption coefficient γ is γ ∈ [0, ], we have considered γ in the range of [0.01,0.20].

4.4 Movement of fireflies

In the continuous problem, the movement is defined by the (8), which can be represented by (21a21b), like [19] in the discrete case

$$\begin{array}{@{}rcl@{}} x_{i} &=& (1-\beta(r_{ij}))x_{i} + \beta(r_{ij}) x_{j}, \end{array} $$
(21a)
$$\begin{array}{@{}rcl@{}} x_{i} &=& x_{i} + \alpha(\epsilon -0.5). \end{array} $$
(21b)

Any firefly moves in solution space in two steps: in the first step β(r i j ) determines the motion of firefly i towards firefly j given by (21a); then in the next step, α determines the random movement of firefly i provided by (21b). It is important to note that these two phases are not interchangeable.

The first step is known as β-step and from the equation, it is clear that x i will be equal to x j with a probability given by β(r i j ), and x i will be unchanged with a probability given by (1−β(r i j )). The β-step procedure is shown in Algorithm 5.

figure e

For the next step, we choose α value in the range of [0,1] and randomise the movement of the firefly for continuous optimisation problems. If the α value is high then x i will take large step and solution will be far away from the current solution. Similarly, for low α value, the new solution will be within the small neighborhood of the current solution. So α value determines the condition for global and local search and an appropriate value of α determines the balance between global search and local search procedure. But obtaining the exact level of α is not an easy task. For binary variables, this α-step only represents the change in bit values for a particular firefly. There are two ways by which we may change in bit values for binary variables: flip and swap procedure. One can use any of these two procedures as the α-step for solving knapsack problem. The term α represents the probability that a particular bit will change or not. The number of bits (n B) for which there is a change in bit value depends on Hamming distance (i.e., r i j ). Because here our aim is to minimize the distance between firefly i and j, as firefly x j is brighter than firefly x i ; so x i attracts towards x j . Therefore, n B = α×r i j and α should be small; otherwise the Hamming distance will increase between x i and x j rather than decreasing. It is similar to variable distance move (k-distance or k-exchange), and here we choose k = 1.

4.5 Generating new firefly

In the original FA, we generate a solution randomly if there is no better solution than a particular one. Opposition-based learning (OBL) introduced by Tizhoosh [55] is used to generate the new solution. It is proved that an opposite solution has a higher chance to be closer to the global optimal than a random solution [49]. Opposite point is defined as

$$ \acute{x_{j}} = a_{j}+b_{j} -x_{j}, $$
(22)

where X = (x 1, x 2,..., x n ) be a point in n-dimensional space and x j ∈[a j , b j ], j ∈ 1, 2, .., n [48]. The generalized OBL (GOBL) is defined as

$$ \acute{x} = r(a+b) - x, $$
(23)

where r is a real number [57]. Here we consider r as a random variable, specifically rU(0, 1). It will transform the solution to a new search space and provides more chances to find better solutions.

4.6 Repairing infeasible solutions

When the binary string violates the constraint of the given problem, then repair algorithm is employed to make infeasible solutions to feasible one.

For 01 KP, the greedy repair procedure is applied as mentioned earlier. All items in the knapsack are sorted in the decreasing order of their profit to weight ratios. The selection process always chooses the last item for deletion.

For 01 MKP, we use the repair operator given in Algorithm 4 suggested in [15].

5 Experimental results

The proposed BCSA and BFA procedures are coded in Python and implemented on a computer with Intel Core2 Duo CPU E8400 @3.00 GHz with 2GB of RAM in Unix environment. Experiments are also conducted with state-of-art particle swarm optimisation algorithm as a benchmark procedure. Knapsack problems are also solved by many researchers with PSO algorithm. Kong and Tian [36] applied PSO algorithm to solve knapsack problems with the sigmoid function for binary transformation. They used repair operator instead of penalty function and set the number of particles equal to the number of objects of the problem (problem size) for solving multidimensional knapsack problem instances. Tan [54] utilizes the penalty function method and mutate particle position at a specific probability of occurrence, a hybrid technique for solving MKP type problem. Lee et al. [37] proposed GPMBPSO to solve multi-modal benchmark functions. They adopted the concepts of genotype-phenotype representation and the mutation operator of genetic algorithms. Bansal and Deep [5] improved the basic BPSO by introducing a new probability function which maintains the diversity in the swarm and makes it more explorative for solving knapsack and multidimensional knapsack problem instances. Chih et al. [14] introduce the BPSO with time-varying acceleration coefficients and the chaotic BPSO with time-varying acceleration coefficients to solve MKP instances.

Here we utilise the advantages of each algorithm presented by Kong and Tian [36] and Tan [54], and the presented algorithm is the hybrid PSO with repair operator and GA-based mutation operator. For large problem class population size become huge in [36] and it is relatively small (P = 10) in [54], so here population size for PSO is set to 40 by the design of experiments (DOE). Results of this hybrid PSO (HPSO) are also shown for comparison along with BPSO [33], GPMBPSO [37], and MBPSO [5] for KP instances. On the other hand, MKP instances are compared with MBPSO [5], CBPSO [16], BPSOTVAC [14], and CBPSOTVAC [14]. For large MKP instance comparison is provided with other two highly efficient algorithms NGHS introduced by Zou et al. [69] and ABHS by Wang et al. [58].

5.1 Benchmark problem set

Two sets of standard benchmark problem instances for knapsack problems are considered in our study. The first set contains ten instances taken from different literature. The number of items in these instances is ranging from 4 to 23. Detailed information is provided in [8]. The other set of benchmark problem instances is taken from [5], which contains total 25 instances. Bansal and Deep [5] proposed the modified binary PSO algorithm to solve these test problem instances. Problem instances are ranging between 8 and 24.

Similar to KP instances, the performance of different algorithms are extensively investigated by a large number of experimental studies conducted over a set of problem instances collected from the literature.We select benchmark problem instances of 01 MKP from OR-Library [7]. Total seven instances are investigated corresponding to small class; items are ranging from 6 to 50, and the number of the knapsack is 5 or 10. Other 49 instances of medium problem class are also taken from OR-Library. Among medium class problem instances, Senyu and Toyada [50] introduced SENTO class, Weingartner and Ness [60] introduced WEING class, Shi [52] introduced WEISH class, and Freville and Plateau [25] introduced PB and HP classes.

5.2 Parameter settings of BCSA

There are only two main parameters of CSA, population size (n) and endanger probability (p a ). For solving knapsack type problems, we transform the continuous variables to binary type; and the effects of different parameter settings are evaluated and discussed in this section.

5.2.1 Effect of binary variables

Here, three different types of functions are used for handling binary variables as mentioned in Section 3. Effect of binary variables is shown in Table 1. Experimental setup is done with population size n = 25 and p a =0.25. For every instance, we consider total 50 independent runs; and best, worst solutions among these runs are reported along with average, median and standard deviation from all solutions for problems of SET 1. Also for each instance, we reported the average total time required for a successful run, and the maximum number of iteration is fixed to 100. In Table 1, for every instance, the first row represents the performance of the first transformation function, and the performance of the sigmoid transformation is given in the second row, and so on.

Table 1 Comparison between different types of transformation functions used for binary variable handling of MKP

From Table 1, it is clear that all transformation functions are effective to solve 01 knapsack instances. However, the sigmoid function gives a much more efficient result, and the solution for the average case is better than any other method. Also, the standard deviation is much low than others.

5.2.2 Effect of endanger probability (p a )

As suggested in [64], we also inspected the effect of p a on solving knapsack problems. We fixed the population size n = 25 and the maximum number of iterations i M a x = 100, and used p a ={0.05,0.1,0.15,0.2,0.25,0.4,0.5}. The number of iterations required to solve a particular case for different p a values is shown in Fig. 1 for MKP problem instance f 4. From Kruskal-Wallis test, we get p-value equal to 0.6667, which suggest that we can reject the null hypothesis, and there is no significant difference between sample medians. One can conclude that p a value does not have any significant effect on solving multidimensional knapsack problem instances.

Fig. 1
figure 1

The effect of p a for MKP instance f 4

5.2.3 Effect of population size (n)

Here, we consider different population size for solving MKP instance f 4, with maximum number of iterations i M a x = 100 and p a =0.2. Result corresponding to different population sizes n = {10,15,20,25,30,40,50} is shown in Fig. 2. Table 2 lists the results of Kruskal-Wallis test. As p-value is very less, so we conclude that at least one sample median is different from the others. Figure 3 shows the results of multiple comparison mean ranks. From the figure, it is clear that Group 1 (G1={n|n = 10,15}) is significantly different from Group 2 (G2={n|n = 25,30,40,50}). As execution time will increase with very high population size, so we may choose population size n = 25.

Fig. 2
figure 2

The effect of population size n for MKP instance f 4

Table 2 Kruskal-Wallis ANOVA table
Fig. 3
figure 3

Multiple comparison of mean ranks of population size (n) for MKP instance f 4

5.3 Parameter settings of BFA

As discussed in Section 4, the main parameters of BFA are the population size (n) and the light absorption coefficient (γ). In this section, the effect of these two parameters for solving knapsack problems is discussed.

5.3.1 Effect of population size (n)

For the experimental study we choose population size, n = {30,40,50,60,70,80,90,100} and light absorption coefficient γ = 0.1. Here we consider total 50 independent runs for each case and the relation between the number of iterations to solve a particular instance and the population size is shown in Fig. 4. Anova table of Kruskal-Wallis test is given in Table 3 and p-value of the corresponding test is 3.96E−07. Results of multiple comparison mean ranks are shown in Fig. 5. Two groups G1={n|n = 30,40} and G2={n|n = 70,80,90,100} are significantly different from each other. So we may choose population size n = 70 for solving multidimensional knapsack problem instances.

Fig. 4
figure 4

The effect of population size n for MKP instance f 4

Table 3 Kruskal-Wallis ANOVA table
Fig. 5
figure 5

Multiple comparison of mean ranks of population size (n) for MKP instance f 4

5.3.2 Effect of light absorption coefficient (γ)

Here, we consider population size n = 60 and light absorption co-efficient γ = {0.02, 0.04, 0.06, 0.08, 0.09, 0.10, 0.12, 0.14, 0.16, 0.18, 0.20}. Total 50 independent runs for each case and the relation between the number of iterations required to solve a particular case and the light absorption co-efficient is shown in Fig. 6 for MKP instance f 4. The results of multiple comparison of mean ranks are given in Fig. 7. Anova table of Kruskal-Wallis test is given in Table 4. From the results, one can conclude that light absorption co-efficient γ≤0.02 for solving MKP instances.

Fig. 6
figure 6

The effect of light absorption coefficient (γ) for MKP instance f 4

Fig. 7
figure 7

Multiple comparison of mean ranks of light absorption coefficient (γ) for MKP instance f 4

Table 4 Kruskal-Wallis ANOVA table

5.4 Performance metrics

Three normalised performance measures are used for algorithm’s performance analysis. The measures are the solution quality, the rate of success and the speed of reaching a solution [53]. To compare the rate of success of different algorithms on different problems, each problem/algorithm combination is run using a maximum number of function evaluations (M a x F E S) as the termination condition. For all problems, the value of M a x F E S is set to 10000×D, where D is the dimension of the problem.

The quality of solution obtained by the particular algorithm is quantified by Q m e t r i c and is given by

$$ QMetric = 2^{q^{10^{2}}}-1; $$
(24)

where q is the normalised measure of solution quality [41]. It is defined as,

$$ q = \frac{\hat{f}-f^{min}}{op - f^{min}}, $$
(25)

where \(\hat {f}\) is the maximum value of the objective function found by the algorithm, op is the best known solution and f min is the minimum objective function value for the concerning problem instance (zero).Footnote 1 Here we consider that 5 % gap or larger from the best known solution is not a significant solution, and Q M e t r i c value will be zero (rounded up to 2 decimal places). A particular run is regarded as successful if the run reaches the global optimum before the M a x F E S of the problem is exceeded. The success rate (S R a t e) is defined as the ratio of the number of successful runs and the total number of runs. The success speed of a run r is defined as

$$ SSpeed_{r} = \left\{\begin{array}{ll} 0, & \text{if the run is}\\ & \text{not successful}\\ \frac{MaxFES - (FES_{r} - 1)}{MaxFES}, & \text{otherwise.} \end{array}\right. $$
(26)

Where F E S r represents the number of function evaluations for the given run r. The success speed over ns successful runs is defined ass

$$ SSpeed = \left\{\begin{array}{ll} \frac{{\sum}_{r=1}^{ns} SSpeed_{r}}{ns}, & \text{if } ns >0\\ 0, & \text{if } ns = 0. \end{array}\right. $$
(27)

To illustrate the use of these three performance metrics, Table 5 shows the results of benchmark problem instances of KP (SET 1) for BCSA, BFA and HPSO. All the three algorithms perform well, and Q M e t r i c values are one for all the cases. Though BFA finds little bit difficulties to solve three problem instances (f 2, f 8 and f 10); S R a t e and S S p e e d are low for these cases.

Table 5 Performance metrics for 01 KP SET 1 instances

Table 6 shows the performance metrics of BCSA, BFA and HPSO for 01 MKP instance SET 1. BCSA and HPSO perform well compared to BFA. Among total seven instances, S S r a t e and S S p e e d are substantially low compared to other algorithms for BFA except problem instance f 1. For problem instance f 6 and f 7, BFA is not able to find best-known solution at all, Q M e t r i c values are 0.69 and 0.79, respectively. BCSA and HPSO also find it little difficult for these two instances. S S R a t e and S S p e e d of HPSO are low for function f 6, and S S p e e d of BCSA is slightly low for function f 7.

Table 6 Performance metrics for 01 MKP SET 1 instances

5.5 Comparative study

Here we consider the second set of problem instances for both KP and MKP. Experiments are conducted to compare proposed three algorithms with BPSO [33], GPMBPSO [37], and MBPSO [5]. An optimal parameter setting for different algorithms is shown in Table 7.

Table 7 Optimal parameter settings

Total 50 independent trials are conducted for each problem instance. Best and average results found out between these runs are reported in Table 8 for knapsack problem instances. Non-parametric tests are used for comparing algorithms[2628, 40]. The average ranking of Friedman test is conducted to compare the performance of multiple algorithms on knapsack problem instances. Friedman test statistic value is 97.37 with 5 degrees of freedom, and the corresponding p-value is 5.62E-11 which indicates that one can not reject the null hypothesis and the performance of all the algorithms are not same. Table 9 shows the average ranking of BCSA, BFA, HPSO, BPSO, GPMBPSO and MBPSO on knapsack problem instances of SET 2.

Table 8 Results of problem SET 2 of 01 KP instances
Table 9 Average Rankings of the algorithms for knapsack problem instances

Nemenyi’s, Holm’s, Shaffer’s and Bergemann-Hommel’s tests are conducted to investigate the difference between the performance of two different algorithms. Adjusted p-values on pairwise comparisons of all algorithms are reported in Table 10. The null hypothesis is defined as the two algorithms are equivalent for these nonparametric tests. If the null hypothesis is rejected, then there is a significant difference between the performances of these two algorithms. In this work, we have considered the significance level α is 0.05. From the results of Table 10, it is clear that Nemenyi’s, Shaffer’s and Bergmann-Hommel’s procedures reject Hypotheses 1–9. According to Holm’s test, we can reject Hypotheses 1–11.

Table 10 Adjusted p-values for SET 2 problem instances of KP

From the results, one can conclude that BCSA and BFA are significantly better than others (MBPSO, HPSO, GPMBPSO and BPSO) for solving knapsack problem instances. There is no significant difference between the performance of BCSA and BFA. MBPSO is significantly better than GPMBPSO and BPSO. There is no significant difference between MBPSO and HPSO. The performance of six algorithms can be sorted by average ranking into the following order: BFA, BCSA, MBPSO, HPSO, BPSO and GPMBPSO. It means that BFA and GPMBPSO are the best and worst ones among these six algorithms, respectively.

Next, we provide the computation study on 01MKP benchmark instances of SET 2 in Table 11. The best solution and average solution among 50 independent runs of BCSA, BFA and HPSO algorithms are reported for each problem instances. For comparing the performance of these algorithms, Friedman test is conducted. Mean average deviation for other four algorithms Al-I (MBPSO [5]), Al-II (CBPSO [16]), Al-III and Al-IV (BPSOTVAC and CBPSOTVAC [14]) are also reported in the table. The test statistics value is 141.22 with degrees of freedom 6 and corresponding p-value is 8.63E−11. So one can reject the null hypothesis and conclude that the performance of at least one algorithm is significantly different than other. Table 12 shows the average ranking of BCSA, BFA, HPSO, MBPSO, CBPSO, BPSOTVAC and CBPSOTVAC algorithm on MKP instances of SET 2.

Table 11 Results of problem SET 2 of 01 MKP instances
Table 12 Average Rankings of the algorithms for SET 2 MKP instances

Adjusted p-values of different nonparametric tests (Nemenyi, Holm, Shaffer and Bergmann-Hommel test) are given in Table 13. According to these tests, two algorithms are equivalent under the null hypothesis. Here we have considered significance level α = 0.05, and the null hypothesis is rejected if the p-value is less than the α value. According to all the four tests (Nemenyi’s, Holm, Shaffer and Bergmann-Hommel test), we can reject Hypotheses 1–12. If we consider only Holm, Shaffer and Bergmann-Hommel test, we can reject Hypothesis 1–14.

Table 13 Adjusted p-values

From the results, one can infer that HPSO and BCSA are significantly better than CBPSOTVAC, MBPSO, CBPSO and BFA for solving MKP instances; BPSOTVAC is significantly better than CBPSOTVAC, CBPSO and MBPSO; BFA is significantly better than CBPSOTVAC, MBPSO and CBPSO. Again there is no significant difference between the performance of HPSO, BCSA and BPSOTVAC. If the performance of these algorithms is sorted by the average ranking, then we get the following order: BCSA, HPSO, BPSOTVAC, BFA, CBPSO, MBPSO and CBPSOTVAC; where BCSA and CBPSOTVAC are the best and worst ones among these seven algorithms, respectively.

5.6 Solution of large class problem instances

Experiments are also conducted on large class problem instances. For KP we choose total twelve instances, generated randomly with average knapsack capacity. Three types of problem instances are produced. For the first category the profit and weight vectors are uncorrelated to each other; in the second case these are weakly correlated, and weight and profit vectors are strongly correlated for the last situation. These three types of problem instances are represented by the number objects followed by the correlation structure, like uncorrelated (“u”), weakly correlated (“w”) and strongly correlated (“s”), respectively. MKP instances are taken from http://www.cs.nott.ac.uk/~jqd/mkp/index.html, introduced by Glover and Kochenberger [29]. Solutions of BCSA, BFA and HPSO, are shown in Fig. 8 for KP instances, and Fig. 9 presents the solutions for MKP instances.

Fig. 8
figure 8

Solution of medium and large scale KP instances

Fig. 9
figure 9

Solution of large scale MKP instances

As viewed from the Fig. 8, for medium problem size (n = 50) there is a small difference between the performance levels of three algorithms. But as problem size increases, there is a much more difference between the performance of these three algorithms. BCSA outperforms other two algorithms for all the cases. For comparing the results, the average ranking of Friedman test is conducted. The test statistic value is 24.00 with 2 degrees of freedom, and the corresponding p-value is 6.14E-06. So one can not reject the null hypothesis and suggests that the performance of all the algorithms are not same. The average ranking of BCSA, BFA, and HPSO are 3.0, 1.0, and 2.0 respectively. Adjusted p-values for Nemenyi’s, Holm’s, Shaffer’s and Bergemann-Hommel’s tests on pairwise comparisons of all three algorithms are reported in Table 14. Adjusted p-values for all the tests are less than the significance level. Therefore, we can conclude that BCSA is significantly better than BFA and HPSO; and HPSO is significantly better than BFA for solving KP large problem instances. Order ranking of these algorithms is BCSA, HPSO and BFA, where BCSA and BFA are the best and worst ones among these algorithms, respectively.

Table 14 Adjusted p-values

For MKP instances of the large case, the performance of BCSA, BFA and HPSO are compared with other two algorithms in Table 15; NGHS [69] and ABHS [58]. For comparing the results, the average ranking of Friedman test is conducted. The test statistic value is 31.56 with 4 degrees of freedom, and the corresponding p-value is 2.36E-06. Therefore, one can not reject the null hypothesis; which suggest that the performance of all the algorithms are not same. The average ranking of BCSA, BFA, HPSO, NGHS, and ABHS are 4.11, 1.0, 2.0, 3.33, and 4.56, respectively. Adjusted p-values for Nemenyi’s, Holm’s, Shaffer’s and Bergemann-Hommel’s tests on pairwise comparisons of all five algorithms are reported in Table 16. For all the tests null hypothesis is rejected for Hypotheses 1-5. According to the results, ABHS is significantly better than HPSO and BFA; NGHS is significantly better than BFA; BCSA is significantly better than HPSO and BFA. But there is no significant difference between ABHS, NGHS and BCSA. Order ranking of these algorithms is ABHS, BCSA, NGHS, HPSO and BFA, where ABHS and BFA are the best and worst ones among these algorithms, respectively.

Table 15 Comparison with other algorithms
Table 16 Adjusted p-values

Table 17 shows the results for other knapsack instances of small and medium size problem along with the results of other two standard algorithms (NGHS and ABHS). The average of 50 individual runs and corresponding success rate are shown for each case. BCSA performs well for each case except the problem instance “Weing7”. Also, other algorithms find it difficult to solve this particular instance. BCSA outperforms other algorithms in the case of “Knap39” instance, and the success rate is very high compared to other algorithms. HPSO performs well except the problem instance “Sent01” and the success rate is very low for this particular case. Among these five algorithms, the performance of BFA is worst compared to others.

Table 17 Comparison with other algorithms for small and medium size instances

6 Conclusion

Here we have considered one of the fundamental combinatorial optimisation problems, knapsack problem and the solution process of this problem is intensively investigated in this paper. We propose the binary cuckoo search algorithm for solving knapsack problems. The combination of local random walk and the global random walk is utilised for the implementation of modified CSA. It provides the balance between intensification and diversification techniques and an excellent exposure to the algorithm for finding out high-quality solutions. Also, we developed the binary firefly algorithm for the knapsack problem; and in the modified FA, the variable distance move is utilised to perform the local search along with repair algorithm. New solutions are generated from outside of the current search domain employing opposition-based learning to explore the unexploited solution regions. It will ensure that the modified FA will not be trapped within the local optimal points. Total 47 benchmark problem instances of 01 KP and 65 benchmark problem instances of 01 MKP is considered for the experimental study.

Also, we have carried out various tests for proper parameter tuning, and the performances of the modified algorithms have been extensively investigated on these standard benchmark instances. In this paper, we consider the particle swarm optimisation algorithm as the reference technique. Proposed algorithms were compared with recently developed several variants of PSO algorithm. For small and medium class problem instances, BCSA outperformed in most of the cases regarding the solution quality, rate of success and speed of reaching best-known solutions. BFA works efficiently for 01 knapsack problem instances but finds difficulties for multidimensional knapsack problem instances. For large case, BCSA totally outperforms others. For KP instances as the dimension of problem increases, the performance of BFA and HPSO degrade. For MKP instances, BCSA again outperforms others. So for large class problem situations, our choice should be BCSA for solving knapsack problems. Numerous statistical tests also confirm the validity of the proposed methods. In future, these algorithms can be extended to solve other combinatorial problems. However, proper parameter tuning is required before carrying out any performance analysis. Further modification on FA is scheduled as future work for improving the solution quality for the large class problem instances for both single and multidimensional knapsack problem.