Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

The purpose of product design optimization is to maximize profits, market share and sales or to minimize costs (Green et al. 1981; Gaul and Baier 2009; Gaul et al. 1995). Most products can be characterized using many attributes and their corresponding levels, so there is large number of possible product offerings (Socha and Dorigo 2008). That means, there is a very high complexity of complete enumeration (Kohli and Krishnamurti 1987). For this issue, swarm intelligence algorithms, where the behavior of foraging for food of social animals is copied, offer a good alternative for the product design problem. In some cases and under certain conditions, they are the only way to find the real optimal solution to the underlying optimization problem (Albritton and McMullen 2007). In this paper, a new approach to the share-of-choice problem, the bee colony optimization (BCO), is introduced and compared to the already existing ant colony optimization (ACO) (Albritton and McMullen 2007; Camm et al. 2006; Teodorovic and Dell’Orco 2005; Karaboga and Basturk 2007; Teodorovic et al. 2011). Therefore, both algorithms were implemented with the statistical software R. Firstly, the problem formulation is introduced in Sect. 2. In Sects. 3 and 4, the methodologies for the ant colony and bee colony optimization are given. After that, in Sect. 5, the empirical comparison of both algorithms follows. At last, in Sect. 6, we give a short conclusion and an outlook.

2 Problem Formulation

To compare these two swarm intelligence algorithms, a common problem structure is needed. In this case, we are using the share-of-choice formulation for the underlying optimization problem (Camm et al. 2006). In the share-of-choice problem that product design is chosen, which satisfies most observed customers. That means, a binary linear integer program has to be solved (Camm et al. 2006). Considering a product with K attributes, L k levels for each attribute k (k = 1, 2, , K) and a with S denoted number of respondents of the simulated conjoint study, the part-worth utility for a level j (j = 1, 2, , L k ) on attribute k for respondent s is given by \(u^{s}_{\mathit{kj}}\). Then, a binary variable x kj is introduced. This variable is set to x kj  = 1 if level j of attribute k is chosen and 0 otherwise. If the utility of a designed product exceeds the status quo utility for the s-th respondent (U s ), y s  = 1 and 0 otherwise. Also define a hurdle utility h s for respondent s with \(h_{s} = U_{s}+\epsilon\), where ε is a small real number. That leads to the following problem formulation:

$$\displaystyle{ \text{ Max }\quad \sum _{s=1}^{S}y_{ s} }$$
(1)
$$\displaystyle{ \mbox{ subject to}\quad \sum _{k=1}^{K}\sum _{ j=1}^{L_{k} }u_{\mathit{kj}}^{s}x_{\mathit{ kj}} \geq h_{s}y_{s}\qquad s = 1,2,\ldots,S, }$$
(2)
$$\displaystyle{ \sum _{j=1}^{L_{k} }x_{\mathit{kj}} = 1\qquad k = 1,2,\ldots,K. }$$
(3)

Expression (1) represents the objective function which yields the number of respondents covered. The first constraint (2) ensures that only those respondents are counted in the objective function who met or exceeded the hurdle utility for a product. The purpose of the second constraint (3) is to make sure that exactly one level of each attribute is chosen (Camm et al. 2006).

3 Methodology for Ant Colony Optimization (ACO)

When a single ant is foraging for food, there are some local rules, that it follows. The first one is, that the ants follow that trail, where a high amount of pheromones lie. For the second rule, the ants leave pheromones on the recently taken path. Then, over time, the ants are taking more direct routes, because of the pheromone scent on the trail. In the initial state, the ants leave their nest randomly to forage for food. A single ant takes a certain path more likely, the higher the amount of pheromone is on this trail (Albritton and McMullen 2007).

The methodology for the ACO is given below and is based on Albritton and McMullen (2007).

Initialize probabilities: The value of \(\mathit{Prob}_{\mathit{kj}}^{s}\) represents the relative desirability of selecting level k of attribute k. This probability stays constant throughout the search process and is calculated via roulette wheel selection:

$$\displaystyle{ \mathit{Prob}_{\mathit{kj}}^{s} = \frac{\sum _{s=1}^{S}u_{\mathit{ kj}}^{s}} {\sum _{{j}^{{\ast}}=1}^{L_{k}}\sum _{s=1}^{S}u_{k{j}^{{\ast} }}^{s}}\quad \forall k = 1,2,\ldots,K. }$$
(4)

The objective function value of Z  ∗  is set to zero:

$$\displaystyle{ {Z}^{{\ast}} = 0. }$$
(5)

Initialize the history matrix: The history matrix G tracks all paths the ants have taken so far. All elements g kj of matrix G are set to “1” at the beginning of the search.

Simulating the amount of pheromone: For simulating the amount of pheromone on a track, the global and local probabilities \(\mathit{GlobProb}_{\mathit{kj}}^{s}\) and \(\mathit{LocProb}_{\mathit{kj}}^{s}\) of selecting level j from attribute k are initialized to the relative desirability \(\mathit{Prob}_{\mathit{kj}}^{s}\):

$$\displaystyle{ \mathit{GlobProb}_{\mathit{kj}}^{s} = \mathit{Prob}_{\mathit{ kj}}^{s}, }$$
(6)
$$\displaystyle{ \mathit{LocProb}_{\mathit{kj}}^{s} = \mathit{Prob}_{\mathit{ kj}}^{s}. }$$
(7)

Starting the search process: To find a solution to the optimization problem, an ant moves through the decision space from each level to the next level until the last attribute is reached. Therefore, a certain probability for each decision is constructed:

$$\displaystyle{ \mathit{DecProb}_{\mathit{kj}}^{s} = \frac{\gamma \cdot \mathit{Prob}_{\mathit{kj}}^{s} +\alpha \cdot \mathit{LocProb}_{\mathit{ kj}}^{s} +\beta \cdot \mathit{GlobProb}_{\mathit{ kj}}^{s}} {\sum _{{j}^{{\ast}}=1}^{L_{k}}\gamma \cdot \mathit{Prob}_{{\mathit{kj}}^{{\ast} }}^{s} +\alpha \cdot \mathit{LocProb}_{k{j}^{{\ast} }}^{s} +\beta \cdot \mathit{GlobProb}_{{\mathit{kj}}^{{\ast} }}^{s}}\quad \forall k = 1,2,\ldots,K. }$$
(8)

where

$$\displaystyle{\alpha +\beta +\gamma = 1.}$$

The three parameters α, β, γ are user-specified and represent the weights of the given probabilities.

Objective function: After a single ant completed its journey through the decision space, the objective function value Z is determined for the found solution:

$$\displaystyle{ Z =\sum _{ s=1}^{S}u_{\mathit{ kj}}^{s},\quad \mbox{ where }k,j \in \text{ solution. } }$$
(9)

Probabilistic updating: With a new solution to the problem, the objective function value of Z is compared to the former best solution of Z  ∗ . If the new solution is better than the old one, a global update is performed, a local update otherwise:

$$\displaystyle{ \mbox{ If }Z > {Z}^{{\ast}}\quad \mbox{ global update,} }$$
(10)
$$\displaystyle{ \mbox{ If }Z \leq {Z}^{{\ast}}\quad \mbox{ local update.} }$$
(11)

The local update means, that the pheromone on this trail evaporates, so that new and better near optimal solutions can be found. Mathematically, this is expressed with a user-chosen value η:

$$\displaystyle{ \mathit{LocProb}_{\mathit{kj}}^{s}:= \mathit{LocProb}_{\mathit{ kj}}^{s} \cdot (1-\eta ). }$$
(12)

On the other side, a global update is performed, to enhance the probability of selecting the path components of the recently found optimal solution. First, the history matrix G, is updated to represent the recently found solution:

$$\displaystyle{ g_{\mathit{kj}}:= g_{\mathit{kj}} + 1,\quad \mbox{ where }k,j \in \text{ solution. } }$$
(13)

Second, the global probability \(\mathit{GlobProb}_{\mathit{kj}}^{s}\) is updated with the recently found solution:

$$\displaystyle{ \mathit{GlobProb}_{\mathit{kj}}^{s} = \frac{g_{\mathit{kj}}} {\sum _{j={1}^{{\ast}}}^{L_{k}}g_{k{j}^{{\ast} }}},\quad \mbox{ where }k,j \in \text{ solution. } }$$
(14)

Iterations: These steps are repeated until the abortion criterion is met, e.g. the number of ants (=iterations) or the runtime of the algorithm.

4 Methodology for Bee Colony Optimization (BCO)

The following bee colony algorithm was not yet applied to the share-of-choice problem based on conjoint data. This is a new approach to solve the underlying optimization problem. Before developing the algorithm, the behavior of the artificial bees is described.

As for the ants, the food search of the bees is simulated. Collaboratively, the bees are searching for an optimal solution to difficult combinatorial optimization problems, where each bee creates one single solution to the problem. For foraging for food, there are two alternating phases for the bees’ behavior: The forward and the backward pass. In the forward pass, the artificial bees visit a certain number of solution components (food sources). In the backward pass, the artificial bees save the information about the quality of their found solution (amount of nectar of the food source), fly back to the hive and communicate with the uncommitted bees. Back to the hive, each bee decides with a certain probability whether it stay loyal to its found solution or not. The better one solution, the higher are the chances, that this bee can advertise its found solution to the uncommitted bees (Teodorovic et al. 2011).

The methodology of BCO (Teodorovic and Dell’Orco 2005; Teodorovic et al. 2011) will be introduced below.

Forward pass: The bees do their first forward pass with selecting one level j of the first attribute k with roulette wheel selection according to Eq. (4).

Objective function: After the first forward pass, the found partial solutions are evaluated for each bee. Therefore, a new variable i is introduced, which represents the actual number of forward passes done. Note, the maximum value of i is the number of attributes (i ≤ K). Hence, the objective function value of Z is determined by:

$$\displaystyle{ Z =\sum _{ s=1}^{S}u_{ ij}^{s},\quad \mbox{ where }i,j \in \mbox{ (partial) solution.} }$$
(15)

Backward pass: When the bees are back to the hive after the forward pass, each bee decides randomly whether to continue its own exploration and become a recruiter or to become a follower. Therefore, the bees have to make a loyalty decision. The notion B stands for the total number of bees. Mathematically, this formulation means, that the probability that the b-th bee (b = 1, 2, , B) is loyal to its previously generated (partial) solution is:

$$\displaystyle{ p_{b}^{i+1} =\exp \left (-\frac{O_{\mathit{max}} - O_{b}} {u} \right )\quad b = 1,2,\ldots,B, }$$
(16)

with

$$\displaystyle{ O_{b} = \frac{Z_{\mathit{max}} - Z_{b}} {Z_{\mathit{max}} - Z_{\mathit{min}}}\quad b = 1,2,\ldots,B, }$$
(17)

where O b is the normalized value for the objective function of the partial solution created by the b-th bee and O max is the maximum value of O b .

If a bee is not loyal to its found (partial) solution, it become a follower. Hence, the total number of bees b is splitted into a certain number of recruiters R and followers F (\(B = R + F\)). However, for each follower, a new solution from the recruiters is chosen by the roulette wheel (Teodorovic et al. 2011):

$$\displaystyle{ \mathit{pf }_{b} = \frac{O_{b}} {\sum _{r=1}^{R}O_{r}}\quad b = 1,2,\ldots,B. }$$
(18)

These steps are repeated until each bee has found a complete solution to the optimization problem.

Iterations: The loop described above, can be repeated for a certain number of iterations or is aborted by a former chosen runtime. For each iteration, the best found objective function value Z is saved.

5 Empirical Comparison

5.1 Initialization of the Algorithms

For initialization of both algorithms, simulated conjoint data is used. This is a common procedure in marketing to compare optimization algorithms for product design (Vriens et al. 1996). Therefore, a matrix U with the part-worths of S respondents is computed. One component \(u_{\mathit{kj}}^{s}\) of the matrix U represents the part-worth utility of the s-th respondent (s = 1, 2, , S) with attribute k (k = 1, 2, , K) and level j (j = 1, 2, , L k ). To initialize these part-worth utilities, a lower (LB) and a upper bound (UB) have to be set from the researcher, as well as a uniformly distributed random variable (\(\Phi _{\mathit{kj}}^{s}\)) in the range of [0, 1] (Albritton and McMullen 2007):

$$\displaystyle{ u_{\mathit{kj}}^{s} = \mathit{LB} + \Phi _{\mathit{ kj}}^{s} \cdot (\mathit{UB} -\mathit{LB}). }$$
(19)

In the next step, part-worth utility matrix U has to be transformed into a matrix C with the elements \(c_{\mathit{kj}}^{s}\) where the components are coded to “1” and “0”. If a level of an attribute satisfies a customer, the part-worth utility is coded to “1” and if the part-worth utility has a negative value, it is coded to “0” (Steiner and Hruschka 2003):

$$\displaystyle{ c_{\mathit{kj}}^{s} = 0\quad \text{ if }\quad u_{\mathit{ kj}}^{s} \leq 0, }$$
(20)
$$\displaystyle{ c_{\mathit{kj}}^{s} = 1\quad \text{ if }\quad u_{\mathit{ kj}}^{s} > 0. }$$
(21)

5.2 Test Configurations

To compare the ACO and BCO, a Monte Carlo simulation is applied to both algorithms. In the first step, each algorithm solved one problem ten times in a predefined period, so that there is a big enough sample to make qualified statements about the algorithms’ behavior. Altogether, there were 45 problems (5 attributes ×3 levels each ×3 customer set-ups), which means 45 different simulated conjoint matrices with uniformly distributed part-worths. For each problem, there are | L K  |  | K |  possible solutions. For capturing a solution, a certain number of corresponding attributes are needed to satisfy a customer, which depends on the number of attributes. After that, the means for each problem and algorithm were computed and based on that, the data shown in Table 1 were generated.

Table 1 The means and maximum values of the marginal distribution of the objective function values for the Monte Carlo simulation of both algorithms

As one can see from Table 1, for comparison, the marginal distribution of the algorithms’ settings is used. Even though, there are no significant differences between both algorithms (tested with Welsh’s two sample t-test on the level of significance α: = {. 1, . 05, . 01}), in almost each case (instead of the case for four attributes on the maximum value), the ACO performed better than the BCO. One possible explanation for these results is, that the mathematical structure of both algorithms is quite similar. A possible reason for the better performance of the ACO is, that this algorithm is more developed to the share-of-choice problem than the BCO. Another explanation could be, that the probability updates for the ACO are more enhanced than the ones from the BCO. However, both algorithms provided very good solutions in a reasonable amount of time, even to large problems and therefore the results are very satisfying. Unfortunately, the results could not be compared to the real optimal values, because complete enumeration would have taken too long.

6 Conclusion and Outlook

For the first application to the share-of-choice problem in conjoint analysis, the BCO algorithm worked very well. The solutions obtained were almost as good as the ones from the ACO. Nevertheless, there is a big potential for enhancing the BCO, e.g. through parameter modifications or other probability distributions for the bees’ behavior.

For further research, we will try to apply genetic algorithms, particle swarm optimization, simulated annealing and other optimization heuristics to the share-of-choice problem.

Furthermore, another important goal is to apply these optimization methods to real world problems with real conjoint data and, of course, not just to product optimization in the single product case, but to whole product line problems.