Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introductiory Remarks

Solving combinatorial optimization problems with approaches from the swarm intelligence field has already a considerably long tradition. Examples of such approaches include particle swarm optimization (GlossaryTerm

PSO

) [1] and artificial bee colony (GlossaryTerm

ABC

) optimization [2]. The oldest – and most widely used – algorithm from this field, however, is ant colony optimization (GlossaryTerm

ACO

) [3]. In general, the GlossaryTerm

ACO

metaheuristic attempts to solve a combinatorial optimization problem by iterating the following steps: (1) Solutions to the problem at hand are constructed using a pheromone model, that is, a parameterized probability distribution over the space of all valid solutions, and (2) (some of) these solutions are used to change the pheromone values in a way being aimed at biasing subsequent sampling toward areas of the search space containing high quality solutions. In particular, the reinforcement of solution components depending on the quality of the solutions in which they appear is an important aspect of GlossaryTerm

ACO

algorithms. It is implicitly assumed that good solutions consist of goodsolution components. To learn which components contribute to good solutions most often helps assembling them into better

In this chapter, GlossaryTerm

ACO

is applied to solve the minimum-weight rooted arborescence (GlossaryTerm

MWRA

) problem, which has applications in computer vision such as, for example, the automated reconstruction of consistent tree structures from noisy images [4]. The structure of this chapter is as follows. Section 68.2 provides a detailed description of the problem to be tackled. Then, in Sect. 68.3 a new heuristic for the GlossaryTerm

MWRA

problem is presented which is based on the deterministic construction of an arborescence of maximal size, and the subsequent application of dynamic programming (GlossaryTerm

DP

) for finding the best solution within this constructed arborescence. The second contribution is to be found in the application of GlossaryTerm

ACO

 [3] to the GlossaryTerm

MWRA

problem. This algorithm is described in Sect. 68.4. Finally, in Sect. 68.5 an exhaustive experimental evaluation of both algorithms in comparison with an existing heuristic from the literature [5] is presented. The chapter is concluded in Sect. 68.6.

2 The Minimum-Weight Rooted Arborescence Problem

As mentioned before, in this work we consider the GlossaryTerm

MWRA

problem, which is a generalization of the problem proposed by Venkata Rao and Sridharan in [5, 6]. The GlossaryTerm

MWRA

problem can technically be described as follows. Given is a directed acyclic graph G = ( V , A ) with integer weights on the arcs, that is, for each a A exists a corresponding weight w ( a ) Z . Moreover, a vertex  v r V is designated as the root vertex. Let A be the set of all arborescences in G that are rooted in v r . In this context, note that an arborescence is a directed, rooted tree in which all arcs point away from the root vertex (see also [7]). Moreover, note that A contains all arborescences, not only those with maximal size. The objective function value (that is, the weight)  f ( T ) of an arboresence  T A is defined as follows:

f ( T ) := a T w ( a ) .
(68.1)

The goal of the GlossaryTerm

MWRA

problem is to find an arboresence T * A such that the weight of  T * is smaller or equal to all other arborescences in A. In other words, the goal is to minimize objective function  f ( ) . An example of the GlossaryTerm

MWRA

problem is shown in Fig. 68.1.

Fig. 68.1 a,b
figure 1figure 1

(a) An input DAG with eight vertices and 14 arcs. The uppermost vertex is the root vertex  v r . (b) The optimal solution, that is, the arborescence rooted in  v r which has the minimum weight among all arborescence rooted in  v r that can be found in the input graph

The differences to the problem proposed in [5] are as follows. The authors of [5] require the root vertex v r to have only one single outgoing arc. Moreover, numbering the vertices from 1 to  | V | , the given acyclic graph G is restricted to contain only arcs  a i , j such that  i < j . These restrictions do not apply to the GlossaryTerm

MWRA

problem. Nevertheless, as a generalization of the problem proposed in [5], the GlossaryTerm

MWRA

problem is NP-hard. Concerning the existing work, the literature only offers the heuristic proposed in [5], which can also be applied to the more general GlossaryTerm

MWRA

problem.

The definition of the GlossaryTerm

MWRA

problem as previously outlined is inspired by a novel method which was recently proposed in [4] for the automated reconstruction of consistent tree structures from noisy images, which is an important problem, for example, in Neuroscience. Tree-like structures, such as dendritic, vascular, or bronchial networks, are pervasive in biological systems. Examples are 2D retinal fundus images and 3D optical micrographs of neurons. The approach proposed in [4] builds a set of candidate arborescences over many different subsets of points likely to belong to the optimal delineation and then chooses the best one according to a global objective function that combines image evidence with geometric priors (Fig. 68.2, for example). The solution of the GlossaryTerm

MWRA

problem (with additional hard and soft constraints) plays an important role in this process. Therefore, developing better algorithms for the GlossaryTerm

MWRA

problem may help in composing better techniques for the problem of the automated reconstruction of consistent tree structures from noisy images.

Fig. 68.2 a,b
figure 2figure 2

(a) A 2D image of the retina of a human eye. The problem consists in the automatic reconstruction (or delineation) of the vascular structure. (b) The reconstruction of the vascular structure as produced by the algorithm proposed in [4]

3 DP-Heur: A Heuristic Approach to the MWRA Problem

In this section, we propose a new heuristic approach for solving the GlossaryTerm

MWRA

problem. First, starting from the root vertex  v r , a spanning arborescence  T in G is constructed as outlined in lines 29 of Algorithm 68.1. Second, a GlossaryTerm

DP

algorithm is applied to  T in order to obtain the minimum-weight arborescence T that is contained in  T and rooted in  v r . The GlossaryTerm

DP

algorithm from [8] is used for this purpose. Given an undirected tree T = ( V T , E T ) with vertex and/or edge weights, and any integer number k [ 0 , | V T | - 1 ] , this GlossaryTerm

DP

algorithm provides – among all trees with exactly k edges in T – the minimum-weight tree  T . The first step of the GlossaryTerm

DP

algorithm consists in artificially converting the input tree T into a rooted arborescence. Therefore, the GlossaryTerm

DP

algorithm can directly be applied to arborescences. Morever, as a side product, the GlossaryTerm

DP

algorithm also provides the minimum-weight arborescences for all l with 0 l k , as well as the minimum-weight arborescences rooted in v r for all l with 0 l k . Therefore, given an arborescence of maximal size  T , which has  | V | - 1 arcs (where V is the vertex set of the input graph G), the GlossaryTerm

DP

algorithm is applied with  | V | - 1 . Then, among all the minimum-weight arborescences rooted in v r for  l | V | - 1 , the one with minimum weight is chosen as the output of the GlossaryTerm

DP

algorithm. In this way, the GlossaryTerm

DP

algorithm is able to generate the minimum-weight arborescence T (rooted in v r ) which can be found in arborescence  T . The heuristic described above is henceforth labeled DP-Heur. As a final remark, let us mention that for the description of this heuristic, it was assumed that the input graph is connected. Appropriate changes have to be applied to the description of the heuristic if this is not the case.

Algorithm 68.1 Heuristic DP-Heur for the MWRA problem

 1: input:GlossaryTerm

DAG

G = ( V , A ) , and a root node  v r

 2:  T 0 := ( V 0 = { v r } , A 0 = )

 3:  A pos := { a = ( v q , v l ) A v q V 0 , v l V 0 }

 4: for  i = 1 , , | V | - 1 do

 5:   a * = ( v q , v l ) := argmin { w ( a ) a A pos }

 6:   A i := A i - 1 { a * }

 7:   V i := V i - 1 { v l }

 8:   T i := ( V i , A i )

 9:   A pos := { a = ( v q , v l ) A v q V i , v l V i }

10: end for

11:  T := Dynamic_Programming ( T | V | - 1 , k = | V | - 1 )

12: output: arborescence T

4 Ant Colony Optimization for the MWRA Problem

The GlossaryTerm

ACO

approach for the GlossaryTerm

MWRA

problem which is described in the following is a  M A X - M I N Ant System (GlossaryTerm

MMAS

) [9] implemented in the hyper-cube framework (GlossaryTerm

HCF

) [10]. The algorithm, whose pseudocode can be found in Algorithm 68.2, works roughly as follows. At each iteration, a number of n a solutions to the problem is probabilistically constructed based on both pheromone and heuristic information. The second algorithmic component which is executed at each iteration is the pheromone update. Hereby, some of the constructed solutions – that is, the iteration-best solution  T ib , the restart-best solution  T rb , and the best-so-far solution  T bs  – are used for a modification of the pheromone values. This is done with the goal of focusing the search over time on high-quality areas of the search space. Just like any other GlossaryTerm

MMAS

algorithm, our approach employs restarts consisting of a re-initialization of the pheromone values. Restarts are controlled by the so-called convergence factor (GlossaryTerm

cf

) and a Boolean control variable called bs_update. The main functions of our approach are outlined in detail in the following.

Algorithm 68.2 Ant Colony Optimization for the MWRA Problem

 1: input:GlossaryTerm

DAG

G = ( V , A ) , and a root node v r

 2:  T bs := ( { v r } , ) , T rb := ( { v r } , ) , cf := 0 , bs_update := false

 3:  τ a := 0.5 for all a A

 4: while termination conditions not met do

 5:   S :=

 6:  for  i = 1 , , n a do

 7:    T := Construct_Solution ( G , v r )

 8:    S := S { T i }

 9:  end for

10:   T ib := argmin { f ( T ) T S }

11:  if T ib < T rb then T rb := T ib

12:  if T ib < T bs then T bs := T ib

13:  ApplyPheromoneUpdate

14: (cf,bs_update,T, T ib , T rb , T bs )

15:   cf := ComputeConvergenceFactor(T)

16:  if  cf > 0.99 then

17:   if  bs_update = true then

18:     τ a := 0.5 for all a A

19:     T rb := ( { v r } , )

20:     bs_update := false

21:   else

22:     bs_update := true

23:   end if

24:  end if

25: end while

26: output: T bs , the best solution found by the algorithm

Construct_Solution ( G , v r ) : This function, first, constructs a spanning arborescence  T in the way which is shown in lines 29 of Algorithm 68.1. However, the choice of the next arc to be added to the current arborescence at each step (see line 5 of Algorithm 68.1) is done in a different way. Instead of deterministically choosing from A pos , the arc which has the smallest weight value, the choice is done probabilistically, based on pheromone and heuristic information. The pheromone model T that is used for this purpose contains a pheromone value  τ a for each arc  a A . The heuristic information  η ( a ) of an arc a is computed as follows. First, let

w max⁡ := max⁡ { w ( a ) a A } .
(68.2)

Based on this maximal weight of all arcs in G, the heuristic information is defined as follows:

η ( a ) := w max⁡ + 1 - w ( a ) .
(68.3)

In this way, the heuristic information of all arcs is a positive number. Moreover, the arc with minimal weight will have the highest value concerning the heuristic information. Given an arborescence  T i (obtained after the ith construction step), and the nonempty set of arcs  A pos that may be used for extending  T i , the probability for choosing arc  a A pos is defined as follows

p ( a T i ) := τ a η ( a ) a ^ A pos τ a ^ η ( a ^ ) .
(68.4)

However, instead of choosing an arc from  A pos always in a probabilistic way, the following scheme is applied at each construction step. First, a value  r [ 0 , 1 ] is chosen uniformly at random. Second, r is compared to a so-called determinism rate δ [ 0 , 1 ] , which is a fixed parameter of the algorithm. If r δ , arc a * A pos is chosen to be the one with the maximum probability, that is

a * := argmax { p ( a T i ) a A pos } .
(68.5)

Otherwise, that is, when r > δ , arc a * A pos is chosen probabilistically according to the probability values.

The output T of the function Construct_Solution ( G , v r ) is chosen to be the minimum-weight arborescence which is encountered during the process of constructing  T , that is,

T := argmin { f ( T i ) i = 0 , , | V | - 1 } .

ApplyPheromoneUpdate(cf, bs_update, T, T ib , T rb , T bs ): The pheromone update is performed in the same way as in all  M M AS algorithms implemented in the GlossaryTerm

HCF

. The three solutions  T ib , T rb , and  T bs (as described at the beginning of this section) are used for the pheromone update. The influence of these three solutions on the pheromone update is determined by the current value of the convergence factor cf, which is defined later. Each pheromone value  τ a T is updated as follows:

τ a := τ a + ρ ( ξ a - τ a ) ,
(68.6)

where

ξ a := κ ib Δ ( T ib , a ) + κ rb Δ ( T rb , a ) + κ bs Δ ( T bs , a ) ,
(68.7)

where  κ ib is the weight of solution  T ib , κ rb the one of solution  T rb , and  κ bs the one of solution  T bs . Moreover, Δ ( T , a ) evaluates to 1 if and only if arc a is a component of arborescence T. Otherwise, the function evaluates to 0. Note also that the three weights must be chosen such that  κ ib + κ rb + κ bs = 1 . After the application of (68.6), pheromone values that exceed τ max⁡ = 0.99 are set back to  τ max⁡ , and pheromone values that have fallen below  τ min⁡ = 0.01 are set back to  τ min⁡ . This prevents the algorithm from reaching a state of complete convergence. Finally, note that the exact values of the weights depend on the convergence factor GlossaryTerm

cf

and on the value of the Boolean control variable bs_update. The standard schedule as shown in Table 68.1 has been adopted for our algorithm.

Table 68.1 Setting of κ ib , κ rb , and κ bs depending on the convergence factor cf and the Boolean control variable bs_update

ComputeConvergenceFactor(T): The convergence factor (cf) is computed on the basis of the pheromone values

c f := 2 ( ( τ a T max⁡ { τ max⁡ - τ a , τ a - τ min⁡ } | T | ( τ max⁡ - τ min⁡ ) ) - 0.5 ) .

This results in cf = 0 when all pheromone values are set to 0.5. On the other side, when all pheromone values have either value τ min⁡ or τ max⁡ , then cf = 1. In all other cases, GlossaryTerm

cf

has a value in ( 0 , 1 ) . This completes the description of all components of the proposed algorithm, which is henceforth labeled Aco.

5 Experimental Evaluation

The algorithms proposed in this chapter – that is, DP-Heur and Aco – were implemented in ANSI C++ using GCC 4.4 for compiling the software. Moreover, the heuristic proposed in [5] was reimplemented. As mentioned before, this heuristic – henceforth labeled VenSri – is the only existing algorithm which can directly be applied to the GlossaryTerm

MWRA

problem. All three algorithms were experimentally evaluated on a cluster of PCs equipped with Intel Xeon X3350 processors with 2667 MHz and 8 Gb of memory. In the following, we first describe the set of benchmark instances that have been used to test the three algorithms. Afterward, the algorithm tuning and the experimental results are described in detail.

5.1 Benchmark Instances

A diverse set of benchmark instances was generated in the following way. Three parameters are necessary for the generation of a benchmark instance G = ( V , A ) . Hereby, n and m indicate, respectively, the number of vertices and the number of arcs of G, while q [ 0 , 1 ] indicates the probability for the weight of any arc to be positive (rather than negative). The process of the generation of an instance starts by constructing a random arborescence T with n vertices. The root vertex of T is called v r . Each of the remaining m - n + 1 arcs was generated by randomly choosing two vertices v i and v j , and adding the corresponding arc a = ( v i , v j ) to T. In this context, a = ( v i , v j ) may be added to T, if and only if by its addition no directed cycle is produced, and neither ( v i , v j ) nor ( v j , v i ) form already part of the graph. The weight of each arc was chosen by, first, deciding with probability q if the weight is to be positive (or nonpositive). In the case of a positive weight, the weight value was chosen uniformly at random from [ 1 , 100 ] , while in the case of a nonpositive weight, the weight value was chosen uniformly at random from [ - 100 , 0 ] .

In order to generate a diverse set of benchmark instances, the following values for n, m, and q were considered:

  • n { 20 , 50 , 100 , 500 , 1000 , 5000 } ;

  • m { 2 n , 4 n , 6 n } ;

  • q { 0.25 , 0.5 , 0.75 } .

For each combination of n, m, and q, a total of 10 problem instances were generated. This resulted in a total of 540 problem instances, that is, 180 instances for each value of q.

5.2 Algorithm Tuning

The proposed GlossaryTerm

ACO

algorithm has several parameters that require appropriate values. The following parameters, which are crucial for the working of the algorithm, were chosen for tuning:

  • n a { 3 , 5 , 10 , 20 } : the number of ants (solution constructions) per iteration;

  • ρ { 0.05 , 0.1 , 0.2 } : the learning rate;

  • δ { 0.0 , 0.4 , 0.7 , 0.9 } : the determinism rate.

We chose the first problem instance (out of 10 problem instances) for each combination of n, m, and q for tuning. A full factorial design was utilized. This means that GlossaryTerm

ACO

was applied (exactly once) to each of the problem instances chosen for tuning. The stopping criterion was fixed to 20000 solution evaluations for each application of GlossaryTerm

ACO

. For analyzing the results, we used a rank-based analysis. However, as the set of problem instances is quite diverse, this rank-based analysis was performed separately for six subsets of instances. For defining these subsets, we refer to the instances with n { 20 , 50 , 100 } as small instances, and the remaining ones as large instances. With this definition, each of the three subsets of instances concerning the three different values for q, was further separated into two subsets concerning the instance size. For each of these six subsets, we used the parameter setting with which GlossaryTerm

ACO

achieved the best average rank for the corresponding tuning instances. These parameter settings are given in Table 68.2.

Table 68.2 Parameter setting (concerning ACO) used for the final experiments

5.3 Results

The three algorithms considered for the comparison were applied exactly once to each of the 540 problem instances of the benchmark set. Although Aco is a stochastic search algorithm, this is a valid choice, because results are averaged over groups of instances that were generated with the same parameters. As in the case of the tuning experiments, the stopping criterion for Aco was fixed to 20000 solution evaluations. Tables 68.368.5 present the results averaged – for each algorithm – over the 10 instances for each combination of n and m (as indicated in the first two table columns). Four table columns are used for presenting the results of each algorithm. The column with heading value provides the average of the objective function values of the best solutions found by the respective algorithm for the 10 instances of each combination of n and m. The second column (with heading std) contains the corresponding standard deviation. The third column (with heading size) indicates the average size (in terms of the number or arcs) of the best solutions found by the respective algorithm (remember that solutions – that is, arborescences – may have any number of arcs between 0 and | V | - 1 , where | V | is the number of the input GlossaryTerm

DAG

G = ( V , A ) ). Finally, the fourth column (with heading time (s)) contains the average computation time (in seconds). For all three algorithms, the computation time indicates the time of the algorithm termination. In the case of Aco, an additional table column (with heading evals) indicates at which solution evaluation, on average, the best solution of a run was found. Finally, for each combination of n and m, the result of the best-performing algorithm is indicated in bold font.

Table 68.3 Experimental results for the 180 instances with q = 0.25. Aco is compared to the heuristic proposed in this work (DP-Heur), and the algorithm from [5] (VenSri)
Table 68.4 Experimental results for the 180 instances with q = 0.5. Aco is compared to the heuristic proposed in this work (DP-Heur), and the algorithm from [5] (VenSri)
Table 68.5 Experimental results for the 180 instances with q = 0.75. Aco is compared to the heuristic proposed in this work (DP-Heur), and the algorithm from [5] (VenSri)

Concerning the 180 instances with q = 0.25, the results allow us to make the following observations. First, Aco is for all combinations of n and m the best-performing algorithm. Averaged over all problem instances Aco obtains an improvement of 29.8 % over VenSri. Figure 68.3a shows the average improvement of Aco over VenSri for three groups of input instances concerning the different arc densities. It is interesting to observe that the advantage of Aco over VenSri seems to grow when the arc density increases. On the downside, these improvements are obtained at the cost of a significantly increased computation time. Concerning heuristic DP-Heur, we can observe that it improves over VenSri for all combinations of n and m, apart from ( n = 100 , m = 2 n ) and ( n = 500 , m = 2 n ) . This seems to indicate that, also for DP-Heur, the sparse instances pose more of a challenge than the dense instances. Averaged over all problem instances, DP-Heur obtains an improvement of 18.6 % over VenSri. The average improvement of DP-Heur over VenSri is shown for the three groups of input instances concerning the different arc-densities in Fig. 68.3a. Concerning a comparison of the computation times, we can state that DP-Heur has a clear advantage over VenSriespecially for large-size problem

Fig. 68.3 a–c
figure 3figure 3

Average improvement (in %) of Aco and DP-Heur over VenSri. Positive values correspond to an improvement, while negative values indicate that the respective algorithm is inferior to VenSri. The improvement is shown for the three different arc-densities that are considered in the benchmark set, that is, m = 2 n , m = 4 n , and m = 6 n

Concerning the remaining 360 instances (q = 0.5 and q = 0.75), we can make the following additional observations. First, both Aco and DP-Heur seem to experience a downgrade in performance (in comparison to the performance of VenSri) when q increases. This holds especially for rather large and rather sparse graphs. While both algorithms still obtain an average improvement over VenSri in the case of q = 0.5 – that is, 19.9 % improvement in the case of Aco and 7.3 % in the case of DP-Heur – both algorithms are on average inferior to VenSriin the case of

Finally, Fig. 68.4 presents the information which is contained in column size of Tables 68.368.5 in graphical form. It is interesting to observe that the solutions produced by DP-Heur consistently seem to be the smallest ones, while the solutions produced by VenSri seem generally to be the largest ones. The size of the solutions produced by Aco is generally in between these two extremes. Moreover, with growing qthe difference in solution size as produced by the three algorithms seems to be more pronounced. We currently have no explanation for this aspect, which certainly deserves further

Fig. 68.4
figure 4figure 4

These graphics show, for each combination of n and m, information about the average size – in terms of the number of arcs – of the solutions produced by DP-Heur, Aco, and VenSri

6 Conclusions and Future Work

In this work, we have proposed a heuristic and an GlossaryTerm

ACO

approach for the minimum-weight rooted arboresence problem. The heuristic makes use of dynamic programming as a subordinate procedure. Therefore, it may be regarded as a hybrid algorithm. In contrast, the proposed GlossaryTerm

ACO

algorithm is a pure metaheuristic approach. The experimental results show that both approaches are superior to an existing heuristic from the literature in those cases in which the number of arcs with positive weights is not too high and in the case of rather dense graphs. However, as far as sparse graphs with a rather large fraction of positive weights are concerned, the existing heuristic from the literature seems to have advantages over the algorithms proposed in this chapter.

Concerning future work, we plan to develop a hybrid GlossaryTerm

ACO

approach which makes use of dynamic programming as a subordinate procedure, in a way similar to the proposed heuristic. Moreover, we plan to implement an integer programming model for the tackled problem – in the line of the model proposed in [11] for a related problem – and to solve the model with an efficient integer programming solver.