Metaheuristic approaches for ratio cut and normalized cut graph partitioning

Palubeckis, Gintaras

doi:10.1007/s12293-022-00365-w

Metaheuristic approaches for ratio cut and normalized cut graph partitioning

Regular research paper
Published: 29 April 2022

Volume 14, pages 253–285, (2022)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Memetic Computing Aims and scope Submit manuscript

Metaheuristic approaches for ratio cut and normalized cut graph partitioning

Download PDF

Gintaras Palubeckis ORCID: orcid.org/0000-0002-4991-1505¹

351 Accesses
3 Citations
Explore all metrics

Abstract

Partitioning a set of graph vertices into two or more subsets constitutes an important class of problems in combinatorial optimization. Two well-known members of this class are the minimum ratio cut and the minimum normalized cut problems. Our focus is on developing metaheuristic-based approaches for ratio cut and normalized cut graph partitioning. We present three techniques in this category: multistart simulated annealing (MSA), iterated tabu search (ITS), and the memetic algorithm (MA). The latter two use a local search procedure. To speed up this procedure, we apply a technique that reduces the effort required for neighborhood examination. We carried out computational experiments on both random graphs and benchmark graphs from the literature. The numerical results indicate that the MA is a clear winner among the tested methods. Using rigorous statistical tests, we show that MA is unequivocally superior to MSA and ITS in terms of both the best and average solution values. Additionally, we compare the performances of MA and the variable neighborhood search (VNS) heuristic from the literature, which is the state-of-the-art algorithm for the normalized cut model. The experimental results demonstrate the superiority of MA over VNS, especially for structured graphs.

A multiple search operator heuristic for the max-k-cut problem

Article 01 June 2016

Hybrid Metaheuristics for the Graph Partitioning Problem

Fixed-Parameter Single Objective Search Heuristics for Minimum Vertex Cover

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Partitioning a set of objects into two or more subsets constitutes an important class of problems in combinatorial optimization. A member of this class can often be modeled by a graph whose vertices represent objects and whose edges link vertices pairs that have some kind of relationship. Let $G=(V,E)$ be an undirected graph with vertex set V and edge set E. For an integer $m\geqslant 2$, a solution to a graph partitioning problem is a set $p=\{V_1,\ldots ,V_m\}$ in which $V_i$, $i\in M=\{1,\ldots ,m\}$, are nonempty and mutually disjoint subsets of V such that $\cup _{i\in M} V_i=V$. The various graph partitioning problems differ in the objective function and/or in the constraints. In many cases, the objective function incorporates the total edge weights connecting vertices from different subsets in partition p. Let the edge weights of graph G be denoted by $c_{uv}$, $(u,v)\in E$. For $V_k,V_l\subset V$, $V_k\cap V_l=\emptyset $, the sum $C(V_k,V_l)=\sum _{u\in V_k,v\in V_l} c_{uv}$ is called the cut between subsets $V_k$ and $V_l$. There are several graph partitioning problems that require minimizing the cut between partition subsets (or clusters). One of them is the ratio cut graph partitioning problem. For a graph G and fixed integer $m\geqslant 2$, it is stated as

$$\begin{aligned} \min _{p\in \varPi } F_r(p)= \sum _{k=1}^m C(V_k)/|V_k| \end{aligned}$$

(1)

where $C(V_k)$ is the shortcut to $C(V_k,V\setminus V_k)$ and $\varPi $ is the set of all partitions of V into m nonempty subsets. Thus, the contribution of a partition subset $V_k$ to (1) is represented by the ratio of the cut between $V_k$ and the rest of the graph to the cardinality of $V_k$. Problem (1) was first considered by Wei and Cheng [69] and Hagen and Kahng [25] for $m=2$ and then generalized to a multiway ratio cut (i.e., for $m>2$) by Chan et al. [11]. Later, Shi and Malik [58] introduced another objective function called normalized cut. The corresponding graph partitioning problem is expressed as follows:

$$\begin{aligned} \min _{p\in \varPi } F_n(p)= \sum _{k=1}^m C(V_k)/d(V_k) \end{aligned}$$

(2)

where $d(V_k)=\sum _{v\in V_k} d_v$ and the sum $d_v=\sum _{u\in V} c_{vu}$ is referred to as the weighted degree of the vertex $v\in V$. In (2), the term for a partition subset $V_k$ is the ratio of the cut between $V_k$ and the rest of the graph to the sum of the weighted degrees of the vertices in $V_k$. Yu and Shi [70] extended the normalized cut model to $m>2$ partition subsets.

A salient feature of both the ratio cut and normalized cut models is that no constraints are imposed on the partition subset size. This makes a significant difference versus graph partitioning problems in which the size of each subset is bounded from above and below. A well-studied problem of this sort is the maximally diverse grouping problem (MDGP). Many algorithms for solving the MDGP have been proposed, including variable neighborhood search [5, 51], hybrid genetic algorithms [51, 60], multistart simulated annealing [51], tabu search with strategic oscillation [22], the artificial bee colony algorithm [55], iterated tabu search [52], iterated maxima search [36], and neighborhood decomposition-based variable neighborhood search and tabu search [37]. Other related graph partitioning problems include maximum k-cut [15], ratio association [16], minimum conductance graph partitioning [10, 43, 44], overlapping normalized cut [63], cohesive clustering [7], partition size constrained minimum cut [1], and edge-ratio clustering [6].

Many applications of the ratio cut and normalized cut graph partitioning problems have been identified in the literature. Perhaps the main application of these graph partitioning models can be observed in clustering. Ratio cut and normalized cut-based clustering methods can be used in a variety of domains, such as image data analysis [14], pattern recognition [33, 61], web search [63], data mining [3], image segmentation [28, 58], and gene network analysis [16]. Other applications include community detection [32, 39, 48], data classification [47, 73], VLSI design [25], tree segmentation [35], salient object detection [21], robot swarm dynamic regrouping [53], Bayesian-type statistics [59], water distribution network partitioning [71], and detecting similar groups in heterogeneous networks [40].

It is well known that both the ratio cut and normalized cut graph partitioning problems are NP-hard [25, 30]. Therefore, it is necessary to develop heuristic algorithms for these problems. Exact methods can be used to solve problem instances with very small sizes only. Fan and Pardalos [17] presented quadratically constrained programs that can be used to find optimal ratio and normalized cuts. They reported computational results for graphs with 10 vertices. For large graphs, one must resort to heuristic algorithms, which provide good but not necessarily optimal solutions. Most of the efforts in this direction have been focused on developing spectral methods for graph partitioning. The basic idea of these methods is to relax (1) and (2) into continuous optimization problems. The latter can be solved by first computing the eigenvectors of the Laplacian matrix of the graph and then finding the final partition using k-means or other suitable algorithms. Many studies have been devoted to the design and performance evaluation of spectral algorithms for ratio cut and normalized cut, including those by Hagen and Kahng [25], Chan et al. [11], Fan and Pardalos [17], Hochbaum [30], Merkurjev et al. [47], Zhang et al. [72], and Han et al. [26]. Many authors have proposed various improvements to the standard spectral clustering method. Lu et al. [42] presented nonnegative and sparse spectral clustering algorithms based on the ratio cut and normalized cut criteria. Experiments have shown that these algorithms outperform the standard spectral clustering technique. Chen et al. [12] proposed a direct normalized cut algorithm exploiting the idea of directly optimizing the normalized cut model. The algorithm has only a quadratic time complexity. Chen et al. [13] considered a new normalized cut model with balance regularization to avoid a trivial solution. They developed an iterative method to solve the new model without using eigendecomposition. More comprehensive discussions of the spectral method for graph clustering and partitioning can be found in the tutorial by von Luxburg [64] and survey papers by Nascimento and de Carvalho [49] and Gallier [23].

Although the traditional approach to ratio cut and normalized cut graph partitioning relies on using the spectral method, there are also algorithms that are based on different principles. Jia et al. [33] proposed an approximate weighted kernel k-means algorithm for the normalized cut. The algorithm avoids the direct eigendecomposition of the Laplacian matrix and is suitable for handling very large graphs. Lorente-Leyva et al. [41] outlined two alternative approaches for the normalized cut partitioning problem. One approach is a heuristic search procedure, and the other is a quadratic formulation-based method. Dhillon et al. [16] designed a fast multilevel algorithm that can be tuned to minimize specific objectives, including ratio cut and normalized cut. The refinement step of the algorithm employs the weighted kernel k-means technique. Fan and Pardalos [17] presented several semidefinite programming relaxations for ratio and normalized cut models. They developed a graph partitioning algorithm whose main step involves solving one of these relaxations. Compared with the spectral clustering method, this algorithm can obtain improved solutions.

However, the literature lacks metaheuristic-based approaches for finding the minimum ratio cut or normalized cut in the graph. One such approach was proposed by Hansen et al. [27]. They developed a variable neighborhood search (VNS) algorithm for the normalized cut model. Their local search (LS) procedure within the VNS framework analyzes all the possibilities of moving a vertex from one partition subset to another. The algorithm also employs a fast LS technique that considers only moves between connected subsets (two vertex subsets $V_k$ and $V_l$ are connected if there is an edge whose one vertex belongs to $V_k$ and the other vertex belongs to $V_l$). During VNS execution, this technique is combined with the complete LS strategy. The experimental results show that the VNS method of Hansen et al. [27] is an efficient approach for solving the normalized cut graph partitioning problem. Metaheuristic-based approaches have also been developed for several related problems. Cafieri et al. [6] proposed a VNS algorithm for graph bipartitioning with the edge-ratio criterion. The algorithm is embedded into a hierarchical divisive heuristic to obtain a partition of the vertex set of the graph into a larger number of subsets. Mu et al. [48] and Ji et al. [32] presented ant colony optimization algorithms for community detection in complex networks. In their approach, the ratio cut is included as a linear term in the objective function (modularity density) of the problem. Lu et al. [43] developed a hybrid evolutionary algorithm for finding minimum conductance in the graph. The algorithm employs a tabu search technique as a local optimization procedure.

Literature analysis shows that the ratio cut and normalized cut graph partitioning problems have been mostly addressed by using various heuristic techniques. Most often, they are obtained through continuous relaxations of ratio or normalized cut models. Very little research has been devoted to metaheuristic optimization approaches although graph partitioning and clustering have many applications in a wide range of areas, as outlined earlier in this section. Considering these observations, our motivation is to implement and investigate the performance of metaheuristic-based algorithms for graph partitioning with ratio cut and normalized cut criteria. The main significance of this work lies in the development and experimental comparison of several metaheuristic algorithms for these graph partitioning problems. In the case of the normalized cut criterion, the comparison includes both the new algorithms and the best existing method. The significance of comparative analysis of various approaches is that it can help identify promising directions for designing better algorithms. The results of our computational study underscore the potential for evolutionary methods for solving graph partitioning problems. The main contribution of our work consists of three diverse algorithms for ratio cut and normalized cut graph partitioning. A simulated annealing (SA) algorithm is selected because of its popularity and success in solving complex combinatorial optimization problems. To be able to apply a CPU-based termination rule, we implemented SA as a multistart procedure. The iterated tabu search (ITS) algorithm is chosen because the tabu search (TS) is one of the most widely used local search methods. To potentially achieve better performance, we apply TS iteratively. Population-based evolutionary approaches in our study are represented by the memetic algorithm (MA). We prefer MA over the genetic algorithm (GA), because MA usually demonstrates faster convergence and better optimization than GA. The algorithm choice was guided by two reasons. First, our focus is on metaheuristics that exist for a long time and have shown excellent performance in solving numerous combinatorial optimization problems. Second, we consider the diversity factor and do not limit ourselves to considering only evolutionary techniques. It seems that some of the new population-based methods can perform equally well or even outperform the ITS and SA algorithms. In recent years, many effective strategies have been proposed to address optimization problems. They include a memetic algorithm with competition [68], iterated local search with tabu search [50], a multistart iterated tabu search algorithm [4], brain storm optimization with an orthogonal learning mechanism [45], and a decomposition-based algorithm using a localized control variable analysis approach [46].

We experimentally compare the proposed algorithms on two different sets of graphs: random graphs and benchmark graphs from the literature. We report results for both ratio cut and normalized cut graph partitioning scenarios. We also present comparison results between our best algorithm and the state-of-the-art VNS-based algorithm of Hansen et al. [27].

The remainder of the paper is organized as follows. In Sects. 2, 3, and 4, we present the multistart simulated annealing (MSA), iterated tabu search, and memetic algorithms, respectively. Section 5 is devoted to experimental analysis and comparisons of algorithms. Concluding remarks are given in Sect. 6.

2 Multistart simulated annealing

In this section, we present an implementation of the SA method for ratio cut and normalized cut graph partitioning. This method is based on an analogy between the metallurgical process of annealing in thermodynamics and the process of searching for the global extremum of a function. This analogy has been efficiently exploited by Kirkpatrick et al. [34] and Černý [8]. The guiding principle of the SA method is to escape a local optimum by accepting worsening moves with a certain probability. A recent overview of different SA variants can be found in [20].

The idea of simulated annealing is to generate trial solutions in the neighborhood of the current solution and either accept or reject them. A new solution is always accepted if it improves the current solution. Otherwise, a decision on acceptance or rejection is made at random with a probability depending on the difference between objective function values of two solutions and the current temperature of the cooling process. Because the search starts with a high initial temperature, this probability is higher at the initial steps of the algorithm. As the algorithm proceeds, the temperature is reduced after each iteration. The algorithm stops whenever the temperature becomes very close to zero. At each temperature level, a certain number of trial solutions are evaluated. In our implementation of SA, the initial temperature is obtained by generating a random solution and calculating the absolute difference between objective function values of the random solution and a solution in its neighborhood many times. The temperature is initialized with the largest of these differences. Another important design feature of our algorithm is that SA is executed multiple times.

Without loss of generality, we present our multistart simulated annealing algorithm in terms of the ratio cut metric. The adaptation of the algorithm to the case of normalized cut graph partitioning requires only very minor changes that come from using a different objective function. The algorithm employs two move types, relocating a vertex from its current subset to the other subset in a partition and swapping two vertices located in two distinct subsets. The relocation move is illustrated in Fig. 1, where vertex $v\in V_1$ is moved to partition subset $V_3$. An example of the swap move is shown in Fig. 2, where the solution on the right is obtained by swapping vertex $v\in V_3$ with vertex $u\in V_1$. The choice of move type at each iteration is controlled by a parameter $Q\in [0,1]$, which defines the probability of selecting a swap move. Trivially, the relocation move is performed with the probability $1-Q$. Assume that a vertex $v\in V_k$, $k\in M$, is relocated from its current subset $V_k$ to subset $V_l$, $l\ne k$, of the partition $p=\{V_1,\ldots ,V_m\}\in \varPi $. Let the resulting partition be denoted by $p'$. Naturally, it is assumed that v belongs to the set $V'(p)$, composed of taking the union of all subsets in p having cardinality greater than one. Thus, $v\in V_k\subseteq V'(p)=\cup _{i=1,|V_i|>1}^m V_i$. For $p\in \varPi $, we can define the relocation neighborhood $N_1(p)$ of p as the set of all partitions that can be obtained from p by relocating a single vertex. The change in cost incurred by applying a move is called the move gain. In the case of the above-defined relocation move, it is denoted by $\delta (p,p')$. Formally, $\delta (p,p')=F_r(p')-F_r(p)$. To provide an expression for $\delta $, we rewrite the objective function of (1) as $F_r(p)=\sum _{i=1}^m R(V_i)$ where

$$\begin{aligned} R(V_i)=C(V_i)/|V_i|. \end{aligned}$$

(3)

For a vertex $v\in V$ and a subset $V_i\in p$, let $c_v(V_i)=\sum _{u\in V_i} c_{vu}$. This number represents the sum of weights of edges between v and vertices in subset $V_i$. Of course, the sum $c_v(V_i)$ is also defined for $v\in V_i$. The sums $c_v(V_i)$ along with the cut weights $C(V_i)$ and ratios $R(V_i)$ can be used to compute $\delta $ as follows.

Proposition 1

For a vertex $v\in V_k$ and a subset $V_l,l\ne k$, in the partition p,

$$\begin{aligned} \begin{aligned}&\delta (p,p')=(C(V_k)-d_v+2c_v(V_k))/(|V_k|-1) \\&+(C(V_l)+d_v-2c_v(V_l))/(|V_l|+1)-R(V_k)-R(V_l). \end{aligned} \end{aligned}$$

(4)

Proof

Let us denote the cut weights for partition $p'$ by $C'(V_i)$, $V_i\in p'$. Imagine that in the first step of the relocation operation, vertex v is temporarily removed from graph G (and thus out of $V_k$). Then, the weight of the cut between $V_k$ and $V\setminus V_k$ decreases by $d_v-c_v(V_k)$ and that between $V_l$ and $V\setminus V_l$ decreases by $c_v(V_l)$. In the second step, the vertex v is inserted into $V_l$. This increases the weight of the cut between $V_k$ and $V\setminus V_k$ by $c_v(V_k)$ and that between $V_l$ and $V\setminus V_l$ by $d_v-c_v(V_l)$. Thus

$$\begin{aligned} C'(V_k)=C(V_k)-d_v+2c_v(V_k) \end{aligned}$$

(5)

and

$$\begin{aligned} C'(V_l)=C(V_l)+d_v-2c_v(V_l). \end{aligned}$$

(6)

Considering the fact that $C'(V_i)=C(V_i)$ for all $i\ne k,l$, we can write

$$\begin{aligned}&\delta (p,p')=C'(V_k)/(|V_k|-1)+C'(V_l)/(|V_l|+1)\nonumber \\&\quad -R(V_k)-R(V_l). \end{aligned}$$

(7)

Substituting (5) and (6) into (7), we obtain (4). $\square $

To illustrate the computation of $\delta $ by (4), we use Fig. 1 in which edge weights greater than one are indicated close to the edges. For $k=1$ and $l=3$, we have $C(V_k)=3$, $C(V_l)=3$, $c_v(V_k)=2$, $c_v(V_l)=2$, $d_v=4$, $R(V_k)=0.75$, and $R(V_l)=1.5$. Putting these values in (4), we obtain $\delta (p,p')=(3-4+4)/3+(3+4-4)/3-0.75-1.5=-0.25$.

Upon acceptance of the move, the cut weights of the subsets $V_k$ and $V_l$ are updated according to Equations (5) and (6): $C(V_k)$ is decreased by $d_v-2c_v(V_k)$ and $C(V_l)$ is increased by $d_v-2c_v(V_l)$. The values of $c_w(V_k)$ and $c_w(V_l)$, $w\in V\setminus \{v\}$, are updated by setting

$$\begin{aligned}&c_w(V_k):=c_w(V_k)-c_{wv}, \nonumber \\&c_w(V_l):=c_w(V_l)+c_{wv}. \end{aligned}$$

(8)

In the case of normalized cut graph partitioning, Equation (4) is slightly modified as follows:

$$\begin{aligned} \begin{aligned}&\delta (p,p')=(C(V_k)-d_v+2c_v(V_k))/(d(V_k)-d_v) \\&\quad +(C(V_l)+d_v-2c_v(V_l))/(d(V_l)+d_v)\\&\quad -C(V_k)/d(V_k)-C(V_l)/d(V_l). \end{aligned} \end{aligned}$$

(9)

If the move is accepted, then $d_v$ is subtracted from $d(V_k)$ and added to $d(V_l)$. Certainly, the operations defined by Equations (5), (6), and (8) are also applied.

Next we consider the swap move. Assume that a vertex $v\in V_k$ is interchanged with a vertex $u\in V_l$, $l\ne k$. As before, we denote the initial partition by p and the resulting partition by $p'$. All the partitions that can be obtained from p in this way form the swap neighborhood $N_2(p)$ of p. The gain of swap move is denoted by $\varDelta (p,p')$. By definition, $\varDelta (p,p')=F_r(p')-F_r(p)$. The following statement presents a formula for $\varDelta $.

Proposition 2

For a vertex $v\in V_k$ and a vertex $u\in V_l$, $l\ne k$,

$$\begin{aligned} \begin{aligned}&\varDelta (p,p')=(C(V_k)+d_u-d_v+2(c_v(V_k)\\&\quad -c_u(V_k)+c_{uv}))/|V_k| \\&\quad +(C(V_l)+d_v-d_u+2(c_u(V_l)\\&\quad -c_v(V_l)+c_{uv}))/|V_l|\\&\quad -R(V_k)-R(V_l). \end{aligned} \end{aligned}$$

(10)

Proof

We split the swap operation into four steps. Assume first that vertex v is temporarily removed from graph G. This action decreases $C(V_k)$ by $d_v-c_v(V_k)$ and $C(V_l)$ by $c_v(V_l)$. In the second step, vertex u is removed from graph G. As a result, $C(V_k)$ is further decreased by $c_u(V_k)-c_{uv}$ and $C(V_l)$ by $d_u-c_{uv}-c_u(V_l)$. The graph is restored by first adding vertex v to subset $V_l$. This step increases $C(V_k)$ by $c_v(V_k)$. Since vertex u is removed from graph G, the current weighted degree of vertex v is $d_v-c_{uv}$, and the current sum of edge weights between v and $V_l$ is $c_v(V_l)-c_{uv}$. Therefore, $C(V_l)$ increases by $d_v-c_{uv}-(c_v(V_l)-c_{uv})=d_v-c_v(V_l)$. In the last step, the vertex u is inserted into $V_k$. Now, the sum of edge weights between u and $V_k$ is $c_u(V_k)-c_{uv}$. It follows that $C(V_k)$ is increased by $d_u-(c_u(V_k)-c_{uv})=d_u-c_u(V_k)+c_{uv}$. Since vertex v has been moved to subset $V_l$, it also follows that $C(V_l)$ is increased by $c_u(V_l)+c_{uv}$. Now, let $C'(V_i)$, $V_i\in p'$, stand for the cut weights for partition $p'$. Summing the changes in $C(V_k)$ for each step of the above procedure, we obtain

$$\begin{aligned} \begin{aligned}&C'(V_k)=C(V_k)-(d_v-c_v(V_k))\\&\quad -(c_u(V_k)-c_{uv})+c_v(V_k) \\&\quad +(d_u-c_u(V_k)+c_{uv})=C(V_k)\\&\quad +d_u-d_v+2(c_v(V_k)-c_u(V_k)+c_{uv}). \end{aligned} \end{aligned}$$

(11)

Analogously

$$\begin{aligned} \begin{aligned}&C'(V_l)=C(V_l)-c_v(V_l)\\&-(d_u-c_{uv}-c_u(V_l))+(d_v-c_v(V_l))+ \\&(c_u(V_l)+c_{uv})=C(V_l)+d_v-d_u\\&+2(c_u(V_l)-c_v(V_l)+c_{uv}). \end{aligned} \end{aligned}$$

(12)

Clearly, $C'(V_i)=C(V_i)$ for all $i\ne k,l$. Therefore, substituting (11) and (12) into equation $\varDelta (p,p')=C'(V_k)/|V_k|+C'(V_l)/|V_l|-R(V_k)-R(V_l)$ gives (10). $\square $

Figure 2 provides an example of the swap move. As before, only edge weights greater than 1 are shown. For $k=3$ and $l=1$, we have $C(V_k)=5$, $C(V_l)=5$, $c_v(V_k)=1$, $c_v(V_l)=3$, $c_u(V_k)=2$, $c_u(V_l)=1$, $d_v=5$, $d_u=3$, and $R(V_k)=R(V_l)=5/3$. Then, using (10), we can calculate the swap move gain $\varDelta (p,p')$, which is equal to $(5+3-5+2(1-2+1))/3+(5+5-3+2(1-3+1))/3-5/3-5/3=-2/3$.

If the solution $p'$ is accepted, then first, the cut weights of the subsets $V_k$ and $V_l$ are updated using Equations (11) and (12) (assuming that $C'(V_k)$ and $C'(V_l)$ represent new values of $C(V_k)$ and $C(V_l)$, respectively). Then, since vertices v and u are swapped, the following assignments are performed:

$$\begin{aligned}&c_v(V_k):=c_v(V_k)+c_{uv}, c_v(V_l):=c_v(V_l)-c_{uv}, \end{aligned}$$

(13)

$$\begin{aligned}&c_u(V_k):=c_u(V_k)-c_{uv}, c_u(V_l):=c_u(V_l)+c_{uv}, \end{aligned}$$

(14)

$$\begin{aligned}&c_w(V_k):=c_w(V_k)+c_{uw}-c_{vw},\nonumber \\&c_w(V_l):=c_w(V_l)+c_{vw}-c_{uw},\nonumber \\&w\in V\setminus \{u,v\}. \end{aligned}$$

(15)

For a normalized cut, the swap move gain is computed as follows:

$$\begin{aligned} \begin{aligned}&\varDelta (p,p')=(C(V_k)+d_u-d_v+2(c_v(V_k)-c_u(V_k)\\&\quad +c_{uv}))/(d(V_k)-d_v+d_u) \\&\quad +(C(V_l)+d_v-d_u+2(c_u(V_l)-c_v(V_l)\\&\quad +c_{uv}))/(d(V_l)+d_v-d_u) \\&\quad -C(V_k)/d(V_k)-C(V_l)/d(V_l). \end{aligned} \end{aligned}$$

(16)

Equations (13)–(15) are applicable for both objective functions, $F_r$ and $F_n$.

Before presenting the MSA algorithm, we need to define a mapping, denoted by q, from vertex set V to set M, which represents the indices of the partition subsets. We will write $q(v)=k$ to indicate that vertex v belongs to subset $V_k$ of partition p. The pseudocode of MSA is given in Algorithm 1. In addition to the aforementioned move type selection probability Q, the parameters of MSA are the cooling factor $\alpha $, the final temperature $T_{\mathrm {min}}$, and the number of moves, $\beta $, to be attempted at each temperature level. The initial temperature $T_{\mathrm {max}}$ is computed using the formula $T_{\mathrm {max}}=\max (\max _{p'\in N'_1} |\delta (p,p')|, \max _{p'\in N'_2} |\varDelta (p,p')|)$, where p is a starting partition generated in Line 2 of Algorithm 1 and $N'_1$ ($N'_2$) is a sequence of partitions obtained by applying randomly chosen relocation (respectively, swap) moves to the partition p (thus $N'_1$ and $N'_2$ are extracted from the neighborhoods $N_1(p)$ and $N_2(p)$, respectively). We fixed the length of both sequences at 5, 000. Having the initial and final temperatures of the annealing process, we can calculate the number of temperature levels ${\bar{\gamma }}$ (Line 5).

As seen in the pseudocode, our SA algorithm is implemented as a multistart procedure. The reason for such a choice is to fairly compare the algorithm with other presented algorithms using a time-based stopping criterion. Obviously, there is no efficient way to control the number of SA restarts. This number decreases with increasing value of the parameter $\beta $. Usually, $\beta $ is assumed to be dependent on the size of the problem instance. Our multistart mechanism is simple-minded. At the start of each iteration, a random partition of the graph is generated. This is done by first randomly generating a permutation of the vertices of G and then splitting its entries (vertices) into m subsets of approximately equal sizes. The latter operation assigns the first $\lceil n/m \rceil $ vertices in the permutation to the first subset, the next $\lceil n/m \rceil $ (or $\lfloor n/m \rfloor $) vertices to the second subset and so on. The obtained partition of the graph is passed to the SA method. Additional operations must be performed before entering SA for the first time. They include initialization of the best solution $p^{*}$ and calculation of the initial temperature $T_{\mathrm {max}}$. This temperature is reused in subsequent SA restarts. It can be noted that SA with multiple starts requires no additional parameters beyond those of classical SA or the probability parameter Q.

From the pseudocode, we see that SA is wrapped in the “while” statement, which loops until a stop condition is met. We use a stopping criterion, where the search is terminated when a maximum time limit is reached. At the beginning of the iteration, the values of the sums $c_w(V_i)$, $w\in V$, $i\in M$, and the cut weights $C(V_i)$, $i\in M$, for the generated solution p are initialized. The best solution to the algorithm is denoted as $p^{*}$ and its value is $f^{*}$. This solution is memorized in Line 28. At each iteration, either a relocation move (Lines 11 and 12) or a swap move (Lines 14 and 15) is selected and evaluated. This choice is controlled by the probability parameter Q. If the move is accepted, then the current solution p and related information are updated in Lines 18–27. The temperature is changed after performing $\beta $ iterations. As is a common practice in SA implementations, the temperature is decreased according to the geometric schedule (Line 31). We note that in the normalized cut case, Equations (4) and (10) should be replaced with Equations (9) and (16), respectively (Lines 12 and 15). Considering only the innermost loop in Algorithm 1, we can obtain the following statement.

Proposition 3

The computational complexity of the body of loop 9–30 of MSA is O(n).

Proof

The major bottleneck of computations in Lines 10–29 of MSA is updating $c_w(V_k),c_w(V_l)$, $w\in V$ (either in Line 20 or in Line 24). This clearly takes O(n) time. The same upper bound holds on the number of operations required to save the improved solution when such a solution is found (Line 28). Other steps inside the loop are less time-consuming. In particular, only O(1) time is needed to compute the gain of a move (Lines 12 and 15). $\square $

We remark that the parameter $\beta $ in many SA implementations depends linearly on the size of the problem. This is also the case in our experiments with MSA (see Sect. 5.2). Therefore, all operations inside the loop 8–32 of MSA can be performed in $O(n^2)$ time. However, evaluating the time complexity of all executions of this loop, which implements SA, is difficult because the number of its repetitions depends on the initial temperature $T_{\mathrm {max}}$. In our algorithm, this temperature is a quantity dependent on the character of the problem instance being solved.

3 Iterated tabu search

A possible alternative to the SA technique is the widely used TS method [24]. We apply it to minimize the ratio cut and normalized cut objective functions iteratively.

The main idea of our TS implementation is to repeatedly execute tabu search and solution perturbation procedures. The use of such a strategy implies that it is only the first iteration where TS starts with a fully random solution. The solution perturbation procedure starts with a relatively good solution and proceeds by performing a number of random moves. The resulting solution is passed to the TS component of the approach. The key idea of tabu search is to allow accepting worsening solutions to prevent getting stuck in local minima. To avoid cycling in the search process, for each move performed, the reverse move is forbidden for a specified number of iterations. For this purpose, we use two tabu tables, one for relocation moves and another for swap moves. The TS procedure stops when a predefined number of iterations is reached. This number is an important parameter of the algorithm. A noteworthy feature of our approach is that the TS procedure is enhanced with the integration of an LS technique. Each of them explores both the relocation and swap neighborhoods at each iteration.

As in the case of SA, we present the ITS algorithm for the ratio cut partitioning problem. The pseudocode of the top level of our TS implementation is shown in Algorithm 2. The search starts from a solution generated using the same procedure as for the MSA method. The best solution is denoted as $p^{*}$ and its value is $f^{*}$. The TS algorithm and solution perturbation procedure, named get_restart_partition, are executed repeatedly within the loop 3–6. To stop the loop, a CPU time-based termination condition is used.

Algorithm 3 gives the pseudocode of the TS component of the approach. This component accepts and returns the current partition p, the best-found partition $p^{*}$, and their objective function values f and $f^{*}$. It starts with the initialization of the tabu tables $t=(t_{vu})$ and ${\tilde{t}}=({\tilde{t}}_{vl})$ by setting their entries to $-\tau $, where $\tau $ is the tabu tenure parameter. As the TS algorithm proceeds, the entry $t_{vu}$ of the tabu table t is used to store the last iteration number in which the vertices v and u have been swapped. Similarly, ${\tilde{t}}_{vl}$ contains the last iteration number in which vertex v has been moved from partition subset $V_l$ to a different subset. Another parameter of the algorithm is the number of iterations I. In each iteration, both neighborhood $N_1$ and neighborhood $N_2$ are explored. The selected move is specified by the vertex $v^{*}$ and partition subset $V_{l^{*}}$ in the case of relocation and by the vertices $v^{*}$ and $u^{*}$ in the case of the swap operation. To distinguish which move type is selected, we use the flag variable $\xi $, which equals 0 for relocation and 1 for swap move. The gain incurred by the selected move is denoted by $\delta _{\mathrm {min}}$. In the gain calculation, p is the current solution, and $p'$ is either the solution obtained by relocating vertex v from subset $V_k$ to subset $V_l$ (Line 7) or the solution obtained by swapping vertices v and u (Line 16). The variable $\rho $ is used to know whether an improving solution has been found. If this is the case, $\rho $ is positive. If two or more improving solutions are discovered, then one of them (or, more precisely, the corresponding move) is selected probabilistically (Lines 10 and 19). The role of the set U in the algorithm (Lines 3, 5, and 15) is to guarantee that each pair of vertices (v, u) is considered only once. The algorithm also uses the set $V'$ and mapping q, which we defined in the previous section.

Once neighborhoods $N_1$ and $N_2$ of the current solution p have been explored, the algorithm proceeds by updating the tabu tables t and ${\tilde{t}}$ (Line 25) and performing either the relocation or swap operation, depending on the value of $\xi $ (Line 26). These operations are implemented in the same way as those in MSA (Lines 19–27 of Algorithm 1). Specifically, Equations (5), (6), and (8) are used for $v=v^{*}$, $k=q(v^{*})$, and $l=l^{*}$ in the case of the relocation move, and Equations (11)–(15) are used for $v=v^{*}$, $u=u^{*}$, $k=q(v^{*})$, and $l=q(u^{*})$ if the swap move has been selected. In the same step, the current partition p is updated. If an improved solution has been found ($\rho >0$), then the TS algorithm attempts to further improve this solution by applying a local search procedure (Line 29). A description of this procedure is given later in this section.

Looking at the TS pseudocode, we can see that the statements inside the outermost loop can be split into two sequences: those in Lines 3–27 and those in Lines 28–31. The first of these sequences has a worst-case runtime $O(n^2)$. The second of these sequences includes a call to the LS procedure. However, its worst-case time complexity is unknown. A remark on the computational complexity of the basic parts of local_search is given at the end of this section.

To periodically direct the search toward unexplored regions of the solution space, the ITS algorithm applies a solution perturbation procedure get_restart_partition. Its pseudocode is presented in Algorithm 4. The input to the procedure includes a partition p, which is the last solution recorded by the TS component of the approach. At the initialization step, get_restart_partition first draws uniformly at random an integer $K_{\mathrm {max}}$ from the interval $[n\kappa _1,n\kappa _2]$ and then integers K from the interval $[K_{\mathrm {min}},K_{\mathrm {max}}]$ and L from the interval $[L_{\mathrm {min}},L_{\mathrm {max}}]$. In these calculations, $\kappa _1,\kappa _2,K_{\mathrm {min}},L_{\mathrm {min}}$, and $L_{\mathrm {max}}$ are ITS parameters. The first two parameters, that is, $\kappa _1$ and $\kappa _2$ define the range for the maximum number of relocation moves to be performed. To bound this number from below, the parameter $K_{\mathrm {min}}$ is used. At each iteration of the solution perturbation procedure, a list of the best relocation moves is built. The parameters $L_{\mathrm {min}}$ and $L_{\mathrm {max}}$ restrict the length of this list to be bounded from below and above, respectively. The role of all these ITS parameters is to guide the diversification of the search. Their appropriate values are selected experimentally. The parameters K and L of the get_restart_partition procedure, when used together, control the level of degradation of the objective function value due to the perturbation of the current partition p. The quality of the generated partition deteriorates with the increase in K and L. In contrast, small values of K and L may lead to producing a partition that is too close to the input solution p. From the pseudocode, it should be clear that the use of the set U guarantees that each vertex is moved from one subset to another at most once. Provided L is a constant and K is proportional to n (such choices are made in Sect. 5.2), the computational complexity of the procedure get_restart_partition is $O(n^2m)$.

We end this section by describing our implementation of the LS algorithm for the considered graph partitioning problems. As seen before, this algorithm is employed within the TS framework. However, perhaps more importantly, the LS procedure is the key ingredient of an MA, which is presented in the next section. The pseudocode of the LS procedure is given in Algorithm 5. As is typical in most LS implementations, the procedure executes in a number of iterations. Each iteration performs all the operations contained in the outer “while” loop (Lines 3–36). For a partition p at the beginning of the loop, let us consider a partition subset $V_k$, $k\in M$. Assume that the content of $V_k$ does not change during the execution of the iteration. More precisely, no vertex $v\in V_k$ moves to another subset (Line 14), and no vertex $u\in V\setminus V_k$ is added to $V_k$ (Line 14 when $l=k$). Moreover, no vertex $v\in V_k$ is interchanged with a vertex $u\in V\setminus V_k$ (Line 30). In this case, the entry $s_k$ of the vector $S=(s_1,\ldots ,s_m)$ at the end of the iteration (Line 36) is equal to 0. If at least one of the above conditions is not satisfied, then $s_k=1$. Within the loop, some entries of S may temporarily be set to 2. Assume now that $s_k=0$ and $s_l=0$ for subsets $V_k$ and $V_l$. Then, in the next iteration, all the gains $\delta $ and $\varDelta $ expressed by (4) and (10) for $V_k$ and $V_l$, respectively, have nonnegative values. It follows that, in the next iteration, there is no need to examine moves in which both subsets, $V_k$ and $V_l$, are involved. This strategy implemented in the local_search procedure reduces the computational time needed to reach a locally optimal solution. The same technique oriented at accelerating neighborhood examinations was proposed by Lai et al. [37] for the MDGP. These authors, however, considered relocation moves and swap moves separately. They used two $0-1$ matrices of size $m\times m$. The (i, j)-entry of the first matrix takes value 0 if and only if no relocation of a vertex from the ith group to the jth group resulted in obtaining an improving solution. Similarly, the second matrix is defined with respect to the swap operation. Lai et al. [37] provided computational results that demonstrated the unequivocal efficiency of the proposed neighborhood exploration strategy.

As can be seen in Algorithm 5, each iteration of our LS procedure is composed of two phases. In the first of them (Lines 4–19), the neighborhood $N_1$ of the current partition p is explored. In Line 4, $M'(p)$ refers to the set of subsets in partition p whose size is greater than 1. Formally, $M'(p)=\{k \mid k\in M, V_k\in p, |V_k|>1\}$, where $M=\{1,\ldots ,m\}$ as before. If an improving solution is found, then flag $\mu $ is set to true (Line 10). As a result, the current solution p and its value f are updated (Lines 14 and 15). In the second phase (Lines 20–35), a better-quality solution is searched for in neighborhood $N_2(p)$. If such a solution is found among the neighbors of p, then it is used to replace p (Lines 30 and 31).

It can be observed from the pseudocode that the time complexity of the body (Lines 7–17) of the first inner “while” loop is O(n) and that of the body (Lines 23–33) of the second inner “while” loop is $O(n^2)$. In obtaining these estimates, it is assumed that in the worst case, the size of partition subsets can be proportional to the graph order n. The number of times the “while” loops are executed depends on the graph and on the starting solution p. This number, however, is difficult to estimate.

4 A memetic algorithm

Evolutionary algorithms are an important class of approaches for solving various optimization problems. Among them, the memetic algorithm is one of the most successful techniques. The fundamental concept of the MA is to apply the genetic operators in combination with an LS procedure. In this section, we present an MA for ratio cut and normalized cut graph partitioning. As before, a description is given in terms of the first of these problems.

Like a GA, the MA manipulates a population of individuals where each individual represents a solution to the optimization problem being solved. To generate new members of the population, a crossover operator comes into play. Usually, it is a binary operator that combines the genes of two parents in some manner to produce an offspring. In the context of ratio cut and normalized cut, individuals in the population correspond to partitions of vertex set V. A convenient method for coding an individual is to use a mapping q introduced in Sect. 2. We remind that for a partition $p=\{V_1,\ldots ,V_m\}$ and a vertex $v\in V_i$, the value $q(v)=i$ points to the partition subset to which vertex v belongs.

One of the key ideas of our proposed algorithm is to employ the crossover operator that is used in grouping genetic algorithms (GGAs) (see [31, 54]). In a GGA, the chromosomes represent the allocation of certain objects (e.g., vertices of the graph) to groups. When generating offspring from two parents, a GGA manipulates groups instead of group members. Our crossover operator randomly selects two individuals as parents and repeatedly transfers a subset of vertices from one of the selected parents to offspring. If the resulting collection of subsets does not cover all the vertices of the graph, then a repair mechanism is triggered to obtain a feasible partition. The offspring partition replaces the worst individual in the current population if it is not worse than the worst individual and of course differs from all individuals in the population. Another component of our memetic algorithm is the LS procedure, which is precisely the same as that used in the ITS algorithm. Certainly, this procedure is applied to each generated offspring. Moreover, LS is used to improve randomly generated graph partitions when constructing an initial population of solutions.

Let $p'$ and $p^{\prime \prime }$ be two partitions in $\varPi $ that are submitted as an input to the crossover operator. The pseudocode of our GGA crossover implementation is given in Algorithm 6, where r, $r'$, and $r^{\prime \prime }$ are the current number of subsets in the offspring partition p and parent partitions $p'$ and $p^{\prime \prime }$, respectively, and where the resulting offspring is represented by the mapping q. The most important part of the procedure (the “while” loop spanning Lines 3–13) serves to transfer genetic information from the parents to the offspring. First, it equiprobably selects one of the parents and then finds a partition subset of this parent with the smallest value of the ratio R (given by (3)). Ties among subsets are broken by random selection. The chosen subset, $V_k$, is moved from the parent partition $p'$ (or $p^{\prime \prime }$) to the offspring partition (Lines 7 and 8 in the case of $p'$). Assume that the parent $p'$ is involved in this operation. Then, each vertex of $V_k$ is removed from the corresponding subset of the partition $p^{\prime \prime }$ (Line 9). The iteration terminates by deleting empty subsets of $p^{\prime \prime }$, if any, and updating their number $r^{\prime \prime }$ accordingly. If the parent $p^{\prime \prime }$ is selected, then similar operations with respect to $p^{\prime \prime }$ are performed (Line 11). Upon emptying both $p'$ and $p^{\prime \prime }$ (then $r'=r^{\prime \prime }=0$), there may still be some vertices uncovered by offspring subsets. The set of such vertices in the pseudocode is denoted by U. If $U\ne \emptyset $ and $r<m$, then the priority is to create single-vertex subsets (Lines 17, 18). When r reaches m, every remaining vertex of U is assigned to a randomly chosen partition subset (Lines 20, 21). The resulting partition p completely covers the vertices of the graph. However, it may occur that p consists of an insufficient number of subsets. In this case, to repair the partition p, an additional “while” loop is used (Lines 27–36). In this part of the pseudocode of the algorithm, ${\tilde{M}}$ is the set of indices of the partition subsets that are used as potential candidates for splitting, and $Z({\tilde{M}})$ denotes the average size of these subsets. Before entering the loop, $Z({\tilde{M}})=n/r$ because initially $\cup _{i\in {\tilde{M}}}V_i=V$ (Line 26). Assume that $r<m$ at this point. Then, the partition p is refined by applying the subset splitting operation $m-r$ times. Only subsets of size at least $Z({\tilde{M}})$ (and trivially at least 2) are candidates for splitting. Each iteration starts by randomly selecting one of the subsets (Line 28). The chosen subset is randomly split into two subsets of nearly equal size (Line 32). Its index is removed from ${\tilde{M}}$, and the new value of $Z({\tilde{M}})$ is calculated (Line 31). If no suitable subset is found, then a new round of splitting begins with an enlarged set ${\tilde{M}}$ (Line 34). Concluding the description of the crossover operator, we remark that an offspring can be generated efficiently.

Proposition 4

The computational complexity of the procedure get_offspring is $O(n^2)$.

Proof

Observe that get_offspring performs $O(n^2)$ operations to compute the ratios $R(V_i)$, $i\in M$. This can be performed at the initialization stage of the procedure. Other parts of the crossover have less complexity. In particular, all iterations of the “while” loops 3–13 and 27–36 take O(nm) time. $\square $

The pseudocode of the top level of the MA for ratio cut graph partitioning is presented in Algorithm 7. After constructing an initial population P (Line 2), the algorithm iterates over the following four steps until a termination condition is satisfied: selection of a pair of individuals in the current generation as parents (Line 4), crossover of the parents, generation of an offspring p (Line 5), execution of an LS algorithm on the offspring p (Line 7), and evaluation of p (Line 8). Similar to MSA and ITS, the iterations stop when a maximum time limit is reached. The best solution in MA is denoted as $p^{*}$ and its value is $f^{*}$.

MA is based on the application of four procedures. Two were previously described: get_offspring previously in this section and local_search in Sect. 3. The pseudocode of init_population is given in Algorithm 8. This procedure creates an initial population P of size z, where z is a parameter of MA. In the pseudocode, $p^{*}$ is the best solution in the population, and $f^{*}$ is its objective function value. At each iteration, the candidates for inclusion in P are a partition ${\tilde{p}}\in \varPi $ generated at random using the same routine as for MSA and a locally optimal solution p produced by local_search applied to ${\tilde{p}}$. A priority is assigned to partition p. If p differs from all solutions currently in P, then it is appended to the population. Otherwise, an attempt is made to add the partition ${\tilde{p}}$ to P.

The condition in Line 6 of Algorithm 8 is checked using an $m\times m$ matrix, $H=(h_{ij})$, whose entry $h_{ij}$ is the number of vertices $v\in V$ such that $q(v)=i$ and $q'(v)=j$, where q and $q'$ are the mappings corresponding to partitions p and $p'$, respectively. The following statement demonstrates that this condition as well as that in Line 8 can be checked rather easily.

Proposition 5

The computational complexity of checking whether partitions $p\in \varPi $ and $p'\in \varPi $ are different is O(n).

Proof

Clearly, constructing the matrix H takes only O(n) operations. The partitions p and $p'$ are different if and only if the matrix H constructed for them has more than m nonzero entries. Taken together, these two observations prove the claim. $\square $

Algorithm 9 gives the pseudocode of the procedure evaluate_offspring. It attempts to replace the worst individual in population P with generated offspring p. This procedure is also responsible for memorizing the best solution found.

5 Computational experiments

The purpose of this section is to examine the computational performance of the described algorithms for ratio cut and normalized cut graph partitioning. We present comparative experiments on both randomly generated graphs and well-known benchmark graphs taken from the literature.

5.1 Experimental setup

All the algorithms were coded in the C++ programming language, and the tests were carried out on a laptop with an Intel Core i5-6200U CPU running at 2.30 GHz. We remark that the code is designed to be applicable for graphs of arbitrary density. To achieve this requirement, the graph is represented in a computer’s memory by the edge-weight matrix. This puts a limit on the size of graphs the code can deal with. Our computer can handle graphs with up to approximately $10^4$ vertices.

We performed our experiments on the following two sets of graphs:

(a)
complete undirected graphs with edge weights generated in the following two steps. First, each vertex is assigned a point whose coordinates are sampled randomly and uniformly from a rectangle. Then, for each pair of vertices $u,v\in V$, the weight of the edge $(u,v)\in E$ is computed as $c_{uv}=\min (1/d_{uv},100)$, where $d_{uv}$ is the Euclidean distance between points corresponding to the vertices u and v. The ensemble of graphs generated in this way is split into five subsets according to the number of vertices, which is 200, 500, 1000, 2000, and 3000, respectively. Each subset consists of five graphs.
(b)
A collection of 36 benchmark graphs. It consists of the following three sets: the 10th DIMACS Implementation Challenge Benchmark [2] (first 18 graphs in the tables to follow, that is, from karate to add32, inclusively), 12 graphs from the Network Data Repository [56] (from Trefethen-200 to bio-dmela in the tables), and 6 social network samples [9] (last 6 graphs in the tables, that is, from soc52 to pokec_2000). We remark that the graph named netscience has more than 100 isolated vertices. We deleted them, reducing the size of netscience from 1589 to 1461.

The dataset (a) and the source codes of the presented algorithms are publicly available at http://www.personalas.ktu.lt/~ginpalu/grpart.html.

In our computational experiments, we run each algorithm 10 times on each graph in sets (a) and (b). Maximum CPU time limits for a run of an algorithm were as follows: 30 s for $n\leqslant 200$, 200 s for $200<n\leqslant 500$, 500 s for $500<n\leqslant 1000$, 1000 s for $1000<n\leqslant 2000$, 2000 s for $2000<n\leqslant 4000$, 4000 s for $4000<n\leqslant 6000$, and 8, 000 s for $n>6000$. To assess the performance of the algorithms, we use the following measures: the objective function value of the best solution out of 10 runs, the average objective function value of 10 solutions, and the average time taken to find the best solution in a run.

5.2 Parameter settings

The main control parameters of the MSA algorithm are the cooling factor $\alpha $, the final temperature $T_{\mathrm {min}}$, and the number of iterations at each temperature level, $\beta $. Following recommendations from the SA literature [57, 62], we set $\alpha $ to 0.95, $T_{\mathrm {min}}$ to 0.0001, and $\beta $ to 100n. The specific parameter of our implementation of SA is the move type selection probability Q. We tested values of Q from 0 to 1 in increments of 0.1. To this end, we run MSA on a training sample consisting of 10 complete graphs whose edge weights are generated using the same procedure as for set (a). The five graphs in this sample are of order $n=200$ and the other of order $n=500$. The number of partition subsets is 5 for $n=200$ and 10 for $n=500$. The sample is disjoint from set (a), which is reserved for the testing stage. The performance of the MSA configurations was measured in terms of the average objective function value over 10 runs. The experiment in the case of ratio cut has shown that our MSA algorithm is quite robust to variation in the parameter Q in the interval [0, 0.9]. We also found that its solutions are substantially worse for $Q=1$. This implies that using only swap operations is not a good strategy. Restricting to the interval [0, 0.9], a marginally better performance was observed for $Q=0.1$. Therefore, we set Q to 0.1. A similar experiment was conducted with MSA for the case of normalized cut graph partitioning. Analogous conclusions as in the previous experiment were made. Based on the results obtained, we decided, for the normalized cut case, to set Q to 0.2.

The parameters of our ITS algorithm are $\tau $, $\kappa _1$, $\kappa _2$, $K_{\mathrm {min}}$, $L_{\mathrm {min}}$, $L_{\mathrm {max}}$, and I (see Sect. 3). To determine good values for these parameters, we experimented with the same training set of graphs as used for the MSA algorithm. We relied on a simple parameter setting procedure. Its idea consists of allowing one parameter to take a number of predefined values while keeping the other parameters fixed at reasonable values chosen during preliminary tests. We ran ITS for both problems, ratio cut and normalized cut minimization. Because of their close similarity, it was possible to identify a common set of good parameter values for both problems. First, we varied $\kappa _1$ from 0.1 to 0.9 in increments of 0.1. The related parameter $\kappa _2$ ($>\kappa _1$) was fixed at 1, which is the maximum possible value. The range of acceptable values for $\kappa _1$ was found to be 0.5 to 0.8. We fixed $\kappa _1$ at 0.7. Then, we attempted to decrease the value of $\kappa _2$. Respecting condition $\kappa _2>\kappa _1$, we tested two values of $\kappa _2$: $\kappa _2=0.8$ and $\kappa _2=0.9$. However, none of them led to better results. Therefore, we fixed $\kappa _2$ at 1 for all further experiments with ITS. We continued by examining the following 6 values of the parameter $K_{\mathrm {min}}$: 1, 5, 10, 15, 20, and 50. The results showed little sensitivity of ITS to this parameter. We arbitrarily set $K_{\mathrm {min}}$ to 15. The next step was to analyze the effect of the parameter $L_{\mathrm {min}}$. We ran ITS with $L_{\mathrm {min}}\in \{3,5,10,25,50,100,200\}$. We observed that the algorithm was fairly robust to the choice of $L_{\mathrm {min}}$. Slightly better results were obtained with smaller values of $L_{\mathrm {min}}$. Based on this finding, we set $L_{\mathrm {min}}=10$. In the next experiment, we tried the following values of the parameter $L_{\mathrm {max}}$: 10, 25, 50, 100, 200, 300, 500, 700, 1000, 1500, 2000, and 3000. The ITS algorithm showed consistently good performance when $L_{\mathrm {max}}$ varied from 200 to 2000. Its performance deteriorated for $L_{\mathrm {max}}\leqslant 100$ and $L_{\mathrm {max}}=3000$. We decided to fix $L_{\mathrm {max}}$ at 500. Further, we investigated how the performance of ITS depends on the tabu tenure parameter $\tau $. We varied $\tau $ from 5 to 30 with a step of 5. Additionally, we tested $\tau =3$. The range of acceptable values for $\tau $ was found to be quite wide. The results of very similar quality were obtained for $\tau \in \{10,\ldots ,30\}$. We fixed $\tau $ at 20, which was the middle value in this set. Perhaps a more important parameter of TS in the ITS framework is the number of iterations I. We ran ITS for the following values of I: 20, 50, 100, 200, 400, and 600. The best performance of ITS was observed when using $I\in \{100,200\}$, with a slight edge to $I=100$. The algorithm showed the worst performance for $I=600$, especially for $I=20$. Considering these findings, we elected to set I to 100.

Table 1 Number of partition subsets (m) and time limit (in seconds) for random graphs

Metaheuristic approaches for ratio cut and normalized cut graph partitioning

Abstract

Similar content being viewed by others

A multiple search operator heuristic for the max-k-cut problem

Hybrid Metaheuristics for the Graph Partitioning Problem

Fixed-Parameter Single Objective Search Heuristics for Minimum Vertex Cover

Explore related subjects

1 Introduction

2 Multistart simulated annealing

Proposition 1

Proof

Proposition 2

Proof

Proposition 3

Proof

3 Iterated tabu search

4 A memetic algorithm

Proposition 4

Proof

Proposition 5

Proof

5 Computational experiments

5.1 Experimental setup

5.2 Parameter settings

5.3 Numerical results for ratio cut

5.4 Numerical results for a normalized cut

5.5 Comparison with the state of the art

5.6 Comparison with a different variant of the crossover operation

5.7 Comparisons with the genetic algorithm and multistart local search

5.8 Analysis of the main parameters

5.8.1 Usefulness of the restart strategy in simulated annealing

5.8.2 The effect of the number of TS iterations on the ITS performance

5.8.3 The effect of the population size on the MA performance

5.9 Analysis of the LS strategy

6 Concluding remarks

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation