1 Introduction

Partitioning a set of objects into two or more subsets constitutes an important class of problems in combinatorial optimization. A member of this class can often be modeled by a graph whose vertices represent objects and whose edges link vertices pairs that have some kind of relationship. Let \(G=(V,E)\) be an undirected graph with vertex set V and edge set E. For an integer \(m\geqslant 2\), a solution to a graph partitioning problem is a set \(p=\{V_1,\ldots ,V_m\}\) in which \(V_i\), \(i\in M=\{1,\ldots ,m\}\), are nonempty and mutually disjoint subsets of V such that \(\cup _{i\in M} V_i=V\). The various graph partitioning problems differ in the objective function and/or in the constraints. In many cases, the objective function incorporates the total edge weights connecting vertices from different subsets in partition p. Let the edge weights of graph G be denoted by \(c_{uv}\), \((u,v)\in E\). For \(V_k,V_l\subset V\), \(V_k\cap V_l=\emptyset \), the sum \(C(V_k,V_l)=\sum _{u\in V_k,v\in V_l} c_{uv}\) is called the cut between subsets \(V_k\) and \(V_l\). There are several graph partitioning problems that require minimizing the cut between partition subsets (or clusters). One of them is the ratio cut graph partitioning problem. For a graph G and fixed integer \(m\geqslant 2\), it is stated as

$$\begin{aligned} \min _{p\in \varPi } F_r(p)= \sum _{k=1}^m C(V_k)/|V_k| \end{aligned}$$
(1)

where \(C(V_k)\) is the shortcut to \(C(V_k,V\setminus V_k)\) and \(\varPi \) is the set of all partitions of V into m nonempty subsets. Thus, the contribution of a partition subset \(V_k\) to (1) is represented by the ratio of the cut between \(V_k\) and the rest of the graph to the cardinality of \(V_k\). Problem (1) was first considered by Wei and Cheng [69] and Hagen and Kahng [25] for \(m=2\) and then generalized to a multiway ratio cut (i.e., for \(m>2\)) by Chan et al. [11]. Later, Shi and Malik [58] introduced another objective function called normalized cut. The corresponding graph partitioning problem is expressed as follows:

$$\begin{aligned} \min _{p\in \varPi } F_n(p)= \sum _{k=1}^m C(V_k)/d(V_k) \end{aligned}$$
(2)

where \(d(V_k)=\sum _{v\in V_k} d_v\) and the sum \(d_v=\sum _{u\in V} c_{vu}\) is referred to as the weighted degree of the vertex \(v\in V\). In (2), the term for a partition subset \(V_k\) is the ratio of the cut between \(V_k\) and the rest of the graph to the sum of the weighted degrees of the vertices in \(V_k\). Yu and Shi [70] extended the normalized cut model to \(m>2\) partition subsets.

A salient feature of both the ratio cut and normalized cut models is that no constraints are imposed on the partition subset size. This makes a significant difference versus graph partitioning problems in which the size of each subset is bounded from above and below. A well-studied problem of this sort is the maximally diverse grouping problem (MDGP). Many algorithms for solving the MDGP have been proposed, including variable neighborhood search [5, 51], hybrid genetic algorithms [51, 60], multistart simulated annealing [51], tabu search with strategic oscillation [22], the artificial bee colony algorithm [55], iterated tabu search [52], iterated maxima search [36], and neighborhood decomposition-based variable neighborhood search and tabu search [37]. Other related graph partitioning problems include maximum k-cut [15], ratio association [16], minimum conductance graph partitioning [10, 43, 44], overlapping normalized cut [63], cohesive clustering [7], partition size constrained minimum cut [1], and edge-ratio clustering [6].

Many applications of the ratio cut and normalized cut graph partitioning problems have been identified in the literature. Perhaps the main application of these graph partitioning models can be observed in clustering. Ratio cut and normalized cut-based clustering methods can be used in a variety of domains, such as image data analysis [14], pattern recognition [33, 61], web search [63], data mining [3], image segmentation [28, 58], and gene network analysis [16]. Other applications include community detection [32, 39, 48], data classification [47, 73], VLSI design [25], tree segmentation [35], salient object detection [21], robot swarm dynamic regrouping [53], Bayesian-type statistics [59], water distribution network partitioning [71], and detecting similar groups in heterogeneous networks [40].

It is well known that both the ratio cut and normalized cut graph partitioning problems are NP-hard [25, 30]. Therefore, it is necessary to develop heuristic algorithms for these problems. Exact methods can be used to solve problem instances with very small sizes only. Fan and Pardalos [17] presented quadratically constrained programs that can be used to find optimal ratio and normalized cuts. They reported computational results for graphs with 10 vertices. For large graphs, one must resort to heuristic algorithms, which provide good but not necessarily optimal solutions. Most of the efforts in this direction have been focused on developing spectral methods for graph partitioning. The basic idea of these methods is to relax (1) and (2) into continuous optimization problems. The latter can be solved by first computing the eigenvectors of the Laplacian matrix of the graph and then finding the final partition using k-means or other suitable algorithms. Many studies have been devoted to the design and performance evaluation of spectral algorithms for ratio cut and normalized cut, including those by Hagen and Kahng [25], Chan et al. [11], Fan and Pardalos [17], Hochbaum [30], Merkurjev et al. [47], Zhang et al. [72], and Han et al. [26]. Many authors have proposed various improvements to the standard spectral clustering method. Lu et al. [42] presented nonnegative and sparse spectral clustering algorithms based on the ratio cut and normalized cut criteria. Experiments have shown that these algorithms outperform the standard spectral clustering technique. Chen et al. [12] proposed a direct normalized cut algorithm exploiting the idea of directly optimizing the normalized cut model. The algorithm has only a quadratic time complexity. Chen et al. [13] considered a new normalized cut model with balance regularization to avoid a trivial solution. They developed an iterative method to solve the new model without using eigendecomposition. More comprehensive discussions of the spectral method for graph clustering and partitioning can be found in the tutorial by von Luxburg [64] and survey papers by Nascimento and de Carvalho [49] and Gallier [23].

Although the traditional approach to ratio cut and normalized cut graph partitioning relies on using the spectral method, there are also algorithms that are based on different principles. Jia et al. [33] proposed an approximate weighted kernel k-means algorithm for the normalized cut. The algorithm avoids the direct eigendecomposition of the Laplacian matrix and is suitable for handling very large graphs. Lorente-Leyva et al. [41] outlined two alternative approaches for the normalized cut partitioning problem. One approach is a heuristic search procedure, and the other is a quadratic formulation-based method. Dhillon et al. [16] designed a fast multilevel algorithm that can be tuned to minimize specific objectives, including ratio cut and normalized cut. The refinement step of the algorithm employs the weighted kernel k-means technique. Fan and Pardalos [17] presented several semidefinite programming relaxations for ratio and normalized cut models. They developed a graph partitioning algorithm whose main step involves solving one of these relaxations. Compared with the spectral clustering method, this algorithm can obtain improved solutions.

However, the literature lacks metaheuristic-based approaches for finding the minimum ratio cut or normalized cut in the graph. One such approach was proposed by Hansen et al. [27]. They developed a variable neighborhood search (VNS) algorithm for the normalized cut model. Their local search (LS) procedure within the VNS framework analyzes all the possibilities of moving a vertex from one partition subset to another. The algorithm also employs a fast LS technique that considers only moves between connected subsets (two vertex subsets \(V_k\) and \(V_l\) are connected if there is an edge whose one vertex belongs to \(V_k\) and the other vertex belongs to \(V_l\)). During VNS execution, this technique is combined with the complete LS strategy. The experimental results show that the VNS method of Hansen et al. [27] is an efficient approach for solving the normalized cut graph partitioning problem. Metaheuristic-based approaches have also been developed for several related problems. Cafieri et al. [6] proposed a VNS algorithm for graph bipartitioning with the edge-ratio criterion. The algorithm is embedded into a hierarchical divisive heuristic to obtain a partition of the vertex set of the graph into a larger number of subsets. Mu et al. [48] and Ji et al. [32] presented ant colony optimization algorithms for community detection in complex networks. In their approach, the ratio cut is included as a linear term in the objective function (modularity density) of the problem. Lu et al. [43] developed a hybrid evolutionary algorithm for finding minimum conductance in the graph. The algorithm employs a tabu search technique as a local optimization procedure.

Literature analysis shows that the ratio cut and normalized cut graph partitioning problems have been mostly addressed by using various heuristic techniques. Most often, they are obtained through continuous relaxations of ratio or normalized cut models. Very little research has been devoted to metaheuristic optimization approaches although graph partitioning and clustering have many applications in a wide range of areas, as outlined earlier in this section. Considering these observations, our motivation is to implement and investigate the performance of metaheuristic-based algorithms for graph partitioning with ratio cut and normalized cut criteria. The main significance of this work lies in the development and experimental comparison of several metaheuristic algorithms for these graph partitioning problems. In the case of the normalized cut criterion, the comparison includes both the new algorithms and the best existing method. The significance of comparative analysis of various approaches is that it can help identify promising directions for designing better algorithms. The results of our computational study underscore the potential for evolutionary methods for solving graph partitioning problems. The main contribution of our work consists of three diverse algorithms for ratio cut and normalized cut graph partitioning. A simulated annealing (SA) algorithm is selected because of its popularity and success in solving complex combinatorial optimization problems. To be able to apply a CPU-based termination rule, we implemented SA as a multistart procedure. The iterated tabu search (ITS) algorithm is chosen because the tabu search (TS) is one of the most widely used local search methods. To potentially achieve better performance, we apply TS iteratively. Population-based evolutionary approaches in our study are represented by the memetic algorithm (MA). We prefer MA over the genetic algorithm (GA), because MA usually demonstrates faster convergence and better optimization than GA. The algorithm choice was guided by two reasons. First, our focus is on metaheuristics that exist for a long time and have shown excellent performance in solving numerous combinatorial optimization problems. Second, we consider the diversity factor and do not limit ourselves to considering only evolutionary techniques. It seems that some of the new population-based methods can perform equally well or even outperform the ITS and SA algorithms. In recent years, many effective strategies have been proposed to address optimization problems. They include a memetic algorithm with competition [68], iterated local search with tabu search [50], a multistart iterated tabu search algorithm [4], brain storm optimization with an orthogonal learning mechanism [45], and a decomposition-based algorithm using a localized control variable analysis approach [46].

We experimentally compare the proposed algorithms on two different sets of graphs: random graphs and benchmark graphs from the literature. We report results for both ratio cut and normalized cut graph partitioning scenarios. We also present comparison results between our best algorithm and the state-of-the-art VNS-based algorithm of Hansen et al. [27].

The remainder of the paper is organized as follows. In Sects. 2, 3, and 4, we present the multistart simulated annealing (MSA), iterated tabu search, and memetic algorithms, respectively. Section 5 is devoted to experimental analysis and comparisons of algorithms. Concluding remarks are given in Sect. 6.

2 Multistart simulated annealing

In this section, we present an implementation of the SA method for ratio cut and normalized cut graph partitioning. This method is based on an analogy between the metallurgical process of annealing in thermodynamics and the process of searching for the global extremum of a function. This analogy has been efficiently exploited by Kirkpatrick et al. [34] and Černý [8]. The guiding principle of the SA method is to escape a local optimum by accepting worsening moves with a certain probability. A recent overview of different SA variants can be found in [20].

Fig. 1
figure 1

Illustration of the relocation move: vertex v is transferred to subset \(V_3\)

Fig. 2
figure 2

Illustration of the swap move: the vertices u and v are interchanged

The idea of simulated annealing is to generate trial solutions in the neighborhood of the current solution and either accept or reject them. A new solution is always accepted if it improves the current solution. Otherwise, a decision on acceptance or rejection is made at random with a probability depending on the difference between objective function values of two solutions and the current temperature of the cooling process. Because the search starts with a high initial temperature, this probability is higher at the initial steps of the algorithm. As the algorithm proceeds, the temperature is reduced after each iteration. The algorithm stops whenever the temperature becomes very close to zero. At each temperature level, a certain number of trial solutions are evaluated. In our implementation of SA, the initial temperature is obtained by generating a random solution and calculating the absolute difference between objective function values of the random solution and a solution in its neighborhood many times. The temperature is initialized with the largest of these differences. Another important design feature of our algorithm is that SA is executed multiple times.

Without loss of generality, we present our multistart simulated annealing algorithm in terms of the ratio cut metric. The adaptation of the algorithm to the case of normalized cut graph partitioning requires only very minor changes that come from using a different objective function. The algorithm employs two move types, relocating a vertex from its current subset to the other subset in a partition and swapping two vertices located in two distinct subsets. The relocation move is illustrated in Fig. 1, where vertex \(v\in V_1\) is moved to partition subset \(V_3\). An example of the swap move is shown in Fig. 2, where the solution on the right is obtained by swapping vertex \(v\in V_3\) with vertex \(u\in V_1\). The choice of move type at each iteration is controlled by a parameter \(Q\in [0,1]\), which defines the probability of selecting a swap move. Trivially, the relocation move is performed with the probability \(1-Q\). Assume that a vertex \(v\in V_k\), \(k\in M\), is relocated from its current subset \(V_k\) to subset \(V_l\), \(l\ne k\), of the partition \(p=\{V_1,\ldots ,V_m\}\in \varPi \). Let the resulting partition be denoted by \(p'\). Naturally, it is assumed that v belongs to the set \(V'(p)\), composed of taking the union of all subsets in p having cardinality greater than one. Thus, \(v\in V_k\subseteq V'(p)=\cup _{i=1,|V_i|>1}^m V_i\). For \(p\in \varPi \), we can define the relocation neighborhood \(N_1(p)\) of p as the set of all partitions that can be obtained from p by relocating a single vertex. The change in cost incurred by applying a move is called the move gain. In the case of the above-defined relocation move, it is denoted by \(\delta (p,p')\). Formally, \(\delta (p,p')=F_r(p')-F_r(p)\). To provide an expression for \(\delta \), we rewrite the objective function of (1) as \(F_r(p)=\sum _{i=1}^m R(V_i)\) where

$$\begin{aligned} R(V_i)=C(V_i)/|V_i|. \end{aligned}$$
(3)

For a vertex \(v\in V\) and a subset \(V_i\in p\), let \(c_v(V_i)=\sum _{u\in V_i} c_{vu}\). This number represents the sum of weights of edges between v and vertices in subset \(V_i\). Of course, the sum \(c_v(V_i)\) is also defined for \(v\in V_i\). The sums \(c_v(V_i)\) along with the cut weights \(C(V_i)\) and ratios \(R(V_i)\) can be used to compute \(\delta \) as follows.

Proposition 1

For a vertex \(v\in V_k\) and a subset \(V_l,l\ne k\), in the partition p,

$$\begin{aligned} \begin{aligned}&\delta (p,p')=(C(V_k)-d_v+2c_v(V_k))/(|V_k|-1) \\&+(C(V_l)+d_v-2c_v(V_l))/(|V_l|+1)-R(V_k)-R(V_l). \end{aligned} \end{aligned}$$
(4)

Proof

Let us denote the cut weights for partition \(p'\) by \(C'(V_i)\), \(V_i\in p'\). Imagine that in the first step of the relocation operation, vertex v is temporarily removed from graph G (and thus out of \(V_k\)). Then, the weight of the cut between \(V_k\) and \(V\setminus V_k\) decreases by \(d_v-c_v(V_k)\) and that between \(V_l\) and \(V\setminus V_l\) decreases by \(c_v(V_l)\). In the second step, the vertex v is inserted into \(V_l\). This increases the weight of the cut between \(V_k\) and \(V\setminus V_k\) by \(c_v(V_k)\) and that between \(V_l\) and \(V\setminus V_l\) by \(d_v-c_v(V_l)\). Thus

$$\begin{aligned} C'(V_k)=C(V_k)-d_v+2c_v(V_k) \end{aligned}$$
(5)

and

$$\begin{aligned} C'(V_l)=C(V_l)+d_v-2c_v(V_l). \end{aligned}$$
(6)

Considering the fact that \(C'(V_i)=C(V_i)\) for all \(i\ne k,l\), we can write

$$\begin{aligned}&\delta (p,p')=C'(V_k)/(|V_k|-1)+C'(V_l)/(|V_l|+1)\nonumber \\&\quad -R(V_k)-R(V_l). \end{aligned}$$
(7)

Substituting (5) and (6) into (7), we obtain (4). \(\square \)

To illustrate the computation of \(\delta \) by (4), we use Fig. 1 in which edge weights greater than one are indicated close to the edges. For \(k=1\) and \(l=3\), we have \(C(V_k)=3\), \(C(V_l)=3\), \(c_v(V_k)=2\), \(c_v(V_l)=2\), \(d_v=4\), \(R(V_k)=0.75\), and \(R(V_l)=1.5\). Putting these values in (4), we obtain \(\delta (p,p')=(3-4+4)/3+(3+4-4)/3-0.75-1.5=-0.25\).

Upon acceptance of the move, the cut weights of the subsets \(V_k\) and \(V_l\) are updated according to Equations (5) and (6): \(C(V_k)\) is decreased by \(d_v-2c_v(V_k)\) and \(C(V_l)\) is increased by \(d_v-2c_v(V_l)\). The values of \(c_w(V_k)\) and \(c_w(V_l)\), \(w\in V\setminus \{v\}\), are updated by setting

$$\begin{aligned}&c_w(V_k):=c_w(V_k)-c_{wv}, \nonumber \\&c_w(V_l):=c_w(V_l)+c_{wv}. \end{aligned}$$
(8)

In the case of normalized cut graph partitioning, Equation (4) is slightly modified as follows:

$$\begin{aligned} \begin{aligned}&\delta (p,p')=(C(V_k)-d_v+2c_v(V_k))/(d(V_k)-d_v) \\&\quad +(C(V_l)+d_v-2c_v(V_l))/(d(V_l)+d_v)\\&\quad -C(V_k)/d(V_k)-C(V_l)/d(V_l). \end{aligned} \end{aligned}$$
(9)

If the move is accepted, then \(d_v\) is subtracted from \(d(V_k)\) and added to \(d(V_l)\). Certainly, the operations defined by Equations (5), (6), and (8) are also applied.

Next we consider the swap move. Assume that a vertex \(v\in V_k\) is interchanged with a vertex \(u\in V_l\), \(l\ne k\). As before, we denote the initial partition by p and the resulting partition by \(p'\). All the partitions that can be obtained from p in this way form the swap neighborhood \(N_2(p)\) of p. The gain of swap move is denoted by \(\varDelta (p,p')\). By definition, \(\varDelta (p,p')=F_r(p')-F_r(p)\). The following statement presents a formula for \(\varDelta \).

Proposition 2

For a vertex \(v\in V_k\) and a vertex \(u\in V_l\), \(l\ne k\),

$$\begin{aligned} \begin{aligned}&\varDelta (p,p')=(C(V_k)+d_u-d_v+2(c_v(V_k)\\&\quad -c_u(V_k)+c_{uv}))/|V_k| \\&\quad +(C(V_l)+d_v-d_u+2(c_u(V_l)\\&\quad -c_v(V_l)+c_{uv}))/|V_l|\\&\quad -R(V_k)-R(V_l). \end{aligned} \end{aligned}$$
(10)

Proof

We split the swap operation into four steps. Assume first that vertex v is temporarily removed from graph G. This action decreases \(C(V_k)\) by \(d_v-c_v(V_k)\) and \(C(V_l)\) by \(c_v(V_l)\). In the second step, vertex u is removed from graph G. As a result, \(C(V_k)\) is further decreased by \(c_u(V_k)-c_{uv}\) and \(C(V_l)\) by \(d_u-c_{uv}-c_u(V_l)\). The graph is restored by first adding vertex v to subset \(V_l\). This step increases \(C(V_k)\) by \(c_v(V_k)\). Since vertex u is removed from graph G, the current weighted degree of vertex v is \(d_v-c_{uv}\), and the current sum of edge weights between v and \(V_l\) is \(c_v(V_l)-c_{uv}\). Therefore, \(C(V_l)\) increases by \(d_v-c_{uv}-(c_v(V_l)-c_{uv})=d_v-c_v(V_l)\). In the last step, the vertex u is inserted into \(V_k\). Now, the sum of edge weights between u and \(V_k\) is \(c_u(V_k)-c_{uv}\). It follows that \(C(V_k)\) is increased by \(d_u-(c_u(V_k)-c_{uv})=d_u-c_u(V_k)+c_{uv}\). Since vertex v has been moved to subset \(V_l\), it also follows that \(C(V_l)\) is increased by \(c_u(V_l)+c_{uv}\). Now, let \(C'(V_i)\), \(V_i\in p'\), stand for the cut weights for partition \(p'\). Summing the changes in \(C(V_k)\) for each step of the above procedure, we obtain

$$\begin{aligned} \begin{aligned}&C'(V_k)=C(V_k)-(d_v-c_v(V_k))\\&\quad -(c_u(V_k)-c_{uv})+c_v(V_k) \\&\quad +(d_u-c_u(V_k)+c_{uv})=C(V_k)\\&\quad +d_u-d_v+2(c_v(V_k)-c_u(V_k)+c_{uv}). \end{aligned} \end{aligned}$$
(11)

Analogously

$$\begin{aligned} \begin{aligned}&C'(V_l)=C(V_l)-c_v(V_l)\\&-(d_u-c_{uv}-c_u(V_l))+(d_v-c_v(V_l))+ \\&(c_u(V_l)+c_{uv})=C(V_l)+d_v-d_u\\&+2(c_u(V_l)-c_v(V_l)+c_{uv}). \end{aligned} \end{aligned}$$
(12)

Clearly, \(C'(V_i)=C(V_i)\) for all \(i\ne k,l\). Therefore, substituting (11) and (12) into equation \(\varDelta (p,p')=C'(V_k)/|V_k|+C'(V_l)/|V_l|-R(V_k)-R(V_l)\) gives (10). \(\square \)

Figure 2 provides an example of the swap move. As before, only edge weights greater than 1 are shown. For \(k=3\) and \(l=1\), we have \(C(V_k)=5\), \(C(V_l)=5\), \(c_v(V_k)=1\), \(c_v(V_l)=3\), \(c_u(V_k)=2\), \(c_u(V_l)=1\), \(d_v=5\), \(d_u=3\), and \(R(V_k)=R(V_l)=5/3\). Then, using (10), we can calculate the swap move gain \(\varDelta (p,p')\), which is equal to \((5+3-5+2(1-2+1))/3+(5+5-3+2(1-3+1))/3-5/3-5/3=-2/3\).

If the solution \(p'\) is accepted, then first, the cut weights of the subsets \(V_k\) and \(V_l\) are updated using Equations (11) and (12) (assuming that \(C'(V_k)\) and \(C'(V_l)\) represent new values of \(C(V_k)\) and \(C(V_l)\), respectively). Then, since vertices v and u are swapped, the following assignments are performed:

$$\begin{aligned}&c_v(V_k):=c_v(V_k)+c_{uv}, c_v(V_l):=c_v(V_l)-c_{uv}, \end{aligned}$$
(13)
$$\begin{aligned}&c_u(V_k):=c_u(V_k)-c_{uv}, c_u(V_l):=c_u(V_l)+c_{uv}, \end{aligned}$$
(14)
$$\begin{aligned}&c_w(V_k):=c_w(V_k)+c_{uw}-c_{vw},\nonumber \\&c_w(V_l):=c_w(V_l)+c_{vw}-c_{uw},\nonumber \\&w\in V\setminus \{u,v\}. \end{aligned}$$
(15)

For a normalized cut, the swap move gain is computed as follows:

$$\begin{aligned} \begin{aligned}&\varDelta (p,p')=(C(V_k)+d_u-d_v+2(c_v(V_k)-c_u(V_k)\\&\quad +c_{uv}))/(d(V_k)-d_v+d_u) \\&\quad +(C(V_l)+d_v-d_u+2(c_u(V_l)-c_v(V_l)\\&\quad +c_{uv}))/(d(V_l)+d_v-d_u) \\&\quad -C(V_k)/d(V_k)-C(V_l)/d(V_l). \end{aligned} \end{aligned}$$
(16)

Equations (13)–(15) are applicable for both objective functions, \(F_r\) and \(F_n\).

Before presenting the MSA algorithm, we need to define a mapping, denoted by q, from vertex set V to set M, which represents the indices of the partition subsets. We will write \(q(v)=k\) to indicate that vertex v belongs to subset \(V_k\) of partition p. The pseudocode of MSA is given in Algorithm 1. In addition to the aforementioned move type selection probability Q, the parameters of MSA are the cooling factor \(\alpha \), the final temperature \(T_{\mathrm {min}}\), and the number of moves, \(\beta \), to be attempted at each temperature level. The initial temperature \(T_{\mathrm {max}}\) is computed using the formula \(T_{\mathrm {max}}=\max (\max _{p'\in N'_1} |\delta (p,p')|, \max _{p'\in N'_2} |\varDelta (p,p')|)\), where p is a starting partition generated in Line 2 of Algorithm 1 and \(N'_1\) (\(N'_2\)) is a sequence of partitions obtained by applying randomly chosen relocation (respectively, swap) moves to the partition p (thus \(N'_1\) and \(N'_2\) are extracted from the neighborhoods \(N_1(p)\) and \(N_2(p)\), respectively). We fixed the length of both sequences at 5, 000. Having the initial and final temperatures of the annealing process, we can calculate the number of temperature levels \({\bar{\gamma }}\) (Line 5).

figure a

As seen in the pseudocode, our SA algorithm is implemented as a multistart procedure. The reason for such a choice is to fairly compare the algorithm with other presented algorithms using a time-based stopping criterion. Obviously, there is no efficient way to control the number of SA restarts. This number decreases with increasing value of the parameter \(\beta \). Usually, \(\beta \) is assumed to be dependent on the size of the problem instance. Our multistart mechanism is simple-minded. At the start of each iteration, a random partition of the graph is generated. This is done by first randomly generating a permutation of the vertices of G and then splitting its entries (vertices) into m subsets of approximately equal sizes. The latter operation assigns the first \(\lceil n/m \rceil \) vertices in the permutation to the first subset, the next \(\lceil n/m \rceil \) (or \(\lfloor n/m \rfloor \)) vertices to the second subset and so on. The obtained partition of the graph is passed to the SA method. Additional operations must be performed before entering SA for the first time. They include initialization of the best solution \(p^{*}\) and calculation of the initial temperature \(T_{\mathrm {max}}\). This temperature is reused in subsequent SA restarts. It can be noted that SA with multiple starts requires no additional parameters beyond those of classical SA or the probability parameter Q.

From the pseudocode, we see that SA is wrapped in the “while” statement, which loops until a stop condition is met. We use a stopping criterion, where the search is terminated when a maximum time limit is reached. At the beginning of the iteration, the values of the sums \(c_w(V_i)\), \(w\in V\), \(i\in M\), and the cut weights \(C(V_i)\), \(i\in M\), for the generated solution p are initialized. The best solution to the algorithm is denoted as \(p^{*}\) and its value is \(f^{*}\). This solution is memorized in Line 28. At each iteration, either a relocation move (Lines 11 and 12) or a swap move (Lines 14 and 15) is selected and evaluated. This choice is controlled by the probability parameter Q. If the move is accepted, then the current solution p and related information are updated in Lines 18–27. The temperature is changed after performing \(\beta \) iterations. As is a common practice in SA implementations, the temperature is decreased according to the geometric schedule (Line 31). We note that in the normalized cut case, Equations (4) and (10) should be replaced with Equations (9) and (16), respectively (Lines 12 and 15). Considering only the innermost loop in Algorithm 1, we can obtain the following statement.

Proposition 3

The computational complexity of the body of loop 9–30 of MSA is O(n).

Proof

The major bottleneck of computations in Lines 10–29 of MSA is updating \(c_w(V_k),c_w(V_l)\), \(w\in V\) (either in Line 20 or in Line 24). This clearly takes O(n) time. The same upper bound holds on the number of operations required to save the improved solution when such a solution is found (Line 28). Other steps inside the loop are less time-consuming. In particular, only O(1) time is needed to compute the gain of a move (Lines 12 and 15). \(\square \)

We remark that the parameter \(\beta \) in many SA implementations depends linearly on the size of the problem. This is also the case in our experiments with MSA (see Sect. 5.2). Therefore, all operations inside the loop 8–32 of MSA can be performed in \(O(n^2)\) time. However, evaluating the time complexity of all executions of this loop, which implements SA, is difficult because the number of its repetitions depends on the initial temperature \(T_{\mathrm {max}}\). In our algorithm, this temperature is a quantity dependent on the character of the problem instance being solved.

3 Iterated tabu search

A possible alternative to the SA technique is the widely used TS method [24]. We apply it to minimize the ratio cut and normalized cut objective functions iteratively.

The main idea of our TS implementation is to repeatedly execute tabu search and solution perturbation procedures. The use of such a strategy implies that it is only the first iteration where TS starts with a fully random solution. The solution perturbation procedure starts with a relatively good solution and proceeds by performing a number of random moves. The resulting solution is passed to the TS component of the approach. The key idea of tabu search is to allow accepting worsening solutions to prevent getting stuck in local minima. To avoid cycling in the search process, for each move performed, the reverse move is forbidden for a specified number of iterations. For this purpose, we use two tabu tables, one for relocation moves and another for swap moves. The TS procedure stops when a predefined number of iterations is reached. This number is an important parameter of the algorithm. A noteworthy feature of our approach is that the TS procedure is enhanced with the integration of an LS technique. Each of them explores both the relocation and swap neighborhoods at each iteration.

As in the case of SA, we present the ITS algorithm for the ratio cut partitioning problem. The pseudocode of the top level of our TS implementation is shown in Algorithm 2. The search starts from a solution generated using the same procedure as for the MSA method. The best solution is denoted as \(p^{*}\) and its value is \(f^{*}\). The TS algorithm and solution perturbation procedure, named get_restart_partition, are executed repeatedly within the loop 3–6. To stop the loop, a CPU time-based termination condition is used.

figure b
figure c

Algorithm 3 gives the pseudocode of the TS component of the approach. This component accepts and returns the current partition p, the best-found partition \(p^{*}\), and their objective function values f and \(f^{*}\). It starts with the initialization of the tabu tables \(t=(t_{vu})\) and \({\tilde{t}}=({\tilde{t}}_{vl})\) by setting their entries to \(-\tau \), where \(\tau \) is the tabu tenure parameter. As the TS algorithm proceeds, the entry \(t_{vu}\) of the tabu table t is used to store the last iteration number in which the vertices v and u have been swapped. Similarly, \({\tilde{t}}_{vl}\) contains the last iteration number in which vertex v has been moved from partition subset \(V_l\) to a different subset. Another parameter of the algorithm is the number of iterations I. In each iteration, both neighborhood \(N_1\) and neighborhood \(N_2\) are explored. The selected move is specified by the vertex \(v^{*}\) and partition subset \(V_{l^{*}}\) in the case of relocation and by the vertices \(v^{*}\) and \(u^{*}\) in the case of the swap operation. To distinguish which move type is selected, we use the flag variable \(\xi \), which equals 0 for relocation and 1 for swap move. The gain incurred by the selected move is denoted by \(\delta _{\mathrm {min}}\). In the gain calculation, p is the current solution, and \(p'\) is either the solution obtained by relocating vertex v from subset \(V_k\) to subset \(V_l\) (Line 7) or the solution obtained by swapping vertices v and u (Line 16). The variable \(\rho \) is used to know whether an improving solution has been found. If this is the case, \(\rho \) is positive. If two or more improving solutions are discovered, then one of them (or, more precisely, the corresponding move) is selected probabilistically (Lines 10 and 19). The role of the set U in the algorithm (Lines 3, 5, and 15) is to guarantee that each pair of vertices (vu) is considered only once. The algorithm also uses the set \(V'\) and mapping q, which we defined in the previous section.

Once neighborhoods \(N_1\) and \(N_2\) of the current solution p have been explored, the algorithm proceeds by updating the tabu tables t and \({\tilde{t}}\) (Line 25) and performing either the relocation or swap operation, depending on the value of \(\xi \) (Line 26). These operations are implemented in the same way as those in MSA (Lines 19–27 of Algorithm 1). Specifically, Equations (5), (6), and (8) are used for \(v=v^{*}\), \(k=q(v^{*})\), and \(l=l^{*}\) in the case of the relocation move, and Equations (11)–(15) are used for \(v=v^{*}\), \(u=u^{*}\), \(k=q(v^{*})\), and \(l=q(u^{*})\) if the swap move has been selected. In the same step, the current partition p is updated. If an improved solution has been found (\(\rho >0\)), then the TS algorithm attempts to further improve this solution by applying a local search procedure (Line 29). A description of this procedure is given later in this section.

Looking at the TS pseudocode, we can see that the statements inside the outermost loop can be split into two sequences: those in Lines 3–27 and those in Lines 28–31. The first of these sequences has a worst-case runtime \(O(n^2)\). The second of these sequences includes a call to the LS procedure. However, its worst-case time complexity is unknown. A remark on the computational complexity of the basic parts of local_search is given at the end of this section.

figure d

To periodically direct the search toward unexplored regions of the solution space, the ITS algorithm applies a solution perturbation procedure get_restart_partition. Its pseudocode is presented in Algorithm 4. The input to the procedure includes a partition p, which is the last solution recorded by the TS component of the approach. At the initialization step, get_restart_partition first draws uniformly at random an integer \(K_{\mathrm {max}}\) from the interval \([n\kappa _1,n\kappa _2]\) and then integers K from the interval \([K_{\mathrm {min}},K_{\mathrm {max}}]\) and L from the interval \([L_{\mathrm {min}},L_{\mathrm {max}}]\). In these calculations, \(\kappa _1,\kappa _2,K_{\mathrm {min}},L_{\mathrm {min}}\), and \(L_{\mathrm {max}}\) are ITS parameters. The first two parameters, that is, \(\kappa _1\) and \(\kappa _2\) define the range for the maximum number of relocation moves to be performed. To bound this number from below, the parameter \(K_{\mathrm {min}}\) is used. At each iteration of the solution perturbation procedure, a list of the best relocation moves is built. The parameters \(L_{\mathrm {min}}\) and \(L_{\mathrm {max}}\) restrict the length of this list to be bounded from below and above, respectively. The role of all these ITS parameters is to guide the diversification of the search. Their appropriate values are selected experimentally. The parameters K and L of the get_restart_partition procedure, when used together, control the level of degradation of the objective function value due to the perturbation of the current partition p. The quality of the generated partition deteriorates with the increase in K and L. In contrast, small values of K and L may lead to producing a partition that is too close to the input solution p. From the pseudocode, it should be clear that the use of the set U guarantees that each vertex is moved from one subset to another at most once. Provided L is a constant and K is proportional to n (such choices are made in Sect. 5.2), the computational complexity of the procedure get_restart_partition is \(O(n^2m)\).

figure e

We end this section by describing our implementation of the LS algorithm for the considered graph partitioning problems. As seen before, this algorithm is employed within the TS framework. However, perhaps more importantly, the LS procedure is the key ingredient of an MA, which is presented in the next section. The pseudocode of the LS procedure is given in Algorithm 5. As is typical in most LS implementations, the procedure executes in a number of iterations. Each iteration performs all the operations contained in the outer “while” loop (Lines 3–36). For a partition p at the beginning of the loop, let us consider a partition subset \(V_k\), \(k\in M\). Assume that the content of \(V_k\) does not change during the execution of the iteration. More precisely, no vertex \(v\in V_k\) moves to another subset (Line 14), and no vertex \(u\in V\setminus V_k\) is added to \(V_k\) (Line 14 when \(l=k\)). Moreover, no vertex \(v\in V_k\) is interchanged with a vertex \(u\in V\setminus V_k\) (Line 30). In this case, the entry \(s_k\) of the vector \(S=(s_1,\ldots ,s_m)\) at the end of the iteration (Line 36) is equal to 0. If at least one of the above conditions is not satisfied, then \(s_k=1\). Within the loop, some entries of S may temporarily be set to 2. Assume now that \(s_k=0\) and \(s_l=0\) for subsets \(V_k\) and \(V_l\). Then, in the next iteration, all the gains \(\delta \) and \(\varDelta \) expressed by (4) and (10) for \(V_k\) and \(V_l\), respectively, have nonnegative values. It follows that, in the next iteration, there is no need to examine moves in which both subsets, \(V_k\) and \(V_l\), are involved. This strategy implemented in the local_search procedure reduces the computational time needed to reach a locally optimal solution. The same technique oriented at accelerating neighborhood examinations was proposed by Lai et al. [37] for the MDGP. These authors, however, considered relocation moves and swap moves separately. They used two \(0-1\) matrices of size \(m\times m\). The (ij)-entry of the first matrix takes value 0 if and only if no relocation of a vertex from the ith group to the jth group resulted in obtaining an improving solution. Similarly, the second matrix is defined with respect to the swap operation. Lai et al. [37] provided computational results that demonstrated the unequivocal efficiency of the proposed neighborhood exploration strategy.

As can be seen in Algorithm 5, each iteration of our LS procedure is composed of two phases. In the first of them (Lines 4–19), the neighborhood \(N_1\) of the current partition p is explored. In Line 4, \(M'(p)\) refers to the set of subsets in partition p whose size is greater than 1. Formally, \(M'(p)=\{k \mid k\in M, V_k\in p, |V_k|>1\}\), where \(M=\{1,\ldots ,m\}\) as before. If an improving solution is found, then flag \(\mu \) is set to true (Line 10). As a result, the current solution p and its value f are updated (Lines 14 and 15). In the second phase (Lines 20–35), a better-quality solution is searched for in neighborhood \(N_2(p)\). If such a solution is found among the neighbors of p, then it is used to replace p (Lines 30 and 31).

It can be observed from the pseudocode that the time complexity of the body (Lines 7–17) of the first inner “while” loop is O(n) and that of the body (Lines 23–33) of the second inner “while” loop is \(O(n^2)\). In obtaining these estimates, it is assumed that in the worst case, the size of partition subsets can be proportional to the graph order n. The number of times the “while” loops are executed depends on the graph and on the starting solution p. This number, however, is difficult to estimate.

4 A memetic algorithm

Evolutionary algorithms are an important class of approaches for solving various optimization problems. Among them, the memetic algorithm is one of the most successful techniques. The fundamental concept of the MA is to apply the genetic operators in combination with an LS procedure. In this section, we present an MA for ratio cut and normalized cut graph partitioning. As before, a description is given in terms of the first of these problems.

Like a GA, the MA manipulates a population of individuals where each individual represents a solution to the optimization problem being solved. To generate new members of the population, a crossover operator comes into play. Usually, it is a binary operator that combines the genes of two parents in some manner to produce an offspring. In the context of ratio cut and normalized cut, individuals in the population correspond to partitions of vertex set V. A convenient method for coding an individual is to use a mapping q introduced in Sect. 2. We remind that for a partition \(p=\{V_1,\ldots ,V_m\}\) and a vertex \(v\in V_i\), the value \(q(v)=i\) points to the partition subset to which vertex v belongs.

One of the key ideas of our proposed algorithm is to employ the crossover operator that is used in grouping genetic algorithms (GGAs) (see [31, 54]). In a GGA, the chromosomes represent the allocation of certain objects (e.g., vertices of the graph) to groups. When generating offspring from two parents, a GGA manipulates groups instead of group members. Our crossover operator randomly selects two individuals as parents and repeatedly transfers a subset of vertices from one of the selected parents to offspring. If the resulting collection of subsets does not cover all the vertices of the graph, then a repair mechanism is triggered to obtain a feasible partition. The offspring partition replaces the worst individual in the current population if it is not worse than the worst individual and of course differs from all individuals in the population. Another component of our memetic algorithm is the LS procedure, which is precisely the same as that used in the ITS algorithm. Certainly, this procedure is applied to each generated offspring. Moreover, LS is used to improve randomly generated graph partitions when constructing an initial population of solutions.

figure f

Let \(p'\) and \(p^{\prime \prime }\) be two partitions in \(\varPi \) that are submitted as an input to the crossover operator. The pseudocode of our GGA crossover implementation is given in Algorithm 6, where r, \(r'\), and \(r^{\prime \prime }\) are the current number of subsets in the offspring partition p and parent partitions \(p'\) and \(p^{\prime \prime }\), respectively, and where the resulting offspring is represented by the mapping q. The most important part of the procedure (the “while” loop spanning Lines 3–13) serves to transfer genetic information from the parents to the offspring. First, it equiprobably selects one of the parents and then finds a partition subset of this parent with the smallest value of the ratio R (given by (3)). Ties among subsets are broken by random selection. The chosen subset, \(V_k\), is moved from the parent partition \(p'\) (or \(p^{\prime \prime }\)) to the offspring partition (Lines 7 and 8 in the case of \(p'\)). Assume that the parent \(p'\) is involved in this operation. Then, each vertex of \(V_k\) is removed from the corresponding subset of the partition \(p^{\prime \prime }\) (Line 9). The iteration terminates by deleting empty subsets of \(p^{\prime \prime }\), if any, and updating their number \(r^{\prime \prime }\) accordingly. If the parent \(p^{\prime \prime }\) is selected, then similar operations with respect to \(p^{\prime \prime }\) are performed (Line 11). Upon emptying both \(p'\) and \(p^{\prime \prime }\) (then \(r'=r^{\prime \prime }=0\)), there may still be some vertices uncovered by offspring subsets. The set of such vertices in the pseudocode is denoted by U. If \(U\ne \emptyset \) and \(r<m\), then the priority is to create single-vertex subsets (Lines 17, 18). When r reaches m, every remaining vertex of U is assigned to a randomly chosen partition subset (Lines 20, 21). The resulting partition p completely covers the vertices of the graph. However, it may occur that p consists of an insufficient number of subsets. In this case, to repair the partition p, an additional “while” loop is used (Lines 27–36). In this part of the pseudocode of the algorithm, \({\tilde{M}}\) is the set of indices of the partition subsets that are used as potential candidates for splitting, and \(Z({\tilde{M}})\) denotes the average size of these subsets. Before entering the loop, \(Z({\tilde{M}})=n/r\) because initially \(\cup _{i\in {\tilde{M}}}V_i=V\) (Line 26). Assume that \(r<m\) at this point. Then, the partition p is refined by applying the subset splitting operation \(m-r\) times. Only subsets of size at least \(Z({\tilde{M}})\) (and trivially at least 2) are candidates for splitting. Each iteration starts by randomly selecting one of the subsets (Line 28). The chosen subset is randomly split into two subsets of nearly equal size (Line 32). Its index is removed from \({\tilde{M}}\), and the new value of \(Z({\tilde{M}})\) is calculated (Line 31). If no suitable subset is found, then a new round of splitting begins with an enlarged set \({\tilde{M}}\) (Line 34). Concluding the description of the crossover operator, we remark that an offspring can be generated efficiently.

Proposition 4

The computational complexity of the procedure get_offspring is \(O(n^2)\).

Proof

Observe that get_offspring performs \(O(n^2)\) operations to compute the ratios \(R(V_i)\), \(i\in M\). This can be performed at the initialization stage of the procedure. Other parts of the crossover have less complexity. In particular, all iterations of the “while” loops 3–13 and 27–36 take O(nm) time. \(\square \)

figure g

The pseudocode of the top level of the MA for ratio cut graph partitioning is presented in Algorithm 7. After constructing an initial population P (Line 2), the algorithm iterates over the following four steps until a termination condition is satisfied: selection of a pair of individuals in the current generation as parents (Line 4), crossover of the parents, generation of an offspring p (Line 5), execution of an LS algorithm on the offspring p (Line 7), and evaluation of p (Line 8). Similar to MSA and ITS, the iterations stop when a maximum time limit is reached. The best solution in MA is denoted as \(p^{*}\) and its value is \(f^{*}\).

figure h

MA is based on the application of four procedures. Two were previously described: get_offspring previously in this section and local_search in Sect. 3. The pseudocode of init_population is given in Algorithm 8. This procedure creates an initial population P of size z, where z is a parameter of MA. In the pseudocode, \(p^{*}\) is the best solution in the population, and \(f^{*}\) is its objective function value. At each iteration, the candidates for inclusion in P are a partition \({\tilde{p}}\in \varPi \) generated at random using the same routine as for MSA and a locally optimal solution p produced by local_search applied to \({\tilde{p}}\). A priority is assigned to partition p. If p differs from all solutions currently in P, then it is appended to the population. Otherwise, an attempt is made to add the partition \({\tilde{p}}\) to P.

The condition in Line 6 of Algorithm 8 is checked using an \(m\times m\) matrix, \(H=(h_{ij})\), whose entry \(h_{ij}\) is the number of vertices \(v\in V\) such that \(q(v)=i\) and \(q'(v)=j\), where q and \(q'\) are the mappings corresponding to partitions p and \(p'\), respectively. The following statement demonstrates that this condition as well as that in Line 8 can be checked rather easily.

Proposition 5

The computational complexity of checking whether partitions \(p\in \varPi \) and \(p'\in \varPi \) are different is O(n).

Proof

Clearly, constructing the matrix H takes only O(n) operations. The partitions p and \(p'\) are different if and only if the matrix H constructed for them has more than m nonzero entries. Taken together, these two observations prove the claim. \(\square \)

figure i

Algorithm 9 gives the pseudocode of the procedure evaluate_offspring. It attempts to replace the worst individual in population P with generated offspring p. This procedure is also responsible for memorizing the best solution found.

5 Computational experiments

The purpose of this section is to examine the computational performance of the described algorithms for ratio cut and normalized cut graph partitioning. We present comparative experiments on both randomly generated graphs and well-known benchmark graphs taken from the literature.

5.1 Experimental setup

All the algorithms were coded in the C++ programming language, and the tests were carried out on a laptop with an Intel Core i5-6200U CPU running at 2.30 GHz. We remark that the code is designed to be applicable for graphs of arbitrary density. To achieve this requirement, the graph is represented in a computer’s memory by the edge-weight matrix. This puts a limit on the size of graphs the code can deal with. Our computer can handle graphs with up to approximately \(10^4\) vertices.

We performed our experiments on the following two sets of graphs:

  1. (a)

    complete undirected graphs with edge weights generated in the following two steps. First, each vertex is assigned a point whose coordinates are sampled randomly and uniformly from a rectangle. Then, for each pair of vertices \(u,v\in V\), the weight of the edge \((u,v)\in E\) is computed as \(c_{uv}=\min (1/d_{uv},100)\), where \(d_{uv}\) is the Euclidean distance between points corresponding to the vertices u and v. The ensemble of graphs generated in this way is split into five subsets according to the number of vertices, which is 200, 500, 1000, 2000, and 3000, respectively. Each subset consists of five graphs.

  2. (b)

    A collection of 36 benchmark graphs. It consists of the following three sets: the 10th DIMACS Implementation Challenge Benchmark [2] (first 18 graphs in the tables to follow, that is, from karate to add32, inclusively), 12 graphs from the Network Data Repository [56] (from Trefethen-200 to bio-dmela in the tables), and 6 social network samples [9] (last 6 graphs in the tables, that is, from soc52 to pokec_2000). We remark that the graph named netscience has more than 100 isolated vertices. We deleted them, reducing the size of netscience from 1589 to 1461.

The dataset (a) and the source codes of the presented algorithms are publicly available at http://www.personalas.ktu.lt/~ginpalu/grpart.html.

In our computational experiments, we run each algorithm 10 times on each graph in sets (a) and (b). Maximum CPU time limits for a run of an algorithm were as follows: 30 s for \(n\leqslant 200\), 200 s for \(200<n\leqslant 500\), 500 s for \(500<n\leqslant 1000\), 1000 s for \(1000<n\leqslant 2000\), 2000 s for \(2000<n\leqslant 4000\), 4000 s for \(4000<n\leqslant 6000\), and 8, 000 s for \(n>6000\). To assess the performance of the algorithms, we use the following measures: the objective function value of the best solution out of 10 runs, the average objective function value of 10 solutions, and the average time taken to find the best solution in a run.

5.2 Parameter settings

The main control parameters of the MSA algorithm are the cooling factor \(\alpha \), the final temperature \(T_{\mathrm {min}}\), and the number of iterations at each temperature level, \(\beta \). Following recommendations from the SA literature [57, 62], we set \(\alpha \) to 0.95, \(T_{\mathrm {min}}\) to 0.0001, and \(\beta \) to 100n. The specific parameter of our implementation of SA is the move type selection probability Q. We tested values of Q from 0 to 1 in increments of 0.1. To this end, we run MSA on a training sample consisting of 10 complete graphs whose edge weights are generated using the same procedure as for set (a). The five graphs in this sample are of order \(n=200\) and the other of order \(n=500\). The number of partition subsets is 5 for \(n=200\) and 10 for \(n=500\). The sample is disjoint from set (a), which is reserved for the testing stage. The performance of the MSA configurations was measured in terms of the average objective function value over 10 runs. The experiment in the case of ratio cut has shown that our MSA algorithm is quite robust to variation in the parameter Q in the interval [0, 0.9]. We also found that its solutions are substantially worse for \(Q=1\). This implies that using only swap operations is not a good strategy. Restricting to the interval [0, 0.9], a marginally better performance was observed for \(Q=0.1\). Therefore, we set Q to 0.1. A similar experiment was conducted with MSA for the case of normalized cut graph partitioning. Analogous conclusions as in the previous experiment were made. Based on the results obtained, we decided, for the normalized cut case, to set Q to 0.2.

The parameters of our ITS algorithm are \(\tau \), \(\kappa _1\), \(\kappa _2\), \(K_{\mathrm {min}}\), \(L_{\mathrm {min}}\), \(L_{\mathrm {max}}\), and I (see Sect. 3). To determine good values for these parameters, we experimented with the same training set of graphs as used for the MSA algorithm. We relied on a simple parameter setting procedure. Its idea consists of allowing one parameter to take a number of predefined values while keeping the other parameters fixed at reasonable values chosen during preliminary tests. We ran ITS for both problems, ratio cut and normalized cut minimization. Because of their close similarity, it was possible to identify a common set of good parameter values for both problems. First, we varied \(\kappa _1\) from 0.1 to 0.9 in increments of 0.1. The related parameter \(\kappa _2\) (\(>\kappa _1\)) was fixed at 1, which is the maximum possible value. The range of acceptable values for \(\kappa _1\) was found to be 0.5 to 0.8. We fixed \(\kappa _1\) at 0.7. Then, we attempted to decrease the value of \(\kappa _2\). Respecting condition \(\kappa _2>\kappa _1\), we tested two values of \(\kappa _2\): \(\kappa _2=0.8\) and \(\kappa _2=0.9\). However, none of them led to better results. Therefore, we fixed \(\kappa _2\) at 1 for all further experiments with ITS. We continued by examining the following 6 values of the parameter \(K_{\mathrm {min}}\): 1, 5, 10, 15, 20, and 50. The results showed little sensitivity of ITS to this parameter. We arbitrarily set \(K_{\mathrm {min}}\) to 15. The next step was to analyze the effect of the parameter \(L_{\mathrm {min}}\). We ran ITS with \(L_{\mathrm {min}}\in \{3,5,10,25,50,100,200\}\). We observed that the algorithm was fairly robust to the choice of \(L_{\mathrm {min}}\). Slightly better results were obtained with smaller values of \(L_{\mathrm {min}}\). Based on this finding, we set \(L_{\mathrm {min}}=10\). In the next experiment, we tried the following values of the parameter \(L_{\mathrm {max}}\): 10, 25, 50, 100, 200, 300, 500, 700, 1000, 1500, 2000, and 3000. The ITS algorithm showed consistently good performance when \(L_{\mathrm {max}}\) varied from 200 to 2000. Its performance deteriorated for \(L_{\mathrm {max}}\leqslant 100\) and \(L_{\mathrm {max}}=3000\). We decided to fix \(L_{\mathrm {max}}\) at 500. Further, we investigated how the performance of ITS depends on the tabu tenure parameter \(\tau \). We varied \(\tau \) from 5 to 30 with a step of 5. Additionally, we tested \(\tau =3\). The range of acceptable values for \(\tau \) was found to be quite wide. The results of very similar quality were obtained for \(\tau \in \{10,\ldots ,30\}\). We fixed \(\tau \) at 20, which was the middle value in this set. Perhaps a more important parameter of TS in the ITS framework is the number of iterations I. We ran ITS for the following values of I: 20, 50, 100, 200, 400, and 600. The best performance of ITS was observed when using \(I\in \{100,200\}\), with a slight edge to \(I=100\). The algorithm showed the worst performance for \(I=600\), especially for \(I=20\). Considering these findings, we elected to set I to 100.

Table 1 Number of partition subsets (m) and time limit (in seconds) for random graphs
Table 2 Best results for ratio cut when running MSA, ITS, and MA on random graphs
Table 3 Average results for ratio cut when running MSA, ITS, and MA on random graphs (the time is in seconds)
Fig. 3
figure 3

Best solutions for g-200-5. a Ratio cut (\(F_r=27.843409, F_n=1.460579\)). b Normalized cut (\(F_r=48.911503, F_n=1.266439\))

The only parameter of our memetic algorithm is the population size z. We performed trial runs of MA with values of z up to 500 individuals. Based on the results obtained, we fixed z at 100. The performance of MA decreased when using smaller values of z. By increasing z above 100, the results on the training set of graphs appeared to be equally good or even marginally better than in the case of \(z=100\). However, MA with such z values becomes unduly time-consuming when used to partition large graphs. This comes from the fact that each individual of the initial population is submitted to an LS procedure that, for large graphs, takes a significant amount of time. Thus, the population size in our memetic algorithm should not be too large, and \(z=100\) is a good choice.

5.3 Numerical results for ratio cut

In this section, we present computational results obtained by the proposed algorithms for ratio cut graph partitioning. To perform pairwise comparisons between two algorithms, we apply the Wilcoxon signed-rank test.

Table 4 Summary of the results obtained by the tested algorithms for ratio cut on random graphs
Fig. 4
figure 4

Time taken by MSA, ITS, and MA to find the best solution in a run (the case of ratio cut). a Random graphs. b Benchmark graphs

We began our comparison by testing the performance of each of the algorithms on random graphs (set (a) in Sect. 5.1). The size of graphs, n, and the number of partition subsets, m, are shown in Table 1. Time limits (last column of the table) were provided in Sect. 5.1. Tables 2 and 3 summarize the results of the experiment, where the first column identifies the graph and the rest of the tables report the performance of MSA, ITS, and MA. Columns 2–4 of Table 2 contain the objective function value of the best solution out of 10 runs, denoted as \(F_r^{*}\). The best value of \(F_r^{*}\) for each instance is highlighted in boldface. The two columns of Table 3 for each algorithm give the average objective function value of 10 solutions, denoted as \({\bar{F}}_r\), and the average time (in seconds) taken to reach the last improvement in solution quality. The best value of \({\bar{F}}_r\) among all the algorithms is indicated in boldface.

Table 5 Best results for ratio cut when running MSA, ITS, and MA on benchmark graphs from the literature
Table 6 Average results for ratio cut when running MSA, ITS, and MA on benchmark graphs from the literature (the time is in seconds)

An example solution for a random graph is presented in Fig. 3a. It is the best solution for g-200-5 obtained by the tested algorithms. It was found in two runs (out of 10) of MSA, two runs of ITS, and all 10 runs of MA.

Table 2 shows that MA performs considerably better than the other two algorithms. The MA obtained the best solution for more than \(90\%\) of the graphs (23 out of 25), whereas MSA and ITS produced the best result for 5 and 4 graphs, respectively. Comparing average values (Table 3), we observe even more prominent dominance of MA. This algorithm yielded the best results for 24 graphs. The average performance of the other two algorithms was much worse: ITS obtained the best average values for two graphs only, whereas MSA failed to do so for the entire set of graphs.

The results of the pairwise comparison of algorithms are shown in Table 4, where the names of the algorithms are given in the first column. The second column identifies the values being compared: “Best” denotes the values displayed in the columns in Table 2 labeled \(F_r^{*}\) and “Average” denotes those in the columns in Table 3 labeled \({\bar{F}}_r\). Columns 3 to 5 of Table 4 show comparison results: #wins, #ties, and #losses count the number of graphs on which the first algorithm in the pair finds a better, an equally good or an inferior solution than the second algorithm. Additionally, we applied the Wilcoxon signed-rank test for each pair of algorithms. The p-values from this test are estimated in the penultimate column. Using a standard significance level of 0.05, we found that the results of the first algorithm in the pair are better than those of the second algorithm (by placing “Yes” in the last column). Table 4 clearly shows that the MA is superior to the MSA technique and ITS method. Comparing the latter two approaches, an obvious advantage of MSA over ITS is observed.

Figure 4a shows the performance comparison of the tested algorithms in terms of computational time. Each point in the plot represents the average value taken over 5 graphs. The results indicate that the algorithms can be ranked, from fastest to slowest, in the following order: ITS, MSA, and MA.

In Tables 5 and 6, we computationally compare the algorithms on graphs in the (b) dataset. Their second and third columns give the size of the graph and the number of partition subsets, respectively. The labels of the other columns have the same meaning as in Tables 2 and 3. By analyzing the results in Tables 5 and 6, we find that MA performs much better than the other two algorithms. We observe that MA produced the best solution for 35 graphs (out of 36). The ITS and MSA algorithms obtained the best solutions for 12 and 9 benchmark graphs, respectively. Moreover, MA achieved the best average result for 35 graphs, whereas each of MSA and ITS did this only for 7 graphs. The only graph on which MA was defeated (by MSA) was add20. We also notice that Table 5 includes 7 graphs (karate, dolphins, polbooks, football, can-292, ia-infect-dublin, and soc52) for which all three heuristics were capable of finding the best solution. A summary of the pairwise comparison results between the algorithms is given in Table 7. The entry “No” in the last column means that there is no significant difference between the results of the two compared algorithms. From the table, it can be concluded that MA is by far the best performing algorithm for the (b) set of graphs. The number of losses is at most 1. We can also see that there is no marked difference between the results of MSA and ITS at the selected significance level of 0.05.

Table 7 Summary of the results obtained by the tested algorithms for ratio cut on benchmark graphs from the literature

The computational time taken by each of the tested algorithms is depicted in Fig. 4b. To construct the plots shown there, we split the interval [250, 7750] into subintervals of length 500 and then, for each subinterval, identified all graphs whose number of vertices falls within that subinterval. In this way, we obtained 10 nonempty subsets of graphs. For example, the subset corresponding to the subinterval [250, 750) consists of 10 graphs. Each point in the plot represents the average value of the time taken over all graphs in the corresponding subset and is placed at the x coordinate being the midpoint of the corresponding subinterval. We excluded from consideration the 8 smallest graphs (of an order less than 250) because the algorithms, especially ITS and MA, require a very small amount of time to partition. The rightmost points in the plots were obtained for a single graph, namely bio-dmela. Perhaps they can be marked as outliers when comparing the algorithms. We can see in Fig. 4b that ITS tends to take less CPU time than MSA and MA. These two latter algorithms are comparable in terms of computation time. In fact, MA was slightly faster than MSA for graphs of an order \(n<2700\), and MSA is faster than MA for graphs with \(n>2700\). The running times of the tested algorithms for individual graphs are listed in Table 6.

5.4 Numerical results for a normalized cut

In the second phase of experimentation, we tested the developed algorithms for the graph partitioning problem with the normalized cut objective function. Their performance was evaluated using the same scenario as in the previous section.

Table 8 Best results for a normalized cut when running MSA, ITS, and MA on random graphs
Table 9 Average results for a normalized cut when running MSA, ITS, and MA on random graphs (the time is in seconds)

First, we tested MSA, ITS, and MA on the set of random graphs. The results are reported in Tables 8 and 9. The objective function value of the best solution (out of 10 runs) is denoted by \(F_n^{*}\) and the average objective function value of 10 solutions is denoted by \({\bar{F}}_n\). Contrasting the results from Tables 2 and 3, and those from Tables 8 and 9, we see that the MA showed equally strong performance in solving both partitioning problems on random graphs. We can find in Table 8 that MA produced the best solution for 24 graphs in the set, whereas MSA and ITS obtained the best result for 3 and 6 graphs, respectively. We observe from Table 9 that MA provided the best average solutions for 23 random graphs. Each of the other two algorithms achieved the best average value for a single graph. The results of the pairwise comparison of algorithms are shown in Table 10. They statistically support the assertion that MA performs better than MSA and ITS. Looking at the statistics for MSA and ITS, we observe a significant difference in favor of MSA only for the case of average solutions.

Table 10 Summary of the results obtained by the tested algorithms for a normalized cut on random graphs

Figure 3b shows the best solution found for graph g-200-5. This solution is achieved by all three heuristics. In particular, the MA arrived at this solution in each of 10 runs. The solutions for the ratio cut (Fig. 3a) and normalized cut (Fig. 3b) were quite different. The ratio cut objective function value for the second of these solutions was 48.911503, which was much worse than that for the first one. In contrast, the first solution was inferior to the second one in terms of the normalized cut objective function. Its value was 1.460579 against 1.266439 for the second solution.

Fig. 5
figure 5

Time taken by MSA, ITS, and MA to find the best solution in a run (the case of normalized cut). a Random graphs. b Benchmark graphs

The execution time of the tested algorithms is plotted in Fig. 5a. The ranking of the algorithms from fastest to slowest was ITS, MSA, and MA. This is exactly the same as in the case of ratio cut minimization. It can also be noted that MA is marginally faster than ITS and MSA for small graphs (with \(n\leqslant 500\)).

Table 11 Best results for a normalized cut when running MSA, ITS, and MA on benchmark graphs from the literature
Table 12 Average results for a normalized cut when running MSA, ITS, and MA on benchmark graphs from the literature (the time is in seconds)

Tables 11 and 12 summarize the results of MSA, ITS, and MA on the set of benchmark graphs from the literature. Inspecting these tables reveals the clear superiority of the MA over other approaches in our evaluation. This algorithm produced the best partition for each of the 36 graphs. We also see that MSA and ITS yielded the best solution for 8 and 12 graphs, respectively, in the set. Looking at the average values \({\bar{F}}_n\), we found that MA obtained the best result for all graphs, while MSA and ITS failed in many cases and matched the MA performance for 6 and 8 benchmark graphs, respectively. We also made a pairwise comparison of heuristics. The results are displayed in Table 13, from which it is apparent that MA dominates the other two algorithms. The superiority of MA over other approaches is even more pronounced than in the random graph case. From the last two rows in Table 13, we can conclude that the quality of solutions delivered by MSA and ITS looks comparable.

Table 13 Summary of the results obtained by the tested algorithms for a normalized cut on benchmark graphs from the literature
Table 14 Comparison of the memetic algorithm with the state-of-the-art algorithm, VNS, of Hansen et al. [27] for normalized cut partitioning of random graphs (the time is in seconds)
Table 15 Comparison of MA versus VNS: a summary of the results

Figure 5b illustrates the computational time of the tested algorithms as a function of the graph size. We can see that as in the ratio cut case, ITS took less time to find the best solution in a run than MSA and MA. In particular, ITS dominates MSA over the whole range of graph sizes. We also observe that MA is the fastest approach for small graphs (with \(n\leqslant 1500\)). However, with increasing n, it became the slowest algorithm. The timing results for individual graphs are shown in Table 12.

5.5 Comparison with the state of the art

For comparison purposes, we implemented the VNS algorithm of Hansen et al. [27]. Similar to the algorithms in this paper, VNS has been coded in the C++ programming language. The experiments were conducted with the parameter values used in [27]. We restricted ourselves to comparing the best of our algorithms, MA, against VNS. We remind that Hansen et al. [27] developed their algorithm for solving the normalized cut graph partitioning problem.

In Table 14, we report the results achieved by VNS and MA for random graphs. The three columns for each algorithm report \(F_n^{*}\), \({\bar{F}}_n\), and the average time for reaching the best result. The best value of \(F_n^{*}\) and \({\bar{F}}_n\) for each instance is shown in boldface and in italics, respectively. It can be seen that MA performs better or equally well with VNS for all graphs of order up to 2000 except for g-2000-4 in terms of \(F_n^{*}\). A different conclusion is reached for graphs of order 3000, where VNS produced better partitions than MA. We also observe that, for \(n\geqslant 1000\), the average time taken by VNS to find the best solution in a run is significantly shorter than that of MA. This means that VNS spends more time than MA without improving the quality of the output solution.

We summarize the results of the experiment in the first two rows of Table 15. As before, the Wilcoxon signed-rank test at the significance level of 0.05 was used to compare the performance of the two algorithms. The test demonstrated a statistically significant difference in the average quality of solutions obtained by MA and VNS. We see that MA outperformed VNS on the 21 random graphs (out of 25). However, there was no statistically significant difference between the results of the two algorithms in the case of the best solutions.

Table 16 Comparison of the memetic algorithm with the state-of-the-art algorithm, VNS, of Hansen et al. [27] for normalized cut partitioning of benchmark graphs (the time is in seconds)

Table 16 shows the results of the experiment on the set of benchmark graphs. The last six columns of this table have the same structure as in Table 14. We observe that MA surpassed or matched the VNS algorithm across all graphs except uk. We also see that VNS could find the best result for 12 of 36 graphs. However, almost all of these graphs are of small sizes. To compare the quality of solutions produced by the algorithms more accurately, we calculate the relative difference between their objective function values. To this end, we apply the following formula

$$\begin{aligned} e=((F_{\mathrm {VNS}}-F_{\mathrm {MA}})/ \min (F_{\mathrm {VNS}},F_{\mathrm {MA}}))100\%, \end{aligned}$$
(17)

where e is the relative difference, and \(F_{\mathrm {VNS}}\) and \(F_{\mathrm {MA}}\) denote the objective function values achieved by VNS and MA, respectively. Figure 6 shows the relative differences calculated using (17) for \(F_{\mathrm {VNS}}\) and \(F_{\mathrm {MA}}\) values contained in the columns in Table 16 labelled \(F_n^{*}\). In this figure, we omit netscience for which e is not defined as well as the graphs for which \(e<2\%\). The results demonstrate that MA significantly outperforms the VNS algorithm. The average relative difference (calculated for 35 graphs) between the \(F_n^{*}\) values of VNS and MA is \(35\%\).

Fig. 6
figure 6

Relative difference between the best objective function value obtained by the MA and VNS methods

From Table 16, we find that the advantage of MA over VNS becomes even more prominent when considering the average performance of algorithms. The VNS heuristic achieved the best average result for uk and 5 smallest graphs only. There are a number of graphs for which VNS, unlike MA, failed to find the best solution in all 10 runs. Three such graphs (jazz, Trefethen-200, gplus_200) have 200 or fewer vertices. Figure 7 depicts the percentage differences calculated by substituting the \({\bar{F}}_n\) values of VNS and MA into Equation (17). The figure lists only those graphs for which \(e>4\%\). The results again indicate a very clear superiority of MA over the VNS algorithm. The average relative difference between VNS and MA in terms of the \({\bar{F}}_n\) value is \(48\%\). Statistically significant differences among the algorithms are confirmed by the Wilcoxon signed-rank test in the last two rows of Table 15.

Fig. 7
figure 7

Relative difference between the average objective function value of solutions found by the MA and VNS methods

Evaluating the computational speed of the MA and VNS algorithms reveals that they take a similar amount of time to find the best solution in a run (compare the time columns in Table 16). This especially can be seen when the graph order becomes large. Overall, the average time taken by VNS was 1064 s and that taken by MA was 1004 s.

5.6 Comparison with a different variant of the crossover operation

The numerical results reported in Sects. 5.4 and 5.5 demonstrate the effectiveness and excellent performance of the MA in solving the normalized cut partitioning problem for benchmark graphs as well as for random graphs of order up to 2000. However, MA was less successful for random graphs with 3000 vertices. It is difficult to determine a possible reason for this shortcoming. We investigated the behavior of the offspring generation procedure get_offspring. In the first stage of this procedure, all vertices of the graph were assigned to offspring subsets. If the required number of subsets was reached, then the procedure stopped (Line 25 of Algorithm 6). Otherwise, a subset splitting operation was performed until m partition subsets were created (Lines 26–36 of Algorithm 6). We observed a much more frequent use of this operation for random graphs with \(n=3000\) in comparison with benchmark graphs and small random graphs. In the case of random graphs of order 3000, the splitting operation was applied on average in \(79.9\%\) of all generated offspring (the average is taken over 5 graphs and 10 runs for each of them). This percentage reduces to \(49.1\%\) for random graphs with \(n=2000\) and is less than \(1\%\) for all benchmark graphs except football and soc52, which are very easy to solve by all tested algorithms. One might guess that frequent use of the subset splitting operation may have a negative impact on the performance of the proposed memetic algorithm.

In the process of MA design, we developed a version of the algorithm with a different variant of the crossover operation. This crossover differs from those typically used in GGAs. We restrict ourselves to a brief outline of the alternative version of the MA because this algorithm generally showed inferior performance compared to the MA presented in Sect. 4. The alternative crossover procedure uses the matrix \(H=(h_{ij})\) introduced in Sect. 4. Suppose that H is constructed for parent partitions \(p'\) and \(p^{\prime \prime }\). Let \(q'\) and \(q^{\prime \prime }\) be the mappings corresponding to partitions \(p'\) and \(p^{\prime \prime }\), respectively. The procedure first identifies the m largest entries of the matrix H. Their set (denoted by \(\varPsi \)) is used to construct a one-to-one mapping g from \(\{(i,j)\mid h_{ij}\in \varPsi \}\) to M. Basically, the mapping g can be simply viewed as arbitrary labeling of the elements of \(\varPsi \) by the integers in the set \(\{1,\ldots ,m\}\). Having the mapping g, the procedure can define sets \(A_i=\{g(i,k)\mid k\in \{1,\ldots ,m\}, h_{ik}\in \varPsi \}\), \(i\in M\), and \(B_j=\{g(k,j)\mid k\in \{1,\ldots ,m\}, h_{kj}\in \varPsi \}\), \(j\in M\). Then, the vertices of the graph are processed one by one. For \(v\in V\), let \(i=q'(v)\), \(j=q^{\prime \prime }(v)\). If i and j are such that \(h_{ij}\in \varPsi \), then q(v) is set to g(ij). Otherwise, q(v) is assigned a partition subset index chosen randomly either from the set \(A_i\cup B_j\) if this set is nonempty or from the set M. The resulting offspring partition is represented by the mapping q. We use MA\('\) to refer to the version of the MA with the just outlined crossover operator.

Table 17 Comparison of MA and VNS with MA\('\) for normalized cut partitioning of large random graphs

We experimentally compared MA\('\) versus MA and concluded that the performance of MA\('\) was inferior to that of MA for benchmark graphs and for smaller random graphs. However, MA\('\) is more effective than MA when solving the normalized cut graph partitioning problem for random graphs of order 2000 and 3000. This can clearly be seen in Table 17, where the results for VNS from Table 14 are repeated for comparison reasons. MA\('\) produced the best solution for 8 graphs, whereas MA and VNS yielded the best partition for only one graph. We also observe that MA\('\) obtained the best average results for 9 of 10 graphs. By comparing the time columns for MA in Table 14 and MA\('\) in Table 17, we found that MA\('\) was more than twice as fast as MA. However, as mentioned before, MA\('\) could not successfully compete with MA when the comparison between algorithms was performed over all graphs in our test suite. In particular, MA yielded better or equal quality solutions compared to MA\('\) for almost all (34 out of 36) benchmark graphs. To save space, we, therefore, do not provide detailed experimental results for MA\('\).

5.7 Comparisons with the genetic algorithm and multistart local search

To further evaluate the MA performance, we compared it with the genetic algorithm for normalized cut graph partitioning. We constructed the latter simply by removing all the calls to the LS procedure from MA. We refer to this algorithm as GA1. We also tested another configuration of the genetic algorithm, which was obtained from MA by removing calls to local_search from the offspring generation steps only. In other words, we modified MA by deleting Step 7 of Algorithm 7. However, we kept using local_search while generating the initial population. We call this configuration of the genetic algorithm GA2. Additionally, we evaluated the performance of the multistart local search method. This method is very simple and straightforward. It consists of repetitively applying local_search to randomly generated starting partitions of the graph. We refer to this algorithm as MLS.

To avoid unnecessarily long computations, we performed the comparison of algorithms on a set of smaller-size graphs from our testbed. This set consists of 10 random graphs with 200 and 500 vertices and all benchmark graphs of order less than 1000. We run each algorithm 10 times on each graph with the same cutoff time as before. To assess the performance of GA1, GA2, and MLS, we used MA as a reference method.

Table 18 Difference in solution values between each tested algorithm (GA1, GA2, and MLS) and the reference method MA for normalized cut graph partitioning

Table 18 summarizes the results of the computational experiments carried out for GA1, GA2, and MLS. We write \(F_{\mathrm {GA1}}-F_{\mathrm {MA}}\) to denote the difference in the \(F_n^{*}\) values between GA1 and MA in the column under heading “Best”, and the difference in the \({\bar{F}}_n\) values between GA1 and MA in the column under heading “Average”. The differences \(F_{\mathrm {GA2}}-F_{\mathrm {MA}}\) and \(F_{\mathrm {MLS}}-F_{\mathrm {MA}}\) are defined similarly. The three bottom lines in the table serve as a summary that includes the differences between objective function values averaged over all 29 graphs, the number of graphs for which the two algorithms achieve the same accuracy, and the number of graphs for which the tested algorithm (GA1, GA2 or MLS) produced worse solutions than MA.

The main observation from the table is that MA shows great superiority over both genetic algorithm configurations and the multistart local search approach. We also see that the pure genetic algorithm, GA1, was by far the worst algorithm in the comparison. It was the only one that failed to match the performance of MA for all graphs in the test suite. Another conclusion is that the other two algorithms, GA2 and MLS, demonstrate comparable performance in terms of solution quality. However, each of them was significantly surpassed by the proposed memetic algorithm. Therefore, we conclude that the main reason for the success of MA lies in the integration of the grouping genetic algorithm with the fast LS technique. As Table 18 shows, individually, these components of the approach produce inferior results.

Table 19 Difference in solution values between the MSA algorithm with \(\beta '=100\) and MSA configurations with \(\beta '=500\) and 1000 for normalized cut graph partitioning
Table 20 Difference in solution values between the ITS algorithm with \(I=100\) and ITS configurations with \(I=20,50\), and 500 for normalized cut graph partitioning

5.8 Analysis of the main parameters

In this section, we study the effect of the main parameters on the performance of the developed algorithms. Again, we focus on the normalized cut graph partitioning problem. We conducted computational experiments on the same set of graphs as used in the previous section. However, we do not report results for graphs for which all variations of the tested algorithm gave the best result. Additionally, we ran our memetic algorithm on several large graphs. The main configurations of the algorithms were used as a basis for comparison.

5.8.1 Usefulness of the restart strategy in simulated annealing

The parameter of MSA that helps to adjust the number of SA restarts is the number of moves attempted at each temperature level. Its value in the main experiment (Sect. 5.4) is set to \(\beta =100n\). Let us denote this parameter as \(\beta =\beta 'n\). By increasing \(\beta '\) we can decrease the number of SA restarts. We performed an additional experiment with the \(\beta '\) set to 500 and 1000. We denote the MSA configurations for tested \(\beta '\) values by MSA(100), MSA(500), and MSA(1000).

Table 19 compares the results of MSA(500) and MSA (1000) with those of MSA(100). Columns 2–5 of this table were obtained in the same manner as Columns 2–7 of Table 18. We denoted the objective function value achieved by MSA(\(\beta '\)) with \(\beta '\in \{100,500,1000\}\) by \(F_{\mathrm {MSA}(\beta ')}\). As before, this value is \(F_n^{*}\) for columns labeled “Best” and \({\bar{F}}_n\) for columns labeled “Average”. The \(F_{\mathrm {MSA}(100)}\) values were extracted from Tables 8, 9, 11, and 12. The last columns in Table 19 display the number of SA restarts (averaged over 10 runs) in the three scenarios considered. The bottom rows of the table show the wins/ties/losses for MSA(500) or MSA(1000) versus MSA(100).

As observed in Table 19, the difference in solution values between MSA(\(\beta '\)) with \(\beta '>100\) and MSA(100) increased with increasing \(\beta '\). This is especially visible when comparing average results. Thus, performing a larger number of shorter SA runs from different starting points was better than performing a smaller number of longer SA runs.

5.8.2 The effect of the number of TS iterations on the ITS performance

As we observed in Sect. 5.2, the ITS algorithm is fairly robust to the choice of the parameters that it depends upon. From preliminary experiments, it was found that the number of TS iterations, I, was one of the most important parameters of the ITS approach. The results given in Sect. 5.4 for ITS were obtained with \(I=100\). To learn more about the influence of parameter I on the performance of the algorithm, we ran ITS with \(I=20,50\), and 500.

Table 21 Difference in solution values between the MA algorithm with population size \(z=100\) and MA configurations with \(z=50\) and 500 for normalized cut graph partitioning

Table 20 shows the comparison results of the four variations of the ITS algorithm. To distinguish between these variations, we use the same naming convention as in the previous section (thus, for example, ITS(20) stands for ITS with \(I=20\)). The differences shown in Table 20 were obtained similarly to those in Tables 18 and 19. The last three rows of the table evaluate the tested configurations of the ITS algorithm versus ITS(100).

From the table, we note that replacing \(I=100\) with \(I\in \{20,50,500\}\) makes the ITS algorithm less efficient in terms of the average solution quality obtained over 10 runs of ITS. Comparing the best results produced by the four versions of ITS, we found that ITS(500) performs equally well as ITS(100). In this respect, the other two variations (with \(I=50\) and especially \(I=20\)) were worse than ITS with \(I=100\).

5.8.3 The effect of the population size on the MA performance

The only parameter of our memetic algorithm is the population size (denoted by z in Sect. 4). In our main experiments, we fixed this parameter to 100. To examine the influence of this parameter on the performance of the MA, we conducted additional experiments in which MA was run with z set to 50 and to 500. The results are summarized in Table 21. Its structure is similar to that of Table 20. We denote different configurations of MA by MA(z), \(z\in \{50,100,500\}\).

A close inspection of Table 21 allows us to conclude that the behavior of MA depends on the graph size. For smaller graphs (first 15 rows of the table), the performance of MA increases with increasing z values. In this case, MA(100) is better than MA(50) (there are many positive entries in the second and third columns), and MA(500) is better than MA(100) (all but one entry in the last two columns and the first 15 rows are nonpositive). However, the picture is completely different for large graphs (last 5 graphs in the table). We see that MA(100) drastically outperforms MA(500). This can be explained by the fact that MA(500) takes a significant amount of time to create an initial population of individuals, which is 5 times larger than that of MA(100). The population is composed of solutions obtained by first randomly generating graph partitions and then improving them by applying the LS procedure. However, LS is not sufficiently fast for large graphs and random initial solutions. Since MA(500) initializes a large-sized population, the amount of time left for offspring generation is significantly reduced. Therefore, for large graphs, MA(500) produces a considerably smaller number of offspring than MA(100). For the uk graph, for example, on average, this number is 771, 1830 and 2496 for MA(500), MA(100), and MA(50), respectively. Producing fewer offspring leads to a reduction in the quality of solutions found by MA(500).

From the table, we can also see that the performance of MA(50) is comparable to that of MA(100). The latter obtains a smaller average objective function value for a larger number of graphs. On the other hand, the average values of \(F_n^{*}\) and \({\bar{F}}_n\) achieved by MA(50) are slightly smaller than those for MA(100). This advantage of MA(50) is due to the excellent performance of this version of MA for the power graph.

5.9 Analysis of the LS strategy

In this section, we discuss the impact of using vector S in our LS procedure on the performance of the memetic algorithm. The role of the vector S is to accelerate the neighborhood exploration process by evaluating only a subset of all possible relocation and swap moves. The vector S is used to mark the partition subsets that have been changed in the previous LS iteration. Then, in the current LS iteration, it is redundant to evaluate moves consisting of relocating a vertex from one unmarked subset to another unmarked subset. A similar observation can be made regarding swap moves.

In order to show the effectiveness of the proposed LS strategy, we experimentally compared the developed memetic algorithm against its version obtained by replacing the LS algorithm of Sect. 3 with a simplified LS procedure. The latter does not use the vector S. Its pseudocode is obtained from that in Algorithm 5 by removing initialization of S (Line 1), conditions imposed on S (Lines 4 and 20), and statements in Lines 16, 32, and 36. It is important to note that both LS implementations (i.e., that given in Algorithm 5 and the simplified procedure) return the same locally optimal solution. However, it is reasonable to expect that the former would be faster than the latter. We refer to the version of MA with S usage disabled in LS as MA-SD.

Table 22 Difference in solution values between MA-SD and MA for normalized cut graph partitioning

Table 22 compares the performance of MA and MA-SD on 10 random graphs and 10 benchmark graphs from our test suite. We provide results obtained for large graphs. For small graphs, the performance difference between MA and MA-SD is less significant. The second and third columns of the table provide the same kind of statistics as Table 21. The column labeled “#impr” gives the number of runs (out of 10) when MA yielded a better solution than MA-SD. In the remaining runs (if \(\mathrm {\#impr}<10\)), MA and MA-SD produced the same solution. Figure 8 compares the objective function values achieved by MA and MA-SD for the two graphs, g-3000-4 and road-minnesota.

Fig. 8
figure 8

Comparison between MA and MA-SD in terms of the objective function value. a g-3000-4. b road-minnesota

Table 22 and Fig. 8 show that the proposed LS strategy leads to a significantly better performance of MA compared to the use of a traditional LS procedure. The reason behind this observation is that with a faster LS technique, the memetic algorithm is able to generate more offspring, which in many cases allows obtaining better results. As can be seen in the last column of Table 22, MA improved MA-SD solutions in all 10 runs for 8 of the 20 graphs. The average improvement rate is \(85\%\). Thus, the proposed LS acceleration technique has a positive impact on the performance of the memetic algorithm.

6 Concluding remarks

The main focus of the paper was to investigate the capabilities of the most popular metaheuristic approaches when applied to ratio cut and normalized cut graph partitioning problems. Three specific algorithms were developed: multistart simulated annealing, iterated tabu search, and memetic algorithms. In each of the algorithms, some optimization strategies were adopted. By calculating and updating the cut weights for the partitions efficiently, the time complexity of the inner loop of SA was significantly reduced. This allows more SA restarts to be performed. Both ITS and MA used a local search procedure. To speed up this procedure, we applied a technique that reduces the effort required for neighborhood examination. In the MA, we implemented a version of the crossover operator that was used in GGAs.

We carried out computational experiments on two sets of graphs. The results show that the MA is unequivocally superior to both MSA and ITS with respect to solution quality. This conclusion is valid for both models (ratio cut and normalized cut) and graph types (random and benchmark graphs from the literature). In some experiments, MA could find better or equally good solutions than ITS or MSA for all graphs in a given set. We also concluded that, in terms of the solution quality, MSA showed better performance than ITS for random graphs. However, there was no statistically significant difference between the results of these two algorithms for the considered set of benchmark graphs. Finally, from the results of our study, we concluded that ITS was the fastest algorithm in the comparison. It tends to find reasonable solutions more quickly than other tested algorithms. However, the MA is not unacceptably slow compared to ITS. For smaller graphs, MA is even faster than ITS and MSA. Additionally, we experimentally compared MA, which is our best algorithm, with the VNS algorithm of Hansen et al. [27] (which is the state-of-the-art algorithm for the normalized cut model). We found that MA provided better overall performance than VNS. Based on the obtained results, it is believed that the MA is an excellent approach to solving the ratio cut and normalized cut models.

There are promising directions for further research. One area of interest can be to develop innovative evolutionary algorithms for minimum cut problems. Some of the recent population-based metaheuristic algorithms can be used to solve these problems, such as monarch butterfly optimization [18, 19], earthworm optimization algorithm [67], elephant herding optimization [66], moth search algorithm [65], slime mold algorithm [38], and Harris hawks optimization [29]. The proposed LS procedure could be embedded in such algorithms and used as a powerful technique for search intensification. Another promising way is to construct hybrid algorithms by combining two (or even more) metaheuristic methods to benefit from the strengths of each of them. Finally, an important avenue for further work is to apply the ideas of the ratio and normalized cut algorithms for developing metaheuristic-based techniques for solving other graph partitioning problems.