Keywords

1 Introduction

The Maximum Clique (MC) problem is a well-known NP-hard problem [17]. Given a simple undirected graph \(G = (V, E)\), a clique C is a subset of V s.t. each vertex pair in C is mutually adjacent. The Maximum Clique problem is to find a clique of the maximum size. An important generalization of the MC problem is the Maximum Vertex Weight Clique (MVWC) problem in which each vertex is associated with a positive integer weight, and the goal is to find a clique with the greatest total vertex weight. This generalization is important in many real-world applications like [1, 3, 4, 8, 9, 18, 19, 23]. In this paper we are concerned in finding a clique whose total vertex weight is as great as possible.

Both MC and MVWC are NP-hard, and the state-of-the-art approximation algorithm can only achieve an approximation ratio of \(O(n(\log \log n)^2/(\log n)^3)\) [15]. Thus various heuristic methods have been developed to find a “good” clique within reasonable time. Up to now, there are two types of algorithms for the MVWC problems: complete algorithms and incomplete ones. Complete algorithms for MVWC include [2, 14, 21, 30]. On the other hand, incomplete algorithms for MVWC include [7, 10, 22, 27, 28].

The aim of this work is to develop a local search MVWC solver named LMY-GRSFootnote 1 to deal with large crafted graphs. Firstly we incorporate random walk into the multi-neighborhood greedy search in [28]. Then we propose novel data structures to achieve greater efficiency.

We used the large crafted benchmark in [27]Footnote 2 to test our solver. And we make two state-of-the-art solvers, MN/TS [28] and LSCC [27], as the competitorsFootnote 3. Experimental results show that LMY-GRS attained all the best-known solutions in this large crafted benchmark. Moreover, for a large proportion of the graphs LMY-GRS achieves better average quality. Among them there are four graphs where LMY-GRS reports new best-known solutions.

2 Preliminaries

2.1 Basic Notations

Given a graph \(G = (V, E)\) where \(V = \{v_1, \ldots , v_n\}\), an edge is a 2-element subset of V. Given an edge \(e = \{u, v\}\), we say that u and v are adjacent to each other. Also we say that u and v are neighbors, and we use \(N(v) = \{u | \{u, v\} \in E\}\) to denote the set of v’s neighbors. The degree of a vertex v, denoted by d(v), is defined as |N(v)|. We use \(d_{\tiny max }(G)\) to denote the maximum degree of graph G, suppressing G if understood from the context. A clique C of G is a subset of V where vertices are pair-wise adjacent. An empty set is said to be an empty clique. A set that contains a single vertex is called a single-vertex clique.

Given a weighting function \(w:V\rightarrow Z^+\), the weight of a clique C, denoted by w(C), is defined to be \(\sum _{v\in C}w(v)\). We use age(v) to denote the number of steps since last time v changed its state (inside or outside the candidate clique).

2.2 The Large Crafted Benchmark

MC and MVWC solvers are often tested on the DIMACS [16] and BOSHLIB [29] benchmarks. Recently large real-world benchmarks have become very popular. Many of these graphs have millions of vertices and dozens of millions of edges. They were used in testing Graph Coloring and Minimum Vertex Cover algorithms [11, 24, 26], as well as the MC [25] and MVWC [27] algorithms.

In this paper we focus on the large benchmark. They were originally unweighted, and to obtain the corresponding MVWC instances, we use the same method as in [22, 27, 28]. For the i-th vertex \(v_i, w(v_i)=(i \mod 200) + 1\).

2.3 Multi-neighborhood Greedy Search

Usually the local search moves from one clique to another until the cutoff arrives, then it returns the best clique that has been found. There are three operators: \(\texttt {add}\), \(\texttt {swap}\) and \(\texttt {drop}\), which guide the local search. To ensure that these operators preserve the clique property, two sets are defined as below.

  1. 1.

    \(AddS = \{v | v \not \in C, v \in N(u)\text { for all }u \in C\}\).

  2. 2.

    \(SwapS = \{(u, v) | u \in C, v \not \in C, \{u, v\} \not \in E, v \in N(w)\text { for all } w \in C \backslash \{u\}\}.\)

So the \(\texttt {add}\) operator can only take an element from AddS as its operand. Similarly the \(\texttt {swap}\) (resp. \(\texttt {drop}\)) operator can only take an element from SwapS (resp. C).

Proposition 1

If \((u_1, v) \in SwapS\) and \((u_2, v) \in SwapS\), then \(u_1=u_2\)Footnote 4.

That is, if a vertex v can go into C through a \(\texttt {swap}\) operation, then the vertex to be removed is unique. Then we have

Lemma 1

For any \(P\subseteq SwapS\), \(|P| = |\{v| (u, v) \in P\}|\).

So P can be projected to V by considering the second element in the swap-pairs.

We use \(\varDelta _{add}\), \(\varDelta _{swap}\) and \(\varDelta _{drop}\) to denote the increase of w(C) for the operations \(\texttt {add}\), \(\texttt {swap}\) and \(\texttt {drop}\) respectively. Obviously, we have (1) for a vertex \(v \in AddS\), \(\varDelta _{add}(v) = w(v)\); (2) for a vertex \(u \in C\), \(\varDelta _{drop}(u) = -w(u)\); (3) for a vertex pair \((u, v) \in SwapS\), \(\varDelta _{swap}(u, v) = w(v) - w(u)\). Basically both MN/TS and LSCC obtain the best local move like Algorithm 1.

figure a

[27] stated that SwapS is usually very large when we solve large sparse graphs. Yet we will show that this statement seems not to be the case.

2.4 The Strong Configuration Checking Strategy

[27] proposed a strategy named strong configuration checking (SCC): (1) in the beginning of the search, confChange(v) is set to 1 for each vertex v; (2) when v is added, confChange(n) is set to 1 for all \(n \in N(v)\); (3) when v is dropped, confChange(v) is set to 0; (4) When \(u\in C\) and \(v\not \in C\) are swapped, confChange(u) is set to 0.

2.5 Best from Multiple Selections (BMS)

BMS is equivalent to deterministic tournament selection in genetic algorithms [20]. Given a set S and a positive integer k, it works as follows: randomly select k elements from S with replacements and then return the best one.

3 Local Move Yielded by Greedy and Random Selections

Based on Lemma 1, we have a theorem below which shows the bound of |SwapS|.

Theorem 1

  1. 1.

    If \(C = \{w\}\) for some w, then \(|SwapS| = |V| - |N(w)| - 1\);

  2. 2.

    If \(|C| > 1\), then \(|SwapS| \le 2d_{\tiny max }\).

Proof

The first item is trivial, so we only prove the second one. Since \(|C|>1\), there must exist two different vertices u and w in C. Now we partition SwapS to be: \(P_1 = \{(x, y) \in SwapS |x = u\}\) and \(P_2=\{(x, y) \in SwapS | x \ne u\}\).

For any pair \((x, y) \in P_1\), since \(x = u\), y must be a neighbor of w. Therefore, \(\{y| (x, y) \in P_1\} \subseteq N(w)\). So \(|\{y| (x, y) \in P_1\}| \le |N(w)|\). By the Lemma 1, we have \(|P_1| = |\{y| (x, y) \in P_1\}| \le d(w)\), and thus \(|P_1| \le d_{\tiny max }\).

For any pair \((x, y) \in P_2\), since \(x \ne u\), y must be a neighbor of u. Therefore, \(\{y| (x, y) \in P_2\} \subseteq N(u)\). So \(|\{y| (x, y) \in P_2\} | \le |N(u)|\). By the Lemma 1, we have \(|P_2| = |\{y| (x, y) \in P_2\}| \le d(u)\), and thus \(|P_2| \le d_{\tiny max }\).

Therefore, \(|SwapS| = |P_1| + |P_2| \le 2d_{\tiny max }\).

Since most large real-world graphs are sparse [5, 12, 13], we have \(d_{\tiny max }\ll |V|\). Moreover, observing all the graph instances in the experiments, we find that \(d_{\tiny max }< 3,000\). So for most of the time, SwapS is a small set.

SwapS becomes huge only when C contains a single vertex, namely u. In this situation, any vertex outside C but not adjacent to u, namely v, can go into C through a \(\texttt {swap}\) operation, and \(\varDelta _{swap}(u, v) = w(v) - w(u)\). Thus picking the best pair is somewhat equivalent to picking a vertex outside with the greatest weight. If we do so, we will obtain a single-vertex clique that contains a vertex with the greatest or near-greatest weight.

Usually there is only a tiny proportion of vertices whose weight is the greatest or close to the greatest. So when \(|C|=1\), if we pick the best swap-pair, we will obtain a clique from a tiny proportion of the single-vertex ones. Therefore, when \(|C|=1\) happens many times, the local search may revisit some areas.

We realize that when \(|C|=1\), [27] uses BMS, SCC and age to help diversify the local search. Yet it is unclear whether greedy search here is necessary, and we believe that random walk is feasible. In our solver, we will abandon BMS and use random walk instead, and the experimental performances are satisfactory.

4 The LMY-GRS Algorithm

LMY-GRS works with three operators \(\texttt {add}\), \(\texttt {swap}\) and \(\texttt {drop}\). With the current clique denoted by C, two sets \(S_{add}\) and \(S_{swap}\) are defined as below which are slightly different from AddS and SwapS.

$$\begin{aligned} S_{add} = \left\{ \begin{array}{ll} \{v | v \not \in C, v \in N(u)\text { for all }u \in C\} &{} \text {if } |C|>0;\\ \emptyset &{} \text {if } |C|=0. \end{array} \right. \end{aligned}$$
$$\begin{aligned} S_{swap} = \left\{ \begin{array}{ll} \{(u, v) | u \in C, v \not \in C, \{u, v\} \not \in E, v \in N(w)\text { for all } w \in C \backslash \{u\}\} &{} \text {if }|C|>1;\\ \emptyset &{} \text {if }|C|\le 1. \end{array} \right. \end{aligned}$$

Obviously we have

Proposition 2

\(|S_{add}| \le d_{\tiny max }\), and \(|S_{swap}| \le 2d_{\tiny max }\).

In our algorithm, the vertices of the operations are explicit from the context and thus omitted.

figure b

Basically LMY-GRS firstly initializes a clique via \(\texttt {RandInitClique(}G\texttt {)}\), and then uses local search to find better cliques. The initialization procedure is just the same as that in MN/TS and LSCC, which is shown in Algorithm 3.

figure c

Like MN/TS and LSCC, we also exploit the multi-restart strategy. More specifically, every L steps we restart the local search. The difference from previous restarting methods is that we simply drop all vertices from C and regenerate a random maximal clique.

In LMY-GRS, C may sometimes become empty. In this situation, we add a random vertex in C and proceed to search for a better clique.

5 Data Structures

5.1 Connect Clique Degrees and Clique Neighbors

Definition 1

Given a clique C and a vertex \(v \not \in C\), we define the connect clique degree of v to be \(\kappa (v, C) = |\{u \in C| u\textit{ and } v \textit{ are neighbors.}\}|\).

So the connect clique degree of v is the number of vertices in C that are adjacent to v. Then we can maintain \(\kappa (v, C)\) when a vertex is added into or dropped from C, based on the following proposition.

Proposition 3

  1. 1.

    \(\kappa (v, C\backslash \{v\}) = |C| - 1\) for any \(v\in C\);

  2. 2.

    \(\kappa (v, C\backslash \{w\}) = \kappa (v, C) - 1\) for all \(v \in (N(w) \backslash C)\);

  3. 3.

    \(\kappa (v, C \cup \{w\}) = \kappa (v, C) + 1\) for all \(v \in (N(w) \backslash C)\).

In [6], there is a notion named the number of missing connections. In fact the number of v’s missing connections is \(|C| - \kappa (v, C)\).

Definition 2

Given a clique C, we define the clique neighbor set to be

$$\begin{aligned} \mathcal {N}(C)= \left\{ \begin{array}{ll} \{v \not \in C|v \in N(u)\textit{ for some } u \in C\}&{}\textit{if }|C|>0;\\ \emptyset &{}\textit{if }|C|= 0. \end{array} \right. \end{aligned}$$

So clique neighbors are those vertices outside C which are adjacent to at least one vertex inside C.

When a vertex namely u is added into or dropped from C, \(\mathcal {N}(C)\) is updated based on the proposition below.

Proposition 4

  1. 1.

    For any vertex u, \(u \not \in \mathcal {N}(C\cup \{u\})\);

  2. 2.

    for any \(u \in C\), \(|C| > 1\) iff \(u \in \mathcal {N}(C\backslash \{u\})\);

  3. 3.

    for any \(v\not \in C\), \(\kappa (v, C) > 0\) iff \(v \in \mathcal {N}(C)\).

Lastly, we use the following proposition to maintain \(S_{add}\) and \(S_{swap}\).

Proposition 5

  1. 1.

    For any v, \(v \in S_{add}\) iff \(v \in \{w \in \mathcal {N}(C)| \kappa (w, C) = |C|\}\);

  2. 2.

    there exists \(u\in C\) s.t. \((u, v)\in S_{swap}\), iff \(v \in \{w \in \mathcal {N}(C)| \kappa (w, C) = |C| - 1\}\).

This tells us that we can maintain \(S_{add}\) and \(S_{swap}\) simply by traversing \(\mathcal {N}(C)\), so we do not have to traverse V. Previous local search solvers for MC or MVWC exclusively traverse all the vertices in V to maintain \(S_{add}\) and \(S_{swap}\), so our implementation can sometimes be much more efficient. Notice that \(|\mathcal {N}(C)| \ll |V|\) usually holds in huge sparse graphs.

5.2 A Hash Table for Determining Neighbor Relations

In graph algorithms there is a common procedure: Given a graph G and two vertices u and v, determining whether u and v are neighbors. In MN/TS, LSCC and LMY-GRS, this procedure is called very frequently, so we have to implement it efficiently. However, it is unsuitable to store the large sparse graphs by adjacency matrices. Therefore, we propose the following data structure which is both memory and time efficient.

We employ a one-to-one function \(f:(Z^+, Z^+)\rightarrow Z^+\) and use a hash table to implement the procedure above. Given a graph \(G = (V, E)\), for any \(\{v_i, v_j\} \in E\) where \(i < j\), we store f(ij) in a hash table \(\mathcal {T}_h\). Then each time when we need to determine whether \(v_k\) and \(v_l\,(k < l)\) are neighbors, we simply check whether f(kl) is in \(\mathcal {T}_h\). If so they are neighbors; otherwise, they are not.

In LMY-GRS, we adopt Cantor’s pairing function as the function f above, \(\textit{i.e.}\), \(f(x, y) = (x+y)(x+y+1) \div 2+y\). Then we can determine whether two vertices are neighbors in O(1) complexity on average.

With the data structures above, our solver is able to perform steps faster than LSCC by orders of magnitude on huge sparse graphs.

6 Experimental Evaluation

In this section, we carry out extensive experiments to evaluate LMY-GRS on large crafted graphs, compared against the state-of-the-art local search MVWC algorithms MN/TS and LSCC.

6.1 Experiment Setup

All the solvers were compiled by g++ 4.6.3 with the ‘-O3’ option. For the search depth L, MN/TS, LSCC and LMY-GRS set \(L = 4,000\). Both MN/TS and LSCC employ the BMS heuristic, and the parameter k was set to 100, as in [27]. MN/TS employs a tabu heuristic and the tabu tenure TL was set to 7 as in [28]. The experiments were conducted on a cluster equipped with Intel(R) Xeon(R) CPUs X5650 @2.67 GHz with 8 GB RAM, running Red Hat Santiago OS.

Each solver was executed on each instance with a time limit of 1,000 s, with seeds from 1 to 100. For each algorithm on each instance, we report the maximum weight (“\(w_{max}\)”) and averaged weight (“\(w_{avg}\)”) of the cliques found by the algorithm. To make the comparisons clearer, we also report the difference (“\(\delta _{max}\)”) between the maximum weight of the cliques found by LMY-GRS and that found by LSCC. Similarly \(\delta _{avg}\) represents the difference between the averaged weights. A positive \(\delta _{avg}\) (resp. \(\delta _{max}\)) indicates that LMY-GRS performed better, while a negative value indicates that LMY-GRS performed worse.

Table 1. Results where LSCC and LMY-GRS returned different \(w_{max}\) or \(w_{avg}\) values

6.2 Main Results

We show the main experimental results in Tables 1 and 2. For the sake of space, we do not report the results on graphs with less than 1,000 vertices.

Quality Improvements. Table 1 shows the performances on the instances where LSCC and LMY-GRS returned different \(w_{max}\) or \(w_{avg}\) values.

From the results in Table 1, we observe that:

  1. 1.

    LMY-GRS attained best-known solutions on all the graphs;

  2. 2.

    In a large proportion of the graphs, LMY-GRS returned solutions which had better average quality;

  3. 3.

    LMY-GRS found new best-known solutions in 4 graphs.

In fact these 4 graphs are the largest ones in the benchmark, and each of them has at least \(10^6\) vertices. Since LMY-GRS and LSCC present different solution qualities over these instances, it is inconvenient to evaluate the individual impacts of the heuristics and the data structures over these instances.

Time Improvements and Individual Impacts. Table 2 compares the performances on those instances where LSCC and LMY-GRS returned both the same \(w_{max}\) and \(w_{avg}\) values. We also present the averaged number of steps to locate a solution, and the number of steps executed in each millisecond.

Table 2. Performances on instances where they returned the same \(w_{max}\) and \(w_{avg}\) values

Over these instances, we find that

  1. 1.

    The time columns show that LMY-GRS and LSCC performed closely well;

  2. 2.

    The step columns show that the heuristic in LMY-GRS is as good as the one in LSCC, since they needed roughly the same number of steps;

  3. 3.

    The last two columns show that LMY-GRS sometimes performed steps faster than LSCC, but sometimes slower.

To show the power of our data structures, we present the averaged number of steps per millisecond in some largest instances in Table 3. In this table we found that LMY-GRS is able to perform steps faster than LSCC by orders of magnitudes in some instances.

Table 3. The number of steps per millisecond on huge sparse instances

Based on the discussions above, we conclude that using random walk is roughly as good as using BMS, and our data structures are very powerful on huge sparse graphs.

7 Conclusions and Future Work

In this paper, we developed a local search MVWC solver named LMY-GRS, which significantly outperforms state-of-the-art solvers on large sparse graphs. It attains best-known solutions on all the graphs in the experiments, and it finds new best-known solutions on some of them.

Our contributions are of three folds. Firstly we rigorously showed that the swap-set is usually small even when we are solving large sparse graphs. Secondly we incorporated random walk into the multi-neighborhood greedy search and showed that it is satisfactory. Thirdly we proposed efficient data structures that work well with huge sparse graphs.

In the future we will improve SCC to further avoid cycles. Also we will introduce more diversification strategies into LMY-GRS.