Keywords

1 Introduction

Given a simple undirected graph \( G = (V,E) \), with n vertices and m edges, \( N(v) \) refers to the neighbor set of vertices adjacent to a vertex \( v \in V \) and \( G[U] \) to the subgraph induced by a vertex set \( U\,\subseteq\,V \). A complete subgraph (or clique) is an induced subgraph such that all its vertices are pairwise adjacent, and the maximum clique problem (MCP) consists in finding a clique of maximum cardinality. The MCP is a theoretical deeply studied NP-hard problem which has also found many applications [1]. Examples of clique search appear in network analysis, coding theory, fingerprint matching, bioinformatics [2], robotics [3, 4] and computer vision [5].

In the last decade there has been rising interest in the exact solving of the MCP. At present, almost all successful algorithms are branch-and-bound and use a greedy vertex coloring procedure as bound [6,7,8,9,10,11,12,13,14,15]. Interestingly, the reports provided by these practical algorithms average a much better performance than is to be expected by the theoretical algorithms alone. A brief overview of theoretical results now follows:

  • The MCP is NP-hard and can not even be approximated in polynomial time within a factor of \( |V|^{1/3 - \varepsilon } \), for any \( \varepsilon > 0 \)(unless \( P = NP \)) [16].

  • There can be as much as \( 3^{{{n \mathord{\left/ {\vphantom {n 3}} \right. \kern-0pt} 3}}} \) distinct maximal cliques to enumerate in a graph of size n [17], which is an upper bound for all enumerative solvers.

  • Branch-and-bound solvers do not need to explore all the maximal cliques. An important number of them may be discarded when, during construction, it is shown that they cannot improve the current incumbent clique. In [18], the worst case running time was set to \( O(2^{{{n \mathord{\left/ {\vphantom {n 3}} \right. \kern-0pt} 3}}} ) \) and, at present, the value of this bound is \( O(2^{{{n \mathord{\left/ {\vphantom {n 4}} \right. \kern-0pt} 4}}} ) \) [19], in both cases with exponential space consumption. We note that these algorithms have not been implemented in practice and literature provides no experimental evaluation.

  • An interesting recent work is [20] which is concerned with the family of branch-and-bound solvers which use coloring as bounding function; these are also the reference algorithms for this work. [20] sets \( \Omega (2^{{{n \mathord{\left/ {\vphantom {n 5}} \right. \kern-0pt} 5}}} ) \) as lower bound for this family of solvers.

The latter result is especially relevant because it is related to state-of-the-art exact algorithms and, as far as we are aware, the gap between the good performances shown by the experimental algorithms and the theoretical result remains to be explained. Motivated by this fact, this work contributes with an empirical study of cutting-edge infra-chromatic bounding functions which have recently been described for the MCP. These functions are able to bind the clique number of a given subproblem below its chromatic number by additional computational effort. More precisely, this implies that the theoretical \( \Omega (2^{{{n \mathord{\left/ {\vphantom {n 5}} \right. \kern-0pt} 5}}} ) \) bound does not hold for these algorithms.

Leading infra-chromatic experimental algorithms for the MCP at present are MaxCLQ [14], and its improved variant IncMaxCLQ [15], as well as bit-parallel BBMCX [11] — see recent comparison surveys [21, 22]. The contribution of this work is experimental. First we describe a way to enhance BBMCX with an additional infra-chromatic bounding function closely related to the one described in MaxCLQ. This bounding function can also be applied to earlier BBMC variants [8,9,10]. The paper reports performances of two enhanced variants (including BBMCX), together with IncMaxCLQ.

The remaining part of this work is structured in the following way: Sects. 2 and 3 cover prior related algorithms. The new bounding function for exact maximum clique is described in Sect. 4 and validated empirically in Sect. 5. Finally, Sect. 6 briefly discusses the reported results and Sect. 7 presents some conclusions.

2 Related Bounding Functions for the MCP

In the last decade, greedy sequential vertex coloring (SEQ) has proved to be a good bounding function for exact branch-and-bound solvers for the MCP. SEQ is an \( O(n^{2} ) \) worst-case bounded procedure which iteratively assigns the lowest possible color number to each vertex that is consistent with the current partial coloring. The first maximum clique algorithm of this type described in literature is Fahle’s [6]. Since then, much research has been devoted to improve the basic SEQ bounding function.

We denote by \( C(G) = \left\{ {C_{1} ,C_{2} , \ldots ,C_{k} } \right\} \) a k-coloring of graph G, where \( C_{i} \) refers to the (color) set which contains vertices assigned color number i. For a given k-coloring \( C(G) \), maximum clique solvers use a recoloring bounding function to lower the color number k of a vertex \( v_{1} \in C_{k} \) by a swap movement with two already colored vertices \( v_{2} \in C_{i} \, \) and \( v_{3} \in C_{j} \) where \( i < j < k \). A successful recoloring leads to the new coloring \( v_{1} \in C_{i} \) and \( v_{2} ,\,v_{3} \in C_{j} \, \). In practice, recoloring can be applied to every vertex of the SEQ coloring, as described originally for algorithm MCS [12], or relaxed to a particular subset, as in BBMCR [9].

A bounding function that could produce bounds below the chromatic number of the bounded subproblem was first described in algorithm MaxCLQ [14], and later improved in IncMaxCLQ [15]. It was defined in terms of a reduction of the MCP over a colored graph to a partial maximum satisfiability problem (PMAX-SAT) as follows: (I) Each vertex of the graph maps to a logical variable. (II) Each PMAX-SAT hard clause consists of two negated literals and represents a non-adjacency relation between the pair of corresponding vertices. (III) Each PMAX-SAT soft clause represents a distinct color set and contains the corresponding positive literals that map to vertices of the set.

Figure 1 provides a simple example of the reduction when applied to an odd-cycle:

Fig. 1.
figure 1

An example of an MCP reduction to PMAX-SAT. (Color figure online)

The PMAX-SAT problem’s equivalence to the MCP consists in finding an assignment of literals which satisfy the maximum number of soft clauses (the color sets) and all hard clauses (the graph structure) — see [14, 15] for a more detailed description. The cleverness of this reduction for branch-and-bound exact MCP solvers is that well-known pruning strategies for PMAX_SAT may be used as bounding functions for the MCP. The main inference is based on propagating literal assignments of unit soft clauses (singleton color sets) until an empty clause is reached. When this occurs, the soft clauses (color sets) that make part of the reasoning are said to be inconsistent. Intuitively, the subgraph induced by an inconsistent set of k colors cannot contain a clique of size k or bigger, so the bound provided by the number of colors in the original coloring may be decremented by one for each inconsistent set of colors found.

In the example of Fig. 1, one such inconsistent set of soft clauses is \( \left\{ {s_{1} ,s_{2} ,s_{3} } \right\} \), which can be obtained by the following inference: (I) \( x_{5} \leftarrow TRUE \) since \( s_{3} \) is a unit clause. (II) Propagating \( x_{5} \leftarrow TRUE \) value leads to \( s_{2} = x_{4} \), according to hard clause \( h_{4} \), and \( s_{1} = x_{1} \), according to hard clause \( h_{5} \). (III) \( x_{4} \leftarrow TRUE \), since \( s_{2} \) is a unit clause. IV) Propagating \( x_{4} \leftarrow TRUE \) leads to the empty clause \( s_{1} = \emptyset \), according to the hard clause \( h_{2} \). Consequently, the original color bound of the graph, 3, can be reduced to 2. Note that the chromatic number of the graph \( \chi (G) = 3 \) is higher than the new bound.

The term infra-chromatic describing bounding functions for the MCP was first employed in the bit-parallel BBMCX algorithm. There, the authors propose an inference procedure in terms of color sets which integrates well with the bitstring encoding and branching of the family of BBMC algorithms [8,9,10,11]. More precisely, BBMCX looks for triplets of color sets \( \left\{ {C_{x} ,C_{y} ,C_{z} } \right\} \) such that \( \left| {C_{x} } \right| = 1 \) and no triangles exist in the induced graph \( G[C_{x} \cup C_{y} \cup C_{z} ] \). The reported results show that this particular case of inconsistency can be computed very efficiently achieving a very good tradeoff between pruning ability and computational effort.

3 The Prior Infra-Chromatic Procedure

Rising interest in exact maximum clique search combines exhaustive enumeration of maximal cliques with a bounding function based on approximate coloring. A step of the search (typically a recursive call to the algorithm) branches on a vertex to enlarge a clique, and a branch of the search tree corresponds to a maximal clique. The solution to the MCP is any leaf node of maximum cardinality.

In this work we borrow the notation for sets employed in BBMCX [11]. More specifically:

  • S denotes the clique to be enlarged at any point during search.

  • S max is the largest clique found at any point during search.

  • U denotes the set of vertices of the current subproblem.

  • \( U_{v} \) denotes the set of vertices of the child subproblem resulting from branching on vertex \( v \in U \).

  • \( G_{v} = G[U_{v} ] \) refers to the induced graph subproblem which results from branching on vertex v.

  • F is a set of (forbidden) color numbers employed in the infra-chromatic bounding function described in BBMCX.

Of interest to this work is the bounding function called for each subproblem U during search. Algorithm 1 outlines the bounding procedure which roughly corresponds with the one described in BBMCX. Procedure UPPERBOUND computes the SEQ coloring of the subproblem \( G[U_{v} ] \) in the main loop (steps 1 to 12), by enlarging one color set at a time. Once all color sets below a certain pruning threshold \( k_{min} \) have been constructed, each selected vertex goes through the infra-chromatic filter INFRACHROM_I described in Algorithm 2.

The filter function looks for pairs of distinct color classes (i, j) such that vertex v is adjacent to exactly just one vertex w in color i and color j contains no common neighbors to both v and w. If the filter succeeds (and returns SUCCESS) the vertex is removed from the original coloring and the other two inconsistent color sets are tagged as forbidden, so that they do not take part in further inferences. We refer the reader to [11] for additional design and implementation details.

4 The Enhanced Infra-Chromatic Procedure

We enhance the previously described bounding function with another infra-chromatic bound which employs unit clause propagation described in PMAX-SAT based solvers. Algorithms 3 and 4 describe the general outline of the new enhanced function. The coloring C obtained after applying INFRACHROM_I to SEQ is now filtered again by new INFRACHROM_II. The function searches for inconsistent color sets using unit clause inferences and returns the number of conflicts found. As explained in the introductory section, the resulting new bound is the difference between the previous bound (the size of the coloring) and the number of conflicts outputted by INFRACHROM_II.

In the current implementation of INFRACHROM_II, the forbidden color set F and the color and neighbor sets are encoded as bitsets. The main inference is the set intersection operation in step 7 which benefits from bit-masking, as well as the initial step 1. The procedure runs in \( O(\left| V \right|^{2} ) \) in the worst-case. In the current implementation, we obtain the best performance when color sets are examined in non-increasing order. This strategy tends to find conflicts earlier because the highest color sets of the SEQ coloring are expected to have a small size. We note that it is easy to envisage more greedy variants of Algorithm 4 by filtering out color sets from the reasoning with additional constraints. It is also worth noting that the integration of INFRACHROM_II in the prior bounding function (see Algorithm 3) is not restricted to BBMCX, and may also make part of other exact approximate-color exact solvers such as MCS, BBMC, BBMCR and other published BBMC variants.

5 Experiments

To evaluate the new proposed infra-chromatic bounding function we have considered the following algorithms in this research:

  1. 1.

    BBMC [8, 9]: The original bit-parallel algorithm for the MCP, which uses only the SEQ color bound.

  2. 2.

    BBMC+I: BBMC enhanced with the new PMAX-SAT based infra-chromatic bound described in Algorithm 3.

  3. 3.

    BBMCR [9]+I: The BBMC recoloring variant enhanced with the new infra-chromatic bound.

  4. 4.

    BBMCX [11]+I: The BBMC infra-chromatic variant enhanced with the new infra-chromatic bound.

  5. 5.

    IncMaxCLQ [15]: The latest PMAX-SAT based algorithm, as provided by its main developer.

All the algorithms were tested over a big number of structured graphs from public datasets which may be found in other maximum clique reports elsewhere. The majority of structured instances were presented at the Second DIMACS Implementation ChallengeFootnote 1. The rest are taken from the BHOSHLIB benchmarkFootnote 2 (more specifically, the frb30-15 collection). The algorithms were also run against uniform random graphs, with sizes ranging from 150 to 15000 vertices and varying densities.

The hardware used in the experiments was a 20 core XEON with 64 GB of RAM running a Linux OS and all algorithms considered were run on a single core of this machine. With respect to initial configurations, the enhanced BBMC infra-chromatic variants use the more recent preprocessing ideas, that is, the initial sorting of vertices described in IncMaxCLQ, and a strong initial solution computed by a state-of-the-art heuristic, as recommended in [13].

Due to space constraints we report in this manuscript the performance of algorithms BBMCR+I, BBMCX+I and IncMaxCLQ against 37 representative structured graphs — solved in more than 0.01 s by at least one algorithm.

6 Discussion

Table 1 reports the performance of the enhanced algorithms BBMCR+I, BBMCX+I as well as IncMaxCLQ. Interestingly, out of the 37 instances reported, BBMCR+I is faster than BBMCX+I in 25 of them. This would indicate that recoloring (BBMCR), followed by a single infra-chromatic filter, is more appropriate than the double infra-chromatic filter of BBMCX+I. This is corroborated by the number of steps taken by both algorithms.

Table 1. A comparison of different infra-chromatic exact maximum clique algorithms. Time is measured in seconds with precision of milliseconds. Header Inc/BBMC compares performance as a ratio. Header p refers to uniform density and header ω is the size of the maximum clique. Cells in bold show the best times in each row. A step is a recursive call in the BBMC enhanced algorithms.

In relation to IncMaxCLQ, the enhanced algorithms perform better in 27 of the 37 instances. Worth noting is that IncMaxCLQ clearly outperforms the latter in keller5 and C250.9. Overall, IncMaxCLQ produces smaller search trees but with higher computational cost, the tradeoff being favorable only in very dense graphs (that is, \( p \ge 0.9 \)) with the exception of keller5. It is also worth noting that the number of steps taken by both enhanced algorithms are much less, on average, than those reported for BBMCX in [11].

7 Conclusions

This work describes a simple way to enhance a leading bit-parallel infra-chromatic solver for the maximum clique problem with a PMAX-SAT based infra-chromatic filter, and compares the performance of two enhanced published solvers with another state-of-the-art algorithm. The contribution is a step forward in the search of an adequate explanation for the gap between theoretical results and the excellent performance shown by experimental algorithms for this problem. Moreover, the enhanced algorithms show some of the best times published, to the best of our knowledge, for a number of structured graphs from well known public data sets.