1 Introduction

One of the most fundamental problems in computer science is k-Clique: given an n-node graph, decide if there are k nodes that form a clique, i.e. that have all the \(k \atopwithdelims ()2\) edges between them. Our interest is in the case where \(3 \le k \ll n\) is a small constant. This is the “SAT of parameterized complexity” being the canonical problem of the W[1] class of “fixed parameter intractable” problems, and its basic nature makes it a core task in countless applications where we seek a small sub-structure defined by pairwise relations.

The naïve algorithm checks all subsets of k nodes and runs in \(O(k^2{n \atopwithdelims ()k})\) time, which is \(\varTheta (n^k)\) for constant k. Whether and how this bound can be beaten (in terms of worst-case asymptotic time complexity) is a quintessential form of the question: can we beat exhaustive search?

The asymptotically fastest algorithms gain a speedup by exploiting fast matrix multiplication – one of the most powerful techniques for beating exhaustive search. In particular, for the important special case of \(k=3\), i.e. the Triangle Detection problem, the running time is \(O(n^{\omega })\) where \(2 \le \omega < 2.3719\) [23] is the exponent in the time complexity of multiplying two \(n \times n\) binary matrices.Footnote 1 For larger \(k>3\), there is a reduction to the \(k=3\) case by Nešetřil and Poljak [31] that produces graphs of size \(O(n^{\lceil k/3 \rceil })\).Footnote 2 The resulting time bound is \(O(n^{\lceil \omega k /3 \rceil })\). Except for improvements for k that is not a multiple of 3 [24], and the developments in fast matrix multiplication algorithms reducing the value of \(\omega \) over the years, this classical algorithm remains the state-of-the-art.

The one general technique underlying all fast matrix multiplication, starting with Strassen’s algorithm [34], is to find some clever formula to exploit cancellations in order to replace multiplications with additions. To date, this is the only technique capable of beating exhaustive search by a polynomial \(n^{\varepsilon }\) factor for the k-Clique problem. All techniques have their limitations, and so does Strassen’s; we defer a detailed discussion on this to the full paper due to space constraints. Consequently, much research has gone into finding “combinatorial algorithms” that beat exhaustive search by other techniques. Existing techniques have only led to polylogarithmic speedups, leading the community to the following conjectures that have become the basis for many conditional lower bounds.

Conjecture 1 (Combinatorial BMM)

Combinatorial algorithms cannot solve Triangle Detection in time \(O(n^{3-\varepsilon })\) where \(\varepsilon >0\).Footnote 3

A reduction of Vassilevska and Williams [37] shows that this conjecture is equivalent to the classical conjecture that combinatorial algorithms cannot solve Boolean Matrix Multiplication (BMM) in truly subcubic time [28, 33]. Following their reduction, many conditional lower bounds were based on this conjecture, e.g. [5, 16, 19, 21] (we refer to the survey [36] for a longer list).

Conjecture 2

(Combinatorial \(\boldsymbol{k}\)-Clique). Combinatorial algorithms cannot solve k-Clique in time \(O(n^{k-\varepsilon })\) for any \(k \ge 3\) and \(\varepsilon >0\).

The latter conjecture is stronger than the former, in the sense that faster algorithms for \(k=3\) imply faster algorithms for larger \(k>3\) but the converse is not known. The first use of this conjecture as a basis for conditional lower bounds was by Chan [17] to prove an \(n^{k-o(1)}\) lower bound for a problem in computational geometry. Later, Abboud, Backurs, and Vassilevska Williams [2] used it to prove \(n^{3-o(1)}\) lower bounds in P. Several other papers have used it since then, e.g. [1, 4, 9, 11,12,13,14, 20, 22, 26, 29].

Previous Combinatorial Bounds. The previous bounds for Triangle detection (\(k=3\)) fall under three conceptual techniques (see full paper for more details). We will omit \((\log \log n)\) factors in this paragraph.

  1. 1.

    The Four-Russians technique [6] from 1970 gives an \(O(n^3 / \log ^2 n)\) bound, and is used in all later developments.

  2. 2.

    In 2010, Bansal and Williams [7] use pseudoregular partitions to shave off an additional \(\log ^{1/4}n\) factor.

  3. 3.

    In 2014, Chan [18] introduced a simple divide-and-conquer technique to get an \(O(n^3 / \log ^3 n)\) bound, and a year later, Yu [38] optimized this technique to achieve a bound of \(O(n^3 / \log ^4 n)\).

For \(k>3\) there are two options: (1) we either apply these algorithms inside the aforementioned reduction to Triangle, getting a bound of \(O(n^k/\log ^4{n})\), or (2) we apply these combinatorial techniques directly to k-Clique. An early work of Vassilevska [35] from 2009 applied the Four-Russians technique directly to get an \(O(n^{k}/\log ^{k-1})\) bound. Note that this generalizes the \(\log ^2\) shaving from the first bullet naturally to all k, and is favorable to the algorithms from option (1) for \(k>5\). Vassilevska’s bound remains state-of-the-art, and in this work, we address the challenge of generalizing the other combinatorial techniques to k-Clique.

1.1 Our Results

The first result of this paper is a faster combinatorial algorithm for k-Clique for all \(k>3\) based on a generalization of the divide-and-conquer technique from Chan’s and Yu’s algorithms for \(k=3\). We use divide-and-conquer to design a more efficient reduction from k-Clique to the \(k=3\) case. The main feature of this reduction is that we get an additional log factor shaving each time k increases by one; this should be contrasted with the classical reduction from option(1) above, in which we gain nothing when k grows.

Theorem 1

(Reduction from \(\boldsymbol{k}\)-Clique to Triangle). Let \(k \ge 3\), and let ab be reals such that there is a combinatorial triangle detection algorithm running in time \(O(n^3 (\log n)^a (\log \log n)^b)\). Then there is a combinatorial k-clique detection algorithm in time \(O(n^k (\log n)^{a - (k-3)} (\log \log n)^{b + k - 3})\).

Combining our reduction with Yu’s state-of-the-art combinatorial algorithm for Triangle detection, we improve Vassilevska’s bound by two log factors.

Corollary 1

(Faster Combinatorial \(\boldsymbol{k}\)-Clique Detection). There is a k-clique detection algorithm running in time \(O(n^{k} (\log n)^{-(k+1)} (\log \log n)^{k+3})\).

It may be interesting to note that our reduction can even be combined with the naïve \(O(n^3)\) algorithm for Triangle detection, giving a \((\log n)^{k-3}\) shaving for k-Clique without using the Four-Russians technique.

Another interesting implication of our reduction is concerning the framework of Bansal and Williams’ [7]. Their algorithm can be improved if better dependencies for regularity/triangle removal lemmas are achieved. The best known upper bound on \(f(\varepsilon )\) in a triangle removal lemma is of the form \(\frac{c}{(\log ^*(1/\varepsilon ))^\delta }\) for some constants \(c>1\) and \(\delta >0\).Footnote 4. Due to this dependency, their first algorithm [7, Theorem 2.1] only shaves a \(\log ^*(n)\) factor from the running times achieved with the standard Four-Russians technique. However, it is not ruled out that much better dependencies can be achieved that would accelerate their algorithm to the point where, combined with our reduction, a k-clique algorithm with faster running times than Corollary 1 is obtained.Footnote 5

A primary reason to seek combinatorial algorithms for k-Clique is that the techniques may generalize in ways fast matrix multiplication cannot (see full paper for detailed discussion). Our second set of results exhibits this phenomenon by shaving logarithmic factors over state-of-the-art for general (non-combinatorial) algorithms.

One limitation of the \(O(n^{\omega })\) algorithm for Triangle detection is that it does not solve the Triangle listing problem: we cannot specify a parameter t and get all triangles in the graph in time \(O(n^{\omega } + t)\) assuming their number is up to t. Listing triangles in an input graph is not only a natural problem, but it is also connected to the fundamental 3SUM problem (given n numbers, decide if there are three that sum to zero). A reduction from 3SUM [27, 32] shows that in order to beat the longstanding \(O(n^{2}/\log ^2{n})\) bound over integers [8] it is enough to shave a \(\log ^{6+\varepsilon }{n}\) factor for Triangle listing – i.e., achieve a running time of \(O(\frac{n^{3}}{\log ^{6+\varepsilon }{n}} + t)\) for some \(\varepsilon > 0\). Although research has seen some results on triangle listing [10], we are not aware of any previous \(o(n^3)+O(t)\) time bound for this problem (even with non-combinatorial techniques). Our second result produces such a time bound, showing that the other combinatorial techniques (namely Four-Russians and regularity lemmas) can be exploited. We shave a \(\log ^{2.25}{n}\) factor for this problem, generalizing the Bansal-Williams bound for BMM. Note we use the non-standard notation to suppress polyloglog factors.

Theorem 2

(Faster Triangle Listing). There is a randomized combinatorial algorithm that lists up to t triangles in a given graph in time , and succeeds with probability \(1 - n^{-100}\).

Another well-known limitation of Strassen-like techniques is that they are ineffective for detecting hypergraph cliques. They fail to give any speedup even for the first generalization in this direction: detecting a 4-clique in a 3-uniform hypergraph (i.e. a hypergraph where each hyper-edge is a set of three nodes). We are not aware of any non-trivial \(o(n^4)\) algorithm for this problem (even with non-combinatorial techniques). The conjecture that \(O(n^{4-\varepsilon })\) time cannot be achieved has been used to prove conditional lower bounds, e.g. [15, 29]. Our third result is a \(\log ^{1.5}{n}\) factor shaving for this problem. The following theorem provides our general bound and strengthens the result for listing (detection can be obtained by setting \(t=1\)).

Theorem 3

(Faster \(\boldsymbol{k}\)-Hyperclique Listing). There is an algorithm for listing up to t k-hypercliques in an r-uniform hypergraph in time

$$\begin{aligned} O\left( \frac{n^{k}}{(\log n)^{\frac{k-1}{r-1}}} + t \right) \end{aligned}$$

(assuming a word RAM model with word size \(w = \varOmega (\log n)\)).

Subsequent Work. Shortly after this work, Abboud, Fischer, Kelly, Lovett, and Meka announced a combinatorial algorithm for BMM with \(O(\frac{n^{3}}{2^{(\log n)^\varepsilon }})\) running time [3]. This implies an improvement for k-Clique as well that is stronger than any poly-log speedup and thus improves over Corollary 1 (by using pseudo-regularity techniques rather than divide-and-conquer). Moreover, building on our proof of Theorem 2 the authors present a speedup for triangle listing as well. However, our result for hypergraphs in Theorem 3 remains unbeaten.

1.2 Outline

We start with some preliminaries in Sect. 2. In Sect. 3 we provide our improved combinatorial k-Clique algorithm. In Sects. 4 and 5 we provide the high-level ideas of our improvements for Triangle Listing and k-Hyperclique Detection; due to space constraints we are forced to defer the technical details to the the full paper.

2 Preliminaries

Let \([n] := \lbrace 1, \dots , n \rbrace \). We write to suppress polylogarithmic factors and use the non-standard notation .

Throughout we consider undirected, unweighted graphs. In the k-clique problem, we are given a k-partite graph \((V_1, \dots , V_k, E)\) and the goal is to determine whether there exist k vertices \(v_1 \in V_1, \dots , v_k \in V_k\) such that there is an edge \((v_i, v_j)\in E\) for every pair \(i \ne j\). Note that the assumption that the input graphs are k-partite is without loss of generality, and can be achieved by a trivial transformation of any non-k-partite graph \(G= (V, E)\): We create k copies \(V_1, \dots ,V_k\) of the vertex set and for every \((u,v) \in E\) we add the edges \((u_i, v_j)\) for every \(i\ne j\). Another typical relaxation is that we only design an algorithm that detect the presence of k-cliques (without actually returning one). It is easy to transform a detection algorithm into a finding algorithm using binary search without asymptotic overhead.Footnote 6

We additionally define the following notation for a k-partite graph as before: For a vertex v, let \(N_{i}(v) = \lbrace u \in V_{i} : (v,u) \in E \rbrace \) denote the neighbourhood of v in \(V_{i}\) and \(d_{i}(v) = |N_{i}(v)|\) denote the degree of v in \(V_{i}\). Moreover, for a vertex set \(V'\subseteq V\) we let \(G[V']\) denote the subgraph of G induced by the vertex set \(V'\). Throughout we further let \(n = |V_1| + \dots + |V_k|\) denote the total number of vertices in the graph.

An r-uniform hypergraph is a pair (VE), where V is a vertex set and \(E \subseteq \left( {\begin{array}{c}V\\ r\end{array}}\right) \) is a set of hyperedges. In the r-uniform k-hyperclique problem we need to decide whether in a k-partite hypergraph \((V_1, \dots , V_k, E)\) there are vertices \(v_1 \in V_1, \dots , v_k \in V_k\) such that all hyperedges on \(\lbrace v_1, \dots , v_k \rbrace \) are present. Similarly, the assumption that the hypergraph is a k-partite is without loss of generality.

We are using the standard word RAM model with word size \(w\in \varTheta \log (n)\). In this model a random-access machine can perform arithmetic and bitwise operations on w-bit words in constant time.

3 Combinatorial Log-Shaves for \(\boldsymbol{k}\)-Clique

In this section we provide our improved algorithmic reduction from k-clique to triangle detection (see Theorem 1). In our core we follow a divide-and-conquer approach for k-clique reminiscent to Chan’s algorithm for triangle detection [18] with a simple analysis. We start with the following observation:

Observation 1

(Trivial Reduction from \(\boldsymbol{k}\)-Clique to \(\boldsymbol{(k\mathord -1)}\)-Clique). Let \(k \ge 4\), let f(n) be a nondecreasing function, and assume that there is a combinatorial \((k-1)\)-clique detection algorithm running in time \(O(n^{k-1} / f(n))\). Then there is a combinatorial k-clique detection algorithm running in time

$$\begin{aligned} O\left( \sum _{v \in V_1} \frac{d_2(v)\cdot \dots \cdot d_k(v)}{f(\min \lbrace d_2(v),\dots ,d_k(v) \rbrace )} \right) . \end{aligned}$$

Proof

The algorithm is simple: For each vertex \(v \in V_1\), we construct the subgraph \(G_{v} = G[N_2(v) \cup \dots \cup N_k(v)]\) consisting of all neighbors of v and test whether \(G_{v}\) contains a \((k-1)\)-clique. Let \(n_{v} = d_2(v) + \dots + d_k(v)\) denote the number of vertices in \(G_{v}\). Our intention is to use the efficient \((k-1)\)-clique algorithm—however, simply running the algorithm in time \(O(n_{v}^k / f(n_{v}))\) is possibly too slow. Instead, we partition each of the \(k-1\) vertex parts in \(G_{v}\) into blocks of size \(d_{v} := \min \lbrace d_2(v), \dots , d_k(v) \rbrace \) (plus one final block of smaller size, respectively). Then, for each combination of \(k-1\) blocks, we use the efficient \((k-1)\)-clique detection algorithm. It is clear that the algorithm is correct, since we exhaustively test every tuple \((v_1, v_2, \dots , v_k)\). For the running time, note that testing whether \(G_{v}\) contains a k-clique takes time

$$\begin{aligned} \lceil * \rceil {\frac{d_2(v)}{d_{v}}} \cdot \dots \cdot \lceil * \rceil {\frac{d_k(v)}{d_{v}}} \cdot O\left( \frac{(d_{v})^{k-1}}{f(d_{v})} \right) = O\left( \frac{d_2(v)\cdot \dots \cdot d_k(v)}{f(\min \lbrace d_2(v),\dots ,d_k(v) \rbrace )} \right) , \end{aligned}$$

and thus the total running time is indeed

$$\begin{aligned} O\left( \sum _{v \in V_1} \frac{d_2(v)\cdot \dots \cdot d_k(v)}{f(\min \lbrace d_2(v),\dots ,d_k(v) \rbrace )} \right) \end{aligned}$$

(possibly after preprocessing the graph in time \(O(n^2)\) to allow for constant-time edge queries. Note that this also covers the cost of constructing \(G_v\) for every \(v\in V_1\)).    \(\square \)

Before moving to the formal proof of Theorem 1, let us give a simplified high-level description of this algorithmic reduction in the specific case of 4-clique. For a given 4-partite graph \((V_{1},V_{2},V_{3},V_{4})\), the core idea is the following: If the degrees in \(V_{1}\) tend to be small, i.e. if for every \(v\in V_{1}\) we have \(d_{2}(v)\cdot d_{3}(v)\cdot d_{4}(v)\le \alpha \cdot |V_{2}|\cdot |V_{3}|\cdot |V_{4}|\) for some fraction \(\alpha \approx \frac{1}{\log n}\), then we can apply Observation 1. Otherwise, there is a heavy vertex \(v\in V_{1}\) with \(d_{2}(v)\cdot d_{3}(v)\cdot d_{4}(v)>\alpha \cdot |V_{2}|\cdot |V_{3}|\cdot |V_{4}|\). In this case, we will check every triplet of the form \((u,w,z)\in N_{2}(v)\times N_{3}(v)\times N_{4}(v)\). If any of these triplets form a triangle, we have detected a 4-clique. Otherwise, we have learned that no triplet in \(N_{2}(v)\times N_{3}(v)\times N_{4}(v)\) is part of a 4-clique. We will therefore recurse in such a way that ensures we never test these triplets again and thereby make sufficient progress.

Proof

Assume that there is a combinatorial triangle detection algorithm which runs in time \(O(n^3 (\log n)^a (\log \log n)^b)\). We prove the claim by induction on k. The base case (\(k = 3\)) is immediate by the assumption there exists a triangle detection algorithm running in time \(O(n^3 (\log n)^a (\log \log n)^b)\).

For the inductive step, consider the following recursive algorithm to detect a k-clique in a given k-partite graph \((V_1, \dots , V_k, E)\). Let D and \(\alpha \) be parameters to be determined later and let d be initialized to 0.

KCliqueRec\((G=(V_1,\ldots ,V_k,E), d)\):

  1. 1.

    If \(d=D\), meaning depth D in the recursion is reached, perform exhaustive search. Return YES if a k-clique was detected, otherwise NO.

  2. 2.

    Test whether there is some \(v \in V_1\) with \(d_2(v) \cdot \ldots \cdot d_k(v) \ge \alpha \cdot |V_2| \cdot \ldots \cdot |V_k|\). If such a vertex exists:

    1. a.

      Test whether the subgraph \(G_v\) induced by \(N_2(v) \cup \dots \cup N_k(v)\) contains a \((k-1)\)-clique by exhaustive search. If it does return YES since this means we’ve found a k-clique involving v.

    2. b.

      For \(2 \le i \le k\), partition \(V_i\) into \(V_{i, 0} = V_i \setminus N_i(v)\) and \(V_{i, 1} = V_i \cap N_i(v)\). Recursively solve the \(2^{k-1} - 1\) subproblems on \((V_1, V_{2, i_2}, \dots , V_{k, i_k})\) for \((i_2, \dots , i_k) \in \lbrace 0, 1 \rbrace ^{k-1} \setminus \lbrace 1^{k-1} \rbrace \), while incrementing the depth. In other words, for each \((i_2, \dots , i_k) \in \lbrace 0, 1 \rbrace ^{k-1} \setminus \lbrace 1^{k-1} \rbrace \), call KCliqueRec\((G[V_1 \cup V_{2, i_2} \cup \dots \cup V_{k, i_k}], d+1)\).

    3. c.

      If any of the calls returned YES, return YES. Otherwise, return NO.

  3. 3.

    Solve the instance using Observation 1.

Correctness. As soon as the algorithm reaches recursion depth D, the algorithm will correctly detect a k-clique in step 1. In earlier levels of the recursion, the algorithm first attempts to find a vertex v with \(d_2(v) \cdot \ldots \cdot d_k(v) \ge \alpha \cdot |V_2| \cdot \ldots \cdot |V_k|\) in step 2. If this succeeds, we test whether v is involved in a k-clique (and terminate in this case). Otherwise, we recurse on \((V_1, V_{2, i_2}, \dots , V_{k, i_k})\) for all combinations \((i_2, \dots , i_k) \in \lbrace 0, 1 \rbrace ^{k-1} \setminus \lbrace 1^{k-1} \rbrace \). Note that we can indeed ignore the instance \((V_1, V_{2, 1}, \dots , V_{k, 1})\) knowing that \((V_{2, 1}, \dots , V_{k, 1})\) does not contain a \((k-1)\)-clique. If the condition in step 2 is not satisfied, we instead correctly solve the instance by means of Observation 1 (which reduces the problem to an instance of \((k-1)\)-clique).

Running Time. Imagine a recursion tree in which every node corresponds to an execution of the algorithm; the root corresponds to the initial call and child nodes correspond to recursive calls. Thus, every node in the tree is either a leaf (indicating that this execution does not spawn recursive calls), or an internal node with fan-out exactly \(2^{k-1} - 1\). The time at a node is the running time of the respective call of the algorithm (ignoring the cost of further recursive calls). In other words, the time at a node is the amount of local work performed in the corresponding call. To bound the total running time of the algorithm, we bound the total time across all nodes in the recursion tree.

We analyze the contributions of all steps individually. Let us introduce some notation first: At a node x in the recursion tree, let \((V_1^x, \dots , V_k^x)\) denote the instance associated to the respective invocation. We similarly write \(d_2^x(v), \dots , d_k^x(v)\).

Cost of Step 1. Note that at any node x at depth D in the recursion tree, the time is \(O(|V_1^x| \cdot \ldots \cdot |V_k^x|)\) since we solve the instance by exhaustive search. Next, observe that for any internal node x in the recursion tree, we have that

$$\begin{aligned} & |V_1^x| \cdot \ldots \cdot |V_k^x| = |V_1^x| \cdot \sum _{i_2, \dots , i_k \in \lbrace 0, 1 \rbrace ^{k-1}} |V_{2, i_2}^x| \cdot \ldots \cdot |V_{k, i_k}^x| \\ & \quad \,\,\,\, \ge |V_1^x| \cdot d_2^x(v) \cdot \ldots \cdot d_k^x(v) + \sum _{y\text { child of }x} |V_1^y| \cdot \ldots \cdot |V_k^y| \\ & \quad \,\,\,\, \ge \alpha \cdot |V_1^x| \cdot \ldots \cdot |V_k^x| + \sum _{y\text { child of }x} |V_1^y| \cdot \ldots \cdot |V_k^y|, \end{aligned}$$

and thus

$$\begin{aligned} \sum _{y\text { child of }x} |V_1^y| \cdot \ldots \cdot |V_k^y| \le (1 - \alpha ) \cdot |V_1^x| \cdot \ldots \cdot |V_k^x|. \end{aligned}$$

It follows by induction that at any depth \(d \le D\) in the recursion tree, we have that

$$\begin{aligned} \sum _{x\text { at depth }d} |V_1^x| \cdot \ldots \cdot |V_k^x| \le (1 - \alpha )^d n^k. \end{aligned}$$

In particular, the total time of all nodes at depth D is bounded by \(O((1 - \alpha )^D n^k)\).

Cost of Step 2. Note that the number of nodes in our recursion tree is at most \(2^{k D}\) since the recursion tree has degree \(\le 2^k\) and the recursion depth is capped at D. At each node, the time of step 2a is bounded by \(O(n^{k-1})\) and the cost of step 2b is bounded by \(O(n^2)\). Therefore, the total time of step 2 across all nodes is bounded by \(O(2^{k D} n^{k-1})\).

Cost of Step 3. By induction we have obtained a \((k-1)\)-clique algorithm in time \(O(n^{k-1} / f(n))\), where \(f(n) = (\log n)^{-a + k - 4} (\log \log n)^{-b - (k - 4)}\). Therefore, by Observation 1 the total time of step 3 across all nodes x in the recursion tree is

$$\begin{aligned} O\left( \sum _{x\text { lea}} \sum _{v \in V_1^x} \frac{d_2^x(v) \cdot \ldots \cdot d_k^x(v)}{f(\min \lbrace d_2^x(v), \dots , d_k^x(v) \rbrace )} \right) . \end{aligned}$$

To bound this quantity, we distinguish two subcases: A pair (xv) (where x is a leaf in the recursion tree and \(v \in V_1^u\)) is called relevant if \(d_2^x(v), \dots , d_k^x(v) \ge \sqrt{n}\) (where n is the initial number of nodes). On the one hand, it is easy to bound the total cost of all irrelevant pairs by

$$\begin{aligned} O\left( \sum _{(x, v)\text { irrelevant}} \frac{d_2^x(v) \cdot \ldots \cdot d_k^x(v)}{f(\min \lbrace d_2^x(v), \dots , d_k^x(v) \rbrace )} \right) \le O(2^{kD} n^{k-1/2}), \end{aligned}$$

since there are at most \(2^{kD}\) nodes in the recursion tree. On the other hand, for any relevant pair (xv), we have \(\min \lbrace d_2^x(v), \dots , d_k^x(v) \rbrace \ge \sqrt{n}\). Moreover, since we reach step 3 of the algorithm we further know that \(d_2^x(v) \cdot \ldots \cdot d_k^x(v) \le \alpha |V_2^x| \cdot \ldots \cdot |V_k^x|\) (as otherwise the condition in step 2 had triggered). It follows that

$$\begin{aligned} & O\left( \sum _{(x, v)\text { relevant}} \frac{d_2^x(v) \cdot \ldots \cdot d_k^x(v)}{f(\min \lbrace d_2^x(v), \dots , d_k^x(v) \rbrace )} \right) \\ & \quad \,\,\,\, \le O\left( \sum _{(x, v)\text { relevant}} \frac{\alpha |V_2^x| \cdot \ldots \cdot |V_k^x|}{f(\sqrt{n})} \right) \\ & \quad \,\,\,\, \le O\left( n^k \cdot \frac{\alpha }{f(\sqrt{n})} \right) . \end{aligned}$$

Choosing the Parameters. Summing over all contributions computed before, the total running time is bounded by

$$\begin{aligned} O\left( n^k \cdot (1 - \alpha )^D + n^k \cdot \frac{\alpha }{f(\sqrt{n})} + n^{k - 1/2} \cdot 2^{kD} \right) . \end{aligned}$$

We pick \(D = \log n / (4k)\) such that the latter term becomes \(n^{k - 1/4}\). Next, we pick \(\alpha = \log ((-a + k) \log n) / D = \varTheta ((\log n)^{-1} \log \log n)\) such that the first term becomes

$$\begin{aligned} n^k \cdot (1 - \alpha )^D \le n^k \cdot 2^{-\alpha D} \le n^k (\log n)^{a-k}. \end{aligned}$$

All in all, the total running time is dominated by the second term

$$\begin{aligned} & n^k \cdot \frac{\alpha }{f(\sqrt{n})} \le O(n^k \cdot \alpha \cdot (\log n)^{a-(k-4)} (\log \log n)^{b+k-4}) \\ & \quad \,\,\,\, \le O(n^k (\log n)^{a-(k-3)} (\log \log n)^{b+k-3}), \end{aligned}$$

which is as claimed.    \(\square \)

4 Combinatorial Log-Shaves for Triangle Listing by Weak Regularity

In this section we quickly outline our triangle listing algorithm which is based on Bansal and Williams’ BMM algorithm [7]. Our contribution is in reformulating and reanalyzing their algorithm for the purpose of triangle listing achieving Theorem 2. Note that we cannot achieve the running time stated in the theorem by applying state-of-the-art black-box reductions from triangle listing to Boolean matrix multiplication [37].

The two key ingredients are pseudoregularity and the following lemma which applies four russians to sparse graphs (see full paper for discussion on pseudoregularity and proof of the lemma).

Lemma 1

(Sparse Four-Russians). There is an algorithm which lists up to t triangles in a given graph \((V_1, V_2, V_3, E)\) (with \(n = \min \lbrace |V_1|, |V_2|, |V_3| \rbrace \)) in time

figure e

Let us give an informal overview of the algorithm. For a given tripartite graph \(G = (V_1, V_2, V_3, E)\), we first compute an \(\varepsilon \)-pseudoregular partition of the bipartite graph \(G[V_2 \cup V_3]\). We then distinguish between two types of pieces—pieces with low density (less than \(\sqrt{\varepsilon }\)) and pieces with high density. Based on this we divide the instance into two triangle listing instances—\(G_L\) which only includes edges connecting low density parts between \(V_2\) and \(V_3\) in G, and its complement \(G_H\) consisting of edges connecting the high-density parts between \(V_2\) and \(V_3\). In the former case we can benefit from the sparseness (by construction the total number of edges \(G_L\) is at most \(\sqrt{\varepsilon } n^2\)). In the latter case, due to the pseudoregularity, there must be many triangles in \(G_H\). We can thus charge the extra cost of computing with \(G_H\) towards the output-size. For complete specification refer to the full paper.

5 Combinatorial Log-Shaves for \(\boldsymbol{k}\)-Hyperclique

In this section we give an intuitive description of the algorithm in the simplest case \(k=4\), \(r=3\) (detecting a 4-clique in a 3-uniform hypergraph in faster than \(O(n^4)\) time), for complete and general specification refer to the full paper. We are given a 4-partite 3-uniform graph \(G = (V_1, V_2, V_3, V_4, E)\) with vertex sets of size n. For each \(v\in V_{1}\), we can define a tri-partite graph \(G_{v} = (V_{2},V_{3},V_{4},E')\) in which we draw an edge between two vertices if and only if they share a hyperedge with v in G. It is easy to check that there is a 4-hyperclique in G if and only if there are vertices \(v_2, v_3, v_4\) that form a triangle in \(G_v\) and in G (meaning they are a hyperedge in G). The naive search for such a triplet would take \(O(n^3)\), and we present an algorithm that accelerates this search:

  1. 1.

    Let \(s=\sqrt{c \log n}\) for some small constant \(c>0\), and partition \(V_{2}\), \(V_{3}\) and \(V_{4}\) each into \(g=\lceil n/s\rceil \) blocks of size at most s. We let \(V_{i,j}\) denote the j’th block in \(V_{i}\).

  2. 2.

    For every combination \(j_{2},j_{3},j_{4}\in \left[ g\right] \):

    1. a.

      Create a lookup table \(T_{j_{2},j_{3},j_{4}}\) with an entry for every possible tripartite graph on the vertex sets \(V_{2,j_{2}},V_{3,j_{3}},V_{4,j_{4}}\) (there are \(2^{s^{2}}= n^{c}\) such graphs).

    2. b.

      For every entry corresponding to a graph \(G'\) store whether \(G'\) has a triangle that is a hyperedge in G.

Note that this preprocessing is fast: We construct \(\frac{n^{3}}{s^{3}}\) tables, each consisting of \(n^{c}\) entries, and each entry takes \(O(s^{3})\) time to determine. So, the total preprocessing time is \(O(n^{3+c})\). Given these tables we can now search for a 4-clique more efficiently: For each \(v\in V_{1}\) we break \(G_{v}\) into triples of blocks as before, and query \(T_{j_{2},j_{3},j_{4}}\) for the graphs \(G_{v}[V_{2,j_{2}} \cup V_{3,j_{3}} \cup V_{4,j_{4}}]\), for all \(j_2, j_3, j_4\). If one the answers is positive we have found a hyperclique. Assuming every query is performed in constant time, the running time is determined by the number of queries which is

$$\begin{aligned} O\left( n \cdot \frac{n^{3}}{s^{3}} \right) = O\left( \frac{n^{4}}{(\log n)^{1.5}} \right) . \end{aligned}$$

All that is left now is to justify the assumption that every query is performed in constant time. The main question is given \(v\in V_{1}\) and a combination of blocks \(V_{2,j_{2}},V_{3,j_{3}},V_{4,j_{4}}\), how can we determine the key corresponding to \(G_{v}[V_{2,j_{2}},V_{3,j_{3}},V_{4,j_{4}}]\) in \(T_{j_{2},j_{3},j_{4}}\) in constant time? For this purpose, we define in the proof a compact representation of tripartite graphs (on vertex sets of size s) used to index the tables \(T_{j_2, j_3, j_4}\). This compact representation is chosen in such a way which allows to efficiently precompute the compact representations of all such graphs  \(G_v[V_{2,j_{2}},V_{3,j_{3}},V_{4,j_{4}}]\).