Abstract
The girth of a graph is the length of its shortest cycle. Due to its relevance in graph theory, network analysis and practical fields such as distributed computing, girth-related problems have been object of attention in both past and recent literature. In this paper, we consider the problem of listing connected subgraphs with bounded girth. As a large girth is index of sparsity, this allows to extract sparse structures from the input graph. We propose two algorithms, for enumerating respectively vertex induced subgraphs and edge induced subgraphs with bounded girth, both running in O(n) amortized time per solution and using \(O(n^3)\) space. Furthermore, the algorithms can be easily adapted to relax the connectivity requirement and to deal with weighted graphs. As a byproduct, the second algorithm can be used to answer the well known question of finding the densest n-vertex graph(s) of girth k.
Access provided by CONRICYT-eBooks. Download conference paper PDF
Similar content being viewed by others
1 Introduction
We consider the problem of finding all subgraphs and induced subgraphs with girth at least k of a graph. The girth is a measure of sparsity, as graphs with large girth are inherently sparse. This corresponds to finding sparse substructures of the given graph, a problem that was considered under several forms [5, 9] and has applications in network analysis. In particular, this problem generalizes two well studied problems, i.e., listing all subtrees and induced subtrees [7, 13,14,15]. Indeed, any graph with girth larger than n may not contain a cycle, i.e., it is a tree, or a forest.
A subgraph enumeration problem, given a graph G and some constraint \(\mathcal {R}\), consists in outputting all the subgraphs satisfying \(\mathcal {R}\) without duplicates. The efficiency of enumeration algorithms is often measured with respect to both the size of the input and that of the output, i.e., the number of solutions: an enumeration algorithm is called an amortized polynomial time algorithm if it runs in \(O(M\cdot poly(N))\) time, where N is the input size and M is the number of solutions. Furthermore, the algorithm is said to have polynomial delay if the maximum time elapsed between two consecutive outputs is polynomial.
In this paper, we present two amortized polynomial time algorithms for enumerating subgraphs of girth at least k. The first, EBG-IS, enumerates induced subgraphs, while the second, EBG-S, enumerates edge subgraphs (also simply called subgraphs). Both EBG-IS and EBG-S run in \(O(n\left| \mathcal {S}\right| )\) time using \(O(n^3)\) space, where n is the number of nodes in G and \(\mathcal {S}\) is the set of all solutions. The proposed algorithms will consider the enumeration of connected subgraphs in simple graphs. However, both algorithms can easily be applied to the enumeration of non-connected subgraphs, and to weighted graphs by trivial changes, with the same time and space complexity. In these problems, the upper bound of the number of solutions are \(O(2^n)\) and \(O(2^m)\), respectively, where m is the number of edges. Hence, the brute force algorithms are optimal if we evaluate the efficiency of algorithms only the input size. When we describe a more efficient algorithm, reducing amortized complexity is important [10]. Indeed, our implementation of EBG-SFootnote 1 is almost 560 times faster than the brute force algorithm when the input graph is a complete graph \(K_8\) and girth is four.
While the problem of efficiently enumerating subgraphs with bounded girth has been considered for directed graphs [6], to the best of our knowledge, there is no known efficient algorithm for the undirected version of the problem.Footnote 2
An early result on girth computation is the algorithm by Itai and Rodeh [8], that finds the girth of a graph in \(O(nm)\) time. In more recent work, the problem was also solved in linear time for planar graphs [4]. However, the problem we consider involves computing the girth of many subgraphs, so relying on these algorithms is not efficient.
A prominent question related to the girth is finding exactly how dense a graph of given girth can be: the maximum number of edges in a d-regular graph with girth k is bounded by the well known Moore bound [2], which Alon later proved to be tight on general graphs as well [1]. Erdős conjectured that there exists a graph with \(\varOmega (n^{1 + 1/k})\) edges and girth \(2k + 1\) [12]. On the other hand, some have focused on giving practical lower bounds, i.e., finding ways to generate graphs of given girth as dense as possible [3, 11]. We remark that our proposed algorithm EBG-S can match theory and practice: the densest n-vertex graph of girth k can be found as a subgraph of the complete graph \(K_n\). While this may not be practical for large values of n, it significantly improves upon the brute force approach by avoiding the generation of subgraphs with girth <k.
2 Preliminaries
Let \(G = (V(G), E(G))\) be a simple undirected graph with no self-loops, with vertex set V(G) and edge set \(E(G) \subseteq V(G)\times V(G)\). Two vertices u and v are adjacent (or neighbors) if there is an edge \(e = \{u,v\} \in E(G)\) joining them. We call e incident to v and we denote the set of incident edges to v E(v). The set of neighbors of u in G is called its neighborhood and denoted by \(N_G(u)\) and the size of \(N_G(u)\) is called the degree of u in G. Let \(N_G[u] = N_G(u) \cup \{u\}\) be the closed neighborhood of u. The set of neighbors of \(U \subseteq V\) is defined as \(N_G(U) = \bigcup _{u \in U}N_G(u) \setminus U\). Similarly, \(N_G[U]\) denotes \(N_G(U) \cup U\). For any vertex subset \(S \subseteq V\), we call \(G[S] = (S, E[S])\) an induced subgraph, where \(E[S] = E(G) \cap (S \times S)\). Since G[S] is uniquely determined by S, we sometimes identify G[S] with S. For any edge subset \(E' \subseteq E\), we call \(G[E'] = (V'(E'), E')\) edge induced subgraph, where \(V'(E') = \bigcup _{\{u, v\} \in E'} u\). We define \(G \setminus \{e\} = (V, E \setminus \{e\})\) and \(G\setminus \{v\} = G[V\setminus \{v\}]\). For simplicity, we use \(v \in G\) and \(e \in G\) to refer to \(v \in V(G)\) and \(e \in E(G)\), respectively. If G is clear from the context, we will also use simplified notation such as V, E, N(u) instead of V(G), E(G), \(N_G(u)\).
A sequence \(P = (v_1, \dots , v_{k+1})\) of distinct vertices is a path from \(v_1\) to \(v_{k+1}\) (\(v_1\)-\(v_{k+1}\) path for short) in \(G = (V, E)\) if for any \(i \in [1, k]\), \(\{v_i, v_{i+1}\} \in E\). P is a shortest path between two vertices if there is no shorter path between them. Let us denote by V(P) and E(P) the set of vertices and edges in P, respectively. We say that G is connected if for any two vertices \(u, v \in V\), there is a u-v path. We say that a sequence \(C = (v_1,\dots , v_{k+1})\) of vertices is a cycle if \((v_1, \dots , v_{k})\) is a \(v_1\)-\(v_{k}\) path, \(v_{k+1} = v_1\), and \(\{v_k, v_{k+1}\} \in E\). The length of a path or cycle is defined by its number of edges. The distance between two vertices is the length of a shortest path between them. The girth of G, denoted by g(G), is the length of a shortest cycle in G. For simplicity, we say that G has girth k if \(g(G)\ge k\). The girth of acyclic graphs is usually assumed to be \(\infty \).
We define our problems as follows and Fig. 1 shows examples of solutions Problem 1 and Problem 2. If we store all outputs, then it is easy to avoid duplicates. Our algorithms achieve without duplicates in polynomial space.
Problem 1
(k-girth connected induced subgraph enumeration). Enumerate all connected induced subgraphs S of a graph G with \(g(S)\ge k\), without duplicates.
Problem 2
(k-girth connected subgraph enumeration). Enumerate all connected subgraphs S of a graph G with \(g(S)\ge k\), without duplicates.
3 Enumeration by Binary Partition
The binary partition method is one of the fundamental frameworks for designing enumeration algorithms. Typically, a binary partition algorithm \(\mathcal {A}\) has the following structure: first \(\mathcal {A}\) picks an element x of the input, then divides the search space into two disjoint spaces, one containing the solutions that include x, and one those that do not. \(\mathcal {A}\) recursively executes the above step until all elements are picked. Whenever the search space contains exactly one solution, \(\mathcal {A}\) outputs it. We call each dividing step an iteration.
Algorithm EBG, detailed in Algorithm 1, represents a basic strategy for Problem 1. Algorithm 1 is based on binary partition, although each iteration divides the search space in more than two subspaces. While EBG enumerates solutions by picking vertices on each iteration, we can obtain an enumeration algorithm for Problem 2 by modifying EBG so that it picks edges instead.
Let G, X, and S(X) be respectively an input graph, an iteration, and the solution received by the iteration X. A vertex \(v \notin S(X)\) is a candidate vertex for S(X) if \(g(S(X) \cup \{v\}) \ge k\) and \(S(X)\cup \{v\}\) is connected, that is, the addition of a candidate vertex generates a new solution. Let \(C\left( S(X)\right) \) be a set of candidate vertices for S(X). We call \(C\left( S(X)\right) \) the candidate set of S(X). Now, suppose that X generates new iterations \(Y_1, \dots , Y_d\) by adding vertices in \(C\left( S(X)\right) = \{v_1, \dots , v_d\}\) on line 7. For each i, we say that X is the parent of \(Y_i\), and \(Y_i\) is a child of X. Note that, on iteration \(Y_i\) and its descendant iterations, EBG outputs solutions that do not include \(v_1, \dots , v_{i-1}\) but do include \(v_i\). This implies that the solution space of \(Y_i\) is disjoint from those of each \(Y_{j<i}\) created so far, i.e., EBG divides the solution space of X in d disjoint subspaces. The only iteration without a parent is the one generated on line 2, which we call the initial iteration and denote by I. We remark that \(S(I) = \emptyset \) and that \(\emptyset \) is a solution.
By using the above parent-child relation, we introduce the enumeration tree \(\mathcal {T}(G) = \mathcal {T} = (\mathcal {V}, \mathcal {E})\). Here, \(\mathcal {V}\) is the set of iterations of EBG for G and \(\mathcal {E}\) is a subset of \(\mathcal {V} \times \mathcal {V}\). For any pair of iterations X and Y, \((X, Y) \in \mathcal {E}\) if and only if X is the parent of Y. We can observe that \(\mathcal {T}\) has no cycles since every child iteration of X receives a solution whose size is larger than S(X). In addition, each iteration other than the initial iteration has exactly one parent. This implies that the initial iteration is an ancestor of all iterations and thus \(\mathcal {T}\) is connected. Thus, \(\mathcal {T}\) forms a tree. Next three lemmas show the correctness of EBG. Due to the space limitation, we omit some proofs (which can be found in Appendix).
Lemma 1
Let G be a simple undirected graph and k a positive integer. Then, every output of \({\mathtt {EBG}}\) induces a connected subgraph of girth k.
Lemma 2
If X and Y are two distinct iterations on \({\mathtt {EBG}}\), then \(S(X) \ne S(Y)\).
Lemma 3
Let G be a simple undirected graph and k a positive integer. \({\mathtt {EBG}}\) \(\mathtt {(}{G,k}\mathtt {)}\) outputs all connected induced subgraphs with girth k in G exactly once.
Proof
By Lemma 1, \({\mathtt {EBG}}\) outputs only solutions, and by Lemma 2 it does not output each solution more than once. We show that \({\mathtt {EBG}}\) outputs all solutions by induction. Let S be a solution. If \(\left| S\right| = 0\), \({\mathtt {EBG}}\) outputs the empty set.
Otherwise, there is an iteration \(X_0\) such that \(S(X_0)\subseteq S\) and \(S\subseteq V(G)\) (that is, no vertex of S has been removed from G). This is trivially true, e.g. for \(X_0 = I\), since \(S(I) = \emptyset \) and nothing has been removed from G. Note that every subgraph of a graph with girth at least k must also have girth at least k, thus every \(v\in S\setminus S(X_0)\) such that \(G[S(X_0)\cup \{v\}]\) is connected must be in \(C\left( S(X_0)\right) \). As S is connected there is at least one such v in \(C\left( S(X_0)\right) \).
Consider the first execution of Line 7 in X for which a vertex \(v\in S\setminus S(X_0)\) is considered to generate a child iteration \(X_1\). As no vertex of S was added to done in \(X_0\), we still have that \(S(X_1)\subseteq S\) and \(S\subseteq V(G)\) in iteration \(X_1\), but \(|S(X_1)| = |S(X_0)|+1\). Hence, by induction, EBG will eventually find S. \(\square \)
Using Itai’s algorithm [8] to compute the girth of a graph in \(O(mn)\), we can obtain a first trivial complexity bound for Algorithm 1.
Theorem 1
\({\mathtt {EBG}}\) solves Problem 1 with delay \(O(n^2m)\).
Non-induced, weighted, and non-connected case. Let us briefly show how EBG also applies to some variants of the problem. Firstly, we can solve Problem 2, i.e., enumerate edge subgraphs, by modifying EBG as follows: Each solution is a set of edges \(S\subseteq E\), and the candidate set \(C\left( S(X)\right) \) becomes \(C\left( S(X)\right) = \{e \in E(X) \mid G[S(X) \cup \{e\}] \text { is connected and }\) \(g(G[S(X) \cup \{v\}]) \ge k\}\). It is straightforward to see that Lemma 3 still holds (replacing the word induced with edge in the statement), and that the modified algorithm will solve Problem 2 in polynomial delay and polynomial space.
Furthermore, we can consider the weighted version of the problem, where the length of a cycle is the sum of the weights of its edges: we can find the girth in this case by adapting the Floyd-Warshall algorithm, and thus still enumerate all solutions for both the induced and edge subgraph version of the problem, in polynomial delay and polynomial space.
Finally, we consider non-connected case, i.e., where the solutions are all induced or edge subgraphs of girth k, and not just the connected ones: this is trivially done by redefining the candidate set as \(C\left( S(X)\right) = \{v \in V(G) \mid g(G[S(X) \cup \{v\}]) \ge k\}\) for Problem 1, and similarly for Problem 2. If G[S] is not connected, its girth is the minimum among that of its connected components, thus we can still use Itai’s algorithm (or Floyd-Warshall if weighted edges are considered as well), and again obtain polynomial delay and polynomial space.
4 Induced Subgraph Enumeration
The bottleneck of EBG is the computation of the candidate set. In this section, we present a more efficient algorithm EBG-IS for Problem 1. EBG-IS is based on EBG, but each iteration exploits information from the parent iteration, and maintains distances in order to improve the computation of the candidate set. The procedure is shown in Algorithm 2.
EBG-IS uses the second distance between vertices defined as follows. Let v be a vertex in \(C\left( S\right) \cup S\), and u and \(u'\) be vertices in \(C\left( S\right) \). We denote by \(D^{(1)}_{uv}(S)\) the distance between v and u in \(G[S \cup \{v, u\}]\), and by \(D^{(2)}_{uu'}(S)\) the distance between u and \(u'\) in \(G[S \cup \{u, u'\}] \setminus \{e_0\}\), where \(e_0 = (u, \cdot )\) is the first edge on a shortest path between u and \(u'\). Note that for any vertices \(x \in G\setminus \{C\left( S\right) \cup S\}\), \(y\in G \setminus C\left( S\right) \), and \(y'\in G \setminus C\left( S\right) \), \(D^{(1)}_{xy}(S) = \infty \) and \(D^{(2)}_{yy'}(S) = \infty \). Especially, we call \(D^{(2)}_{uu'}(S)\) the second distance between u and \(u'\) in \(G[S \cup \{u, u'\}]\). In addition, we call a path whose length is the second distance a second shortest path. Moreover, we write \(D^{(1)}_{uwv}(S)\) and \(D^{(2)}_{uwv}(S)\) for the distance and the second distance from u to v via a vertex w, respectively. Let P and \(P'\) be respectively a v-u shortest path and a v-u second shortest path. Since P and \(P'\) do not share \(e_0\) but do share their ends, H must have a cycle including v and u, where H is a subgraph of G such that \(V(H) = V(P)\cup V(P')\) and \(E(H) = E(P) \cup E(P')\). Figure 2(C) shows an example of a cycle made by P and \(P'\). To compute the candidate set efficiently, we will use the following lemmas. In the following lemmas, let X and Y be two iterations such that X is the parent of Y, and v be a vertex in \(C\left( S(X)\right) \) such that \(S(Y) = S(X) \cup \{v\}\).
Lemma 4
Let u and w be two vertices in \(C\left( S(X)\right) \) and \(k= g(G[S(X)])\). (A) \(g(G[S(X) \cup \{u, w\}]) \ge k\) if and only if (B) \(D^{(1)}_{uw}(S(X)) + D^{(2)}_{uw}(S(X)) \ge k\).
Proof
Clearly, (A) \(\rightarrow \) (B) holds by definition of \(D^{(1)}_{}(S(X))\) and \(D^{(2)}_{}(S(X))\). For the direction (B) \(\rightarrow \) (A), consider a shortest cycle C in \(G[S(X) \cup \{u, w\}])\) in the following three cases: (I) \(u, w \notin C\): \(\left| C\right| \ge k\) since \(g(G[S(X)])\ge k\). (II) Either u or w in C: \(\left| C\right| \ge k\) since u and w belong to \(C\left( S(X)\right) \). (III) Both u and w in C: C can be decomposed into two u-w paths P and Q. Without loss of generality, \(\left| P\right| \le \left| Q\right| \). If P is a u-w shortest path, then \(\left| C\right| \ge k\) from (B), since Q is at least as long as the second distance \(D^{(2)}_{uw}(S(X))\). Otherwise, there is a u-w shortest path \(P'\) and a cycle \(C'\) consisting of a part of P (or Q) and a part of \(P'\). If \(C'\) contains w, then \(\left| C'\right| = \left| C\right| \ge k\) since C is a shortest cycle. If \(C'\) does not contain w, then \(\left| C'\right| \) is a cycle in \(G[S(X) \cup \{u\}]\), thus \(\left| C'\right| \ge k\) because \(u\in C\left( S(X)\right) \). \(\square \)
Lemma 5
\({\mathtt {EBG\text {-}IS}}\) computes \(C\left( S(Y)\right) \) in \(O(\left| C\left( S(X)\right) \right| + \left| N(v)\right| )\) time.
Proof
From Lemma 4, vertex u in \(C\left( S(X)\right) \) belongs to \(C\left( S(Y)\right) \) if and only if \(D^{(1)}_{uv}(S(X)) + D^{(2)}_{uv}(S(X)) \ge k\). This can be done in constant time. In addition, from the connectivity of G[S(Y)], \(C\left( S(Y)\right) \setminus C\left( S(X)\right) \subseteq N(v)\). Thus, we can find \(C\left( S(Y)\right) \setminus C\left( S(X)\right) \) in \(O(\left| C\left( S(X)\right) \right| + \left| N(v)\right| )\) time. \(\square \)
Next, we consider how to update the values of \(D^{(1)}_{}(S(Y))\) and \(D^{(2)}_{}(S(Y))\) when adding v to S(X). We can update the old distances to the ones after adding v as in the Floyd-Warshall algorithm (see Algorithm 2), meaning that we can compute \(D^{(1)}_{}(S(Y))\) in \(O(\left| S(X)\cup C\left( S(X)\right) \right| \cdot \left| C\left( S(X)\right) \right| )\) time. By the following lemma, the values of \(D^{(2)}_{}(S(Y))\) can be updated in \(O(\left| S(Y)\right| )\) time for each pair of vertices in \(C\left( S(Y)\right) \).
Lemma 6
Let u and w be two vertices in \(C\left( S(X)\right) \), \(e_0\) be an edge in a u-w shortest path in \(G[S(X) \cup \{u, w\}]\), and \(H = G[S(X) \cup \{u, w\}] \setminus \{e_0\}\). If \(N_H(u) = \emptyset \), then \(D^{(2)}_{uw}(S(X)) = \infty \). Otherwise, \(D^{(2)}_{uw}(S(X)) = \min _{y \in N_H(u)}\{D^{(1)}_{yw}(S(X)) + 1\}\).
Proof
From the definition of \(D^{(2)}_{uw}(S(X))\), if \(N_H(u) = \emptyset \), then \(D^{(2)}_{uw}(S(X)) = \infty \). We assume \(\left| N_H(u)\right| \ge 1\). Since \(u \notin S(X)\), every shortest path between u and w in \(G[S(X) \cup \{w\}] \cup {f}\) contains f, where \(f = \{u, y\}\). Hence, \(D^{(1)}_{yw}(S(X)) + 1\) is equal to the distance between u and w in \(G[S(X) \cup \{w\}] \cup \{f\}\). Hence, the statement holds. \(\square \)
The next lemma implies that if \(D^{(1)}_{uw}(S(X)) + D^{(2)}_{uw}(S(X)) < k\), i.e., \(G[S(X) \cup \{u, w\}]\) is not a solution, then computing \(D^{(2)}_{uw}(S(Y))\) takes constant time.
Lemma 7
Let u and w be two vertices in \(C\left( S(Y)\right) \). If \(p_1 + p_3 < k\), then \(D^{(2)}_{uw}(S(Y)) = \min \{\max \{p_1, p_2\}, p_3\}\), where \(p_1 = D^{(1)}_{uw}(S(X))\), \(p_2 = D^{(1)}_{uvw}(S(Y))\), and \(p_3 = D^{(2)}_{uw}(S(X))\).
Proof
Let \(G_X = G[S(X) \cup \{u, w\}]\) and \(G_Y = G[S(Y) \cup \{u, w\}]\). Note that \(p_1 \le p_3\). We consider the following cases: (I) \(p_1 < p_2\): Let \(e = \{u, x\}\) be the first edge of a u-w shortest path P in \(G_Y\). Note that P cannot contain v. (I.a) There exists a u-v-w shortest path Q that does not contain e: clearly, \(D^{(2)}_{uw}(S(Y)) = \min \{\left| Q\right| = p_2, p_3\}\). (I.b) Every u-v-w shortest path Q contains e: there always exists a cycle C in \(S(Y) \cup \{w\}\) such that \(V(C) \subseteq (V(P) \cup V(Q))\setminus \{u\}\) and C does not contain u. Note that \(\left| C\right| < p_1 + p_2\). If \(p_2 \le p_3\), then this contradicts \(w \in C\left( S(Y)\right) \) since \(\left| C\right| < k\). Thus, \(p_2 > p_3\). This implies that \(\left| Q\right| - 1 \ge p_3\). Hence, \(D^{(2)}_{uw}(S(Y)) = p_3\). (II) \(p_2 \le p_1\): this assumption implies that there exists a u-w shortest path P in \(G_Y\) that contains v, and \(p_1 + p_2 < k\). Let e be the first edge of P in \(G_Y\) and Q be a u-v-w shortest path in \(G_Y \setminus \{e\}\). Now, we can see \(\left| Q\right| > p_1\) since if \(\left| Q\right| \le p_1\), then \(u \notin C\left( S(Y)\right) \) since P and Q make a cycle C containing u with \(\left| C\right| < k\). Thus, the length of a u-w shortest path in \(G_Y\setminus \{e\}\) is \(p_1\), and \(D^{(2)}_{uw}(S(Y)) = p_1\) holds. \(\square \)
Algorithm 2 shows in detail the update of the candidate set, \(D^{(1)}_{}(\cdot )\), and \(D^{(2)}_{}(\cdot )\) (done using Lemma 7). We analyze the time complexity of EBG-IS. Let ch(X) be the set of children of X and \(\#gch(X)\) be the number of grandchildren of X. The next lemma shows the time complexity for updating \(D^{(2)}_{}(S(X))\).
Lemma 8
We can compute \(D^{(2)}_{}(S(Y))\) from \(D^{(2)}_{}(S(X))\) in \(O(\#gch(Y)\cdot \left| S(Y)\right| + \left| C\left( S(Y)\right) \right| ^2)\) time.
Proof
Let u and w be two vertices in \(C\left( S(Y)\right) \). Two cases are possible:
(I) \(D^{(1)}_{uw}(S(X)) + D^{(2)}_{uw}(S(X)) \ge k\): By Lemma 6, computing \(D^{(2)}_{uw}(S(Y))\) takes \(O(\left| S(Y)\right| )\) time, checking only vertices in S(Y). As the number of pairs (u, w) that fit this case is bounded by \(\#gch(Y)\), EBG-IS needs \(O(\#gch(Y)\cdot \left| S(Y)\right| )\) time to compute this part. (II) \(D^{(1)}_{uw}(S(X)) + D^{(2)}_{uw}(S(X)) < k\): From Lemma 7, computing \(D^{(2)}_{uw}(S(Y))\) takes constant time, for a total complexity of \(O(\left| C\left( S(Y)\right) \right| ^2)\), which proves the statement. \(\square \)
Theorem 2
\({\mathtt {EBG\text {-}IS}}\) enumerates all solutions in \(O(\sum _{S \in \mathcal {S}}\left| N[S]\right| )\) time using \(O(\max _{S \in \mathcal {S}}\{\left| N[S]\right| ^3\})\) space, where \(\mathcal {S}\) is the set of all solutions.
Proof
The correctness of EBG-IS follows from Lemma 3. We first consider the space complexity. In an iteration X, EBG-IS uses \(O(\left| C\left( S(X)\right) \cup S(X)\right| ^2)\) space for storing values of \(D^{(1)}_{}(\cdot )\) and \(D^{(2)}_{}(\cdot )\). In addition, the height of \(\mathcal {T}\) is at most \(\max _{S \in \mathcal {S}}\{\left| S\right| \}\). Therefore, EBG-IS uses \(O(\max _{S \in \mathcal {S}}\{\left| N[S]\right| ^3\})\) space.
Let c(X) be \(\left| C\left( S(X)\right) \right| \) and T(X, Y) be the time needed to generate Y from X, i.e., an execution of NextC() (Algorithm 2). From Lemma 5, Lemma 6, and the Floyd-Warshall algorithm, T(X, Y) is \(O(c(X) + \left| N(v)\right| + c(Y)\cdot \left| S(X)\right| + \#gch(Y)\cdot \left| S(Y)\right| + c(Y)^2)\) time. In addition, \(\left| N[S(X)]\right| \le \left| N[S(Y)]\right| \), \(\left| N(v)\right| = O(\left| N[S(Y)]\right| )\), and \(c(X) = O(N[S(X)])\) since every vertex in the candidate set has a neighbor in S(X). Thus, \(T(X, Y) = O(\left| N[S(Y)]\right| (c(Y) + \#gch(Y)))\) time. Note that the sum of children and grandchildren for all iterations is at most \(2\left| \mathcal {V}\right| \). Thus, by distributing the \(O(\left| N[S(Y)]\right| )\) time from X to children and grandchildren of Y, each iteration needs \(O(\left| N[S(Y)]\right| )\) time since each iteration receives costs only from the parent and the grandparent. In addition, each iteration outputs a solution, and hence the total time is \(O( \sum _{S \in \mathcal {S}}\left| N[S]\right| )\). \(\square \)
5 Subgraph Enumeration
We propose an algorithm, EBG-S, for enumerating all subgraphs with girth k in a given graph G, detailed in Algorithm 3. A trivial adaptation of EBG-IS would run in \(O(m)\) time per solution, as the candidate sets are sets of edges, whose size is \(O(m)\). To improve this running time, EBG-S selects candidates in a certain order, so that the number of candidate edges does not exceed no more than the number of nodes in the previous solution G[S].
Let S be the current solution. Note that S is an edge set. We first define an inner edge and an outer edge as follows: an edge \(e = \{u, v\}\) is an inner edge for S if \(u, v \in G[S]\), and an outer edge otherwise (see Fig. 4). Let \(C_\mathrm{{in}}{\left( S\right) }\) and \(C_\mathrm{{out}}{\left( S\right) }\) be a set of inner edges and outer edges in \(C\left( S\right) \), respectively. We first consider the case when EBG-S picks an outer edge. In the following lemmas, let X be an iteration in enumeration tree \(\mathcal {T}\), e be an edge not in X, and Y be the child iteration of X satisfying \(S(Y) = S(X) \cup \{e\}\).
Lemma 9
Let \(e = \{x, y\}\) be an outer edge such that \(x \in V(G[S(X)])\). Then \(C\left( S(Y)\right) \subseteq (C\left( S(X)\right) \cup E(y)) \setminus \{e\}\), where E(y) are the edges incident to y.
Proof
An edge \(g \notin E(y) \cup C\left( S(X)\right) \) may not be added to S(Y) as the resulting subgraph would be disconnected, and \(e\not \in C\left( S(Y)\right) \) since \(e\in S(Y)\). \(\square \)
From Lemma 9, EBG-S manages the candidate set \(C\left( S(Y)\right) \) in \(O(\left| C\left( S(Y)\right) \right| + \left| V(G[S(X)])\right| )\) time when EBG-S picks an outer edge e since we can add all edges \(e' \notin S(X) \cup C\left( S(X)\right) \) incident to y and \(S(Y) \cup \{e'\}\) is a solution. Moreover, removed edges are at most \(\left| V(G[S(X)])\right| \) since all removed edges have a vertex in V(G[S(X)]). In this case, EBG-S can obtain \(C_\mathrm{{in}}{\left( S(Y)\right) }\) and \(C_\mathrm{{out}}{\left( S(Y)\right) }\) in \(O(S(X))\) time and \(O(C\left( S(Y)\right) )\) time, respectively. Next, we consider that when EBG-S picks an inner edge e. When we pick an inner edge, \(C\left( S(Y)\right) \) is monotonically decreasing.
Lemma 10
If e is an inner edge, then \(C_\mathrm{{in}}{\left( S(Y)\right) } \subset C_\mathrm{{in}}{\left( S(X)\right) }\) and \(C_\mathrm{{out}}{\left( S(Y)\right) } = C_\mathrm{{out}}{\left( S(X)\right) }\).
Proof
Since e is an inner edge \(V(G[S(Y)]) = V(G[S(X)])\), thus there is no edge \(f \in C_\mathrm{{in}}{\left( S(Y)\right) } \setminus C_\mathrm{{in}}{\left( S(X)\right) }\). Since \(e \notin C_\mathrm{{in}}{\left( S(Y)\right) }\) and no edge in \(C_\mathrm{{out}}{\left( S(X)\right) }\) is in \(C_\mathrm{{in}}{\left( S(Y)\right) }\), \(C_\mathrm{{in}}{\left( S(Y)\right) } \subset C_\mathrm{{in}}{\left( S(X)\right) }\). Moreover, there is no cycle including \(f \in C_\mathrm{{out}}{\left( S(X)\right) }\) in \(G[S(Y) \cup \{f\}]\), hence \(C_\mathrm{{out}}{\left( S(Y)\right) } = C_\mathrm{{out}}{\left( S(X)\right) }\). \(\square \)
Next, for any pair of edges e and f not in G[S(X)], we consider the computation of the girth of \(G[S(X) \cup \{e, f\}]\) in EBG-S. Let \(A(X) = \{v \in V(G[S(X)]) \mid E(v) \cap C\left( S(X)\right) \ne \emptyset \}\). In a similar fashion as EBG-IS, EBG-S uses \(D^{\small {(}3\small {)}}_{}(S(X))\) for A(X). The definition of \(D^{\small {(}3\small {)}}_{}(S(X))\) is as follows: For any pair of vertices u and v in A(X), \(D^{\small {(}3\small {)}}_{uv}(S(X))\) is the distance between u and v in A(X). Note that a shortest path between u and v may contain a vertex in \(G[S] \setminus A(X)\). The next lemma shows that by using \(D^{\small {(}3\small {)}}_{}(S(X))\), we can compute \(C\left( S(Y)\right) \) in \(O(\left| V(G[S(Y)])\right| )\) time from \(C\left( S(X)\right) \).
Lemma 11
For any iteration X, \(\left| C_\mathrm{{in}}{\left( S(X)\right) }\right| \le \left| V(G[S(X)])\right| \).
Proof
The proof follows from these facts: (A) Initially, \(C_\mathrm{{in}}{\left( S(X)\right) }=\emptyset \). (B) Choosing \(e \in C_\mathrm{{in}}{\left( S(X)\right) }\) decreases \(|C_\mathrm{{in}}{\left( S(Y)\right) }|\). (C) \(e = \{x,y\} \in C_\mathrm{{out}}{\left( S(X)\right) }\) is chosen iff \(\left| C_\mathrm{{in}}{\left( S(X)\right) }\right| =0\), and (assuming wlog \(y\not \in V(G[S(X)])\)) it increases \(|C_\mathrm{{in}}{\left( S(Y)\right) }|\) by at most \(\left| \{ \{y,z\} : z\in V(G[S(X)])\}\right| < \left| V(G[S(X)])\right| \). \(\square \)
Lemma 12
\(\left| C_\mathrm{{out}}{\left( S(X)\right) } \setminus C_\mathrm{{out}}{\left( S(Y)\right) }\right| + \left| C_\mathrm{{out}}{\left( S(Y)\right) } \setminus C_\mathrm{{out}}{\left( S(X)\right) }\right| \le V(G\) [S(Y)]).
Proof
We consider two cases: (I) \(C_\mathrm{{in}}{\left( S(X)\right) } \ne \emptyset \): EBG-S picks \(e \in C_\mathrm{{in}}{\left( S(X)\right) }\), and thus, From Lemma 10, \(C_\mathrm{{out}}{\left( S(Y)\right) } = C_\mathrm{{out}}{\left( S(X)\right) }\). (II) \(C_\mathrm{{in}}{\left( S(X)\right) } = \emptyset \): EBG-S picks \(e = \{u, v\} \in C_\mathrm{{out}}{\left( S(X)\right) }\). Without loss of generality, we can assume that \(u \in V(G[S(X)])\) and \(v \notin V(G[S(X)])\). Let f be an edge \(\{v, w\}\) incident to v. Now, \(w \in V(G[S(Y)])\). This implies that the number of edges that are added to \(C_\mathrm{{out}}{\left( S(Y)\right) }\) and removed from \(C_\mathrm{{out}}{\left( S(X)\right) }\) is at most \(\left| V(G[S(Y)])\right| \). \(\square \)
Note that \(\left| V(G[S(X)])\right| \le \left| V(G[S(Y)])\right| \). Hence, from the above lemmas, we can obtain the following lemma.
Lemma 13
\(C\left( S(Y)\right) \) can be computed in \(O(\left| V(G[S(Y)])\right| )\) time from \(C\left( S(X)\right) \).
Theorem 3
\({\mathtt {EBG\text {-}S}}\) enumerates all connected subgraphs with girth k in \(O(\sum _{S \in \mathcal {S}}\left| V(G[S])\right| )\) total time using \(O(\max _{S \in \mathcal {S}}\{\left| V(G[S])\right| ^3\})\) space.
Proof
The proof can be obtained by adapting that of Theorem 2. A more detailed proof can be found in the appendix. \(\square \)
6 Conclusion
In this paper, we addressed the k-girth connected induced/edge subgraph enumeration problems. We proposed two algorithms: EBG-IS for induced subgraphs and EBG-S for edge subgraphs. Both algorithms have \(O(n)\) time delay and require \(O(n^3)\) space (exact bounds are reported in Table 1). The algorithms can easily be adapted to relax the connectivity constraint and consider weighted graphs. Other possibilities include applying the algorithms for network analysis and considering the more challenging problem of enumerating maximal subgraphs.
Notes
- 1.
The implementation of EBG-S in the github repository: https://github.com/ikn-lab/EnumerationAlgorithms/tree/master/BoundedGirth/.
- 2.
We remark that the techniques in [6] do not extend to undirected graphs, thus motivating a separate study. In directed graphs, a u-v path and a v-u path are distinct. However, a u-v path and a v-u path may be same in undirected graphs.
References
Alon, N., Hoory, S., Linial, N.: The moore bound for irregular graphs. Gr. Comb. 18(1), 53–57 (2002)
Bollobás, B.: Extremal Graph Theory. Courier Corporation (2004)
Chandran, L.S.: A high girth graph construction. SIAM J. Discrete Math. 16(3), 366–370 (2003)
Chang, H.-C., Lu, H.-I.: Computing the girth of a planar graph in linear time. SIAM J. Comput. 42(3), 1077–1094 (2013)
Conte, A., Kanté, M.M., Otachi, Y., Uno, T., Wasa, K.: Efficient enumeration of maximal k-degenerate subgraphs in a chordal graph. In: Cao, Y., Chen, J. (eds.) COCOON 2017. LNCS, vol. 10392, pp. 150–161. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-62389-4_13
Conte, A., Kurita, K., Wasa, K., Uno, T.: Listing acyclic subgraphs and subgraphs of bounded girth in directed graphs. In: Gao, X., Du, H., Han, M. (eds.) COCOA 2017. LNCS, vol. 10628, pp. 169–181. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-71147-8_12
Ferreira, R., Grossi, R., Rizzi, R.: Output-sensitive listing of bounded-size trees in undirected graphs. In: Demetrescu, C., Halldórsson, M.M. (eds.) ESA 2011. LNCS, vol. 6942, pp. 275–286. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23719-5_24
Itai, A., Rodeh, M.: Finding a minimum circuit in a graph. SIAM J. Comput. 7(4), 413–423 (1978)
Johnson, D.S., Yannakakis, M., Papadimitriou, C.H.: On generating all maximal independent sets. Inf. Process. Lett. 27(3), 119–123 (1988)
Kurita, K., Wasa, K., Arimura, H., Uno, T.: Efficient enumeration of dominating sets for sparse graphs. arXiv preprint arXiv:1802.07863 (2018)
Lazebnik, F., Ustimenko, V.A., Woldar, A.J.: A new series of dense graphs of high girth. Bull. Am. Math. Soc. 32(1), 73–79 (1995)
Parter, M.: Bypassing Erdős’ girth conjecture: hybrid stretch and sourcewise spanners. In: Esparza, J., Fraigniaud, P., Husfeldt, T., Koutsoupias, E. (eds.) ICALP 2014. LNCS, vol. 8573, pp. 608–619. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-43951-7_49
Read, R.C., Tarjan, R.E.: Bounds on backtrack algorithms for listing cycles, paths, and spanning trees. Networks 3(5), 237–252 (1975)
Shioura, A., Tamura, A., Uno, T.: An optimal algorithm for scanning all spanning trees of undirected graphs. SIAM J. Comput. 26(3), 678–692 (1997)
Wasa, K., Arimura, H., Uno, T.: Efficient enumeration of induced subtrees in a K-degenerate graph. In: Ahn, H.-K., Shin, C.-S. (eds.) ISAAC 2014. LNCS, vol. 8889, pp. 94–102. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-13075-0_8
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Kurita, K., Wasa, K., Conte, A., Uno, T., Arimura, H. (2018). Efficient Enumeration of Subgraphs and Induced Subgraphs with Bounded Girth. In: Iliopoulos, C., Leong, H., Sung, WK. (eds) Combinatorial Algorithms. IWOCA 2018. Lecture Notes in Computer Science(), vol 10979. Springer, Cham. https://doi.org/10.1007/978-3-319-94667-2_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-94667-2_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-94666-5
Online ISBN: 978-3-319-94667-2
eBook Packages: Computer ScienceComputer Science (R0)