Keywords

1 Introduction

The clique-width has been recently studied as an invariant that maintains its properties under graph isomorphism, belonging to the theory of parameterized complexity [1], which is a branch of computational complexity theory that classifies the difficulty of problems according to multiple input or output parameters, specifically in graph theory it measures the difficulty of decomposing a graph into a tree-like structure. Computing the clique-width of a graph consists of construct a finite term which represents the graph.

Courcelle et al. [2] presents a set of four operations to construct such term: 1) the creation of labels for vertices, 2) the disjoint union of graphs, 3) the creation of edges and 4) the re-labeling of vertices. The number of labels used to construct the finite term is commonly denoted by k. The minimum k number used to construct the term, also called k-expression, defines the clique-width. Finding the best combination which minimizes the k-expression is a NP-complete problem [3], furthermore, there is no constant error in the approximation algorithms for the calculation of the clique-width [3]. As the clique-width increases, the complexity of the problem in graphs also increases, in the same way the difficulty to achieve the decomposition of the graph increases, in fact, for some automata that represent certain problems in graphs (according to the main Courcelle’s theorem), its computation rapidly consumes memory.

In recent years, the clique-width has been studied in different classes of graphs showing the behavior of this invariant under certain operations. For example, Golumbic et al. [4] showed that for every graph with hereditary distance (a graph in which the distances between all connected induced subgraphs is the same as in the original graph), the clique-width (cwd for short) is smaller or equal to 3, so the following problems have a linear solution in this type of graph: minimum dominant set, minimum connected dominant set, minimum Steiner tree, maximum weight clique, domestic number for a fixed k, vertices cover and colorability for a fixed k. Examples can also be found in [5] for graphs with cwd 3 or 4.

The importance of classifying these types of graphs according to their cwd allows us to have a set of problems in graph theory that allow solutions in polynomial time, which indicates that they are into the tractable problems category. A variant of the problem, which also belongs to the NP-complete class, is to decide whether a graph has cwd of size k, for a fixed k number. For graphs with cwd bounded, in [6] is shown that there is an algorithm in polynomial time \((O(n^2m))\) that recognizes graphs of cwd less than or equal to 3. However, the authors refer that the problem remains open for \(k\ge 4\). In the other hand, there is a classification for the graphs of \(cwd \le 2\), since the graphs of \( cwd = 2 \) are precisely the cographs (graphs that can be generated by a graph \( K_1 \) by complementation or disjoint union).

Similarly, the same result holds for the other invariant in graphs, the treewidth [7], however, the cwd is more general in the sense that graphs with a small tree-width also have small cwd. Algorithms used to calculate the cwd in these graphs require a previous certificate that indicates that they have small cwd, however, calculating this certificate or deciding if the cwd is bounded by a given number is also a complicated problem in the combinatorial sense. In other hand, the cwd of a graph with n vertices of degree greater than 2 cannot be approximated by a polynomial algorithm with an absolute error of \(n^\epsilon \) unless \( P = NP \) [3].

Recently, González-Ruiz et al. [8] showed that the cwd for Cactus graphs is less or equal to 4 and a polynomial time algorithm was presented that calculates exactly the \( 4-expression \). Furthermore, in [9] they studied graphs called Polygonal Trees, which consist of simple cycles joined by at most one edge, showing that the cwd of this type of graph is less or equal to 5 and a polynomial time algorithm was presented that calculates the \(5-expression\).

Additionally Heule et al. [10] present a procedure to transform the cwd problem to the SAT problem. Its conversion algorithm is polynomial, however, as is well known, the Propositional Satisfaction problem (SAT) remains NP-complete. They calculate the cwd of relevant graphs in the literature for which the exact cwd was not known, using SAT solvers. The graphs considered have known topologies such as Brinkmann, Dodecahedron, Frutch, Kittell, McGee, Desargues, among others.

In this article we present a polynomial algorithm that approximates the cwd of simple graphs based on induced paths and show a comparison with the exact results of Heule et al.

2 Preliminares

All the graphs in this article are simple, that is, finite, without loops or multiple edges, and without direction. A graph is a pair \( G = (V, E) \), where V is a set of elements called vertices and E an unordered set of pairs of vertices called edges. An edge e is denoted as uv where u and v are vertices. As we know |A| is used to denote the cardinality of a set A. The degree of a v vertex in a graph G is the number of edges of G incident at v. Also the maximum degree of the vertices of G is denoted by \( \Delta (G) \). In other hand a graph is connected if for every partition of its set of vertices into two non-empty sets X and Y, there is at least one edge with an end in X and another end in Y. Additionally a graph \( G'\), whose set of vertices and edges build a subset of vertices and edges of a given graph G, is called a subgraph of G. An abstract graph represents a class of isomorphic graphs.

Let G be a graph and vw vertices of G, a path from v to w, denoted by path(vw) , is a sequence of edges: \( v_0 v_1, v_1 v_2, \dots , v _ {(n-1)}, v_n \) such that \( v = v_0 \), \( v_n = w \) and \( v_k \) is adjacent to \( v_{(k + 1)}\), for \( 0 \le k \le n \). The length of the path is n. In other way, a simple path is a path where \( v_0, v_1, \dots , v_{(n-1)}, v_n \) are all different. A cycle is a nonempty path such that the first and last vertex are identical, and a simple cycle is a cycle in which there is no repeated vertex, except for the first and last. \( P_n \) and \( C_n \) denote a path and a simple loop respectively, where n denotes the number of vertices.

Dijkstra’s algorithm called minimum paths algorithm determines the shortest path given a starting vertex to the rest of the vertices of a given graph [7], the complexity of this algorithm is \( O (n ^ 2) \). In the Algorithm 1 we present the pseudo-code of the procedure of E. Dijkstra (1959) to find the shortest path between a pair of vertices.

figure a

In other hand, a spanning tree of a connected graph of n vertices is a subset of \( n-1 \) edges that forms a tree. Given a graph \( G = (V, E) \), let \( T_G \) be one of its spanning trees. The edges in \( T_G \) are called tree edges, while the edges \( E (G) \setminus E (T_G) \) are called fronds. Let \( e \in E (G)\setminus E (T_G) \) be a frond edge, the union of the path in \( T_G \) between the end points of e with the same edge e form a simple cycle, such a cycle it is called the fundamental of G with respect to \( T_G \). Each frond \( e = vw \) fulfills the maximum path contained in the fundamental cycle of which it is part. The set of fundamental cycles of G will be denoted by C. Let L be a countable set of labels. A labeled graph is a \( (G, \gamma ) \) pair where \( \gamma \) is a function that maps V(G) to the set L. A labeled graph can also be defined by a triplet \( G = (V, E, \gamma ) \) and its labeling function is denoted by \( \gamma (G) \). We can say that G is D -labelled if D is finite and \( \gamma (G) \subseteq D \).

We now introduce the notion of cwd (cwd, for short). Let \(\mathscr {C}\) be a countable set of labels. A labeled graph is a pair \((G,\gamma )\) where \(\gamma \) maps each element of V(G) into \(\mathscr {C}\). A labeled graph can also be defined as a triple \(G=(V(G),E(G),\gamma (G))\) and its labeling function is denoted by \(\gamma (G)\). We say that G is C-labeled if C is finite and \(\gamma (G)(V)\subseteq C\). We denote by \(\mathscr {G}(C)\) the set of undirected C-labeled graphs. A vertex with label a will be called an a-port.

We introduce the following symbols:

  • a nullary symbol a(v) for every \(a\in \mathscr {C}\) and \(v\in V\);

  • a unary symbol \(\rho _{a\rightarrow b}\) for all \(a,b\in \mathscr {C},\) with \(a\ne b\);

  • a unary symbol \(\eta _{a,b}\) for all \(a,b\in \mathscr {C},\) with \(a\ne b\);

  • a binary symbol \(\oplus \).

These symbols are used to denote operations on graphs as follows: a(v) creates a vertex with label a corresponding to the vertex v, \(\rho _{a\rightarrow b}\) renames the vertex a by b, \(\eta _{a,b}\) creates an edge between a and b, and \(\oplus \) is a disjoint union of graphs.

For \(C\subseteq \mathscr {C}\) we denote by T(C) the set of finite well-formed terms written with the symbols \(\oplus ,a,\rho _{a\rightarrow b},\eta _{a,b}\) for all \(a,b\in C,\) where \(a\ne b\). Each term in T(C) denotes a set of labeled undirected graphs. Since any two graphs denoted by the same term t are isomorphic, one can also consider that t defines a unique abstract graph.

The following definitions are given by induction on the structure of t. We let val(t) be the set of graphs denoted by t.

If \(t\in T(C)\) we have the following cases:

  1. 1.

    \(t=a\in C\): val(t) is the set of graphs with a single vertex labeled by a;

  2. 2.

    \(t=t_{1}\oplus t_{2}\): val(t) is the set of graphs \(G=G_{1}\cup G_{2}\) where \(G_{1}\) and \(G_{2}\) are disjoint and \(G_{1}\in val(t_1)\), \(G_{2}\in val(t_{2})\);

  3. 3.

    \(t=\rho _{a\rightarrow b}(t'):\) \(val(t)=\{\rho _{a\rightarrow b}(G)|G\in val(t')\}\) where for every graph G in \(val(t')\), the graph \(\rho _{a\rightarrow b}(G)\) is obtained by replacing every vertex label a by b in G;

  4. 4.

    \(t=\eta _{a,b}(t'):\) \(val(t)=\{\eta _{a,b}(G)|G\in val(t')\}\) where for every undirected labeled graph \(G=(V,E,\gamma )\) in \(val(t')\), we let \(\eta _{a,b}(G)=(V,E',\gamma )\) such that \(E'=E\cup \{\{x,y\}|x,y\in V,x\ne y,\gamma (x)=a,\gamma (y)=b\}\), e.g. \(\eta _{a,b}(G)\) adds an edge between each pair of vertices a and b in G.

For every labeled graph G we let:

\(cwd(G)=min\{|C||G\in val(t),t\in T(C)\}.\)

A term \(t \in T(C)\) such that \(|C| \,= cwd\, (G) \) and \(G = val(t)\) is called optimal expression of G [11] and written as |C|-expression.

In other words, the cwd of a graph G is the minimum number of different labels needed to construct a vertex-labeled graph isomorphic to G using the four mentioned operations.

As an example we show the computing of cwd for a simple graph Fig. 1 that consists of a cycle of size 4. Firstable in Fig. 2 shows the creation of vertices 3 and 2 of the original graph by the expression \( a (3) \oplus a (2) \), next in Fig. 3 the labels b(1) and b(4) are created. Then, in Fig. 4 the disjoint union of the vertices created in steps 1 and 2 is made using the expression \( (b (1) \oplus b (4)) \oplus (a (3) \oplus a (2)) \). Finally, in Fig. 5 the edges between the labeled vertices are created, resulting in an abstract graph isomorphic to the original by the expression \( \eta _ {(a, b)} ((b (1) \oplus b (4)) \oplus (a (3) \oplus a (2))) \).

As we can see, 2 labels were used to build the cycle of size 4. It is obvious that a single label cannot be used to build the cycle of 4, since edges cannot be created between the same vertices. Therefore, the cwd of a cycle with size 4 is 2.

3 Shortest Path Procedure

In this section we show a procedure based on shortest paths, which allows, throughout the construction, to use a smaller number of labels, thus optimizing the cwd computation.

Let \( G = (V, E) \) be a simple graph and \( \mathcal {C} \) be the set of fundamental cycles of G. If \( C_i, C_j \in \mathcal {C} \) and \( E (C_i) \cap E (C_j) \ne 0 \) then \( C_i \triangle C_j = (E (C_i) \cup E (C_j)) - ( E (C_i) \cap E (C_j)) \) forms a compound loop, where \( \triangle \) denotes the symmetric difference operation between the set of edges in both loops.

Let \( G = (V, E) \) be a simple graph and \( C_1 \) the smallest fundamental cycle, the result of the decomposition of the graph by a spanning tree and its co-tree.

Looking to find an optimal way to construct a graph from the minimum, element of G, i.e. the smallest fundamental cycle, we have proposed the following inductive construction of subgraphs representing together G:

\(G'_1=C_1.\)

\(G'_n=G'_{(n-1)} \cup min \{ Dijkstra((V(G),E(G)\setminus \bigcup _{i=1}^{n-1}E(G'_i)),v_i,v_j)|v_i,v_j \in G'_{n-1},v_i\ne v_j \}.\)

Fig. 1.
figure 1

Simple graph

Fig. 2.
figure 2

Step 1: \(a(3)\oplus a(2)\)

Fig. 3.
figure 3

Step 2: \(b(1) \oplus b(4)\)

Fig. 4.
figure 4

Step 3: \((b(1)\oplus b(4))\oplus ((a(3)\oplus a(2))\)

Fig. 5.
figure 5

Step 4: \((\eta _{a,b}(b(1)\oplus b(4))\oplus ((a(3)\oplus a(2)))\)

Each graph \( G'_i \) is induced from the original graph and is contained as shown by the proposition 1.

Proposition 1 \(G'_1 \subset G'_2 \dots \subset G'_n=G.\)

Proof

\(V(G'_i) \subseteq V(G'_{i+1})\) y \(E(G'_i) \subseteq E(G'_{i+1}), \forall i,1 \le i<n.\) \(\square \)

Now, to simplify the notation we use:

\(P_{v_{ini},v_{fin}}^{i}= min \{ Dijkstra((V(G),E(G) \bigcup _{i=1}^{n-1} E(G'_i)),v_i,v_j )|v_i,v_j \in G'_{n-1},v_i \ne v_j\}\)

where \( v_{ini}, v_{fin} \) are the initial and final vertex respectively of the shortest path found by Dijkstra’s method. Therefore, we can establish the following theorem.

Theorem 1

Let G a simple graph, and \(G'_1 \subset G'_2 \dots \subset G'_n=G\), we can build each \({k_{G'_n}}_{1}^{n}\)-expression to calculate the cwd(G), inductively represented by:

\(G'_1=C_1.\)

\(G'_n=G'_{n-1} \cup P_{v_{ini} ,v_{fin}}^{n-1}.\)

Proof

The proof will be carried out detailing the procedure:

Let L be a set of labels. We construct the k-expression of each induced subgraph \( G'_i \) as follows:

  • For all \(v_i \in G'_1\), take a new \(a_i \in L\) and create \(a_i (v_i)\).

  • The k-expression of the graph \(G'_1\) is: \(k (G'_1 )=\eta _{a_1,a_l} (\eta _{a_{i-1},a_i} (a_1 (v_1 )\oplus \dots \oplus a_l (v_l ))\) where \(l=|V(G'_1 )|,1<i\le l\).

  • Let \(K=\{a_i | 1 \le i \le l\}\), that is, the labels that have been used to construct the k-expression of the graph \( G'_ {j-1} \). The set of labels that can be reused in the construction of the \( k_ {G'_j} \)-expression starting from the expression \( k_ {G'_ {j-1}} \) are defined as: \(R=\{a_i \in K | a_i (v_k ) \in k_{G'_{j-1}} ,\delta (v_k) \in G \setminus E(G'_{j-1} )=0 \}\).

  • Let \( a_s \in R \) the label that will be used to rename all other R in the \( G'_ {j-1} \) graph. The resulting expression after relabeling is: \(\rho _{a_t \rightarrow a_s} (k_{G'_{j-1}}) \forall a_t \in R,a_t \ne a_s\).

  • For each \(v \in P_{v_{ini} ,v_{fin}}^{n-1},v \ne v_{ini} \ne v_{fin}\), take a \(a_i \in R\) with \(a_i \ne a_s\) if exists, if not, take a new \(a_i \in L\) and create \(a_i(v_i)\).

  • As \(P_{v_{ini} ,v_{fin}}^{j-1} \setminus \{v_{ini},v_{fin} \}\) is a path that can be associated with each vertex, considering its adjacency, an index of the sequence \(\{1,\cdots ,r \}\) where r is the number of vertices of the path. With this arrangement create the k-expression: \(k_{P^{j-1}} =\eta _{a_{i-1},a_i } (a_1 (v_1 )\oplus \cdots \oplus a_r (v_r ))\) where \(1<i \le r\).

  • Thus \(k_{G'_j }=\eta _{a_r,v_{fin}} (\eta _{a_1,v_{ini}} (k_{G'_{j-1}} \oplus k_{P^{j-1}}))\), where \(a_1,a_r \in P^{j-1}\) are the two labeled vertices of the ends of \(P^{j-1}\). \(\square \)

Now for a better understanding, we present the algorithm 2 that constructs the k-expression given a simple graph. As input, the algorithm receives a graph and its decomposition into fundamental cycles. It starts using the smallest size cycle and calculating its k-expression, later the vertices already used in the construction are stored in the set A, using these vertices the shortest path between them is searched, when finding it, compute its k-expression and add the vertices of the path to the set A. This method is repeated until the original graph is built, creating edges between already labeled vertices and releasing labels already covered.

Each step of the 2 algorithm is described as follows. From lines 1 to 2 the algorithm starts with G as input and \( C_1 \) is the smallest fundamental cycle of G. In line 3 each vertex of \( C_1 \) is added to the set of vertices A. In line 4 the vertices involved in the cycle \( C_1 \) are deleted from the original graph G. In line 5 we begin to build the k-expression using different labels for each vertex in \( C_1 \). From lines 6 to 23 we have the main procedure as long as the number of edges in the resulting G is different from 0. In this procedure, in line 7, the shortest path P is found between each vertex of A on the G graph using the well known Dijkstra algorithm, if you have more than one path with the same cardinality then you can choose the last one found. In line 8 the new k-expression of the path P from step 7 is constructed, at this moment we already have the first and last label of P since it is an element of A, each vertex between the first and last on the way must be different. In line 9 the edges involved in the path P mentioned above are deleted from the current G. In line 10 each vertex of the path P found (except the extremes) is added to A.

Now the following condition allows to release labels to be able to reuse them, from 11 to 14 it is compared if the degree of each vertex in A is 0, if it is true, the corresponding labels are released and the vertex of the A set is deleted since has been covered in its entirety. The above is useful to do less operations. The next condition from 15 to 22 is to verify if the elements of A are connected in the current G, if so, an edge is created using the \(\eta \) operation, which will join the different k-expressions that have already been built. After that, on line 17 the edges found in the last step of the current G are deleted. Finally the same condition is repeated from 18 to 21 as it was done in lines 11 to 14. The while instructions will be repeated until all edges are deleted from the graph G.

Furthermore, the complexity of the method presented in this paper is given by two main methods, the first is the well-known Dijkstra Algorithm for simple graphs which is of order \( O (n ^ 2) \). In other hand, the proposed method is in the worst case \( O (n-3) \), removing the first minimum cycle of size 3. Therefore, the complexity of these methods is \( O (n ^ 3) \).

figure b

4 Example

This section shows how our algorithm works when considering the 8-cubic graph, which is illustrated in Fig. 6 also called a trivalent graph whose vertices have degree 3. This graph contains 8 vertices and 12 edges. In Fig. 7 we start with the cycle \( C_1 = [4,6,5] \) so 3 labels are used and the set \( A = \{4,6,5 \} \). In Fig. 8 the computing of the shortest path is shown with the following steps: the shortest path between the vertices of A is calculated, the resulting path is \( P = [5,1, 7,6] \) and 2 more labels are used, as a result \( A = \{4,5,6,1,7 \} \), now 2 labels can be released, the ones corresponding to vertex 6 and 5, one of them will be left as a residual label for the entire graph and we have a free one, as a result \( A = \{4,1,7 \} \), 5 labels have been used and 1 remains free to use. Later in Fig. 9 the following algorithm computing is shown, the shortest path is found \( P = [7,8,2,1] \), the free label is used and a new one, and the vertices are added to the set A, so far \( A = \{4,1,7,8,2 \} \), at this moment labels 7 and 1 can be released, so there are 2 free labels left and the set would be \( A = \{4,8,2 \} \). In Fig. 10 the following path found is shown \( P = [8,3,2] \), one of the two available labels is used and the set is as follows: \( A = \{4,8,2,3 \} \), labels 8 and 2 can be released, the resulting set would be: \( A = \{4,3 \} \). Finally, in Fig. 11 an edge is created between 4 and 3, to later release these two vertices leaving the set as \( A = \{\} \), ending the algorithm and resulting in the \( cwd \,(8\; cubic\; graph\; 2) \le 6 \).

Fig. 6.
figure 6

8 cubic graph 2

Fig. 7.
figure 7

\(C_1= [4,6,5]\)

Fig. 8.
figure 8

\(P=[5,1,7,6]\)

Fig. 9.
figure 9

\(P=[7,8,2,1]\)

Fig. 10.
figure 10

\(P=[8,3,2]\)

Fig. 11.
figure 11

\(\eta _{(4,3)}\)

5 Conclusions

This article presents a proposal to approximate the cwd of simple graphs. The proposed algorithm is based on the classic Dijkstra algorithm together with the calculation of the shortest paths of induced graphs. To get a conclusion about the efficiency of our algorithm, the results reported by Heule [10] were considered and compared with those generated by our proposal. In Table 1, is shown the name of the graph with its number of vertices and edges, followed by the exact result of Heule et al. and the result of our proposal.

Table 1. Comparison between approach algorithm and exact CWD

On the other hand, the error between an exact method and an approximation method is given by an equation in [3] as follows: \( | A (G) -cwd (G) | \le n ^ {\epsilon } \), where A(G) is the result of the cwd using an approximation algorithm (which is our case), the cwd(G) is the exact result, n is the number of vertices and finally \( \epsilon \) is the error, clearing \( \epsilon \) gives the error of the last column. As can be seen in Table 1, in 8 of the 13 cases the cwd with an \({\approx }0\%\) of error was obtained, having in the worst case an error of \( 30.1\% \) and in the average case an error of \( 9.8\% \).

The advantage of this algorithm is that its execution time is polynomial so the approximation has a complexity of the order \( O (n ^ 3) \) where n is the number of vertices of the input graph.