Keywords

1 Introduction

Graph coloring problems are well studied in literature. The traditional vertex coloring problem asks to color the vertices of the graph using minimum number of colors such that the adjacent vertices get different colors. There are many variants of coloring problems. Recently, Zhang and Li [10] studied a coloring problem in which adjacent vertices are allowed to get same color. The proposed problems have applications related to homophyly in networks (see Chapter 4 of [4]).

Given an undirected graph \(G = (V, E)\) and a vertex coloring, a vertex is happy if the vertex and all its adjacent vertices have the same color and unhappy otherwise. An edge is happy if its end vertices have the same color and unhappy otherwise.

For \(S \subseteq V\), let \( c_{p} : S \rightarrow \{1, 2, \dots , k\} \) be a partial vertex coloring. A coloring \(c_{f} : V \rightarrow \{1, 2, \dots , k\} \) is an extended full coloring for \(c_{p}\), if \(c_{f}(v) = c_{p}(v), \forall v \in S\).

Given an \(S \subseteq V\) and a partial coloring \(c_{p}\), Maximum Happy Vertices (MHV) (respectively, Maximum Happy Edges (MHE)) problem asks to find an extended full coloring c such that the number of happy vertices (respectively, edges) is maximized. As k is also an input parameter, the problem is also referred to as k-MHV (respectively, k-MHE).

Definition 1

Multiway-Cut

(Instance) We are given an undirected graph \(G = (V, E)\) and a terminal set \(S = \{s_{1}, s_{2}, \dots , s_{k}\} \subseteq V\).

(Goal) Find a set of edges \(C \subseteq E\) with minimum cardinality whose removal disconnects all the terminals from each other.

Definition 2

Multiway-Uncut

(Instance) We are given an undirected graph \(G = (V, E)\) and a terminal set \(S = \{s_{1}, s_{2}, \dots , s_{k}\} \subseteq V\).

(Goal) Find a partition \(\{V_{1}, V_{2}, \dots , V_{k}\}\) of V such that each partition contains exactly one terminal and the number of edges not cut by the partition is maximized.

The k-MHE problem is a generalization of the Multiway Uncut problem [7] which is the complement of Multiway Cut problem [1, 2]. The Multiway Uncut problem is a special case of k-MHE problem in which there is exactly one pre-colored vertex (terminal) for each color.

Both k-MHV and k-MHE problems are NP-Hard [10] for \(k \ge 3\) for arbitrary graphs. In [10], \(O(mn^7 \log n)\) and \(O(\min \{n^{\frac{2}{3}}m, m^{\frac{3}{2}}\})\) time algorithms are presented for 2-MHV and 2-MHE respectively. Towards this end, the authors of [10] used techniques such as minimizing sub modular functions (2-MHV) [6] and max-flow algorithms (2-MHE) [5]. Zhang and Li [10] presented approximation algorithms with approximation ratios \(\max \{ \frac{1}{k}, \varOmega ({\varDelta ^{-3}})\}\) and \(\frac{1}{2}\) for k-MHV and k-MHE respectively. Here, \(\varDelta \) is the maximum degree of the graph. Later, Zhang et al. [9] presented approximation algorithms with approximation ratios \(\frac{1}{\varDelta + 1}\) and \((\frac{1}{2} + \frac{\sqrt{2}}{4} f(k)) \ge 0.8535\) for k-MHV and k-MHE respectively.

1.1 Our Results

Apart from the results in [9, 10], the MHV and MHE problems does not seem to be addressed for any class of graphs. In this paper, we study these problems for trees. We propose dynamic programming based algorithms for both k-MHV and k-MHE. For an arbitrary k, the proposed algorithms take \(O(nk \log k)\) and O(nk) time respectively. When k is fixed, the algorithms run in linear time. We also extend our algorithms to generate all the optimal colorings of the tree. Generating each optimal coloring takes polynomial time.

Using the result from [2] we observe that, for an arbitrary k, the k-MHE problem is NP-Hard for planar graphs. Using the result from [3] we infer that, when the number of pre-colored vertices is bounded, the k-MHE problem can be solved in linear time for graphs with bounded branch width.

The rest of the paper is organized as follows: In Sect. 2 we discuss the algorithm for the k-MHV problem, in Sect. 3 we discuss the algorithm for the k-MHE problem and the related observations. We conclude with Sect. 4. Throughout the paper we assume that the input graph is a tree (T). We use integers from 1 to k to denote the colors.

2 Algorithm for k-MHV Problem

We root the tree at an arbitrary vertex. Let \(T_{v}\) denotes the subtree rooted at a vertex v. Before presenting the algorithm we give a simple reduction rule, which can be executed in linear time.

 

Rule 1: :

If a leaf vertex is uncolored, remove it and count the leaf vertex as happy.

 

We can give the color of its parent to the uncolored leaf to make it happy. Hence, without loss of generality we can assume that all the leaves are colored.

We process the vertices of the rooted tree according to post order traversal. At each vertex v, we maintain a list of 2k integer values. The maximum value of these 2k values gives the maximum number of happy vertices in \(T_{v}\), the sub tree rooted at v. The maximum value of the 2k values associated with the root gives us the maximum number of happy vertices of the tree. The corresponding optimal coloring can also be traced back in reverse direction. The list of 2k values defined as follows, for \(1 \le i \le k\):

  • \(T_{v}[i, H]\): The maximum number of happy vertices in the subtree \(T_{v}\), when v is colored i and is happy in \(T_{v}\). That is, when v and all its children are colored i. Note that, here we focus on v being happy in the subtree \(T_{v}\). The vertex v can become unhappy in the tree T because its parent gets another color.

  • \(T_{v}[i, U]\): The maximum number of happy vertices in \(T_{v}\), when v is colored i and is unhappy in \(T_{v}\). That is, when one or more children of v are colored with a color other than i.

Note that, if a vertex or some of its children are already colored, then some of the 2k values are invalid. We use \(-1\) to denote an invalid value. We keep these 2k values in an array to access any specific item in constant time. The values are indexed in the order, \(T_{v}[1, H]\), \(T_{v}[1, U]\), \(T_{v}[2, H]\), \(T_{v}[2, U]\), ..., \(T_{v}[k, H]\), \(T_{v}[k, U]\).

The following expressions are defined to simplify some of the equations:

  • \(T_{v}[i, *]\): The maximum number of happy vertices in the subtree \(T_{v}\), when v is colored i. v may be happy or unhappy. That is:

    $$\begin{aligned} T_{v}[i, *] = \max \{T_{v}[i, H], T_{v}[i, U] \}. \end{aligned}$$
    (1)
  • \(T_{v}[i, -]\): The maximum number of happy vertices in \(T_{v}\) excluding v, when v is colored i.

    $$\begin{aligned} T_{v}[i, -] = \max \{T_{v}[i, H]-1, T_{v}[i, U] \}. \end{aligned}$$
    (2)
  • \(T_{v}[\overline{\imath }, *]\): The maximum number of happy vertices in the subtree \(T_{v}\), when v is colored with color other than i.

    $$\begin{aligned} T_{v}[\overline{\imath }, *] = \max _{r \ne i} \{T_{v}[r, *]\}. \end{aligned}$$
    (3)
  • \(T_{v}[\overline{\imath }, -]\): The maximum number of happy vertices in the subtree \(T_{v}\) excluding v, when v is colored with color other than i.

    $$\begin{aligned} T_{v}[\overline{\imath }, -] = \max _{r \ne i} \{T_{v}[r, -]\}. \end{aligned}$$
    (4)
  • \(T_{v}[*, *]\): The maximum number of happy vertices in \(T_{v}\). That is:

    $$\begin{aligned} T_{v}[*, *] = \max \{ T_{v}[1, *], T_{v}[2, *], \dots , T_{v}[k, *] \}. \end{aligned}$$
    (5)

Now we explain the process to compute these 2k values at each vertex. As a leaf vertex is pre-colored, it is always happy alone as a subtree with a single vertex. Only one out of 2k values is valid. Suppose the color of the leaf is i, then the only valid value is \(T_{v}[i, H] = 1\).

The following subsections consider the case when v is a non leaf vertex. Let \(v_{1}, v_{2}, \dots , v_{d}\) be the children of v. The values \(T_{v}[i, H]\) and \(T_{v}[i, U]\) are invalid, if v is pre-colored with a color \(r \ne i\). Otherwise, we compute \(T_{v}[i, H]\) and \(T_{v}[i, U]\) as follows:

2.1 Computing \(T_{v}[i, H]\)

Computing \(T_{v}[i, H]\) has two cases:

figure a

 

Case 1: :

For some child \(v_{j}\), \(T_{v_{j}}[i,*] = -1\).

This means that the child \(v_{j}\) is pre colored with a color other than i. In this case, v becomes unhappy when it gets color i. So \(T_{v}[i, H]\) is invalid.

Case 2: :

For every child \(v_{j}\), \(T_{v_{j}}[i,*] > -1\).

In this case, we use the following equation to compute \(T_{v}[i, H]\).

$$\begin{aligned} T_{v}[i, H] = 1+\sum _{v_{j}} T_{v_{j}}[i,*]. \end{aligned}$$
(6)

 

2.2 Computing \(T_{v}[i, U]\)

Computing \(T_{v}[i, U]\) has three cases:

figure b

 

Case 1: :

Every child \(v_{j}\) is pre colored with color i.

In this case, we cannot make v unhappy by giving color i to v. Hence \(T_{v}[i, U]\) is invalid.

Case 2: :

For some child \(v_{j'}\), \(T_{v_{j'}}[*, *] \ne T_{v_{j'}}[i, *]\).

That is, the child \(v_{j'}\) has color \(r \ne i\) in the optimal coloring of \(T_{v_{j'}}\). When v is colored i and \(v_{j'}\) is colored r, irrespective of the colors of the other children, v will certainly be unhappy. In this case, we use the following expression to compute \(T_{v}[i, U]\).

$$\begin{aligned} T_{v}[i, U]&= T_{v_{j'}}[r, -]\; + \mathop {\sum _{v_j \text { child of }v,}}_{v_{j} \ne v_{j'}} \max \{T_{v_{j}}[1, -], \dots , T_{v_{j}}[i, *], \dots , T_{v_{j}}[k, -]\} \end{aligned}$$
(7)
$$\begin{aligned}&= \mathop {\sum _{v_j \text { child of }v}} \max \{T_{v_{j}}[1, -], \dots , T_{v_{j}}[i, *], \dots , T_{v_{j}}[k, -]\}. \end{aligned}$$
(8)
Case 3: :

For every child \(v_{j}\), \(T_{v_{j}}[*, *] = T_{v_{j}}[i, *]\).

For each \(v_{j}\), if we pick \(T_{v_{j}}[i, *]\), v will become happy, but we need v to be unhappy. To avoid this situation, for some child we pick a value with color other than i as follows:

 

For each \(v_{j}\), we define \(\text {diff}(v_{j}, i)\) as follows:

$$\begin{aligned} \text {diff}(v_{j}, i) = T_{v_{j}}[i, *] - T_{v_{j}}[\overline{\imath }, -]. \end{aligned}$$
(9)

We pick the child (say \(v_{\ell }\)) with minimum \(\text {diff}(v_{j}, i)\) value. Suppose, \(T_{v_{\ell }}[\overline{\imath }, -]\) = \(T_{v_{\ell }}[q, -]\), we replace \(T_{v_{\ell }}[i, *]\) with \(T_{v_{\ell }}[q, -]\). The new expression is:

$$\begin{aligned} T_{v}[i, U] = T_{v_{\ell }}[q, -] + \sum _{v_{j} \ne v_{\ell }} T_{v_{j}}[i, *]. \end{aligned}$$
(10)
figure c

Theorem 1

There is an \(O(nk \log k)\) time algorithm for the k-MHV problem for trees.

Proof

We evaluate the time spent at a particular vertex v to compute \(T_{v}[i, H]\) and \(T_{v}[i, U]\), for \(1 \le i \le k\). Let \(v_{1}, v_{2}, \dots , v_{d}\) be the children of v.

Computing \(T_{v}[i, H]\): The \(T_{v_{j}}[i, H]\) and \(T_{v_{j}}[i, U]\) values are accessible in constant time for each child \(v_j\). Time to compute \(T_{v}[i, H]\), \(\forall 1 \le i \le k\) is:

$$\begin{aligned} \sum _{1 \le i \le k} O(d) = O(kd). \end{aligned}$$
(11)

Computing \(T_{v}[i, U]\): We sort the 2k values in descending order. For any child \(v_j\), \(T_{v_{j}}[i, *]\) is available in constant time from the original array. From the sorted array \(T_{v_{j}}[*, *]\) and \(T_{v_{j}}[\overline{\imath }, *]\) are available in constant time. Hence \(T_{v}[i, U]\), \(\forall 1 \le i \le k\) can be computed in:

$$\begin{aligned} O(dk \log k) + \sum _{1 \le i \le k} O(d) = O(dk \log k). \end{aligned}$$
(12)

Hence the total time is:

$$\begin{aligned} \sum _{v} dk + dk \log k \le \sum _{v} 2dk \log k = 2k \log k \sum _{v} d = O(nk \log k). \end{aligned}$$
(13)

   \(\square \)

The correctness of the value \(T_{v}[*, *]\) for every vertex v implies the correctness of the algorithm. The correctness of the value \(T_{v}[*, *]\) follows from the correctness of the 2k values \(T_{v}[1, H]\), \(T_{v}[1, U]\), \(T_{v}[2, H]\), \(T_{v}[2, U]\), ..., \(T_{v}[k, H]\), \(T_{v}[k, U]\) associated with v.

Theorem 2

Algorithm 3 correctly computes the values \(T_{v}[i, H]\) and \(T_{v}[i, U]\) for every v and \(1 \le i \le k\).

Proof

We prove the theorem by using induction on the size of the subtrees. For a leaf vertex v, the algorithm correctly computes the values \(T_{v}[i, H]\) and \(T_{v}[i, U]\) for \(1 \le i \le k\). Since the leaf vertices are pre-colored, each leaf vertex has only one valid value (this value being 1).

For a non-leaf vertex v, let \(v_{1}, v_{2}, \dots , v_{d}\) be the children of v. By induction on the size of the sub-trees, all the 2k values associated with each child \(v_{j}\) of v are correctly computed. Let x be the value computed by the algorithm for \(T_{v}[i,H]\) (or \(T_{v}[i,U]\)) for any color i. If x is not the optimal value, it will contradict the optimality of at least one value of a child of v. Hence the algorithm correctly computes the values \(T_{v}[i, H]\) and \(T_{v}[i, U]\) for every v and \(1 \le i \le k\).    \(\square \)

2.3 Generating All Optimal Happy Vertex Colorings

Our algorithm can also be extended to generate all the optimal happy vertex colorings of the tree. Among the 2k values associated with a vertex v, there may be multiple values equal to the optimal value. So, while generating optimal happy vertex coloring, we can chose any of these values to generate a different optimal coloring. For example, let \(T_{v}[i,H]\) be an optimal value for the vertex v. Let \(v_{j}\) be a child of v with both \(T_{v_{j}}[i,H]\) and \(T_{v_{j}}[i,U]\) are optimal. So, we can generate one optimal coloring by picking \(T_{v_{j}}[i,H]\) and another optimal coloring by picking \(T_{v_{j}}[i,U]\). There may be exponentially many optimal colorings, but, generating each optimal coloring takes polynomial time (linear time for fixed k).

3 Algorithm for k-MHE Problem

Before presenting the algorithm we give simple reduction rules, which can be executed in linear time.

 

Rule 2: :

Let v be a pre-colored vertex with degree more than 1. Let \(v_{1}, v_{2}, \dots , v_{d}\) be the neighbours of v in T. We can divide T into d edge disjoint subtrees \(T_{1}, T_{2}, \dots , T_{d}\) and all these trees share only the vertex v.

$$\begin{aligned} k\text {-MHE}(T) = k\text {-MHE}(T_{1}) + k\text {-MHE}(T_{2}) + \dots + k\text {-MHE}(T_{d}). \end{aligned}$$
(14)

 

With the application of Rule 2, without loss of generality we can assume that T does not have a pre-colored vertex with degree more than 1.

Now, we root the tree at an arbitrary vertex with degree more than 1.

  • Rule 3: (Similar to Rule 1 in Sect. 2) If a leaf vertex is uncolored, remove it and count the edge connecting the leaf vertex as happy.

With Rule 2 and Rule 3, without loss of generality, all the leaves of the rooted tree T are pre-colored and no non-leaf vertex is pre-colored.

Our algorithm for k-MHE problem has two phases. In the first phase, we visit the vertices according to post order traversal and populate a list of tentative colors for each vertex. In the second phase we visit the vertices according to pre-order traversal and assign a color for each vertex.

figure d

 

Phase 1: :

We visit the vertices according to post order traversal. At each vertex v, we keep a list of tentative colors to assign to the vertex v in the optimal solution. The size of this list is at most k. Let L(v) denote the list of tentative colors associated with the vertex v.

If the vertex v is a leaf, as the leaf vertex is pre-colored, we add that pre-color to L(v). Otherwise, let \(v_{1}, v_{2}, \dots , v_{d}\) be the children of v. The list of tentative colors \(L(v_{j})\) for each vertex \(v_{j}\) are already computed. For each child \(v_{j}\), we traverse the list \(L(v_{j})\) and compute the frequency of occurrences of each color in the multiset that is union of the lists. Let frequency(i) denote the frequency of color i. We add all the colors with maximum frequency to L(v). The process is captured in Algorithm 4.

 

figure e

 

Phase 2: :

We visit the vertices according to pre-order traversal to assign a color to each vertex. Let v be the vertex in pre-order. If \(|L(v)| = 1\), then we fix the color of v to the only color in L(v). Otherwise, we check if the color of the parent of v is present in L(v), and assign it to v if present. Otherwise, we pick any arbitrary color from L(v) and assign it to v. The process is captured in Algorithm 5.

 

Theorem 3

There is an O(nk) time algorithm for the k-MHE problem for trees.

Proof

At each vertex with degree d, we perform O(kd) time in the Phase 1 and O(k) time in the Phase 2. The time complexity is:

$$\begin{aligned} \sum _{v}O(kd) = O(nk). \end{aligned}$$
(15)

   \(\square \)

The correctness of the algorithm can be proved using induction on the size of the sub-tree similar to Theorem 2.

3.1 Generating All Optimal Happy Edge Colorings

Our algorithm can be extended to generate all the optimal happy edge colorings. We keep a list of tentative colors at each vertex. At a vertex v, if the color(parent(v)) is present in L(v), then, we assign the color(parent(v)) to v in the optimal coloring. Otherwise, we can generate a different optimal coloring for each color in L(v). Here we point out that, this scheme may miss out some optimal colorings when color(parent(v)) is not present in L(v) but present in the set of colors with frequency one less than the maximum frequency. In this case, we can assign the color(parent(v)) to v even though the color(parent(v)) is not present in L(v). A special case of this scenario is when there is a vertex v where all its children have distinct colors (the maximum frequency being 1). Even though the color(parent(v)) not present in L(v), we can assign the color(parent(v)) to v as it has zero frequency at v.

There may be exponentially many optimal happy edge colorings. Generating each optimal coloring takes polynomial time (linear time for fixed k).

3.2 k-MHE for Planar Graphs and Graphs with Bounded Branch Width

The Multiway-Cut problem is NP-Hard for planar graphs [2] when k, the number of terminals, is not fixed. This implies the following theorem on hardness of k-MHE for planar graphs for an arbitrary k.

Theorem 4

For an arbitrary k, the k-MHE problem is NP-Hard for planar graphs.

In [8], Robertson and Seymour introduced the notions of tree width and branch width. They showed that these two quantities are always within a constant factor of each other. Many graph problems that are NP-Hard for general graphs have been shown to be solvable in polynomial time for graphs with bounded tree width or equivalently bounded branch width. For more formal definitions of branch width and tree width we refer the readers to [8].

Definition 3

Multi-Multiway Cut

(Instance) We are given an undirected graph \(G = (V, E)\) and c sets of vertices \(S_{1}, S_{2}, \dots , S_{c}\).

(Goal) Find a set of edges \(C \subseteq E\) with minimum cardinality whose removal disconnects every pair of vertices in each set \(S_{i}\).

When \(c = 1\), the Multi-Multiway Cut problem is equivalent to Multiway Cut problem. The k-MHE problem can also be formulated as a Multi-Multiway Cut problem, by creating vertex sets with every pair of pre-colored vertices with different colors. In [3], Deng et al. studied the Multi-Multiway Cut problem for graphs with bounded branch width and presented an \(O(b^{2b+2}.2^{2bc}.|G|)\) time algorithm, where b is the branch width of the graph and c is the number of vertex sets. The algorithm runs in linear time when the branch width and the number of vertex sets are fixed.

Theorem 5

When the branch width of the graph and the number of pre-colored vertices are bounded, there is a linear time algorithm for the k-MHE problem.

Proof

Let the number of pre-colored vertices be p and the branch width be b. For this instance of k-MHE, we can formulate a Multi-Multiway Cut problem with at most \(p^2\) vertex sets. Hence, the k-MHE problem can be solved in time \(O(b^{2b+2}.2^{2b p^{2}}.|G|)\). Hence, when both the number of pre-colored vertices and the branch width are constants, the k-MHE problem can be solved in linear time.    \(\square \)

4 Conclusions

In this paper, we study the Maximum Happy Vertices (k-MHV) and Maximum Happy Edges (k-MHE) problems for trees. We have presented \(O(nk \log k)\) and O(nk) time algorithms for k-MHV and k-MHE problems respectively. Our algorithms run in linear time when k is fixed. Our algorithms can be extended to generate all the optimal colorings of the tree.

As a future direction, it is interesting to study the hardness of the k-MHV problem for planar graphs. For fixed k, the Multiway Cut problem has a polynomial time algorithm for planar graphs [2]. So, for planar graphs and when k is fixed, polynomial time algorithms might be possible for k-MHV and k-MHE. Finding a linear time algorithm for graphs with bounded tree width (branch width) without the constraint on the number of pre-colored vertices is another direction.