1 Introduction

Given a hypergraph \({\mathcal {H}}=({\mathcal {V}},{\mathcal {E}} )\) with \(e\subseteq {\mathcal {V}}\) for all \(e\in {\mathcal {E}} \), a capacity \(u\in {\mathbb {N}}_{> 0 }\), and an upper bound \(k\in {\mathbb {N}}_{>0} \cup \{\infty \}\), the capacitated (or balanced) hypergraph vertex separator problem (CHVS) is to find a minimum cardinality subset of vertices \(\mathcal {S} \subset {\mathcal {V}}\) such that the remaining vertices decompose in at most \(k\) components (not necessarily connected) with at most \(u\) vertices each, i.e., there is no hyperedge incident to more than one component. These components are called shores. The goal is equivalent to maximizing the number of vertices in the shores.

The CHVS is not only \({\mathcal {NP}}\)-hard, but also hard to approximate within an additive term away from the optimum, even when restricted to graphs with maximum node degree three and \(k=2\). This is an immediate consequence of Theorem 4.3 in [10].

We abbreviate \([n]:=\{1,\ldots ,n\}\) for any \(n \in {\mathbb {N}}_{>0}\) and \({\mathcal {E}} ({R}) :=\{ e\in {\mathcal {E}} : R\cap e\ne \emptyset \}\) for any \(R\subseteq {\mathcal {V}}\). For a single vertex \(v\in {\mathcal {V}}\) we write \({\mathcal {E}} (v)\) instead of \({\mathcal {E}} (\{v\})\). Furthermore, we assume w.l.o.g. that there are no isolated vertices.

Applications and literature Our main motivation for studying the \(\text{ CHVS } \) comes from a matrix decomposition problem: given a matrix \(A\in {\mathbb {R}}^{n\times m}\), a number k of blocks, and a block capacity \(u\), assign as many rows as possible to one of the blocks such that the number of rows assigned to each block is at most \(u\). Two rows assigned to different blocks must not share a column having a nonzero entry in both of them. The set of unassigned rows is called the border. By re-arranging the rows and columns blockwise the matrix attains the so-called single-bordered block-diagonal form. Identifying rows with vertices and columns with nets (spanning exactly the vertices whose corresponding rows have a nonzero entry in this column), we obtain a bijection between instances (and solutions) of CHVS and the matrix decomposition problem, see Fig. 1. The single-bordered block-diagonal form has itself a vast number of applications in e.g., numerical linear algebra, see [19] for a survey. Examples of particular interest are the parallelized QR factorization [1], and determining how to apply a Dantzig-Wolfe reformulation to a mixed-integer linear program [8].

Fig. 1
figure 1

Exploiting a relation to the CHVS, we are able to re-arrange a matrix (left) to single-bordered block-diagonal form (right). Black dots in the matrices represent non-zero entries. A smallest vertex separator (here: vertices a and f) corresponds to a minimum size border

The matrix decomposition problem also motivated the only exact approach to the \(\text{ CHVS } \) so far. Borndörfer et al. [9] propose a binary program which they solve by a tailored branch-and-cut algorithm. It is based on binary variables \(x^\ell _v\) that equal 1 if and only if vertex v is part of shore \(\ell \). Their model reads as follows.

$$\begin{aligned} \max \quad&\displaystyle \sum _{\ell \in \left[ k\right] } \sum _{v\in {\mathcal {V}}} x^\ell _v \nonumber \\ \text {s.t.}\quad&\sum _{\ell \in \left[ k\right] } x^\ell _v \le 1 \quad \forall v\in {\mathcal {V}}\end{aligned}$$
(B.1)
$$\begin{aligned}&\sum _{v \in {\mathcal {V}}} x^\ell _v \le u\quad \forall \ell \in \left[ k\right] \end{aligned}$$
(B.2)
$$\begin{aligned}&x^\ell _v + x^{\ell '}_{v'} \le 1 \quad \forall \ell ,\ell '\in \left[ k\right] ,\quad \ell \not = \ell ',\quad \forall v,v': {\mathcal {E}} (v) \cap {\mathcal {E}} (v') \ne \emptyset , \\&x^\ell _v\, \in \{0,1\}, \quad \forall \ell \in \left[ k\right] ,\quad \forall v\in {\mathcal {V}}.\nonumber \end{aligned}$$
(B.3)

Every vertex can be part of at most one shore via Eq. (B.1) and each shore is capacitated, ensured by Eq. (B.2). Central to this model is the conflict that vertices that belong to a common net cannot be part of different shores, see Eq. (B.3). Based on these latter packing type of constraints, the model is strengthened by several classes of valid inequalities in [9]. Since formulation (B) is symmetric in index k, even the tailored approach struggles with larger k. Oosten et al. [25] suggest a non-symmetric binary program, however, they assume \(k=\infty \). For the variant where the difference between component cardinalities is bounded by a parameter, Cornaz et al. [11] present a compact and an exponential-size formulation. The latter one can be seen as an intermediate step to what we propose in this paper. Yet, they can also hardly handle larger k. Most recently, Cornaz et al. [12] independently used similar ideas for the uncapacitated problem variant on graphs to cut into at least k components.

The CHVS appears in several other contexts, in particular when restricted to graphs, where network structure is of interest: removing few vertices from a graph such that a certain number of components remain is a common topic in graph clustering [25] and partitioning [15]. In communication applications such vertices have a certain criticality for the network and the cardinality of a separator is a proxy for network robustness [6].

An overview of heuristic approaches, mainly for \(k=2\), is given in [15]. Polyhedral results (other than those in [9] already mentioned) can be found in [3, 13] (\(k=2\)) and [25] (\(k=\infty \)). For \(k=\infty \), the problem is polynomially solvable for several graph classes, also for the version with vertex weights [6].

Our contribution We model the \(\text{ CHVS } \) as an exponential-size binary program in which the symmetry in k is eliminated. Our approach is the first to consistently solve instances with larger k and thus complements previous exact approaches that work better/only for smaller k. We design a branch-and-price algorithm, for which it is remarkable that branching on so-called aggregated original variables works well. One key component of this (theoretically incomplete) branching scheme is a repair algorithm that might solve an auxiliary Bin Packing problem to find an integer solution based on a fractional one. We discuss the complexity of the pricing problem, a variant of the Next Release Problem [2], and solve it by heuristic and exact approaches. The optimal re-arrangements of matrices into single-bordered block-diagonal form constitute a contribution on their own.

2 Branch-and-price algorithm

A slight modification of formulation (B) for the CHVS reads as follows.

$$\begin{aligned} \max \quad&\displaystyle \sum _{\ell \in \left[ k\right] } \sum _{v\in {\mathcal {V}}} x^\ell _v\nonumber \\ \text {s.t.} \quad&\sum _{\ell \in \left[ k\right] } y^\ell _e\le 1 \quad \forall e\in {\mathcal {E}} , \end{aligned}$$
(P.1)
$$\begin{aligned}&\sum _{v\in {\mathcal {V}}} x^\ell _v \le u\quad \forall \ell \in \left[ k\right] , \end{aligned}$$
(P.2)
$$\begin{aligned}&x^\ell _v - y^{\ell }_{e} \le 0 \quad \forall \ell \in \left[ k\right] ,\, \forall e\in {\mathcal {E}} ,\quad \forall v \in e,\\&x^\ell _v ,\, y^\ell _e\in \{0,1\} \quad \forall v \in {\mathcal {V}},\, \forall e\in {\mathcal {E}} ,\quad \forall \ell \in \left[ k\right] .\nonumber \end{aligned}$$
(P.3)

Binary variable \(x^\ell _v\) equals 1 if and only if vertex v is part of shore \(\ell \). There is a binary variable \(y^\ell _e\) for all \(e\in {\mathcal {E}} \) and \(\ell \in [k]\) that equals 1 if \(x^\ell _v = 1\) for some \(v \in e\) which is enforced by constraints (P.3). The inequalities (P.1) invoke that every hyperedge touches at most one shore. Notice that since there are no isolated vertices, every vertex is assigned to at most one shore. Furthermore constraints (P.2) accomplish that every shore includes at most \(u\) many vertices. The objective function maximizes the number of vertices assigned to some shore.

The drawbacks of this formulation are two-fold: firstly, the linear programming (LP) relaxation is weak; assigning \(x^\ell _v:=y^\ell _e:=\min \{ \frac{u}{m},\frac{1}{k}\}\) yields a feasible solution with objective value equal to \(\min \left\{ ku, m\right\} \) which are trivial bounds. Secondly, the formulation is highly symmetric, as for any feasible solution of (P) every permutation of shore indices \(\ell \) yields another feasible solution of (P).

2.1 A shore based formulation

Let \({\mathcal {R}} :=\{ R \subseteq {\mathcal {V}} : |R|\le u\}\) denote the set of all possible shores, e.g., vertex subsets with cardinality at most \(u\). We consider the following natural shore-based ILP formulation (M), which formally is a Dantzig-Wolfe reformulation of (P): for every \(\ell \in \left[ k\right] \) one reformulates the corresponding constraints (P.2) and (P.3) into a separate subproblem, resulting in k identical subproblems that will be aggregated into one single subproblem, thereby eliminating the symmetry of (P). The remaining constraints (P.1) form the master problem.

$$\begin{aligned} \max \quad&\displaystyle \sum _{R \in \mathcal {R}} \vert R |\lambda _R \nonumber \\ \text {s.t.} \quad&\sum _{\begin{array}{c} R \in \mathcal {R}: \\ e\cap R \ne \emptyset \end{array}} \lambda _R \le 1\quad (\beta _e) \quad \forall e\in {\mathcal {E}} \end{aligned}$$
(M.1)
$$\begin{aligned}&\sum _{R \in \mathcal {R}} \lambda _R \le k\quad (\gamma ) \\&\lambda _R \in \{0,1\} \quad \forall R \in \mathcal {R}.\nonumber \end{aligned}$$
(M.2)

Variable \(\lambda _R\) takes value 1 if and only we select a shore consisting exactly of the vertices in R. The objective function maximizes the number of vertices assigned to some shore.

Constraints (M.1) ensure that for every hyperedge \(e\) there is at most one shore including a vertex incident to \(e\). Hence there are no two shores sharing a hyperedge. Constraint (M.2) assures that at most \(k\) shores are chosen. The LP relaxation of (M) is denoted by (MLP), in which the upper bounds on the variables need not be explicitly stated because of Eq. (M.1). The dual variables to the respective constraints are indicated in brackets.

Formulation (M) has \( \sum _{i=0}^u\left( {\begin{array}{c}m\\ i\end{array}}\right) \ge 2^u\) variables, therefore we solve it by branch-and-price, i.e., the relaxation (MLP) is solved by column generation [23]. We assume the reader to be familiar with both concepts. The restricted master problem arises from (M) by only considering a subset \({{\bar{\mathcal {R}}}}\) of \(\mathcal {R}\). Its LP relaxation is denoted by (RMLP). If there is no \(R\in \mathcal {R}\backslash {{\bar{\mathcal {R}}}}\) such that \(\lambda _R\) has positive reduced cost, an optimal solution of (RMLP) is also optimal for (MLP). Otherwise, we add at least one R to \({{\bar{\mathcal {R}}}}\) with \(\lambda _R\) having positive reduced cost, and solve (RMLP) again, see Sect. 2.3. Note that, in principle, (RMLP) is feasible for \( {{\bar{\mathcal {R}}}} = \emptyset \).

2.2 Branching

When (MLP) is solved to optimality, the optimal \(\lambda '\) might not be integer, i.e., \(\lambda ' \notin \{0,1\}^{\mathcal {R}}\). When there is a \(v \in {\mathcal {V}}\) with \(z_v^{\lambda '} :=\sum _{R \in \mathcal {R}: v \in R} \lambda '_R \in (0,1)\) we branch on the dichotomy that v is either part of the separator or a shore. This is realized by imposing constraints \(z_v^{\lambda '} = 0\) and \(z_v^{\lambda '} = 1\), respectively, in the two child nodes in the branch-and-price tree. Note that this can be interpreted as branching on aggregated original x-variables of (P): \(\sum _{\ell =1}^k x^\ell _v \notin \{0,1\}\) for \(v \in {\mathcal {V}}\). However, this branching scheme is not complete in theory, as the following example shows.

Example 1

Let \({\mathcal {H}} = (\{a,b,c,d,e,f\}, \{ \{a,b\}, \{c,d\}, \{e,f\}\})\), \(k=2\), and \(u=6\). Then, for \({{\bar{\mathcal {R}}}} :=\{ \{a,b\}, \{c,d\}, \{e,f\}, \{a,b,c,d,e,f\} \}\) the solution \(\lambda '_R = 0.5\) for \(R\in {{\bar{\mathcal {R}}}}\) and \(\lambda '_R = 0\) for \(R\in \mathcal {R}\backslash {{\bar{\mathcal {R}}}}\) is basic feasible and \(z_i^{\lambda '} =1 \) for \(i \in [4]\). For a visualization of the solution and the relevant full rank part of the basis matrix see Fig. 2.

Fig. 2
figure 2

Example for incomplete branching, i.e., \(\lambda ' \notin \{0,1\}^\mathcal {R}\) but \(z_v^{\lambda '} \in \{0,1 \}\)

In the case that \(\lambda ' \notin \{0,1\}^\mathcal {R}\) but \(z_v^{\lambda '} \in \{0,1 \}\) we are able (under mild assumptions) to retrieve an integral solution of the same objective function value as \(\lambda '\) by solving an auxiliary Bin Packing problem. Define \(V^{\lambda '} :=\{v \in {\mathcal {V}}: z_v^{\lambda '} = 1\}\) and let \(\mathcal {H}[V^{\lambda '}]\) denote the hypergraph induced by \(V^{\lambda '}\).

Proposition 1

Let \({\lambda '}\) be a solution of (RMLP) with \(z_v^{\lambda '} \in \{0,1 \}\). Then for every connected component C of \({\mathcal {H}}[V^{\lambda '} ]\) it holds that \(|C|\le u\).

Proof

Let C be a connected component of \({\mathcal {H}} [V^{\lambda '}]\) and let \(v \in C\) and \(e\in {\mathcal {E}} \) with \(v \in e\). Define \({\mathcal {R}}^+_w :=\{ R \in {\mathcal {R}} : \lambda '_R > 0, w \in R \}\). Since \(\{R \in {\mathcal {R}}: v \in R \} \subseteq \{R \in {\mathcal {R}}: e\cap R \ne \emptyset \} \) we have \(1 = z^{\lambda '}_v = \sum _{R \in {\mathcal {R}}: v \in R} \lambda '_R \le \sum _{R \in {\mathcal {R}}: e\cap R \ne \emptyset } \lambda '_R \le 1\) which holds with equality. Hence \({\mathcal {R}}^+_{v_1} ={\mathcal {R}}^+_{v_2} \) for adjacent vertices \(v_1, v_2 \in C\), and since C is connected also \({\mathcal {R}}^+_{w_1} ={\mathcal {R}}^+_{w_2} \) for arbitrary vertices \(w_1, w_2 \in C\). Therefore, \(C\subseteq R\) for all \(R\in {\mathcal {R}}^+_{v}\) (since \(R\in {\mathcal {R}}^+_{w}\) for every \(w \in C\) and thus \(w \in R\) ) and \(|C |\le u\). \(\square \)

Recall that for the Bin Packing problem items of non-negative weight must be assigned to bins such that the total weight in each bin does not exceed the bin capacity and the number of used bins is minimum.

Definition 1

Let \({\mathcal {H}}\) be a hypergraph with connected components \(C_1,\ldots , C_h\), and \(u\in {\mathbb {N}}\). The Bin Packinginstance associated to\(({\mathcal {H}}, u)\) has h items of size \(|C_i |\) for item i and bin capacity \(u\).

The classical Gilmore-Gomory formulation [17] to solve such an instance reads:

$$\begin{aligned}&w^*= \min \displaystyle \sum _{P \in \mathcal {P}} \mu _P \\&\text {s.t.}\quad \sum _{\begin{array}{c} P \in \mathcal {P}: \\ i \in P \end{array} } \mu _P = 1 \quad \forall i \in [h] \\&\mu _P \in \{0,1\}\quad \forall P \in {\mathcal {P}} , \end{aligned}$$

where we define \(\mathcal {P} = \{ P \subseteq [h]: \sum _{i \in P} |C_i| \le u\}\). Let \(w^*_{\text {LP}}\) denote the optimum of its LP relaxation.

Following the general definition from [5], an instance of the Bin Packing problem has the integer round-up property if \(w^* = \lceil w^*_{\text {LP}} \rceil \) for that instance.

Proposition 2

Let \(\lambda '\) be a solution of (RMLP) with \(z_v^{\lambda '} \in \{0,1 \}\). If the Bin Packing instance associated to \(({\mathcal {H}}[V^{\lambda '}], u)\) has the integer round-up property, there exists an integer solution \({{\bar{\lambda }}}\) of (M) with \(\sum _{R\in \mathcal {R}} |R |\lambda '_R = \sum _{R\in \mathcal {R}} |R |{{\bar{\lambda }}}_R \).

Proof

Let \(\lambda '\) be a solution of (RMLP) with \(z_v^{\lambda '} \in \{0,1 \}\), \(v \in {\mathcal {V}}\). Let \(C_1,\ldots ,C_h\) be the connected components of \(H[V^{\lambda '}]\) and let the Bin Packing instance B associated to \(({\mathcal {H}}[V^{\lambda '}], u)\) have the integer round-up property. We have seen in the proof of Proposition 1 that for all connected components C of \({\mathcal {H}} [V^{\lambda '}]\) with \(R\cap C \ne \emptyset \) for R with \(\lambda '_R> 0\) it holds that \(C\subseteq R\). We define \(P(R):=\{i\in [h]: C_i\subseteq R\}\) for \(R\in {\mathcal {R}}\) with \(\lambda '_R> 0\). Since \(C_{i_1} \cap C_{i_2} = \emptyset \) for \(i_1,i_2\in [h]\) with \(i_1 \ne i_2\) we have \(\sum _{i\in P(R)} |C_i |\le |R |\le u\). We define \(\mu \) such that \(\mu _{P(R)}:=\lambda '_R\) for R with \(\lambda '_R> 0\) and \(\mu _P :=0\) for all remaining \(P\in {\mathcal {P}}\) [with \(P\ne P(R)\) for all R with \(\lambda '_R> 0\)]. Then \(\mu \) is a fractional solution of the Gilmore-Gomory formulation of B with objective value \(w_{\lambda '}=\sum _{R\in {\mathcal {R}}}\lambda '_R\). With \(w^*\) and \(w^*_\text {LP}\) denoting the optimum values of the Gilmore-Gomory formulation of B and its LP relaxation, respectively, we get \(k\ge \lceil w_{\lambda '}\rceil \ge \lceil w^*_\text {LP} \rceil = w^* \). The last equality holds since B has the integer round-up property. Every solution for B with objective function value of \(\ell \) can be translated to a solution of CHVS using \(\ell \) shores by assigning all vertices of components with corresponding item in the same bin to the same shore. Let \({{\bar{\lambda }}}\) be such a solution originating from an optimal solution of B. Then \({{\bar{\lambda }}}\) is also feasible for (M) with \(V^{\lambda '} = V^{{{\bar{\lambda }}}}\) and thus \(\sum _{R\in \mathcal {R}} |R |\lambda '_R = \sum _{R\in \mathcal {R}} |R |{{\bar{\lambda }}}_R\). \(\square \)

Based on these results, we formulate Algorithm 1 to retrieve an integer solution from a fractional solution \(\lambda '\), showing that no branching is necessary for the current branch-and-price node when \(z_v^{\lambda '} \in \{0,1 \}\). As a consequence of Proposition 1 Algorithm 1 never terminates in line 3 if the input hypergraph \({\mathcal {H}}\) is of the form \({\mathcal {H}} = {\mathcal {H}}[V^{\lambda '}]\) with \(z_v^{\lambda '} \in \{0,1 \}\) for \(\lambda '\). Hence, it terminates either (a) by finding a feasible solution with the same objective function value as an optimal solution of (RMLP) in the current node, or (b) in line 9. Case (b) happens if and only if the integer round-up property does not hold for the Bin Packing instance associated to (\(\mathcal {H}, u)\). It is an open question whether this can actually occur, but in our numerous computational tests it never did. However, we believe that a pathological example can be crafted, so we state.

Conjecture 1

The Bin Packing instances constructed for use in Algorithm 1 do not always have the integer round-up property.

To be on the safe side, in “Appendix A.2” (online only) we present a fallback branching rule that is essentially Ryan-Foster’s [26], to cover this case theoretically.

figure a

Remark 1

The first Bin Packing instance that does not have the integer round-up property was found by Marcotte [24] in 1986. More recently, Kartak et al. [18] computationally showed that there are instances with 10 items that do not have the integer round-up property and that all instances with 9 or less items have it.

2.3 Pricing

In the pricing problem we find a variable with (maximum) positive reduced cost to add to (RMLP), or prove that none exists. In the root node of the branch-and-price tree the reduced cost \({\bar{c}}_R\) of variable \(\lambda _R\) for \(R \subseteq {\mathcal {V}}\) with \(|R |\le u\) is \(|R |- \sum _{e\in {\mathcal {E}} ({R}) } \beta _e- \gamma \). Branching decisions, further down the tree, can easily be respected (this is also true for the fallback branching as described in “Appendix A.2”): if \(z_v^{\lambda '}=0\) for some \(v \in {\mathcal {V}}\), vertex v cannot be chosen for any shore; therefore we just do not consider it. If \(z_v^{\lambda '}=1\) for some \(v \in {\mathcal {V}}\), vertex v has to be chosen for exactly one shore. Hence we have to consider the value of the corresponding dual variable \(\alpha _v\) of \(\sum _{R \in \mathcal {R}: v \in R} \lambda _R = 1\) and the reduced costs become

$$\begin{aligned} {\bar{c}}_R = \displaystyle |R |- \left( \sum _{v\in R} \alpha _v + \sum _{e\in {\mathcal {E}} ({R}) } \beta _e+ \gamma \right) = \sum _{v\in R} (1 - \alpha _v) - \sum _{e\in {\mathcal {E}} ({R}) } \beta _e- \gamma . \end{aligned}$$

Hence, the pricing problem is to find a subset of vertices \(R\subseteq {\mathcal {V}}\) with \(|R |\le u\) such that an objective function of the form \(c^*_R:=\sum _{v\in R} p_v - \sum _{e\in {\mathcal {E}} ({R}) } c_e\) for \(p\in {\mathbb {R}}^{\mathcal {V}}\), and \(c\in {\mathbb {R}}^{\mathcal {E}} _{+}\) is maximized. We denote this problem as \(\textsc {Pr}\) and a specific instance as \(\textsc {Pr}(\mathcal {H}, p,c,u)\). Note that we can assume w.l.o.g. that \(p_v > 0\) (v could otherwise be excluded from any optimal solution) and \(c_e\ge 0\) (since \(\beta _e\ge 0\)).

2.3.1 Applications

Problem Pr can be seen as a variant of the Next Release Problem [2]: we are given a set of customers I and a set of possible software enhancements R, with profits \(p_i\) for every customer \(i \in I\), programming costs \(c_j\) for every enhancement \(j \in R\), and a set of demanded enhancements \(R_i \subseteq R\) for every customer i. The task is to find a subset of customers S such that the total profit \(\sum \nolimits _{i \in S} p_i\) is maximum and the needed programming costs \(\sum \nolimits _{j \in \bigcup \nolimits _{i \in S} R_i} c_j\) do not exceed a given value. In problem \(\textsc {Pr}\) the programming costs are part of the objective function and the number of chosen customers is bounded by a given value.

2.3.2 Complexity

Proposition 3

The pricing problem \(\textsc {Pr}\) is \({\mathcal {NP}}\)-hard.

Proof

We reduce the Clique problem to Pr. Consider an instance of the decision variant of Clique, i.e., an integer \(\ell \) and an undirected graph \(G=(V,E)\). The task is to find a clique in G with at least \(\ell \) nodes if one exists. The reduction works as follows: we take G as input for Pr assigning unit costs \(c_e= 1, e\in E\), to the edges and “irresistable” profits \(p_v= |E |+ 1, v \in V\), to the vertices. By solving \(\textsc {Pr}(G,p,c,u) \) one gets a subset of nodes R with \(|R |= u\) such that the number of edges \({\mathcal {E}} (R)\) having at least one end point in R is minimum. Hence \(V\backslash R\) is a subset of nodes with \(|V\backslash R|= |V |- u\) such that the number of edges with both end points in \(V\backslash R\) is maximized. Therefore it can be checked if there exists a clique in G with exactly \(|V |- u\) nodes by checking if \(|E |- |{\mathcal {E}} (R) |= \left( {\begin{array}{c} |V |- u\\ 2\end{array}}\right) \). Thus by solving \(\textsc {Pr}(G,p,c,u)\) for \(u=0,1,\ldots ,|V |- \ell \) one can check whether G has a clique with at least \(\ell \) nodes. \(\square \)

The following result (we give a proof in “Appendix A.1”, online only) was already observed by Barahona and Jensen [4], who found bounds for a location problem with inventory cost by applying Dantzig-Wolfe decomposition. The pricing problem they solved is \(\textsc {Pr}\) with relaxed cardinality constraint, which is polynomially solvable, and in particular an optimal solution can be calculated by finding a minimum s-t cut:

Proposition 4

The problem \(\textsc {Pr}\) is polynomially solvable for \(u= |\mathcal {V} |\).

2.3.3 Preprocessing the pricing problem

The following proposition states that unprofitable vertices can be identified by solving what we call the uncapacitated instance \(\textsc {Pr}({\mathcal {H}},p,c, |{\mathcal {V}}|)\), i.e., with \(u=|{\mathcal {V}}|\). We denote the set of optimal solutions to \(\textsc {Pr}({\mathcal {H}},p,c,u)\) by \(\mathcal {R}^*_{u}\).

Proposition 5

Let \(R'_\infty \in \mathcal {R}^*_{|{\mathcal {V}}|}\) denote an optimal solution to the uncapacitated instance \(\textsc {Pr}({\mathcal {H}},p,c, |{\mathcal {V}}|)\). For all \(u\in {\mathbb {N}}\) there exists an optimal solution \(R \in \mathcal {R}^*_{u}\) with \(R \subseteq R'_\infty \).

Proof

Consider an optimal solution \( R'_u\in \mathcal {R}^*_{u}\) for the capacitated instance. We show that \(R:=R'_u\cap R'_\infty \) is an optimal solution for \(\textsc {Pr}({\mathcal {H}},p,c,u)\). Clearly, the vertex set \(R_\infty :=R'_u\cup R'_\infty \) is a solution for \(\textsc {Pr}({\mathcal {H}},p,c, |{\mathcal {V}}|)\). Since \({\mathcal {E}} (R) = {\mathcal {E}} (R'_u \cap R'_\infty ) \subseteq {\mathcal {E}} (R'_u) \cap {\mathcal {E}} (R'_\infty ) \), and \(c_e\ge 0 \) for all \(e\in {\mathcal {E}} \), we obtain

$$\begin{aligned} c^*_{R} + c^*_{R_\infty }&= c^*_{R} + c^*_{R'_u} + c^*_{R'_\infty } - \sum _{v \in R'_u \cap R'_\infty } p_v + \sum _{e\in {\mathcal {E}} (R'_u) \cap {\mathcal {E}} (R'_\infty ) } c_e\\&\ge c^*_{R} + c^*_{R'_u} + c^*_{R'_\infty } - \sum _{v \in R } p_v + \sum _{e\in {\mathcal {E}} (R) } c_e\\&= c^*_{R'_u} + c^*_{R'_\infty } . \\ \end{aligned}$$

By definition, R is feasible for \(\textsc {Pr}({\mathcal {H}},p,c,u)\), and because \(R'_u\) and \(R'_\infty \) are optimal for \(\textsc {Pr}({\mathcal {H}},p,c,u)\) and \(\textsc {Pr}({\mathcal {H}},p,c,|{\mathcal {V}}|)\), respectively, also R and \(R_\infty \) are optimal for them, respectively. \(\square \)

Remark 2

We use Proposition 4 to preprocess the instance of the pricing problem. More precisely, we solve \(\textsc {Pr}({\mathcal {H}},p,c, |{\mathcal {V}}|)\) and exclude (for this single pricing iteration) all vertices that are not part of an optimal solution of \(\textsc {Pr}({\mathcal {H}},p,c, |{\mathcal {V}}|)\).

In the following, we present several algorithms that solve the pricing problem heuristically or exactly. All of them will be used in our implementation.

2.3.4 Greedy heuristic

For a subset of vertices \(R\subseteq {\mathcal {V}}\) and a vertex \(v \notin R\) we compute the change of the objective function value that would occur by including v in R by \(c(R,v):=p_v - \sum \nolimits _{e\in {\mathcal {E}} (v)\backslash {\mathcal {E}} (R)} c_e\). Starting with an initially empty set Algorithm 2 greedily adds a vertex that is locally most profitable until \(u\) vertices are included. For every intermediate R the corresponding variable \(\lambda _R\) is added if its reduced cost is positive.

figure b

2.3.5 Multi-start iterated local search

The following Algorithm 3 is a multi-start iterated local search with changing neighborhoods. It is started for several runs with a random initial solution and in each iteration a neighbor of largest improvement is chosen. For the number of neighborhood types we choose \({{\bar{\ell }}} = 4\). In every iteration we have a feasible solution \(R \in \mathcal {R}\) for \(\textsc {Pr}({\mathcal {H}},p,c, u)\) and a neighborhood level \(\ell \in [{{\bar{\ell }}}]\). Then R is set to a most profitable improving neighbor in \(\mathop {{\mathrm{arg}}\,{\mathrm{max}}}\nolimits _{R' \in \mathcal {N}_\ell (R)} {\bar{c}}_{R'}\) with \(\max \nolimits _{ R' \in \mathcal {N}_\ell (R)} {\bar{c}}_{ R'} > {\bar{c}}_{R}\) that is found by enumeration of \(\mathcal {N}_\ell (R)\) if it exists, and the neighborhood level \(\ell \) is reset to 1. Otherwise, the neighborhood level is increased by one if \(\ell \ne {{\bar{\ell }}}\). If all neighborhoods are exhausted we add \(\lambda _R\) to (RMLP) if \({\bar{c}}_R>0\).

figure c

The neighborhood types we use are \(\mathcal {N}_1 :=\mathcal {N}^{{\mathcal {V}}}_1 \), \(\mathcal {N}_2 :=\mathcal {N}^{{\mathcal {E}} }_1 \), \(\mathcal {N}_3 :=\mathcal {N}^{{\mathcal {E}} }_2 \), \(\mathcal {N}_4 :=\mathcal {N}^{{\mathcal {E}} }_3 \), with \(\mathcal {N}^{{\mathcal {V}}}_\ell (R) :=\{ {\bar{R}} \in \mathcal {R}: |R \triangle {\bar{R}}|= \ell \} \) and \(\mathcal {N}^{{\mathcal {E}} }_\ell (R) :=\{ {\bar{R}} \in \mathcal {R}: |{\mathcal {E}} (R) \triangle {\mathcal {E}} ({{\bar{R}}})|= \ell \} \) with the symmetric difference \(A\triangle B := (A\backslash B ) \cup (B \backslash A )\). In our implementation the neighborhood \(\mathcal {N}_4\) is explored at most once in every run of the algorithm.

2.3.6 Integer linear program

The following binary program (Pr) can be used to solve \(\textsc {Pr}({\mathcal {H}}, p,c, u)\) if \(c_e\ge 0\) for all \(e\in {\mathcal {E}} \):

$$\begin{aligned} \max \quad&\displaystyle \sum _{v\in {\mathcal {V}}} p_v x_v - \sum _{e\in {\mathcal {E}} } c_ey_e \end{aligned}$$
(Pr.1)
$$\begin{aligned} \text {s.t.} \quad&\sum _{v\in {\mathcal {V}}} x_v \le u \end{aligned}$$
(Pr.2)
$$\begin{aligned}&x_v - y_e\le 0 \quad \forall v \in {\mathcal {V}}, e\in {\mathcal {E}} : v \in e, \end{aligned}$$
(Pr.3)
$$\begin{aligned}&x_v ,\, y_e\in \{0,1\}, \quad \forall v\in {\mathcal {V}},\ e\in {\mathcal {E}} . \end{aligned}$$
(Pr.4)

Variable \(x_v\) equals 1 if and only if \(R \ni v\) and Eq. (Pr.2) ensures that at most \(u\) vertices are chosen. Variable \(y_e\) attains value of 1 if \({\mathcal {E}} ({R}) \ni e\) which is guaranteed by Eq. (Pr.3).

2.3.7 Complementary pricing

The greedy and local search heuristics and finally the exact ILP (Pr) are run in cascade. Additionally, after any one pricing algorithm found a variable \(\lambda _R\) with positive reduced cost, this algorithm is restarted on \({\mathcal {H}}' = ({\mathcal {V}} \backslash R,\ {\mathcal {E}} [{\mathcal {V}} \backslash R] )\). This can be applied repeatedly in rounds. If no variable with positive reduced cost is found, no restart takes place. The resulting complementary subsets are supposed to combine well to integer feasible solutions, see e.g., [16].

2.4 Preprocessing the hypergraph

We preprocess \(\mathcal {H}\) in two phases: in phase 1 we only remove hyperedges that are contained in others. If there are identical hyperedges we remove all of them but one. This can be easily done by enumeration.

The idea of phase 2 is to express the conflicts between vertices (imposed by hyperedges) by a smaller set of hyperedges. Consider the clique graph of \({\mathcal {H}}\) defined as \({G}({\mathcal {H}}) :=({\mathcal {V}}, E :=\{vw \mid \exists e\in {\mathcal {E}} : v,w \in e\})\). Two hypergraphs \({\mathcal {H}}, {\mathcal {H}}_1\) with identical clique graph \(G({\mathcal {H}}) = G({\mathcal {H}}_1)\) yield identical solution spaces for the CHVS. In order to find a simple such hypergraph \(\mathcal {H}_1\) we search for a minimum clique edge cover in \({G}({\mathcal {H}})\). A (minimum) clique edge cover \(\mathcal {C} = \{C_1,\ldots ,C_\ell \}\) of a graph is a (minimum cardinality) set of cliques such that each \(e\in {\mathcal {E}} \) is a subset of at least one clique, i.e., there is an \(h \in [\ell ]\) with \(e\subseteq C_h\). Then we replace each clique \(C \in \mathcal {C}\) by a hyperedge spanning exactly the vertices \(q\in C\). Thus we obtain \(\mathcal {H}_1\) with \(G({\mathcal {H}}) = G({\mathcal {H}}_1)\).

Since the hyperedges of \({\mathcal {H}}\) represent a clique edge cover in \(G({\mathcal {H}})\) of cardinality m, we can assume that \(\ell \le m\) and hence the new number of hyperedges would not be increased if the clique edge cover is minimum. In practice we use the polynomial-time heuristic of Kou et al. [22] that was based on a heuristic by Kellerman [20] and replaces the original hyperedges by the found ones if their number is decreased.

2.5 Primal heuristic

Algorithm 1 can be used to verify whether a subset of vertices S disconnects \({\mathcal {H}}\) in at most k components of cardinality at most \(u\). We exploit this fact by using it as a primal heuristic during the solution of the restricted master problem. In order to get a potential separator S we randomly round the possibly fractional solution values of \(z_v^{\lambda '} = \sum _{R \in \mathcal {R}: v \in R} \lambda '_R\). More specifically, a vertex \(v \in {\mathcal {V}}\) is added to S with probability \(z_v^\lambda + \delta \) where \(\lambda \) is the current LP solution, and \(\delta \in \{-\,0.001, 0, 0.05, 0.1,0.2, 0.3 \}\) fixed randomly equally distributed for this run of the heuristic. The number of runs is 200 and the heuristic is called directly after solving a branch-and-price node and after every 50 column generation iterations.

2.6 Exchange vectors

For a given subset of vertices \(R\subseteq {\mathcal {V}}\) and a hyperedge \(e\in {\mathcal {E}} (R)\) one can easily construct \(R'(R,e):=R\backslash e\). By adding an artificial variable \(\nu _e\) for every \(e\in {\mathcal {E}} \) corresponding to removing \(e\) with all \(v \in e\) from a shore one implicitly can use \(\bigcup \nolimits _{ R\in \bar{\mathcal {R}} } \bigcup \nolimits _{e\in {\mathcal {E}} } R'(R, e) \) pattern variables that are not explicit part of the model when solving (RMLP). Note again that in formulation (MLP) the upper bound constraints \(\lambda _R\le 1\) are already implied by Eq. (M.1) (since there are no isolated vertices). We obtain the following augmented master LP formulation (AMLP):

$$\begin{aligned} \max \quad&\displaystyle \sum _{R \in \mathcal {R}} |R |\lambda _R - \sum _{e\in {\mathcal {E}} } |e|\nu _e\nonumber \\ \text {s.t.} \quad&\sum _{\begin{array}{c} R \in \mathcal {R}: e\in {\mathcal {E}} (R) \end{array}} \lambda _R - \nu _e\le 1 \quad \forall e\in {\mathcal {E}}&\end{aligned}$$
(AMLP.1)
$$\begin{aligned}&\sum _{R \in \mathcal {R}} \lambda _R \le k\\&\lambda _R \ge 0 \quad \forall R \in \mathcal {R}&\nonumber \\&\nu _e\ge 0 \quad \forall e\in {\mathcal {E}}&\nonumber \end{aligned}$$
(AMLP.2)

The new variables, their coefficient columns are called exchange vectors, translate to the following constraints in the dual (called dual-optimal inequalities [7]):

$$\begin{aligned} \beta _e\le |e|\quad \forall e\in {\mathcal {E}} \end{aligned}$$

The following proposition states their validity, i.e., that (MLP) and (AMLP) are equivalent.

Proposition 6

Let \(z_{\text {MLP}}\) and \(z_{\text {AMLP}}\) denote the respective optima of (MLP) and (AMLP). Then \(z_{\text {MLP}} = z_{\text {AMLP}}\).

Proof

We use the dual (DMLP) of formulation (MLP)

$$\begin{aligned} \min \quad&\displaystyle \sum _{e\in {\mathcal {E}} } \beta _e+ k\gamma \nonumber \\ \text {s.t.}\quad&\sum _{e\in {\mathcal {E}} ({R}) } \beta _e+ \gamma \ge |R |\quad \forall R \in \mathcal {R}\\&\beta _e, \gamma \ge 0 \quad \forall e\in {\mathcal {E}} ,\nonumber \end{aligned}$$
(DMLP.1)

and show that the inequalities \(\beta _e\le |e|\) for \(e\in {\mathcal {E}} \) are fulfilled by all optimal solutions of (DMLP). Assume by contradiction that there is a an optimal solution \((\beta ^*,\gamma ^*)\) to (DMLP) with \(\beta ^*_{e'} > |e' |\) for some \(e' \in {\mathcal {E}} \). Then, there exists at least one subset of vertices \(R' \in \mathcal {R}\) with \(e' \cap {R'} \ne \emptyset \) and \(\sum _{e\in {\mathcal {E}} ({R'})} \beta ^*_{e} + \gamma ^* = |{R'} |\) (otherwise \(\beta ^*_{e'}\) could be reduced, contradicting optimality). In particular, \(|{R'} |\ge \beta ^*_{e'} > |e' |\) and hence, \({\tilde{R}} :={R'} \backslash {e'}\) is nonempty and \(|{\tilde{R}} |+ |e' |\ge |{R'} |\). Since \({\mathcal {E}} ({R'}) \supseteq {\mathcal {E}} ({{\tilde{R}}}) \cup \{e'\} \), we get

$$\begin{aligned}&\sum _{e\in {\mathcal {E}} ({\tilde{R}})} \beta ^*_{e} + \gamma ^* + |e' |\ge |{\tilde{R}} |+ |e' |\ge |R' |\\&\qquad = \sum _{e\in {\mathcal {E}} ({R'})} \beta ^*_{e} + \gamma ^* \ge \sum _{e\in {\mathcal {E}} ({{\tilde{R}}})} \beta ^*_{e} + \beta ^*_{e'} + \gamma ^*, \\ \end{aligned}$$

contradicting \(\beta ^*_{e'} > |e' |\). \(\square \)

3 Computational results

We first introduce our computational environment. We then compare our algorithm with a commercial solver applied to the original formulation for different values of k. Thirdly, we investigate the influence of different algorithmic ingredients proposed in the previous section. Finally, we visualize some matrix decompositions complementing those known from the literature. This section contains mainly aggregate information; details can be found in the online appendix.

3.1 Implementation

Our algorithm denoted as base is a branch-and-price algorithm to solve formulation (M). Its default settings were derived in extensive experiments with the described features, see Sect. 3.7. The branching is executed as described in Sect. 2.2. The three pricing algorithms (Sects. 2.3.42.3.6) run in the following order: greedy heuristic, local search heuristic, and integer linear program (Pr). If a variable with positive reduced cost is found, the remaining pricing algorithm(s) will not be called. We set the maximal number of complementary pricing rounds (Sect. 2.3.7) for each algorithm to 8. The local search pricing heuristic is restarted three times for each complementary round. The preprocessing of the pricing problem (described in Sect. 2.3.3) is executed before each call of the exact pricing algorithm, i.e., solving formulation (Pr). Primal heuristic and preprocessing (of the hypergraph) are implemented as described in Sects. 2.4 and 2.5, respectively. The exchange vectors (cf. Sect. 2.6) are disabled by default.

3.2 Environment

The branch-and price algorithm is implemented in SCIP 3.2.0 with CPLEX 12.6.1 running on a single thread using default settings (with the exception that dual simplex optimizer is used) as a solver for the IP subproblems. The original formulation is also solved by CPLEX 12.6.1 under exactly the same conditions. All computations were performed on Intel Core i7-2600 CPUs with 16 GB of RAM on openSUSE 12.1 workstations running Linux kernel 3.1.10. The default time limit is 1800 s.

3.3 Instances

We consider four different groups of instances (details are in the online “Appendix A.3”):

netlib These 55 instances arise from basis matrices of linear programs in the NETLIB. These were also used by Borndörfer et al. [9]. We select all instances with up to 500 vertices (rows) that are not too small (more than 50 vertices). See Table 1 for more details on the instances.

Table 1 Instance information of netlib testset

dimacs The second group of 40 instances originates from graph coloring problems that were solved at the second DIMACS challenge. These graphs were used in [13]. We select all non-tiny (at least 20 vertices) instances with up to 500 vertices. A detailed description can be found in Table 2.

Table 2 Instance information of dimacs testset
Table 3 Instance information of miplib testset

miplib The third group consists of 37 coefficient matrices originating from presolved mixed integer programs from MIPLIB2010 [21]. Borndörfer et al. [9] use similar instances from MIPLIB 3.0. Our instances are presolved by SCIP 3.2.0 with default settings. We report some characteristics in Table 3.

random Finally, we randomly constructed a test set of 50 hypergraphs based on 10 groups with different characteristics. The construction works as follows: for group \(i \in \{1,\ldots ,5\}\) or group \(j\in \{6,\ldots ,10\}\), the number of vertices is i.i.d. in \([25(i+1),25(i+2)]\) or \([25(j-4),25(j-3)]\), respectively. The number of hyperedges is i.i.d. in [50, 75] for groups 1 through 5 and from [75, 100] for groups 6 through 10. The cardinality of each hyperedge is i.i.d. in [2, 4] and the spanning vertices are chosen randomly as well. Details of the resulting instances can be found in Table 4.

Table 4 Instance information of random testset

Summarized instance information At the end of this paragraph we display some aggregated instance information for the testsets. Columns headed \(|\mathcal {V}_i|\) list the arithmetic mean of the number of vertices before (\(i=0\)) and after (\(i=1\)) presolving. Columns headed \(|\mathcal {E}_i|\) contain the arithmetic mean of the number of hyperedges in the original graph (\(i=0\)), and in the presolved hypergraph after phase 1 (\(i=1\)) and phase 2 (\(i=2\)), see Sect. 2.4. The columns that are indicated by \( D^{\mathcal {E}} _i \), \(i\in \{0,1,2\}\) contain the average cardinality of a hyperedge in the respective hypergraph. The last column shows the density of the clique graph \(G(\mathcal {H})\).

Testset

\( |{\mathcal {V}}_0 |\)

\(|{\mathcal {V}}_1 |\)

\(|{\mathcal {E}} _0 |\)

\(|{\mathcal {E}} _1 |\)

\(|{\mathcal {E}} _2 |\)

\( D^{\mathcal {E}} _0 \)

\( D^{\mathcal {E}} _1 \)

\( D^{\mathcal {E}} _2 \)

d

miplib

277.7

277.7

2421.2

1237.8

968.6

6.0

9.1

18.6

0.264

netlib

218.1

206.3

218.1

150.7

146.4

5.1

6.7

7.1

0.136

random

112.3

94.5

76.3

75.0

74.1

3.0

3.0

3.0

0.056

dimacs

168.4

151.5

3507.6

3507.6

311.4

2.0

2.0

8.3

0.236

3.4 Fixed maximum number of shores

When reporting on a fixed maximum number \(k\) of shores we set capacity \(u:= \lceil \frac{|{\mathcal {V}}|}{k} \rceil \), see Fig. 3. With increasing k the ratio of optimally solved instances also increases for every testset. Furthermore the order of the testsets according to the ratio of optimally solved instances is the same for every k. Considering this as measure for difficulty (according to base) we get in increasing order of difficulty: random, netlib, dimacs, miplib. Note that for \(k\le 12\) there are instances in every testset that could not be solved optimally.

Fig. 3
figure 3

Ratio of optimally solved instances in each testset for different values of k

Figure 4 gives more detail on solution quality. We plot the ratio of instances that were solved with optimality gap worse than \(\sigma \) for different values of k. On the one hand, the ratio of instances with optimality gap \(\sigma \ge 50\%\) is varying not much but on the other hand, the ratio of instances with gap \(\sigma \le 25\%\) decreases with increasing k. This shows that finding some solution is generally not so hard, but closing the gap is easier for larger k. About \(5\%\) of the instances could not be solved with a gap better than \(200\%\) for every k.

In Fig. 5 we display the dependence of the clique graph density d. The instances are grouped according to their density. The performance for instances with \(d\le 0.05\) seems to be most k-sensitive in the sense that these instances are the hardest to solve for \(k=4\) and the easiest to solve for \(k=256\). In contrast the difficulty of instances with \(d > 0.15\) seems to be least k-sensitive in the sense that the ratio of optimally solved instances is changing the least for increasing k.

Fig. 4
figure 4

Ratio of instances solved with optimality gap worse than \(\sigma \) for different values of \(\sigma \) and k

Fig. 5
figure 5

Ratio of optimally solved instances for different density d and k

Aggregated report of results for all instances In the following we want to compare the performance of base with the performance of CPLEX 12.6 working on the original formulation (P). We call this algorithm cplex. In order to compare the performance on all instances for every k we use performance profiles [14], displayed in Fig. 6. Algorithms cplex and base are represented by the dashed and the solid line, respectively. We realize that for \(k<8\) algorithm cplex performs better while for \(k=8\) both algorithms have a similar performance. For \(k>8\), however, algorithm base outperforms algorithm cplex. Furthermore we observe for increasing k that instances overall get easier to solve for base. On the contrary for algorithm cplex for increasing k instances get harder to solve (up to \(k=32\) then easier again). The results for each testset are similar to the aggregated ones. Performance profiles for each testset can be found in “Appendix A.4” (online only). We want to point out that Borndörfer et al. [9] also solved instances from the netlib testset for \(k=4\) but since base ist outperformed by cplex for \(k=4\), a comparison between base and the algorithm in [9] is obsolete.

Fig. 6
figure 6

Performance difference between branch-and-price and branch-and-cut, over all instances: base (solid) on model (M) versus cplex (dashed) on model (P)

3.5 Arbitrarily many shores

In the next experiment we do not bound k and adapt u accordingly, but consider the reverse setting: set a specific capacity and allow arbitrarily many shores. We report on \(u\in \{4,12,36,\lceil \frac{|{\mathcal {V}}|}{4}\rceil , \lceil 1.05\frac{|{\mathcal {V}}|}{2}\rceil \}\). In Fig. 7 the success rate for the different testsets and values of \(u\) is displayed. For \(u\in \{4,12\} \) all random and a large share of netlib instances can be solved optimally. We further find that for all testsets with increasing \(u\) less instances can be solved optimally. This is to be expected as pricing problems get combinatorially richer. One might expect the results for \(k=4\) and \(k=\infty \) with \(u=\lceil \frac{|{\mathcal {V}}|}{4}\rceil \) to be similar, but in fact the latter one is much better. A possible reason is the absence of constraint (M.2) for \(k=\infty \).

Oosten et al. [25] tested their approach for a subset of the netlib instances for \(u=\lceil \frac{|{\mathcal {V}}|}{4}\rceil \). Algorithm base solves all instances they tested within 25 s in total (including the three instances scsd1, beaconfd, and share2b that could not be solved within the time limit of 60 min in [25]).

Fig. 7
figure 7

Ratio of optimally solved instances by algorithm base for \(k=\infty \) and varying capacity \(u\)

Fig. 8
figure 8

Impact of hypergraph preprocessing as described in Sect. 2.4: algorithm base (with phase 2 preprocessing) versus original (only phase 1 preprocessing) on all instances for \(k\in \{4,8,12\}\)

Fig. 9
figure 9

Impact of working with a hypergraph based formulation (algorithm base) versus an edge based formulation using clique graphs (algorithm edge) on all instances for \(k\in \{4,8,12\}\)

3.6 Strength of formulations

Phase 2 of hypergraph preprocessing may find an alternative hypergraph with the same clique graph as the input hypergraph (Sect. 2.4). The corresponding formulations (M) have the same integer solutions but the respective LP relaxations may differ in strength. To evaluate the impact of that reformulation we compare against the case with only phase 1 preprocessing enabled (called algorithm original). As a third reference we also compare against using the corresponding clique graph in formulation (M) (called algorithm edge).

For a fixed maximum number of shores the particularly interesting (since difficult) shore numbers are \(k\in \{4,8,12\}\). The performance profiles in Figs. 8 and 9 reveal that for these k algorithm base outperforms original and massively outclasses edge. More specifically, algorithm base solves between 10 and 20 percent more instances for \(k\in \{4,8,12\}\) than algorithm original.

In order to get an idea of the strength of the formulations we look at the integrality gap at the root node (if it is solved for this instance). Figure 10 shows the shifted (by 1) geometric mean for \(k\in \{4,8,12\}\) and each instance set. Algorithm base delivers the smallest shifted geometric mean of the integrality gap in almost every case.

Fig. 10
figure 10

Shifted (by 1) geometric mean of the integrality gap in the root node (in percent, for instances with solved root node) for combinations of k and testset (abbreviated by their first letter)

3.7 Feature impact

Presolving of pricing problem The pricing problem is preprocessed as described in Sect. 2.3.3. The influence of the preprocessing is displayed in the three performance profiles in Fig. 11. As one can see the performance is slightly improved in particular for \(k=8\).

Fig. 11
figure 11

Performance profiles base versus base with unpresolved pricing problem on all instances for \(k\in \{4,8,12\}\)

Exchange vectors We tested the model modifications discussed in Sect. 2.6. Unfortunately, it turns out that the described exchange vectors (which in fact are rather removal vectors) are not involved enough and lead to a slightly worse performance. Therefore we decided to disable them by default, see Fig. 12.

Fig. 12
figure 12

Performance profiles base versus base with exchange vectors on all instances for \(k\in \{4,8,12\}\)

Fig. 13
figure 13

Performance profiles base versus base without complementary pricing on all instances for \(k\in \{4,8,12\}\)

Fig. 14
figure 14

Performance profiles base versus base without some heuristic pricing algorithm(s) on all instances for \(k\in \{4,8,12\}\)

Fig. 15
figure 15

Examples of (optimally) decomposed matrices corresponding to graphs of dimacs instances

Complementary pricing In the performance profiles in Fig. 13 one can see the influence of the complementary pricing described in Sect. 2.3.7. For \(k\in \{4,8,12\}\)base outperforms the variant with disabled complementary pricing.

Pricing algorithms The performance profiles in Fig. 14 give an overview over the performance when disabling one or both of the heuristic pricing approaches for \(k\in \{4,8,12\}\) over all instances. It turns out that enabling both heuristic pricing algorithms overall outperforms the other approaches.

3.8 Optimal decompositions of matrices

Examples for coefficient matrices of dimacs, miplib, and netlib instances into single-bordered block-diagonal form with minimum cardinality border are shown in Figs. 15, 16 and 17.

Fig. 16
figure 16

Examples of (optimally) decomposed coefficent matrices of presolved MIPLIB2010 mixed integer programs

Fig. 17
figure 17

Examples of (optimally) decomposed coefficent matrices of netlibinstances

4 Conclusion

In this paper we studied the capacitated vertex separator problem on hypergraphs (CHVS). We presented a branch-and-price approach for fixed and arbitrary number of shores, and reported and compared our results to the existing approaches whenever possible. It is the first successful algorithm for the CHVS for a large number of shores \(k>8\) that is especially interesting for the matrix decomposition application. It uses state-of-art methods that highlight the impact of exploiting problem structure, e.g., in preprocessing. We tested on a large set of instances from several applications. The complexity of the pricing problem, which has an interesting application on its own, is studied and we give furthermore three approaches to solve it. The non-trivial branching scheme uses results on the integer round-up property for Bin Packing.

More than 20 years have passed since the first presentation of an exact algorithm for the CHVS [9]. At the time, elaborate valid inequalities were needed to strengthen the LP relaxation. Such cutting planes are part of generic solvers nowadays and make them successful tools as can be seen in our experiments for fixed \(k\le 12\). For larger k, our exponential-size reformulation, and the resulting branch-and-price algorithm, still significantly outperform the standard state-of-the-art solver. We may hope that 20 years from now, such reformulations be part of generic solvers as well.