1 Introduction

Modeling with disjunctive constraints is a standard technique in the theory and practice of mathematical programming. The concept was introduced by Balas (1979), and has become a flourishing research area with many applications (Balas, 2018).

A central question is the following: given a finite set of polyhedra \(P_1,\ldots ,P_m\subseteq \mathbb {R}^n_{\ge 0}\), how can we express the constraint

$$\begin{aligned} x \in \bigcup _{i=1}^m P_i \end{aligned}$$
(1)

in a linear program? Balas had the idea to lift this problem into a higher dimensional space. Suppose \(P_i =\{x \in \mathbb {R}^n\ |\ A^i x \le b^i\}\) for \(i \in \llbracket m \rrbracket \). Let \(\Gamma = \llbracket m \rrbracket \), and \(\Gamma ^\star = \{i \in \Gamma \ |\ P_i \ne \emptyset \}\). Let \(\mathcal {S}(Q^\star )\) be the set of those vectors \(x \in \mathbb {R}^n\) such that there exist vectors \((y^i, \lambda _i)_{i\in \Gamma ^\star }\) satisfying

$$\begin{aligned} x - \sum _{i\in \Gamma ^\star } y^i= & 0, \nonumber \\ A^i y^i - b^i\lambda _i\le & 0,\quad \forall i\in \Gamma ^\star ,\nonumber \\ \sum _{i\in \Gamma ^\star }\lambda _i= & 1, \nonumber \\ \lambda _i\ge & 0, \quad \forall i\in \Gamma ^\star . \end{aligned}$$
(2)

Theorem 1

(Balas (1985)) \(\mathrm {cl\ conv}(\bigcup _{i=1}^m P_i) = \mathcal {S}(\Gamma ^\star )\).

It may not be straightforward to determine the subset \(\Gamma ^\star \) of \(\Gamma \). Balas has given necessary and sufficient conditions for replacing \(\Gamma ^\star \) by \(\Gamma \) in (2). For our purposes, the following condition suffices.

Theorem 2

(Balas (1985)) If for every \(i \in \Gamma \), some subset of the set of inequalities \(A^i x \le b^i\) defines a bounded nonempty polyhedron, then \(\mathrm {cl\ conv}(\bigcup _{i=1}^m P_i) = \mathcal {S}(\Gamma )\).

From now on, we assume that each \(P_i\) is a polytope.

An important property of the linear system (2) is that for every basic (or extreme) solution, there exists an index \(i \in \Gamma \) such that \(x = y^i\) and \(\lambda _i = 1\), while \(y^{i'} = 0\) and \(\lambda _{i'} = 0\) for all \(i' \in \Gamma {\setminus } \{i\}\), see Balas (1998).

A drawback of the extended formulation (2) is that the x variables are copied m times, which may considerably increase the size of the LP formulation. To alleviate this burden, Vielma introduced the concept of embedding formulations (Vielma, 2018). Let \(P_i^\textrm{emb} = P_i\times \epsilon _i\) be the embedding of polytope \(P_i\) into \(\mathbb {R}^{n+m}\), where \(\epsilon _i \in \mathbb {R}^m\) has a 1 in position i, and 0 otherwise. Then

$$\begin{aligned} P^\textrm{emb} = \textrm{conv}\left( \bigcup _{i=1}^m P^\textrm{emb}_i\right) , \end{aligned}$$
(3)

is the Cayley Embedding of the union of the \(P_i\) (Huber et al., 2000; Vielma, 2018).

In Kis and Horváth (2022) the concept of network-flow representable polytope has been defined, and explored to characterize the facets of (3). We recapitulate the fundamental definitions and key properties in Sect. 2.

Recall the notion of ideal, and non-extended formulation from Vielma (2018; 2019), Huchette and Vielma (2019a; b). Let \(Ax + By +Cz \le b,\, z\in \mathbb {Z}^k\) be a formulation of (3) and Q denote the polyhedron determined by it’s LP relaxation. We call the formulation ideal if z is integral in the extreme points of Q, and non-extended, if it contains no y variables (otherwise extended). Our formulation of (3) is ideal, since it gives the convex hull of feasible points, and non-extended, since beside the original problem variables, there are only m new binary variables to select the \(P_i\). Note that Balas’ formulation (2) is ideal, and extended.

In the present paper we complement the results of Kis and Horváth (2022) by efficient separation algorithms that can be used in branch-and-cut solvers. To demonstrate the effectiveness of the new cuts, and to test the separation procedures, we summarize our computational experiments on a set of benchmark problems.

The structure of the paper is as follows. Section 2 describes the necessary background and some basic results to be used throughout the paper. We define the special cases \(P^{\textrm{emb}}_\le \) and \(P^{\textrm{emb}}_=\) of \(P^{emb}\) here as well. We characterize those disjunctive constraints that admit the proposed network flow representation in Sect. 3. The affine hull of \(P^{emb}\) is determined in Sect. 4. In Sect. 5 we recall the characterization of facets of \(P^{\textrm{emb}}_\le \) from Kis and Horváth (2022), and give a new characterization of the non-trivial facets of \(P^{\textrm{emb}}_=\), which is more general than the result in Kis and Horváth (2022). Separation procedures are provided along with proofs of their correctness in Sect. 6. In Sect. 7 we describe a general test problem for the separation algorithm and summarize our computational results. We also provide a comparison to other approaches, including Balas’ extended formulation.

2 Background and preliminary results

A network representation for \(P^\textrm{emb}\) consists of a network \(G = (V,E)\) along with a capacity function \(c_{x,\lambda }\) on the arcs, where G is of special structure. The set of nodes V comprises a source node \(s\in V\) and a unique sink node \(t\in V\). Let \(V_t= \{v_1,\ldots ,v_n\}\) denote the neighboring nodes of t, and we will call them the terminal nodes. We assume that \(G\setminus \{t\}\) can be decomposed into m subgraphs \(G_1,\ldots ,G_m\). Each \(G_i\) is a directed tree rooted at s with all leaves in \(V_t\). In addition, we assume that for each node \(v_j \in V_t\) there exists at least one subgraph \(G_i\) containing \(v_j\). Moreover, for \(i \ne i'\), \(V(G_i) \cap V(G_{i'})\) comprises node s, and possibly some nodes from \(V_t\), but they have no other nodes in common. This implies that each \(v\in V(G_i) \setminus \left\{ s\right\} \) has a unique in-arc in \(G_i\), denoted by \(e_i\left( v\right) \), and a unique parent node \(p_i\left( v\right) \), where \(e_i(v) = (p_i(v),v)\).

The capacity function \(c_{x,\lambda }\) has the following properties: if \(e\in E(G_i)\), then \(c_{x,\lambda }(e) = \beta _{e} \lambda _i\), where \(\beta _{e} > 0\) is a constant. The sum of the capacities of the arcs emanating s in subgraph \(G_i\) is denoted by \(\alpha _i\lambda _i\). The capacity of the arc \((v_j,t)\) is \(c_{x,\lambda }(v_j,t) = x_j\). The sum of the capacities of those arcs entering t is \(\sum _{j=1}^n x_j\).

For simplicity, for a fixed graph G we will denote the network by \(N\left( x,\lambda \right) \), where x and \(\lambda \) are the parameters of the capacity function. We say that \(N\left( x,\lambda \right) \) represents \(P^{\textrm{emb}}\), if for each i, when setting \(\lambda _i =1\), and the other coordinates of \(\lambda \) to 0, and for any \(x\ge 0\), \((x,\lambda ) \in P^{\textrm{emb}}\) if and only if the network N parametrized by x and \(\lambda \) as above, admits a feasible \(s-t\) flow of value \(\sum _{j=1}^n x_j\), and \((x,\lambda )\) satisfies the valid equations for \(P^{\textrm{emb}}\).

Throughout the paper, we will consider two special cases of the polyhedron \(P^{\textrm{emb}}\). Either the only valid equation for \(P^{\textrm{emb}}\) is

$$\begin{aligned} \sum _{i=1}^m \lambda _i = 1, \end{aligned}$$
(4)

or all \((x,\lambda )\in P^{\textrm{emb}}\) also satisfy

$$\begin{aligned} \sum _{i=1}^m \alpha _i\lambda _i = \sum _{j=1}^n x_j. \end{aligned}$$
(5)

We distinguish these two cases by \(P^{\textrm{emb}}_\le \) and \(P^{\textrm{emb}}_=\), respectively. Moreover, \(P^{\textrm{emb}}_{\star }\) will denote either of these two polytopes. Let \(R_{\le } := \{(x,\lambda ) \in \mathbb {R}^{n+m}_{\ge 0}\ |\ (4) \}\), \(R_{=} := \{(x,\lambda ) \in R_{\le }\ |\ (5)\}\), and \(R_{\star }\) will denote either of them.

We continue with a couple of definitions and notations used throughout the paper. The set U will denote a subset of the terminal nodes \(V_t\), and \(\overline{U}\) will always denote \(V_t\setminus U\). For an arc \(e=(u,v)\) we call u the tail of the arc and v the head of the arc. In graph G, for any \(X \subset V(G)\), we denote the set of those arcs (uv) such that \(v \in X\) and \(u \in V\setminus X\) by \(\rho (X)\) and the set of those arcs (uv) such that \(u \in X\), and \(v \in V\setminus X\) by \(\delta (X)\). Note that for \(v \in V(G_i) {\setminus } (\{s\}\cup V_t)\), \(\delta (v)\) and \(\rho (v)\) consist of arcs in \(E(G_i)\) only. For any node \(w \in V(G)\), let \(\sigma (w)\) denote the subset of nodes in \(V(G)\setminus \{t\}\) consisting of w and all those nodes reachable from w by a directed path.

An \(s-t\) cut in this network is a partitioning \(\left[ S,\overline{S}\right] \) of V such that \(s\in S\) and \(t\in \overline{S}:= V\setminus S\). We call S the source side of the cut and \(\overline{S}\) the sink side of the cut. Let \(E\left( S,\overline{S}\right) \) be the set of cut-arcs, with tail in S and head is \(\overline{S}\), and let \(E_{i}\left( S,\overline{S}\right) =E(G_i)\cap E\left( S,\overline{S}\right) \). For any subset of nodes \(S\subset V(G)\), let E(S) be the set of those arcs from E(G) with both end-points in S, and let \(E_i(S) = E(G_i)\cap E(S)\). For \(w \in S\cap V(G_i)\), let \(c^i_w(S)=\sum _{e\in E_{i}\left( S\cap V(\sigma (w)),\overline{S}\right) }\beta _{e}\). When the \(s-t\) cut is clear from the context, we simply put \(c^i_w\). For \(w \in V_t\), \(c^i_w(S) = 0\).

An \(s-t\) cut \(\left[ S,\overline{S}\right] \) in G induces the following inequality:

$$\begin{aligned} \sum _{e\in E\left( S,\overline{S}\right) }c_{x,\lambda }(e) \ge \sum _{j=1}^n x_j. \end{aligned}$$
(6)

Consider a network \(N\left( x,\lambda \right) \). We define \(Q_\le (N)\) as the set of vectors \(\left( x,\lambda \right) \) that satisfy (4) and all the inequalities (6):

$$\begin{aligned} Q_\le (N) = \left\{ \left( x,\lambda \right) \in \mathbb {R}_{\ge 0}^{n+m}\ \left| \ \begin{aligned}&\sum _{i=1}^m \lambda _i = 1,\\&\sum _{e\in E\left( S,\overline{S}\right) } c_{x,\lambda }(e) \ge \sum _{j=1}^n x_j,\,\forall \ \text {s-t cut} \left[ S,\overline{S}\right] \text {in { G}} \end{aligned} \right. \right\} . \end{aligned}$$
(7)

We define \(Q_=(N)\) by requiring also (5):

$$\begin{aligned} Q_=(N) = \left\{ (x,\lambda )\in Q_\le (N)\ \left| \ \sum _{i=1}^m\alpha _i\lambda _i = \sum _{j=1}^n x_j\right. \right\} . \end{aligned}$$
(8)

If the network is understood from the context, then we use the notation \(Q_\le \) and \(Q_=\). For the sake of brevity, let \(Q_{\star }\) denote either \(Q_\le \) or \(Q_=\). An important property of these polyhedra is that in their vertices \(\lambda =\epsilon _i\) for some \(i\in \llbracket m\rrbracket \), see Kis and Horváth (2022).

Theorem 3

(Corollary 1 of Kis and Horváth (2022)) Network N represents the polytope \(P^{\textrm{emb}}_{\star }\) if and only if \(P^{\textrm{emb}}_{\star }=Q_{\star }(N)\).

We continue with reduction rules for the capacity function \(c_{x,\lambda }\). We call a capacity function reduced if for all \(i \in \llbracket m\rrbracket \), and \(v\in V(G_i){\setminus }(\left\{ s\right\} \cup V_t)\) the following holds:

$$\begin{aligned} \begin{aligned}&\beta _{e} \le \beta _{e_i\left( v\right) } \quad \forall e\in \delta (v), \\&\beta _{e_i\left( v\right) } < \sum _{e\in \delta (v)} \beta _{e}. \end{aligned} \end{aligned}$$
(9)

Proposition 1

A network \(N=(G,c_{x,\lambda })\) with non-reduced capacity function can always be transformed to a network \(N'=(G',c_{x,\lambda }')\) with a reduced capacity function such that \(P^{\textrm{emb}}_{\star }(N)=P^{\textrm{emb}}_{\star }(N')\).

The proofs of Propositions 16 can be found in Appendix A. For an \(s-t\) cut \([S,\overline{S}]\), let \(k_i\) denote the sum of coefficients of the cut arcs that belong to \(E(G_i)\), i.e.

$$\begin{aligned} k_i = \sum _{e\in E_{i}\left( S,\overline{S}\right) }\beta _{e} \end{aligned}$$
(10)

For a subset \(U \subset V_t\) an \(s-t\) cut with respect to U is an \(s-t\) cut \(\left[ S,\overline{S}\right] \) such that \(U\subseteq S\), and \(V_t{\setminus } U=\overline{U}\subseteq \overline{S}\). There may be several \(s-t\) cuts w.r.t. U, and we will be interested only in those of smallest capacity.

Definition 1

(Dominating cut) A dominating \(s-t\) cut w.r.t. U is an \(s-t\) cut such that \(k_i\) is minimal for each \(i\in \llbracket m\rrbracket \), where the minimum is taken over all \(s-t\) cuts w.r.t. U.

It can be shown that there always exists a dominating \(s-t\) cut for any \(U \subset V_t\), see Kis and Horváth (2022). The dominating \(s-t\) cuts w.r.t. U are not unique in general. We will denote the set of dominating \(s-t\) cuts w.r.t. U by \(\mathcal {C}_{\min }\left( U\right) \). All \(s-t\) cuts in \(\mathcal {C}_{\min }\left( U\right) \) have the same capacity by definition. We can also define a partial order on \(\mathcal {C}_{\min }\left( U\right) \) based on the set inclusion relation.

Definition 2

(Minimal/maximal dominating cut) An \(s-t\) cut \(\left[ S,\overline{S}\right] \) is minimal dominating w.r.t. U if

$$\begin{aligned} S\subseteq S'\quad \forall \left[ S',\overline{S'}\right] \in \mathcal {C}_{\min }\left( U\right) \end{aligned}$$

and it is maximal dominating if

$$\begin{aligned} S'\subseteq S\quad \forall \left[ S',\overline{S'}\right] \in \mathcal {C}_{\min }\left( U\right) . \end{aligned}$$

The minimal and the maximal dominating \(s-t\) cuts w.r.t. U are unique as proved in Proposition 4 of Kis and Horváth (2022). Throughout this paper, \(\left[ S^-,\overline{S^-}\right] \) and \(\left[ S^+,\overline{S^+}\right] \) will denote the unique minimal and maximal dominating \(s-t\) cut w.r.t. U, respectively.

Next, we give a constructive characterization of these cuts. To this end, we define a special capacity function \(c^U\) on the arcs of G as follows:

$$\begin{aligned} c^U_{uv} = {\left\{ \begin{array}{ll} \beta _{uv} & \text { if } v\ne t\\ 0 & \text { if } u\in U\\ M & \text { if } u\in \overline{U}\\ \end{array}\right. }, \end{aligned}$$
(11)

where M is a very large number, for example \(M=\sum _{i=1}^m\alpha _i+1\). With the new arc capacities \(c^U_{uv}\), in any minimum capacity \(s-t\) cut, the arcs \((v_j,t)\) are always cut-arcs for all \(v_j\in U\) and the arcs \((v_{j'},t)\) are never cut-arcs for any \(v_{j'}\in \overline{U}\). Thus for any minimum capacity \(s-t\) cut \(\left[ S,\overline{S}\right] \) we have \(U\subset S\), and \(\overline{U} \subset \overline{S}\). Let f denote a maximum \(s-t\) flow in the network with arc capacities \(c^U\). We call arc e tight if \(\beta _{e}=f_e\). It is well-known that in any minimum capacity \(s-t\) cut, the cut-arcs are tight. Observe that tight arcs occur only on paths from s to \(\overline{U}\) by the definition of \(c^U\). Our goal is to find sets \(S^-,S^+\) such that \(s\in S^-\cap S^+\), \(S^-\) is minimal, \(S^+\) is maximal and all arcs leaving \(S^-\) or \(S^+\) are tight.

Proposition 2

Let \(\left[ S^-,\overline{S^-}\right] \) be the minimal dominating \(s-t\) cut w.r.t. U. Let f be an \(s-t\) flow of maximum value in the network \((G,c^U)\). The set of cut-arcs \(E^-{:}{=}E\left( S^-,\overline{S^-}\right) \) has the following properties:

  1. 1.

    All arcs from U to t belong to \(E^-\),

  2. 2.

    No arc from \(\overline{U}\) to t is in \(E^-\),

  3. 3.

    The arcs in \(E^-\) are tight and cover all paths from s to \(\overline{U}\),

  4. 4.

    \(E^-\) is minimal in the sense that no proper subset of it covers all paths from s to t,

  5. 5.

    For all \(uv\in E^-\) with \(v \ne t\), there is no tight arc on the unique \(s - u\) path in G.

In a similar manner, we can derive the key properties of the maximal dominating \(s-t\) cuts w.r.t. U.

Proposition 3

Let \(\left[ S^+,\overline{S^+}\right] \) be a maximal dominating \(s-t\) cut w.r.t. U. Let f be a maximum flow in the network \((G,c^U)\). The set of cut-arcs \(E^+{:}{=}E\left( S^+,\overline{S^+}\right) \) has the following properties:

  1. 1.

    All arcs from U to t are in \(E^+\),

  2. 2.

    No arc from \(\overline{U}\) to t belong to \(E^+\),

  3. 3.

    The arcs in \(E^+\) are tight and cover all paths from s to \(\overline{U}\),

  4. 4.

    \(E^+\) is maximal in the sense that no proper subset of it covers all paths from s to t,

  5. 5.

    For all \(uv\in E^+\) with \(v \ne t\), there exists a path from v to \(\overline{U}\) with no tight arc.

The minimal and the maximal dominating \(s-t\) cuts, respectively, satisfy the following properties:

Proposition 4

For \(U\subset U'\subseteq V_t\) and minimal dominating \(s-t\) cuts \(\left[ S^-,\overline{S^-}\right] \),\(\left[ Z^-,\overline{Z^-}\right] \) w.r.t. U and \(U'\), respectively, we have \(S^-\subset Z^-\).

Proposition 5

For \(U\subset U'\subseteq V_t\) and maximal dominating \(s-t\) cuts \(\left[ S^+,\overline{S^+}\right] \),\(\left[ Z^+,\overline{Z^+}\right] \) w.r.t. U and \(U'\), respectively, we have \(S^+\subset Z^+\).

Finally, consider \(S^+\) and \(S^-\), where \(\left[ S^+,\overline{S^+}\right] \) and \(\left[ S^-,\overline{S^-}\right] \) are maximal and minimal dominating \(s-t\) cuts w.r.t. \(U\subseteq V_t\).

Proposition 6

The connected components of the subgraph spanned by \(S^+{\setminus } S^-\) are rooted trees. If \(v \in S^+{\setminus } S^-\) is the root node of such a tree, then \(e_i\left( v\right) \) emanates from \(S^-\), and \(\beta _{e_i\left( v\right) }=c^i_v(S^+)\).

Corollary 1

Let \(\left[ S^+,\overline{S^+}\right] ,\left[ S^-,\overline{S^-}\right] \) be the maximal and minimal dominating \(s-t\) cuts w.r.t. U, respectively. For all \(i\in \llbracket m\rrbracket \), we have

$$\begin{aligned} \sum _{e\in E_{i}\left( S^+\setminus S^-,\overline{S^+}\right) }\beta _{e} = \sum _{e\in E_{i}\left( S^-,S^+\setminus S^-\right) }\beta _{e}. \end{aligned}$$
(12)

Now we consider the inequalities induced by dominating \(s-t\) cuts. For any \(U \subset V_t\), the dominating \(s-t\) cuts in \(\mathcal {C}_{\min }(U)\) uniquely induce the inequality

$$\begin{aligned} \sum _{i=1}^m k_i\lambda _i+\sum _{j:v_j\in U} x_j \ge \sum _{j=1}^n x_j. \end{aligned}$$
(13)

We can simplify this by subtracting \(\sum _{j:v_j\in U} x_j\) from both sides:

$$\begin{aligned} \sum _{i=1}^m k_i\lambda _i\ge \sum _{j:v_j\in \overline{U}} x_j. \end{aligned}$$
(14)

For any subset \(U \subset V_t\), let \(F_U\) be the set of vectors \((x,\lambda )\in P^{\textrm{emb}}_{\star }\) that satisfy the equation

$$\begin{aligned} \sum _{i=1}^m k_i\lambda _i = \sum _{j:v_j\in \overline{U}} x_j. \end{aligned}$$
(15)

Note that (15) is obtained from (13) or (14) by turning the inequality into an equation. We call \(U \subset V_t\) facet inducing for \(P^{\textrm{emb}}_{\star }\) if \(F_U\) is a facet of \(P^{\textrm{emb}}_{\star }\).

In the \(P^{\textrm{emb}}_=\) polytope, the points of \(F_U\) also satisfy

$$\begin{aligned} \sum _{i=1}^m\left( \alpha _i-k_i\right) \lambda _i = \sum _{j:v_j\in U} x_j, \end{aligned}$$
(16)

which can be derived by subtracting (15) from (5).

Let \((x,\lambda )\in P^{\textrm{emb}}_{\star }\) be a vector that satisfies an inequality induced by a dominating \(s-t\) cut \(\left[ S,\overline{S}\right] \) (for some \(U\subset V_t\)) at equality. Let \(\phi \) be a maximum \(s-t\) flow in \(N\left( x,\lambda \right) \). Then \(\phi _e=c_{x,\lambda }(e)\) for all \(e\in E\left( S,\overline{S}\right) \) and \(\phi _e=0\) for all \(e\in E\left( \overline{S},S\right) \). For polytope \(P^{\textrm{emb}}_=\) we also have \(\phi _{v_jt}=x_j\) for all \(j\in \llbracket n\rrbracket \).

3 Network flow representable polytopes

In this section we characterize the family of those polytopes, that admit a network flow representation. Recall, that a family of sets \(\mathcal {S}\) is called laminar, if

$$\begin{aligned} X\cap Y\ne \emptyset \implies X\subseteq Y\text { or }Y\subseteq X \end{aligned}$$
(17)

holds for all \(X,Y\in \mathcal {S}\), see e.g., section 13.4 of Schrijver (2004).

Fig. 1
figure 1

Example for polytope \(P_i\), the corresponding laminar system \(\mathcal {L}_i\) and the rooted tree representation \(G_i\) along with capacity function \(c_{x,\lambda }\)

Theorem 4

For \(i\in \llbracket m\rrbracket \), let \(\mathcal {L}_i\) be a laminar family on \(\llbracket n \rrbracket \) and for each \(L\in \mathcal {L}_i\), \(b_L\) a positive number, and define a polytope

$$\begin{aligned} P_i = \left\{ x\in \mathbb {R}^n_{\ge 0}\ \left| \ \sum _{j\in L}x_j \le b_L,\,\forall L\in \mathcal {L}_i\right. \right\} , \end{aligned}$$
(18)

for each of these families. Then \(P^{\textrm{emb}}_\le = \textrm{conv}\left( \cup _{i=1}^m P_i^{\textrm{emb}}\right) \) is network flow representable, where \(P_i^\textrm{emb} = P_i\times \epsilon _i\).

Conversely, given a network flow representable polytope \(P^{\textrm{emb}}_\le = \textrm{conv}\left( \cup _{i=1}^m P_i^{\textrm{emb}}\right) \), we can find laminar families \(\mathcal {L}_i\), \(i\in \llbracket m\rrbracket \) and \(b_L > 0\) for each \(L \in \mathcal {L}_i\), such that the \(P_i\) satisfy (18).

Proof

To prove the statement, we first show that for a polytope \(P^{\textrm{emb}}_\le \) or \(P^{\textrm{emb}}_=\), we can construct a network N representing it. Then we show that from a network N, we can retrieve the systems of inequalities of (18).

Recall that a laminar family \(\mathcal {L}_i\) admits a rooted tree representation, see e.g., Edmonds and Giles (1977). Let the graph \(G_i\) be the rooted tree representation of \(\mathcal {L}_i\). Note that every inequality \(\sum _{j\in L}x_j \le b_L\) is represented by a node of \(G_i\), and every leaf \(v_j\) of \(G_i\) corresponds to a variable \(x_j\). Let G be the graph we get by joining the graphs \(G_i\) for all i on their leaves. Then we add source node s and sink node t, and add arcs from every leaf to t and from s to every node with no in-arcs. Define capacity function \(c_{x,\lambda }\) as \(c_{x,\lambda }(u,v) = b_L\lambda _i\) if node v corresponds to \(\sum _{j\in L}x_j \le b_L\) and \(L\in \mathcal {L}_i\). If \(v = v_j\) is a leaf, and there is no explicit upper bound on it defined by (18), then the \(c_{x,\lambda }(u,v_j) = b_L\lambda _i\) where L is the minimal set in \(\mathcal {L}_i\) that contains j. For leaf node \(v_j\), we set \(c_{x,\lambda }(v_jt)=x_j\). See Fig. 1 for an illustration. Let \(N=(G,c_{x,\lambda })\) be the network flow representation.

We argue that for any \((x,\lambda )\in P^{\textrm{emb}}_{\star }\) with \(\lambda _i=1\), we have \(\lambda _{i'}=0\) for all \(i'\ne i\), and there exists an \(s-t\)-flow f in N with flow value \(\sum _{j=1}^n x_j\), that is,

$$\begin{aligned} f_e = {\left\{ \begin{array}{ll} x_j & \text {if } e=(v_j,t)\\ \sum _{j\in L} x_j & \text {if } e=(u,v), \text { and } v \text { represents the inequality } \sum _{j \in L} x_j\le b_L\\ \end{array}\right. } \end{aligned}$$
(19)

for all arc e of \(G_i\) and \(f_e=0\) if e is not in \(G_i\). Now since \((x,\lambda )\) satisfies the inequalities (18), it is guaranteed that flow values do not exceed capacity on any arcs, i.e., f is feasible \(s-t\)-flow.

For the other direction, we use again the fact that there is a one-to-one correspondance between laminar families and rooted trees. Let \(N=(G,c_{x,\lambda })\) be the network flow representation of polytope \(P^{\textrm{emb}}_{\star }\). Let \(G_i\) be the subgraph of G, where the capacity of the arcs depend on \(\lambda _i\). For any node v of \(G_i\), let \(L_v\) denote the indices of terminal nodes that can be reached from v on a directed path. Notice that \(\left( L_v\right) _{v\in V(G_i)}\) is a laminar family. Let \(b_v\) denote the coefficient of \(\lambda _i\) on the in-arc of node \(v\in V(G_i)\). We have to show that \(P_i = \left\{ x\in \mathbb {R}^n_{\ge 0}\ \left| \ \sum _{j\in L_v} x_j\le b_v,\,\forall v\in V(G_i)\right. \right\} \). Consider \(P_i^{\textrm{emb}} = P_i \times \epsilon _i\), and fix \(\lambda _i=1\) and \(\lambda _{i'}=0\) for all \(i'\ne i\) and some \(x\in \mathbb {R}^n_+\) such that there is a feasible \(s-t\) flow in \(N(x,\lambda )\) of value \(\sum _{j=1}^n x_j\). Then \((x,\lambda ) \in P_i^\textrm{emb}\), which implies \(\left\{ x\in \mathbb {R}^n_{\ge 0}\ \left| \ \sum _{j\in L_v} x_j\le b_v,\,\forall v\in V(G_i)\right. \right\} \subseteq P_i\). The converse inclusion is obvious by the definition of network flow representability. \(\square \)

Theorem 4 can be straightforwardly extended to the polytopes

$$\begin{aligned} P_i^= = \left\{ x\in \mathbb {R}^n_{\ge 0}\ \left| \ \sum _{j=1}^nx_j=\alpha _i,\,\sum _{j\in L}x_j \le b_L,\,\forall L\in \mathcal {L}_i\right. \right\} ,\quad i\in \llbracket m \rrbracket , \end{aligned}$$

where \(\alpha _i>0\) for \(i\in \llbracket m\rrbracket \), since the defining equations \(\sum _{j=1}^nx_j=\alpha _i\) have no impact on the laminar family induced by the inequalities.

The depth of the laminar family \(\mathcal {L}\) is the length of the longest chain, i.e., the number of nodes of the longest path from s to t in the network minus 2. In Fig. 1, a longest chain is \(\left\{ (x_3),\,(x_3,x_4),\,(x_3,x_4,x_5),\,(x_1,x_2,x_3,x_4,x_5)\right\} \) with length 4, while the longest path from s to t has 6 nodes.

4 The affine hull of \(P^{\textrm{emb}}_{\star }\)

Observe that the only valid equation for \(P^{\textrm{emb}}_\le \) is (4) provided that all terminal nodes in \(V_t\) are connected to the source node s of the network. On the other hand, the structure of the network may imply some additional valid equations for \(P^{\textrm{emb}}_=\) beside (4) and (5).

Let \(\kappa \) be the number of the connected components (in the undirected sense) of the graph \(G\setminus \left\{ s,t\right\} \) and \(T_1,\ldots ,T_\kappa \) be the components. By definition, there are no arcs between \(T_p\) and \(T_q\) for any \(p\ne q\). Let \(U_\ell \) be the subset of nodes of \(T_\ell \) that belong to \(V_t\). Note that \(\cup _{\ell =1}^\kappa U_\ell =V_t\). Let \(a_i^\ell \) denote the sum of the coefficients of those edges of subgraph \(G_i\) that start in s and end in \(T_\ell \), i.e.

$$\begin{aligned} a^\ell _i = \sum _{\begin{array}{c} e\in E_{i}\left( s,T_\ell \right) \end{array}} \beta _{e}. \end{aligned}$$
(20)

See Fig. 2 for an illustration.

Fig. 2
figure 2

Illustration of the \(a_i^\ell \) values

Proposition 7

The equations

$$\begin{aligned} \sum _{i=1}^m a^\ell _i\lambda _i =\sum _{j:v_j\in U_\ell }x_j \quad \forall \ell \in \llbracket \kappa \rrbracket \end{aligned}$$
(21)

are all valid for \(P^{\textrm{emb}}_=\).

Proof

We have to prove that all \((x,\lambda )\in P^{\textrm{emb}}_=\) satisfy all of the Eq. (21). For any \((x,\lambda )\in P^{\textrm{emb}}_=\) we have

$$\begin{aligned} \sum _{\ell =1}^\kappa \sum _{i=1}^m a^\ell _i\lambda _i = \sum _{i=1}^m\alpha _i\lambda _i = \sum _{j=1}^n x_j = \sum _{\ell =1}^\kappa \sum _{j:v_j\in U_\ell }x_j, \end{aligned}$$

since \(V_t=\cup _{\ell =1}^\kappa U_\ell \) and \(\sum _{\ell =1}^\kappa a_i^\ell =\alpha _i\) for all \(i\in \llbracket m\rrbracket \). We will prove that for any \(\ell \in \llbracket \kappa \rrbracket \), for all \((x,\lambda )\in P^{\textrm{emb}}_=\) the sum of the capacity of edges that enter \(T_\ell \) is exactly the sum of the capacity of the edges that leave \(T_\ell \). In any \(s-t\) flow in \(N\left( x,\lambda \right) \), the arcs that leave s and enter \(T_\ell \) are saturated, otherwise Eq. (5) is not satisfied. Also, the arcs leaving \(T_\ell \) and entering t are saturated, for the same reason. No flow that enters \(T_\ell \) enters any other \(T_k\), and the conservation flow implies that the sum of the capacities of the arcs entering \(T_\ell \) is equal to the capacity of those arcs leaving \(T_\ell \) and entering t, which is the statement we wanted to prove. \(\square \)

Observation 1

Consider any face F of \(P^{\textrm{emb}}_=\). Then F consists of those points \((x,\lambda ) \in P^{\textrm{emb}}_\le \) that satisfy (5), and some other Eq. (15) for distinct subsets \(U\subset V_t\).

Note that a face \(F_U\) may be equal to the intersection of several \(F_{U'}\) for distinct subsets \(U'\) of \(V_t\).

Proposition 8

The system of equations consisting of (4), and (21) constitute a maximal linearly independent equation system for \(P^{\textrm{emb}}_=\).

Proof

By observation (1), \(P^{\textrm{emb}}_=\) consists of those points of \(P^{\textrm{emb}}_\le \) that satisfy (5) and some equations of the form (15) for distinct subsets \(U \subset V_t\). Suppose there exists a subset \(U'\) of \(V_t\) other than \((U_\ell )_{\ell =1}^\kappa \) such that all \((x,\lambda )\in P^{\textrm{emb}}_=\) satisfy the equation

$$\begin{aligned} \sum _{i=1}^m k'_i\lambda _i=\sum _{j:v_j\in \overline{U'}} x_j, \end{aligned}$$
(22)

where \(k'_i\) is the coefficient of \(\lambda _i\) in the inequality (15) for \(U'\). We can distinguish two cases: i) there exists some index \(\ell \) such that \(U'\subset U_\ell \), and ii) the set \(U'\) intersects multiple \(U_\ell \) sets. We will prove that case i) is impossible and we will reduce case ii) to case i). Suppose that case i) holds. By subtracting (22) from (21) we obtain

$$\begin{aligned} \sum _{i=1}^m (a^\ell _i-k'_i)\lambda _i = \sum _{j:v_j\in U'} x_j. \end{aligned}$$
(23)

Now, since \(T_\ell \) is connected, there exists \(G_i\) and \(v\in V(G_i)\cap V(T_\ell )\) such that the parent node of v is s and at least one node from both \(U'\) and \(U_\ell {\setminus } U'\) is reachable from v on a directed path. Let \(p_1\) and \(p_2\) be two directed paths connecting v with nodes \(v_{j'} \in U'\), and \(v_{j''} \in U_\ell {\setminus } U'\), respectively.

We fix \(\lambda _i = 1\), and \(\lambda _k=0\) for all \(k\ne i\). We construct a feasible flow \(\phi \) in the network as follows. We set \(\phi _{su}=\beta _{su}\) for \(su\in E(G_i)\), otherwise \(\phi _{su}=0\). Then, for each \(w \in V(G_i)\setminus (V_t \cup \{s\})\), we split the flow entering w among the out-arcs of w proportionally to their capacities, i.e.,

$$\begin{aligned} \phi _e = \frac{\beta _{e}}{\sum _{e'\in \delta (w)}\beta _{e'}} \phi _{e_i\left( v\right) }, \ \forall e \in \delta (w). \end{aligned}$$

We let \(x_j\) equal the flow entering \(v_j\), and also let \(\phi _{v_jt} = \phi _{e_i\left( v_j\right) }\). Finally, we let \(\phi _e = 0\) for all other arcs. This defines a feasible flow in \(N\left( x,\lambda \right) \). Since \(\sum _{j=1}^n x_j = \alpha _i\) by construction, we have \((x,\lambda ) \in P^{\textrm{emb}}_=\). Hence, \((x,\lambda )\) must satisfy (22).

Observe that \(\phi _e = \beta _e\) for all \(e \in \delta _{G_i}(s)\) by construction (i.e., all the out-arcs of s are saturated). Moreover, for each \(w \in V(G_i){\setminus } \{s\}\), \(0< \phi _e < \beta _e\) for all \(e \in \delta (w)\). Hence, \(0< \phi < \beta _e\) for all arcs of \(p_1\) and \(p_2\). Let \(\varepsilon = \min \{ \phi _e, \beta _e - \phi _e\, \ e \in E(p_1) \cup E(p_2)\}\).

We perturb \(\phi \) as follows: we decrease the value of \(\phi \) on the arcs of \(p_2\) by \(\varepsilon \) and increase \(\phi \) on the arcs of \(p_1\) by \(\varepsilon \):

$$\begin{aligned} \phi '_e= {\left\{ \begin{array}{ll} \phi _e+\varepsilon & e\in p_1\\ \phi _e-\varepsilon & e\in p_2\\ \phi _e & \text {otherwise} \end{array}\right. } \end{aligned}$$
(24)

Clearly, \(\phi '\) is a feasible flow. Now let \((x',\lambda )\in P^{\textrm{emb}}_=\) a point such that

$$\begin{aligned} x'_j = {\left\{ \begin{array}{ll} x_{j'} + \varepsilon & \text { if } j = j'\\ x_{j''} - \varepsilon & \text { if } j = j''\\ x_{j}& \text { otherwise. }\\ \end{array}\right. } \end{aligned}$$
(25)

Note that \(\sum _{j=1}^n x_j = \sum _{j=1}^n x'_j\), but \(\sum _{j\in U'} x_j \ne \sum _{j\in U'} x'_j\). Since \(\phi '\) is a feasible flow in \(N_=(x',\lambda )\) of value \(\sum _{j=1}^n x'_j\), \((x',\lambda )\) belongs to \(P^{\textrm{emb}}_=\). However (22) is not satisfied by \((x',\lambda )\), since \(\sum _{j\in U'} x_j \ne \sum _{j\in U'} x'_j\). Hence, Eq. (22) is not valid for all points of \(P^{\textrm{emb}}_=\), a contradiction.

As for case (ii), we can assume that \(U'\) intersects exactly two of the sets \(U_1,\ldots ,U_\kappa \) (otherwise it can be reduced to this case by induction), and that the two sets are \(U_1\) and \(U_2\) (by possibly re-labeling \(U_1,\ldots ,U_\ell \)). Let \(\left[ S,\overline{S}\right] \) be a dominating \(s-t\) cut w.r.t. \(V_t{\setminus } U'\). Let \(T'_1=\overline{S}\cap T_1,T'_2=\overline{S}\cap T_2\) and \(U'_1=U_1\cap U',U'_2=U_2\cap U'\). Let \(k^1_i\) denote the sum of the coefficients on those cut-arcs that enter \(T'_1\) and belong to \(G_i\), and define \(k^2_i\) similarly for \(T'_2\). Since there are no arcs between \(T_1'\) and \(T_2'\) in any direction, we can decompose Eq. (22) the following way:

$$\begin{aligned} \sum _{i=1}^m k^1_i\lambda _i+\sum _{i=1}^m k^2_i\lambda _i = \sum _{j:v_j\in U'_1} x_j + \sum _{j:v_j\in U'_2} x_j, \end{aligned}$$
(26)

which is exactly the sum of the equations

$$\begin{aligned} \sum _{i=1}^m k^1_i\lambda _i&= \sum _{j:v_j\in U'_1} x_j \end{aligned}$$
(27a)
$$\begin{aligned} \sum _{i=1}^m k^2_i\lambda _i&= \sum _{j:v_j\in U'_2} x_j. \end{aligned}$$
(27b)

This means that there is an implied valid equation acting on \(U'_1\), which is a proper subset of \(U_1\), thus reducing it to case (i). \(\square \)

The above propositions give a simple method to determine a maximal system of linearly independent valid equations for \(P^{\textrm{emb}}_=\).

5 Characterization of facet inducing \(s-t\) cuts of \(P^{\textrm{emb}}_\le \) and \(P^{\textrm{emb}}_=\)

In this section we give new characterizations of the facet inducing \(s-t\) cuts for the \(P^{\textrm{emb}}_\le \) and \(P^{\textrm{emb}}_=\) polytopes, respectively.

Theorem 5

Let U be an arbitrary subset of the terminal nodes, \([S^-, \overline{S^-}]\) and \([S^+,\overline{S^+}]\) the minimal and the maximal dominating \(s-t\) cut w.r.t. U, respectively. U induces a facet of \(P^{\textrm{emb}}_\le \) if and only if

  1. 1.

    The subgraph of G induced by \(\overline{S^+}\setminus \{t\}\) is connected in the undirected sense, and

  2. 2.

    The subgraph of G induced by \(S^-\) is connected in the undirected sense.

Proof

These conditions are equivalent to those of Theorem 1 in Kis and Horváth (2022). \(\square \)

Now we turn to \(P^{\textrm{emb}}_=\). In order to characterize its facets, we may assume that a maximal, linearly independent system of valid equations for \(P^{\textrm{emb}}_=\) consists of (4) and (5), see Kis and Horváth (2022), which is equivalent to the condition that \(G\setminus \{s,t\}\) is connected (by Proposition 8). In Kis and Horváth (2022), the characterization of non-trivial facets of \(P^{\textrm{emb}}_=\) implicitly assumes that the node-set of the network can be partitioned as \(\{s,t\}\cup V_s \cup V_t\), where \(V_s\) and \(V_t\) are the neighbors of the source s, and sink t, respectively, and the three subsets, namely, \(V_s\), \(V_t\), and \(\{s,t\}\), are pairwise disjoint. Below we give a new characterization which is valid for any network.

Theorem 6

Suppose that the only valid equations for \(P^{\textrm{emb}}_=\) are (4) and (5). Fix some \(U\subset V_t\), and let \([S^-, \overline{S^-}]\) and \([S^+,\overline{S^+}]\) be the minimal and the maximal dominating \(s-t\) cut w.r.t. U, respectively. U induces a facet of \(P^{\textrm{emb}}_=\) if and only if

  1. 1.

    the subgraph of G induced by \(\overline{S^+}\setminus \{t\}\) is connected in the undirected sense and

  2. 2.

    the subgraph of G induced by \(S^-\setminus \{s\}\) is connected in the undirected sense.

For the proof of Theorems  6, see Appendix B.

These characterizations lead to polynomial-time separation algorithms, which is the topic of the next section.

6 Separation algorithms

The separation algorithms for both of \(P^{\textrm{emb}}_\le \) and \(P^{\textrm{emb}}_=\) aim to solve the following problem: given a point \((x,\lambda ) \in R_*\), decide if \((x,\lambda ) \in P^{\textrm{emb}}_{\star }\), and if not, then determine a violated facet inducing inequality. Both algorithms are based on the network flow representation of \(P^{\textrm{emb}}_{\star }\), and use the \(s-t\) cuts of the network.

We can decide if a vector \((x,\lambda )\in R_*\) belongs to \(P^{\textrm{emb}}_{\star }\) by computing the capacity C of a minimum capacity \(s-t\) cut in \(N\left( x,\lambda \right) \) and compare it to \(\sum _{i=1}^n x_i\). If

$$\begin{aligned} C < \sum _{i=1}^n x_i, \end{aligned}$$
(28)

then \((x,\lambda )\) is not contained in the polytope, otherwise it is, see Sect. 1.

At first, we describe two methods to find a maximimal and a minimal dominating \(s-t\) cut, respectively w.r.t. \(U\subset V_t\) in Sect. 6.1. We will need them in the separation algorithms for \(P^{\textrm{emb}}_\le \), and \(P^{\textrm{emb}}_=\), respectively, which are presented in Sects. 6.2 and 6.3, respectively, along with a proof of correctness.

6.1 Finding minimal and maximal dominating \(s-t\) cuts w.r.t. a given set U

We will use the properties described in Propositions 2 and 3 for finding minimal and maximal dominating \(s-t\) cuts, respectively. The algorithm for finding the minimal dominating \(s-t\) cut w.r.t. U is depicted in Algorithm 1. Initially we let \(S^- = V{\setminus }(\overline{U}\cup \left\{ t\right\} ),\,\overline{S^-} = \overline{U}\cup \left\{ t\right\} \) and \(E^-=E\left( S^-,\overline{S^-}\right) \). We process each subgraph \(G_i\) separately. Let \(W_i\) denote \(V(G_i)\cap \overline{U}\). We want to cover all paths from s to \(W_i\) in \(G_i\) with arcs of minimum total capacity. We choose a node from \(W_i\), and we examine the path from s to v in \(G_i\), and denote it by \(\pi _i(s,v)\). Let \(e=uw\) be the first tight arc of the path \(\pi _i(s,v)\). Note that all such paths contain at least one tight arc by Proposition 2. We extend the set of cut-arcs \(E^-\) with arc e and delete those cut-arcs from \(E^-\) that are in the subgraph \(\sigma (w)\). We delete the nodes of \(\sigma (w)\) from \(S^-\) and extend \(\overline{S^-}\) with the same nodes. We delete the terminal nodes of \(\sigma (w)\) from \(W_i\). This is repeated while \(W_i\) is not the empty set. By Proposition 2, this algorithm finds a minimal dominating \(s-t\) cut.

Algorithm 1
figure a

Find minimal dominating \(s-t\) cut w.r.t. U

For finding the maximal dominating \(s-t\) cut w.r.t. U, the procedure is the same, except that we choose the last arc of the path \(\pi _i(s,v)\). The corresponding algorithm is depicted in Algorithm 2.

Algorithm 2
figure b

Find maximal dominating \(s-t\) cut w.r.t. U

6.2 Facet separation algorithm for polytope \(P^{\textrm{emb}}_\le \)

Algorithm 3
figure c

Find violated facet for polytope \(P^{\textrm{emb}}_\le \)

Consider Algorithm 3. The following results ensure that the algorithm always gives the correct answer in polynomial time.

Consider the minimal dominating \(s-t\) cut \(\left[ S^-,\overline{S-}\right] \) w.r.t. U, and suppose that the subgraph of G induced by \(S^-\) is not connected. Let \(S_0,S_1,\dots ,S_N\) denote the node sets of its connected components such that \(s\in S_0\). Let \(U_0=S_0\cap V_t\). See Fig. 3.

Proposition 9

For each \(\ell \in \llbracket 1,N\rrbracket \) there exists \(v\in V_t\) such that \(S_\ell =\left\{ v\right\} \).

Proof

We argue that \(S_\ell \subset V_t\) for each \(\ell \in \llbracket 1,N\rrbracket \), from which the statement follows. If not, then observe that there is no arc uv of the graph G such that \(u \in S_0\), \(v \in S_\ell \), since \(S_\ell \) is the node set of a connected component of \(G(S^-)\). Hence,

$$\begin{aligned} c_{x,\lambda }\left[ S^-\setminus (S_\ell \setminus V_t),\overline{S^-} \cup (S_\ell \setminus V_t)\right] \le c_{x,\lambda }\left[ S^-,\overline{S^-}\right] . \end{aligned}$$

Consequently, \(\left[ S^-{\setminus } (S_\ell {\setminus } V_t),\overline{S^-} \cup (S_\ell {\setminus } V_t)\right] \) is a dominating \(s-t\) cut w.r.t. U, while \(S^-\setminus (S_\ell \setminus V_t)\) is a proper subset of \(S^-\), a contradiction. \(\square \)

For convenience, let \(x_\ell \) denote the capacity of the unique arc in \(E\left( S_\ell ,t\right) \) (Figs. 3 and 4).

Fig. 3
figure 3

Illustration of \(S^-,\overline{S^-}{\setminus } \{t\},U_0\) and \(S_0,S_1,\dots ,S_N\) node sets

Proposition 10

If \((x^*,\lambda ^*)\in R_{\le }\) violates the dominating \(s-t\) cuts w.r.t. U, then it also violates the dominating \(s-t\) cuts w.r.t. \(U_0\).

Proof

The minimal dominating \(s-t\) cut w.r.t. \(U_0\) is \(\left[ S_0,\overline{S^-}\cup \left( \bigcup _{\ell =1}^N S_\ell \right) \right] \). There exists no arc uv such that \(u\in S_0\) and \(v\in S_\ell \) for any \(\ell \in \llbracket N\rrbracket \), since \(S^-\) is not connected. Also, there exists no arc uv such that \(u\in S_\ell \) and \(v\in \overline{S^-}\setminus \{t\}\) by Proposition 9. The inequality induced by \(\left[ S^-,\overline{S^-}\right] \) is violated by \((x^*,\lambda ^*)\), i.e.

$$\begin{aligned} \sum _{i=1}^mk_i\lambda ^*_i < \sum _{j:v_j\in \overline{U}}x^*_j, \end{aligned}$$
(29)

and the inequality induced by \(\left[ S_0,\overline{S^-}\cup \left( \bigcup _{\ell =1}^N S_\ell \right) \right] \) is

$$\begin{aligned} \sum _{i=1}^mk_i\lambda _i \ge \sum _{j:v_j\in \overline{U}}x_j + \sum _{\ell =1}^Nx_\ell , \end{aligned}$$
(30)

which is also violated by \((x^*,\lambda ^*)\). \(\square \)

From now on we assume that \(G(S^-)\) is connected.

Now suppose that \(\left[ S,\overline{S}\right] \in \mathcal {C}_{\min }\left( U\right) \) is violated by \((x^*,\lambda ^*)\), and the subgraph induced by \(\overline{S^+}\setminus \{t\}\) is not connected in the undirected sense, and the node sets of its connected components are \(T_1,\dots ,T_K\) for some \(K>1\). Let

$$\begin{aligned} U_\ell = U\cup \left( \bigcup _{k\in \llbracket K\rrbracket \setminus \{\ell \}}\left( V_t\cap T_k\right) \right) \end{aligned}$$

for all \(\ell \in \llbracket K\rrbracket \). We will show that at least one of the sets \(U_\ell \) induces a facet of \(P^{\textrm{emb}}_\le \) violated by \((x^*,\lambda ^*)\).

Proposition 11

There exists \(\ell \in \llbracket K\rrbracket \) such that the dominating \(s-t\) cut w.r.t. \(U_\ell \) is violated by \((x^*,\lambda ^*)\).

Proof

Since \((x^*,\lambda ^*)\) violates the dominating \(s-t\) cuts w.r.t. U, we have

$$\begin{aligned} \sum _{i=1}^m k_i\lambda ^*_i < \sum _{j:v_j\in V_t\setminus U} x^*_j. \end{aligned}$$
(31)

Since there are no arcs between \(T_r\) and \(T_q\) for any \(r\ne q\), the above inequality decomposes to

$$\begin{aligned} \sum _{i=1}^m\sum _{\ell =1}^K k_i^\ell \lambda ^*_i < \sum _{\ell =1}^K\sum _{j:v_j\in T_\ell \cap V_t} x^*_j \end{aligned}$$
(32)

where \(k_i^\ell \) denotes the portion of \(k_i\) that enters \(T_\ell \), i.e.,

$$\begin{aligned} k^\ell _i = \sum _{e\in E_{i}\left( S^-,T_\ell \right) }\beta _{e}. \end{aligned}$$

Suppose for a contradiction that for each \(\ell \in \llbracket K\rrbracket \) the set \(U_\ell \) does not induce a violated cut, i.e.,

$$\begin{aligned} \sum _{i=1}^m k_i^\ell \lambda ^*_i \ge \sum _{j:v_j\in T_\ell \cap V_t}x^*_j. \end{aligned}$$
(33)

Summing up these inequalities gives

$$\begin{aligned} \sum _{i=1}^m\sum _{\ell =1}^K k_i^\ell \lambda ^*_i \ge \sum _{\ell =1}^K\sum _{j:v_j\in T_\ell \cap V_t} x^*_j, \end{aligned}$$

which contradicts inequality (32), hence, there is at least one index \(\ell \) such that (33) does not hold for \(\ell \). \(\square \)

Theorem 7

If the subgraph induced by \(S^-\) is connected, then for any \(\ell \in \llbracket K\rrbracket \), the dominating \(s-t\) cut w.r.t. \(U_\ell \) is facet inducing for \(P^{\textrm{emb}}_\le \).

Proof

Let \(\left[ Z^+,\overline{Z^+}\right] \) denote the maximal dominating \(s-t\) cut w.r.t. \(U_\ell \). We first prove that \(U_\ell \) satisfies condition i) of Theorem 5. To this end, it suffices to verify that \(Z^+=S^+\cup \left( \bigcup _{r\ne \ell }T_r\right) \), since then \(\overline{Z^+}=T_\ell \cup \left\{ t\right\} \), and \(T_\ell \) is connected by definition. Recall that \(\overline{S^+} = \cup _{r=1}^K T_r \cup \{t\}\) and \(S^+\subset Z^+\) by Proposition 5. We have to prove that we get \(Z^+\) from \(S^+\) by moving all \(T_r\), \(r\ne \ell \) to the source side of the cut.

Firstly, we argue that \(T_\ell \cap Z^+=\emptyset \). Suppose for contradiction that \(\emptyset \ne T_\ell \cap Z^+\). There is no path from any node of \(T_\ell \cap Z^+\) to \(T_r\) for any \(r\ne \ell \). Let \(\left[ S',\overline{S'}\right] \) denote the \(s-t\) cut we get as the result of extending \(S^+\) by \(Z^+\cap T_\ell \). Observe that \(S' \cap V_t = U\), since \(T_\ell \cap Z^+ \cap V_t = \emptyset \). Let \(Z^\star = Z^+ {\setminus } T_\ell \). Then \(\left[ Z^\star ,\overline{Z^\star }\right] \) is an \(s-t\) cut w.r.t. \(U_\ell \). Recall the definition of \(k_i\) in (10), and define \(k'_i\) analogously for \([S',\overline{S'}]\). If \(\left[ S^+,\overline{S^+}\right] \) dominates \([S',\overline{S'}]\), then there exists i such that \(k_i<k_i'\). Observe that

$$\begin{aligned} k_i'= k_i + \sum _{e\in E_{i}\left( Z^+\cap T_\ell ,T_\ell \setminus Z^+\right) }\beta _{e} - \sum _{e\in E_{i}\left( S^+,Z^+\cap T_\ell \right) }\beta _{e}. \end{aligned}$$
(34)

Since \(k_i<k_i'\) by assumption, we have

$$\begin{aligned} \sum _{e\in E_{i}\left( Z^+\cap T_\ell ,T_\ell \setminus Z^+\right) }\beta _{e} > \sum _{e\in E_{i}\left( S^+,Z^+\cap T_\ell \right) }\beta _{e}. \end{aligned}$$
(35)

We define the coefficients \(k_i^+\) and \(k_i^\star \) for \(\left[ Z^+,\overline{Z^+}\right] \) and \(\left[ Z^\star ,\overline{Z^\star }\right] \) using (10), respectively. We have

$$\begin{aligned} k_i^\star = k_i^+ - \sum _{e\in E_{i}\left( Z^+\cap T_\ell ,T_\ell \setminus Z^+\right) }\beta _{e} + \sum _{e\in E_{i}\left( S^+,Z^+\cap T_\ell \right) }\beta _{e}. \end{aligned}$$
(36)

Now inequality (35) implies that \(k_i^+>k_i^\star \), which means that \(\left[ Z^+,\overline{Z^+}\right] \) is dominated by \(\left[ Z^\star ,\overline{Z^\star }\right] \), a contradiction. Otherwise, if \([S',\overline{S'}]\) dominates \(\left[ S^+,\overline{S^+}\right] \), then \(\left[ S^+,\overline{S^+}\right] \) is not a dominating \(s-t\) cut w.r.t. U, a contradiction. Finally, if \([S',\overline{S'}]\) neither dominates \(\left[ S^+,\overline{S^+}\right] \), nor is dominated by \(\left[ S^+,\overline{S^+}\right] \), then they have the same capacity. But then, \(\left[ S^+,\overline{S^+}\right] \) is not a maximal dominating \(s-t\) cut w.r.t. U, since \(S^+\subset \left( S^+\cup \left( Z^+\cap T_\ell \right) \right) \).

Finally, we argue that for all \(r\ne \ell \) we have \(T_r\subseteq Z^+\). Suppose for contradiction that \(T_r{\setminus } Z^+ \ne \emptyset \). By definition, there is no path between \(T_r\) and \(T_\ell \), however, there exists a path from \(S^+ \subset Z^+\) to \(T_r\), hence moving \(T_r\setminus Z^+\) to \(Z^+\) decreases the capacity of the resulting \(s-t\) cut, which contradicts that the \(s-t\) cut \(\left[ Z^+,\overline{Z^+}\right] \) is dominating.

Now we prove that condition (ii) of Theorem 5 holds for \(Z^-\), where \(\left[ Z^-,\overline{Z^-}\right] \) denotes the minimal dominating \(s-t\) cut w.r.t. \(U_\ell \). We claim that there is no cut arc on any path from s to \(\cup _{r\ne \ell }T_r\) in the \(s-t\) cut \(\left[ Z^-,\overline{Z^-}\right] \). Let \(c^U,c^{U_\ell }\) be capacity functions as described by Eq. (11) w.r.t. U and \(U_\ell \), respectively, and f be maximum flow in the network \((G,c^U)\). We construct a maximum flow g in \((G,c^{U_\ell })\). First let \(g=f\). Let \(\pi =s,v_0,\dots ,v_n\) be a directed path in subgraph \(G_i\) where \(v_n\in \cup _{r\ne \ell }T_r\cap V_t\). Since \(c^{U_\ell }(v_nt)=0\), we have \(g_{v_{n-1}v_n}=0\). For each arc e of \(\pi \), let \(g_e=f_e-f_{v_{n-1}v_n}\). Repeat this for all paths from s to the terminal nodes in \(\cup _{r\ne \ell }T_r\), for all subgraphs \(G_i\). The flow g is a feasible flow in \((G,c^{U_\ell })\) and it is maximum flow, since we cannot increase the flow value on any paths from s to \(T_\ell \cap V_t\), because f and thus g saturates all cut arcs separating \(S^+\) from \(T_\ell \). Now \(g_e<f_e\) holds for all arcs of all directed paths from s to \(\cup _{r\ne \ell }T_r\), hence, none of these arcs are cut arcs in \(\left[ Z^-,\overline{Z^-}\right] \), because those cut-arcs are saturated in every maximum flows in \((G,c^{U_\ell })\) , see Fig. 4. Since \(S^-\subseteq Z^-\) by Proposition 4, and \(T_r\), \(r\ne \ell \), is a subset of \(Z^-\), by the above arguments, it follows that \(Z^-\) is connected.

Fig. 4
figure 4

There is a path (in the undirected sense) unsaturated by g from \(U_\ell \setminus U\) to s and from s to U

\(\square \)

6.3 Facet separation algorithm for \(P^{\textrm{emb}}_=\)

The transformation of a set U into a facet inducing set \(U'\) is similar to the one described in Sect. 6.2. For simplicity we assume that \(\kappa =1\), i.e. \(G\setminus \{s,t\}\) is connected in the undirected sense, otherwise we run the algorithm for the components separately. See Algorithm 4.

Algorithm 4
figure d

Facet separation for polytope \(P^{\textrm{emb}}_=\)

The following statements ensure the correctness of the algorithm. First, assume that \(G(S^-\setminus \{s\})\) is not connected, and let \(S_1,\dots ,S_N\) be the node-sets of its connected components. Let \(U_\ell =S_\ell \cap V_t\).

Proposition 12

There exists \(\ell \in \llbracket N\rrbracket \) such that the inequality induced by \([S_\ell , \overline{S_\ell }]\) is violated by \((x^*,\lambda ^*)\).

Proof

Suppose \((x^*,\lambda ^*)\) violates inequality (14) for set U. By Eq. (5), this is equivalent to

$$\begin{aligned} \sum _{i=1}^m\left( \alpha _i-k_i\right) \lambda ^*_i > \sum _{j:v_j\in U} x^*_j. \end{aligned}$$
(37)

We will decompose the \(\alpha _i\) and the \(k_i\) the following way. Let

$$\begin{aligned} \begin{array}{rrrclr} & & k_i^\ell & = & \displaystyle {\sum _{e\in E_{i}\left( S_\ell ,\overline{S^-}\right) } \beta _{e}} & \forall \ell \in \llbracket N\rrbracket ,\,\forall i\in \llbracket m\rrbracket \\ & & \alpha _i^\ell & = & \displaystyle {\sum _{e\in E_{i}\left( s,S_\ell \right) } \beta _{e}}& \forall \ell \in \llbracket N\rrbracket ,\,\forall i\in \llbracket m\rrbracket \\ k_i^0 & = & \alpha _i^0 & = & \displaystyle {\sum _{e\in E_{i}\left( s,\overline{S^-}\right) } \beta _{e}}& \forall i\in \llbracket m\rrbracket . \end{array} \end{aligned}$$
(38)

See Fig. 5 for an illustration. Observe that

$$\begin{aligned} \alpha _i=\alpha _i^0+\sum _{\ell =1}^N\alpha _i^\ell ,\quad k_i=k_i^0+\sum _{\ell =1}^N k_i^\ell . \end{aligned}$$

We can decompose the left hand side of inequality (37) as follows:

$$\begin{aligned} \sum _{i=1}^m\left( \alpha _i-k_i\right) \lambda ^*_i&= \sum _{i=1}^m\left( \alpha _i^0+\sum _{\ell =1}^N\alpha _i^\ell -k_i^0-\sum _{\ell =1}^N k_i^\ell \right) \lambda ^*_i\\&= \sum _{i=1}^m\left( \sum _{\ell =1}^N\alpha _i^\ell -\sum _{\ell =1}^N k_i^\ell \right) \lambda ^*_i\\&=\sum _{\ell =1}^N\sum _{i=1}^m\left( \alpha _i^\ell -k_i^\ell \right) \lambda ^*_i, \end{aligned}$$

and the right hand side as

$$\begin{aligned} \sum _{j:v_j\in U}x^*_j = \sum _{\ell =1}^N\sum _{j:v_j\in U_\ell } x^*_j. \end{aligned}$$

Therefore, inequality (37) implies that there exists an \(\ell \) such that

$$\begin{aligned} \sum _{i=1}^m\left( \alpha _i^\ell -k_i^\ell \right) \lambda ^*_i > \sum _{j:v_j\in U_\ell } x^*_j. \end{aligned}$$

Subtracting this from equality (5), we get

$$\begin{aligned} \sum _{i=1}^m \left( \alpha _i - \alpha _i^\ell +k_i^\ell \right) \lambda ^*_i < \sum _{j:v_j\in V_t\setminus U_\ell } x^*_j. \end{aligned}$$
(39)

Observe that (39) means that the inequality induced by the \(s-t\) cut \([S_\ell , \overline{S_\ell }]\) is violated by \((x^*,\lambda ^*)\). \(\square \)

Fig. 5
figure 5

Illustration of the \(k_i^0,k_i^1,\dots ,k_i^N\) and \(\alpha _i^0,\alpha _i^1,\dots ,\alpha _i^N\) values

Now suppose that \(S^-\setminus \{s\}\) is connected, but \(\overline{S^+}\setminus \{t\}\) is not, and the node-sets of its connected components are \(T_1,\dots ,T_M\). Now let

$$\begin{aligned} S^+_\ell =S^+\cup \left( \bigcup _{ r \in \llbracket M\rrbracket \setminus \{\ell \}} T_r\right) \text { and } U_\ell = V_t\setminus T_\ell . \end{aligned}$$

Proposition 13

There exists \(\ell \in \llbracket M\rrbracket \) such that the inequality induced by \([S^+_\ell ,\overline{S^+_\ell }]\) is violated by \((x^*,\lambda ^*)\).

Proof

Recall that the dominating cut w.r.t. U being violated is equivalent to inequality (31). We decompose \(k_i\) the following way. Let

$$\begin{aligned} k_i^\ell = \sum _{e\in E_{i}\left( S^+,T_\ell \right) }\beta _{e}\quad \forall \ell \in \llbracket M\rrbracket ,\,\forall i\in \llbracket m\rrbracket . \end{aligned}$$
(40)

Observe that \(k_i=\sum _{\ell =1}^Mk_i^\ell \). Now we can decompose inequality (31):

$$\begin{aligned} \sum _{\ell =1}^M\sum _{i=1}^m k_i^\ell \lambda ^*_i < \sum _{\ell =1}^M\sum _{j:v_j\in V_t\setminus U_\ell }x^*_j. \end{aligned}$$
(41)

Therefore, for at least one index \(\ell \) we have

$$\begin{aligned} \sum _{i=1}^m k_i^\ell \lambda ^*_i < \sum _{j:v_j\in V_t\setminus U_\ell }x^*_j, \end{aligned}$$
(42)

which implies the statement. \(\square \)

Theorem 8

If the subgraph \(S^-\setminus \{s\}\) is connected, then for any \(\ell \in \llbracket M\rrbracket \) the set \(U_\ell \) is facet inducing.

Proof

We prove that the conditions (i) and (ii) of Theorem 6 hold for \(U_\ell \) for any \(\ell \in \llbracket K\rrbracket \). Let \(\left[ Z^+,\overline{Z^+}\right] \) denote the maximal dominating \(s-t\) cut w.r.t. \(U_\ell \). First we prove that \(Z^+=S^+\cup \left( \bigcup _{r\ne \ell }T_r\right) \) and consequently \(\overline{Z^+}=T_\ell \cup \left\{ t\right\} \). Note that this implies that \(U_\ell \) satisfies condition (i) of Theorem 6, since \(T_\ell \) is connected by definition.

We define the arc capacities \(c^U\), \(c^{U_\ell }\), and \(s-t\) flows f and g similarly to those in the proof of Theorem 7. Then f carries 0 flow to the nodes in U, and maximum flow to the nodes in \(V_t\setminus U\), and likewise, g carries 0 flow to the nodes in \(U_\ell \) and maximum flow to the nodes in \(V_t\setminus U_\ell \). First we prove that for all \(r\ne \ell \) we have \(T_r\subseteq Z^+\). Since there is a cut arc between \(S^+\) and \(T_r\) which is saturated by f by construction, there is a directed path \(\pi \) from s to some \(v_j \in T_r \cap U_\ell \) with a positive f-flow. Since \(U \subset U_\ell \), g does not carry any flow to \(v_j\) by construction, \(g_e < f_e\) for each arc e of \(\pi \), and thus g does not saturate any arc on \(\pi \). This means that there is no cut-arc (for the capacities \(c^{U_\ell }\)) on any path from s to any \(T_r\), \(r\ne \ell \), hence \(T_r\subset Z^+\) for \(r\ne \ell \). Moreover, this shows that \(T_r\subseteq Z^-\). \(\square \)

Second, we prove that \(T_\ell \cap Z^+=\emptyset \). The arcs that cut \(T_\ell \) from \(S^+\) are saturated by g, hence they cut \(T_\ell \) from \(Z^+\). Hence, \(Z^+=S^+\cup \left( \bigcup _{r\ne \ell }T_r\right) \) as claimed.

Let \(\left[ Z^-,\overline{Z^-}\right] \) denote the minimal dominating \(s-t\) cut w.r.t. \(U_\ell \), and we verify the condition (ii) of Theorem 6. We proceed with the following:

Claim 1

The subgraph spanned by \(S^+\setminus \{s\}\) is connected in the undirected sense.

Proof

Suppose for contradiction that the subgraph spanned by \(S^+{\setminus } \{s\}\) has multiple components, \(S_0,S_1,\dots ,S_K\). Since the subgraph spanned by \(S^-\setminus \{s\}\) is connected by assumption, one of these components contains \(S^-{\setminus } \{s\}\). Let \(S_0\) be that component. Since \(S^+\cap V_t=U=S^-\cap V_t\), no other component contains any terminal nodes. There exists at least one arc that emanates \(S_k\) and enters \(T_r\) for some \(k\in \llbracket K\rrbracket \) and \(r\in \llbracket N\rrbracket \), otherwise no terminal node is reachable from the nodes of \(S_k\), which contradicts that from \(S_k\) at least one terminal node is reachable. Since \(S_k\) contains no terminal nodes or node s, it is a rooted tree with root \(w\in V(G_i)\) for some i. If \(c_w>\beta _{e_i\left( w\right) }\), then the \(s-t\) cut \(\left[ S^+,\overline{S^+}\right] \) is dominated by \(s-t\) cut \(\left[ S^+\setminus S_k,\overline{S^+}\cup S_k\right] \), contradiction. If \(c_w\le \beta _{e_i\left( w\right) }\) then the capacity function is not reduced, since all arcs leaving \(S_k\) enter \(\overline{S^+}\), contradiction. Hence we have that no such \(S_k\) exists, and \(S^+\setminus \{s\}\) is connected. \(\square \)

Fig. 6
figure 6

If \(p_i\left( v\right) \in S^-\setminus \{s\}\), there is an undirected path unsaturated by g from \(U_\ell \setminus U\) to \(p_i\left( v\right) \) and from \(p_i\left( v\right) \) to U

To finish the proof of the theorem, it suffices to prove the following:

Claim 2

For each \(r\ne \ell \), there exists an undirected path connecting a node in \(T_r\cap U_\ell \) with a node in U containing no cut arc in \(\left[ Z^-,\overline{Z^-}\right] \).

Proof

First suppose there exists an arc e between \(S^-\) and \(T_r\).

So there is an undirected path \(\pi \) containing e, which connects some node in U to some node in \(T_r\cap U_\ell \). No arc of \(\pi \) is saturated by g, since the part of \(\pi \) in \(S^-\) contains no saturated arc by Proposition 2 5), the part of \(\pi \) in \(T_r\) has zero flow by definition of g, and similarly \(g_e=0\). Since no arc of \(\pi \) is saturated by g, it contains no cut arc.

If the arc e above does not exist, then there is an undirected path between \(T_r\) and \(S^-\) which contains a node from \(S^+\setminus S^-\). Recall that by Proposition 6, the subgraph spanned by \(S^+\setminus S^-\) decomposes to rooted trees. Let \(\tau \) denote the rooted tree component of \(S^+\setminus S^-\) that contains a part of the said undirected path, with root node v. The parent node \(p_i\left( v\right) \) of v is either in \(S^-\setminus \{s\}\) or it is s. If \(p_i\left( v\right) \in S^-\setminus \{s\}\), then there is an undirected path from \(T_r\cap U_\ell \) to U with no saturated arc, since any path from s to \(T_r\) has no arcs saturated by g as seen in the proof of Theorem 7, hence, the path from \(p_i\left( v\right) \) to \(T_r\) has no saturated arcs, and any path from \(p_i\left( v\right) \) to U within \(S^-\) has no saturated arcs by Proposition 2 5), see Fig. 6.

If \(p_i\left( v\right) =s\), then there is at least one arc from \(\tau \) to \(S^-\), since \(S^+\setminus \{s\}\) is connected as shown above. Since there is a path from node v to \(T_r\) with arcs unsaturated by flow g, it is enough to show that there exists a path from v to \(S^-\) with arcs unsaturated by g. Suppose for contradiction, that all paths from v to \(S^-\) contain at least one arc saturated by g, and hence saturated by f, see Fig. 7a. Observe that \(\tau \) admits a maximal subtree \(\tau '\) rooted at v such that the flow f saturates no arc in \(\tau '\). Then there exists no arc from any node of \(\tau '\) to some node in \(S^-\) by our indirect assumption. Since all arcs leaving \(\tau '\) are saturated by f, we have

$$\begin{aligned} \beta _{e_i\left( v\right) } = f_{e_i\left( v\right) } = \sum _{e\in \delta (\tau ')} f_e = \sum _{e\in \delta (\tau ')} \beta _{e}. \end{aligned}$$
(43)

Hence, all arcs of \(\tau '\) are saturated by f, therefore \(V(\tau ')=\{v\}\) (see Fig. 7b), and thus the network G is not reduced (see Proposition 1), a contradiction. Therefore, there must exist at least one path from v to \(S^-\) with all arcs unsaturated by flow f, and consequently unsaturated by flow g.

Fig. 7
figure 7

The wavy lines symbolize paths with no arcs saturated by g, the thick straight lines symbolize paths with at least one saturated arc by g

This completes the proof of the theorem. \(\square \)

7 Computational experiments

To assess the computational efficiency of the proposed facet separation procedure (Algorithm 4), we implemented it within a branch-and-cut framework, and we performed computational experiments on a set of problem instances described below. The goal of these experiments is to demonstrate the potential computational advantage of the separation algorithm, and we chose a rather general test problem to do so.

7.1 Test problem

Let \(K,m,n\in \mathbb {N}_{>0}\) and \(\mathcal {L}^k_i\subseteq 2^{\llbracket n\rrbracket }\) be a laminar family of subsets of \(\llbracket n\rrbracket \) for \(k\in \llbracket K\rrbracket \) and \(i\in \llbracket m \rrbracket \). For all \(L\in \mathcal {L}_i^k\) define \(b_L \ge 0\), and for all \(k\in \llbracket K\rrbracket \) and \(i\in \llbracket m \rrbracket \), let \(\alpha _i^k > 0\). For each \(i \in \llbracket m\rrbracket \), we define the polytope P(ik) as the convex hull of the non-negative solutions of the system of inequalities described by the laminar family \(\mathcal {L}^k_i\), \(\alpha _i^k\) and \(\left\{ b_L\right\} _{L\in \mathcal {L}^k_i}\), as in Theorem 4, i.e.,

$$\begin{aligned} P(i,k) = \left\{ x\in \mathbb {R}_{\ge 0}^n\ \left| \ \sum _{j=1}^n x_j = \alpha _i^k,\,\sum _{j\in L}x_j\le b_L\ \forall L\in \mathcal {L}^k_i\right. \right\} . \end{aligned}$$
(44)

By Theorem 4, \(\textrm{conv}\left( \bigcup _{i=1}^m P(i,k)^{emb}\right) \) has a network flow representation.

Let \(w_j, \beta _j>0\) for all \(j\in \llbracket n \rrbracket \), while the \(c^k_i\) (for \(k \in \llbracket K \rrbracket ,\,i\in \llbracket m \rrbracket \)) are arbitrary rational numbers. Our goal is to solve the following problem:

$$\begin{aligned} \begin{array}{rrcllll} \min & \displaystyle {\sum _{j=1}^n} w_j y_j & + & \displaystyle {\sum _{k=1}^K \sum _{i=1}^m} c^k_i \lambda ^k_i \\ \mathrm {s.t.} & \displaystyle {\sum _{k=1}^{K}} x^k_j & \le & \beta _j+y_j,& \forall j\in \llbracket n\rrbracket \\ & (x^k,\lambda ^k) & \in & \displaystyle {\bigcup _{i=1}^{m}} P(i,k)^\textrm{emb},& \forall k \in \llbracket K\rrbracket . \end{array} \end{aligned}$$
(45)

In this problem we have to find a set of vectors \(x^k\in \mathbb {R}^n\), \(k\in \llbracket K\rrbracket \), such that each \(x^k\) is in one of the polytopes P(ik), \(i \in \llbracket m\rrbracket \), the \(\lambda ^k\) indicates which one, and the \(x^k\) also satisfy a set of linear constraints, with the objective of minimizing the violation of these constraints and the total cost of the chosen alternatives.

We can restate (45) as a mixed integer-linear program:

$$\begin{aligned} \begin{array}{rrcllll} \min & \displaystyle {\sum _{j=1}^n} w_jy_j & + & \displaystyle {\sum _{k=1}^K \sum _{i=1}^m} c^k_i \lambda ^k_i\\ \mathrm {s.t.} & \displaystyle {\sum _{k=1}^{K}} x^k_j & \le & \beta _j+y_j,& \forall j\in \llbracket n\rrbracket \\ & \displaystyle {\sum _{j\in L} x^k_j} & \le & b_L + (1-\lambda ^k_i) M_k,\quad & \forall L\in \mathcal {L}^k_i,\,i \in \llbracket m \rrbracket ,\,\forall k \in \llbracket K\rrbracket \\ & \displaystyle {\sum _{j=1}^n x^k_j} & =& \displaystyle {\sum _{i=1}^{m}\alpha ^k_i\lambda ^k_i},& \forall k \in \llbracket K\rrbracket \\ & \lambda _i^k & \in & \left\{ 0,1\right\} , & \forall i\in \llbracket m\rrbracket ,\, \forall k \in \llbracket K\rrbracket \\ & x^k_j, & \ge & 0& \forall j\in \llbracket n\rrbracket ,\,\forall k \in \llbracket K\rrbracket . \end{array} \end{aligned}$$
(46)

The constant \(M_k\) in (46) equals \(\sum _{i=1}^{m}\alpha ^k_i +1\) for each \(k\in \llbracket K\rrbracket \). We will refer to (46) as the big-M formulation.

7.2 Methods compared and test environment

We implemented our separation procedure (Algorithm 4) in C++. We used FICO XPRESS v9.4.1 for solving the mixed-integer linear program (46) with branch-and-cut. In our method, called D-cuts, we used presolve with default settings, except for dual reductions, which was turned off to enable cut generation. The built-in cuts of XPRESS were disabled. We separated disjunctive cuts in the root node of the search tree in at most 20 rounds, and then in one round in every node of depth at most 20. A violated cut found by Algorithm 4 was added to the LP relaxation of a node only if the absolute violation was at least 0.1. We compared the performance of our D-cuts method to three other approaches.

The second method, called B &B, in our comparison was the branch-and-bound procedure of XPRESS applied to (46) with presolve turned on (no cut generation at all).

The third approach, called XPRS-cuts, is the default branch-and-cut of the XPRESS solver applied to (46), using presolve and the built-in cuts of the solver.

In the fourth method we applied Balas’ reformulation (Theorem 2) to each disjunctive constraint in (45), and solved the resulting MIP with presolve turned on.

The experiments were performed on a notebook computer with i7-8850 H CPU @ 2.60GHz and Windows operating system. The root relaxation was always solved using 8 CPU threads by the barrier solver and 1 thread by the dual simplex method, whereas the subsequent tree-search and cut generation use only one CPU thread. The run-time limit was set to 240 s for the smaller instances and 1200 s for the larger ones.

7.3 Design and evaluation of computational experiments

The test problems were generated as follows. The laminar families defining each polyhedron P(ik) have depth 3 in all test instances, cf. Sect. 3. We generated 10 random problem instances for each \(n \in \{30,40,50,60,70, 100, 130\}\), while m and K were both set to n in all cases. The laminar sets \(\mathcal {L}_i^k\), the parameters \(\alpha _i^k\), and the right-hand-sides \(b_L\) for each \(L \in \mathcal {L}_i^k\) were chosen randomly while ensuring that the corresponding network be reduced. Finally, the parameters \(\beta _j\), and the weights \(w_j\) and \(c^k_i\) were chosen as follows: \(\beta _j\) is a random number from the interval \([\underline{b}_j/2, 1.5 \underline{b}_j]\), where \(\underline{b}_j = \sum _{k=1}^K \min \{ b_L\ |\ j \in L,\, L\in \mathcal {L}_i^k,\,i \in \llbracket m\rrbracket \}\) is the sum of the smallest non-zero upper bounds on \(x_j\) over all the disjunctive constraints. The weights \(w_j\) and \(c^k_i\) are chosen uniformly at random from the intervals [0, 10.0] and \([-200,+200]\), respectively. The above choices were made after some preliminary tests to make difficult instances.

7.3.1 Results on small and medium size instances

In this section we summarize our computational results on those instances, where \(30 \le n \le 70\). We ran all four methods on each problem instance in this class. Tables 1 shows for each n average values over the 10 instances of the group. For each method and group of instances, we provide the average optimality gap (gap), the average running time in seconds (time), the average number of search tree nodes (nodes), the average number of cuts (cuts), and the average root gap (root gap). The optimality gap of a method on a problem instance is computed as \((ub - lb)/ub\), where ub and lb are the best upper and lower bounds obtained by the method, respectively. A dash ’-’ in the column ’gap’ indicates that the optimum was found for all instances in the group. In case of a positive average gap, we provide in parenthesis the number of instances solved optimally among the 10 instances in the group. The content of the column cuts depends on the method. For the D-cuts method, it contains the average number of cuts separated, for B &B it is a ’-’ throughout, since no cuts are generated at all. In the case of the XPRS-cuts method, only the built-in cuts of the solver are used, but we have no data available for their number (n.a.), whereas when using Balas’ reformulation, no built-in cuts were generated in most cases, except for one instance for \(n=70\). The root-gap is calculated by the formula \((ub - root\_lb)/ub\), where ub is the best upper bound, and \(root\_lb\) is the lower bound after processing the root node including the separation of cuts when it applies.

In all the approaches, the node LPs were presolved by the XPRESS solver, which reduced the computation times significantly. The root LPs were solved with default setting in all cases, which means that the solver applied the dual simplex using one thread, and the parallel barrier solver using 8 threads. After solving the root LP, the D-cuts method strengthened it by at most 20 rounds of cut generation, which led to the optimal solution in most cases. This can be seen by the low average number of search tree nodes. We can observe that as n increases, the average computation time, and the average number of cuts increase as well, but the average number of search tree nodes remains low. In contrast, without cut generation, the B &B method found the optimum only for small values of n. As n increases, the number of instances solved optimally decreases, while the average optimality gap increases. This is due to the run-time limit of 240 s, since as n increases, the size of the MIP formulation increases as well. By the same token, the average solution time increases, and the number of search tree nodes visited decreases. The XPRS-cuts method clearly outperforms the B &B method, but performs worse than D-cuts, especially for larger values of n.

Balas’ extended formulation has a stable behavior. Note that the resulting MIP has much more variables and constraints than the big-M formulation. The reason is that the \(x^k\) vector variables are copied as many times as the number of alternatives in the disjunctive constraints, cf. (2). In practice, this means that if \(x^k\in \mathbb {R}^n\), then there are n copies of \(x^k\), since we have \(m = n\) alternatives in every disjunctive constraint in all the problem instances. The number of constraints is multiplied analogously. We also observed that presolve could not reduce the size of the resulting MIPs. However, solving the root LPs using 9 CPU threads helped a lot in reducing the computation times. As it turned out, the barrier solver implementation of XPRESS exploited parallel processing to a great extent. Without this feature, Balas’ extended formulation would not be a competitive method for solving the problem at hand. In most cases, the solution of the root LP and presolve found an optimal solution, and the built-in cuts were separated only for one problem instance with \(n=70\).

Based on the results, we can conclude that for \(n \ge 40\), the fastest method is D-cuts, and the second place is shared between XPRS-cuts and Balas’ extended formulation. Note though that with Balas’ extended formulation the optimum was found in the root node in most cases, and it had a more stable behavior than XPRS-cuts. The longer running time with Balas’ extended formulation is primarily due to the significantly larger problem sizes the solver must handle with this approach. However, parallel processing also plays a crucial role in efficiently solving the root LPs, particularly when using Balas’ extended formulation.

Table 1 Results on small and medium size instances

7.3.2 Results on large instances

We also tested our methods on instances with \(n \in \{100, 130\}\) jobs. We run four methods on each problem instance: D-cuts, XPRS-cuts, Balas’ reformulation, and a combination of D-cuts, and XPRS-cuts, which we call “D-cuts + XPRS cuts”. In this method, we configured the xpress solver to use built-in cuts, and also our separation procedure for D-cuts. Our separation subroutine was called by the solver after generating built-in cuts. In the root node of the search tree D-cuts cuts were separated in at most 10 rounds, and for one round in all other nodes. We have also tested B &B, but it was always inferior to any other methods, so we do not provide detailed results with that method. We set the run-time limit to 1200 s.

Table 2 presents the results, highlighting that the “D-cuts + XPRS-cuts” method is the fastest on average, followed closely by D-cuts. Both of these methods generate only a small number of search tree nodes. In contrast, XPRS-cuts is slower on average, while Balas’ reformulation, though the slowest, generates the fewest search tree nodes.

Figure 8 illustrates the average run time and variance across the 10 instances for \(n \in \{100,130\}\). Notably, D-cuts and Balas’ reformulation exhibit the lowest run time variance across both instance groups, while XPRS-cuts shows the highest variance.

We conclude that on the large instances, the best method is “D-cuts + XPRS-cuts”, and D-cuts is the second best.

Table 2 Results on large instances
Fig. 8
figure 8

Average run time and variance

8 Final remarks

In this paper, we have described polynomial time exact separation algorithms for disjunctive constraints with a network-flow representation. We have also identified the disjunctive constraints that can be represented in this manner. Our computational experiments on a set of benchmark problems demonstrate the superiority of our approach. The results indicate that the new cuts can significantly reduce the computational time of a branch-and-cut procedure, outperforming general cutting planes applied to the same formulation.

Balas’ reformulation shows a solid performance. In most cases, the optimum of a test problem was found already in the root node without cutting and branching (with the help of presolve and heuristics). In fact, the LP relaxation of Balas’ reformulation of (45) is tighter than that of (46) without adding our cutting planes. Nevertheless, when a disjunctive constraint can be represented by a network flow, our modeling approach and separation procedures constitute a viable alternative to Balas’ extended formulation, as our preliminary computational results demonstrate.

A further approach for solving MILPs with disjunctive constraints is Dantzig–Wolfe reformulation with subsequent column generation. For instance, Sadykov and Vanderbeck (2013) applied branch-and-price for solving bin packing problems with conflicts, while Almathkour et al. (2024) proposed branch-and-cut-and-price for the 2-connected subgraph problem with disjunctive constraints.

Our methodology can be applied to any disjunctive constraint with a network-flow representation, facilitated by our general separation procedure that only requires a weighted network as specified in Sect. 2.

In the future, we plan to apply our methodology to specific problems that include network-flow representable disjunctive constraints.