On cutting planes for cardinality-constrained linear programs

Kim, Jinhak; Tawarmalani, Mohit; Richard, Jean-Philippe P.

doi:10.1007/s10107-018-1306-0

On cutting planes for cardinality-constrained linear programs

Full Length Paper
Series A
Published: 06 June 2018

Volume 178, pages 417–448, (2019)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Mathematical Programming Submit manuscript

On cutting planes for cardinality-constrained linear programs

Download PDF

Jinhak Kim¹,
Mohit Tawarmalani ORCID: orcid.org/0000-0003-3085-0084² &
Jean-Philippe P. Richard³

1101 Accesses
3 Citations
Explore all metrics

Abstract

We derive cutting planes for cardinality-constrained linear programs. These inequalities can be used to separate any basic feasible solution of an LP relaxation of the problem, assuming that this solution violates the cardinality requirement. To derive them, we first relax the given simplex tableau into a disjunctive set, expressed in the space of nonbasic variables. We establish that coefficients of valid inequalities for the closed convex hull of this set obey ratios that can be computed directly from the simplex tableau. We show that a transportation problem can be used to separate these inequalities. We then give a constructive procedure to generate violated facet-defining inequalities for the closed convex hull of the disjunctive set using a variant of Prim’s algorithm.

Cutting planes from extended LP formulations

Article 31 March 2016

A new family of facet defining inequalities for the maximum edge-weighted clique problem

Article 22 June 2016

Intersection Cuts for Polynomial Optimization

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

A cardinality-constrained optimization problem (CCOP) is an optimization problem with a constraint requiring that, in feasible solutions, the number of variables taking nonzero values does not exceed a given threshold.

Cardinality constraints appear in a large number of important applications in fields as diverse as computational finance, supply chain management, statistical data analysis, and machine learning. They are used in cardinality-constrained optimal portfolio selection problems in quantitative finance [11, 13, 17, 19, 33, 38,39,40]. These problems adapt the Markowitz mean–variance model where the objective is to minimize a quadratic risk measure under linear constraints along with a restriction that the number of securities chosen for investment is sufficiently small. Another application is in index tracking investment strategies [8, 23, 31, 32, 46, 48]. These problems are modeled as time series optimization models where the objective is to minimize a quadratic tracking error under budget constraints and a restriction that the number of securities selected for investment is small. Facility location problems are classical supply chain management models where a company must decide where to locate facilities. The variant of the problem where at most p warehouses can be opened is known as the p-median problem, and has been extensively studied in the literature [1, 7, 16, 20, 30, 35, 41]. In statistical data analysis, principal component analysis (PCA) is a well-known technique for dimension reduction. It finds principal components as linear combinations of the original variables. When the coefficients of many variables in these linear combinations are nonzero, the principal components can be difficult to interpret. In order to find principal components that are easier to explain, a cardinality constraint (referred to as a sparsity constraint) is sometimes imposed on the original problem. The resulting problem is known as sparse principal component analysis (sparse PCA); see [18, 27, 34, 54]. Ensemble pruning [52] and variable selection in multiple regression [9, 10] are also often modeled as CCOPs.

For a given vector x, we define the cardinality of x, denoted as $\mathop {\mathrm{card}}(x)$, as the number of its components that are nonzero. In this paper, we focus on CCOPs, where the optimization problem is linear and refer to them as cardinality-constrained linear programs (CCLPs). A CCLP can be formulated as

$$\begin{aligned} \max \bigl \{c^{\intercal }x + d^{\intercal }y\bigm | Ax+By\le b, x\ge 0, y\ge 0, \text {card}(x)\le K\bigr \} \end{aligned}$$

where c, $x\in \mathbb {Re}^p$, d, $y\in \mathbb {Re}^q$, $b\in \mathbb {Re}^m$, $A \in \mathbb {Re}^{m\times p}$, $B \in \mathbb {Re}^{m\times q}$, and K is a fixed positive integer with $K<p$. The above model, which we study in this paper, contains a single cardinality constraint. However, since it is a relaxation of problems with multiple constraints, it can be used to derive cuts for such problems.

Although CCOPs find uses in a variety of applications, they are hard to solve to global optimality. Perhaps the simplest of these problems, which involves optimizing a linear function over the intersection of a continuous knapsack polytope and a cardinality constraint, is already NP-hard [22]. Further, large instances of practical problems are computationally challenging to solve [11, 22, 40].

Various strategies have been proposed to model cardinality constraints, and to leverage classical mixed integer programming (MIP) branch-and-cut methodologies in the solution of cardinality-constrained problems. When variables x are bounded, auxiliary binary variables $z\in \{0,1\}^p$ can be introduced to model the cardinality constraint. The bound constraints on variables, say $0\le x\le 1$, are then replaced with $0\le x\le z$ while the cardinality constraint $\mathop {\mathrm{card}}(x)\le K$ is replaced with $\mathbb {1}^{\intercal }z \le K$, where $\mathbb {1}$ is a vector of dimension p whose entries are all equal to one. When all constraints of the initial problem are linear, such an approach allows the use of branch-and-cut or greedy search algorithms developed for MIPs. This reformulation also allows for the use of cutting planes derived for cardinality-constrained problems; see [50, 51].

In [6] a specialized branch-and-bound algorithm was proposed to solve problems with cardinality constraints where $K\in \{1,2\}$. These techniques were adapted to logically constrained linear programs [37], mixed integer quadratic programs [11], and to cardinality-constrained knapsack problems (CCKPs) [22]. Moreover, [22] develops valid inequalities for CCKPs that can be used for CCLPs.

In this paper, we follow a similar line of research, as we conduct a polyhedral study of CCLPs in the space of original problem variables. In particular, we use information contained in feasible simplex tableaux of LP relaxations of CCLPs to construct strong valid inequalities. Our underlying motivation is that working in the initial space of variables allows us to derive inequalities without the assumption that cardinality-constrained variables are bounded, an assumption that is required for MIP formulations. Further, avoiding the introduction of unnecessary indicator variables will help maintain the original problem structure, and might lead to streamlined solution approaches for these problems. Other advantages of not introducing binary variable are discussed in [21]. Finally, our cuts can be generated in a low-order polynomial time and can augment the cutting plane strategies adopted in commercial MIP solvers.

Although we are not aware of previous studies of tableau-based cuts for CCOPs, such inequalities have been proposed in the literature in the context of MIPs, quadratic programming, concave programming, and linear complementarity problems [2, 5, 24, 25, 29, 44, 49].

The remainder of this paper is organized as follows. In Sect. 3, we show that violated cuts for CCLPs can be generated from a disjunctive relaxation of any simplex tableau corresponding to a basic feasible solution violating the cardinality requirement. In Sect. 4, we provide a closed form representation of the convex hull of the relaxation. In Sect. 5, we transform the convex hull into the dual of a transportation problem through a nonlinear map. In Sect. 6, we prove that nontrivial facets of the closed convex hull of the disjunctive set correspond to label-connected spanning trees of the bipartite network associated with the transportation problem. This result yields, in Sect. 7, an explicit polynomial-time constructive procedure for the derivation of nontrivial facet-defining inequalities. We give concluding remarks in Sect. 8.

2 Overview of results

This paper studies the closure convex hull of a disjunctive subset Q of $\mathbb {Re}^n$ consisting of $K+1$ disjuncts, each of which excludes the origin and is defined as the intersection of a single half-space and the nonnegative orthant. This disjunctive set appears as a natural relaxation of simplex tableaux of CCLPs. In this context, identifying a facet-defining inequality of $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q)$ that separates Q from the origin is relevant. We characterize the extreme points and extreme rays of each disjunct in Proposition 5. Then, by taking their union, we obtain in Corollary 2 an inner representation for the closure convex hull of the disjunctive set using points and rays. By homogenizing the set ($Q^0$), we express the valid inequalities as those whose normal vector belongs to the intersection of the normal cone of the extreme rays. Then, in Theorem 1, we relate the coefficients of the valid inequalities to extreme points of a polyhedron whose explicit form is readily derived from the defining inequalities of the disjuncts. This polyhedron, which has at most two non-zero coefficients per constraint, can be converted into the dual of a transportation polyhedron via a logarithmic transformation. The face-lattice of the transformed polyhedron is isomorphic to that of the original polyhedron.

Using this formulation, we show in Theorem 2 that the separation problem reduces to a transportation problem. Assume the homogenized nontrivial inequalities describing each disjunct are such that each requires a specific linear function to be non-negative. The variables are then partitioned into two sets ${\mathcal {I}}_+$ and ${\mathcal {I}}_-$, where ${\mathcal {I}}_-$ consists of variables whose coefficients are negative in all the defining inequalities of the disjuncts and ${\mathcal {I}}_+$ consists of the remaining variables. The transportation problem is then defined over the bipartite graph $({\mathcal {I}}_+,{\mathcal {I}}_-)$. The constraint set controls the ratio of coefficients of variables, one from each set. In particular, the absolute ratio of the coefficients of variables with negative coefficients must be no more than the minimum such ratio $w_{ij}, i\in {\mathcal {I}}_+, j\in {\mathcal {I}}_-$ in an inequality where they appear with opposing signs. The cost on edge $(i,j)\in {\mathcal {I}}_+\times {\mathcal {I}}_-$ is set to be $\log (w_{ij})$ and the supply and the demand vectors are chosen so that they are positive and their sums balance. The transportation problem identifies the ratios that guarantee that the cut is valid while ensuring that multiplicative factors determining the coefficients of the variables are small; see (15). Since the common disjunctive cut, which we refer to as c-max cut, corresponds to a feasible solution to the transportation problem but may not be one of its extreme point, the cuts we generate are at least as tight.

We then use the structure of the transportation problem to develop a deeper understanding of these inequalities. In particular, if we set the homogenizing variable (corresponding to the right-hand-side) at some value, the coefficients of the remaining variables are determined using ratios from the tree-structure that corresponds to an optimal extreme point. Moreover, we show in Theorem 3 that the ratios arising from a particular disjunct can be arranged to form a connected component in the tree. An inequality is not a facet-defining inequality when it does not have sufficiently many tight edges to form a tree. In Proposition 16, we develop an algorithm that expresses any separating valid inequality that is not facet-defining for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$ as either a conic combination of two valid inequalities or the combination of a valid inequality and a trivial increase of a coefficient. Finally, in Theorem 4 we use this result to develop a Prim-type algorithm that, given a valid inequality, adds sufficiently many ratios to form a connected tree and thus produce a facet-defining inequality. This algorithm yields a characterization of when the c-max cut is facet-defining, and also provides a way to tighten it when it is not. This characterization was not even known for complementarity problems, which are a special case of the cardinality problems we treat here.

3 Disjunctive relaxation of a cardinality-constrained simplex tableau

Given an LP relaxation of a CCLP, a basic feasible solution that violates the cardinality constraint, $\mathop {\mathrm{card}}(x)\le K$, and the associated simplex tableau, we discuss how we can obtain an inequality valid for the CCLP that cuts off this solution. Denoting the basic variables in this tableau by v (indexed by set $\mathbf{{\mathcal {M}}}$), and the nonbasic variables by t (indexed by set $\mathbf{{\mathcal {N}}}$), we write the simplex tableau as

$$\begin{aligned} \begin{array}{ll} v_l = v_{l}^*-\sum _{i \in \mathbf{{\mathcal {N}}}} f_{li}t_i, \forall l\in \mathbf{{\mathcal {M}}}; \qquad v_l \ge 0, \forall l \in \mathbf{{\mathcal {M}}}; \qquad t_i \ge 0, \forall i \in \mathbf{{\mathcal {N}}}; \end{array} \end{aligned}$$

(1)

where $v_{l}^* \ge 0$ for $l \in \mathbf{{\mathcal {M}}}$. Since we have assumed that the current basic solution $(v,t)=(v^*,0)$ does not satisfy the cardinality constraint, there exists a subset ${\mathcal {L}}\subseteq \mathbf{{\mathcal {M}}}$ of basic variables such that (i) $|{\mathcal {L}}|=K+1$, (ii) variables $v_l$ for $l \in {\mathcal {L}}$ appear in the cardinality constraint, and (iii) $v_l^*>0$ for $l \in {\mathcal {L}}$. We construct the desired disjunctive relaxation ${\bar{Q}}$ by (i) relaxing the cardinality constraint $\mathop {\mathrm{card}}(x)\le K$ into the disjunction $\bigvee _{l\in {\mathcal {L}}} (v_l\le 0)$, which forces one of the $K+1$ variables in ${\mathcal {L}}$ to be nonpositive, (ii) removing the nonnegativity requirements on basic variables, and (iii) omitting tableau constraints associated with basic variables $v_{\mathbf{{\mathcal {M}}}\backslash {\mathcal {L}}}$:

$$\begin{aligned} {\bar{Q}} := \bigl \{(v,t) \in \mathbb {Re}^{|{\mathcal {L}}|}\times \mathbb {Re}^{|\mathbf{{\mathcal {N}}}|}_+\bigm | \begin{array}{ll} v_l = v_{l}^*-\sum _{i \in \mathbf{{\mathcal {N}}}} f_{li}t_i, \forall l\in {\mathcal {L}};\; \bigvee _{l\in {\mathcal {L}}} (v_l \le 0) \end{array}\bigr \},\nonumber \\ \end{aligned}$$

(2)

where each equality corresponds to a basic variable in ${\mathcal {L}}$, and represents it as an affine function of the nonbasic variables. If a nonbasic variable is a slack variable for a constraint in $Ax+By\le b$, then an inequality valid for ${\bar{Q}}$ can be written in the space of original problem variables using the defining inequality for the slack variable. The relaxation steps applied to the initial simplex tableau in order to obtain ${\bar{Q}}$ resemble those made to obtain the corner relaxation of an MIP; see [26].

Remark 1

Our procedure applies more generally to the setup where we separate an extreme point of a polyhedron from $K+1$ disjuncts not containing the point, each of which is defined by a single inequality. Separation is possible because the extreme point cannot be expressed as a convex combination of points feasible in the disjuncts. The disjuncts in (2) are not always parallel. We will show that the c-max cut, described in (3), does not suffice to yield the closed convex hull of ${\bar{Q}}$, unlike the case of 0–1 disjunction, where [4] showed that similar disjunctive cuts are sufficient. This lack of correspondence was also observed in [36] while deriving a procedure that identifies disjunctive cuts.

Since ${\bar{Q}}$ is a finite union of polyhedra, $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}({\bar{Q}})$ is a polyhedron; see for instance Theorem 19.6 in [45].

Proposition 1

The set $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}({\bar{Q}})$ is a polyhedron. $\square $

The convex hull of ${\bar{Q}}$ is not necessarily closed, as the following example shows:

$$\begin{aligned} {\bar{Q}} := \bigl \{(v,t) \in \mathbb {Re}^2\times \mathbb {Re}^2_+\,\bigm |\, \begin{array}{ll} v_1 = 2 - t_1 + 2t_2, v_2 = 3 - t_1 + t_2, (v_1 \le 0)\vee (v_2 \le 0) \end{array}\bigr \}. \end{aligned}$$

Hence, we will characterize the closure convex hull of ${\bar{Q}}$. Since $(v_1,v_2,t_1,t_2) = (0,0,1,1)$ is a recession direction for the disjunct $v_2\le 0$ but not for $v_1\le 0$, ${\bar{Q}}$ is not MIP representable. Lack of upper bounds on t variables is a key reason why ${\bar{Q}}$ is not MIP representable and its convex hull is not closed. Therefore, MIP cuts and those in [22] do not directly apply to our setting.

A linear inequality is valid for ${\bar{Q}}$ if and only if it is valid for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}({\bar{Q}})$. We characterize these inequalities by studying $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q)$ where Q is the projection of ${\bar{Q}}$ onto the space of nonbasic variables t. Formally, for each $l\in {\mathcal {L}}$, define $Q_l:= \big \{t \in \mathbb {Re}^{|\mathbf{{\mathcal {N}}}|}_+ \bigm | \sum _{i \in \mathbf{{\mathcal {N}}}} f_{li}t_i\ge v_{l}^*\bigr \}$ and set $Q:=\bigcup _{l\in {\mathcal {L}}} Q_l$. Without loss of generality, we assume that $\mathbf{{\mathcal {N}}}=\{1,\ldots ,n\}$. Let $h^*(t)=\begin{pmatrix}v^*\\ 0\end{pmatrix}+\begin{pmatrix}-F\\ I\end{pmatrix}t$, where the entry (l, i) of matrix F is $f_{li}$, as used in the definition of ${\bar{Q}}$ in (2). It is clear that $h^*(\cdot )$ is an affine map, and that ${\bar{Q}}=h^*(Q)$.

Proposition 2

For $i=1,\dots ,p$, let $P_i\in \mathbb {Re}^n$ be nonempty polyhedra. Also let $h:\mathbb {Re}^n\rightarrow \mathbb {Re}^m$ be an affine map. Then $\textstyle \mathop {\mathrm{cl}}\mathop {\mathrm{conv}}\left( \bigcup _{i=1}^p h(P_i)\right) =\textstyle h\left( \mathop {\mathrm{cl}}\mathop {\mathrm{conv}}\left( \bigcup _{i=1}^p P_i\right) \right) $. Consequently, $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}({\bar{Q}})= h^*\left( \mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q)\right) $. $\square $

In the remainder of this paper, we restrict our attention to the study of $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q)$ since Proposition 2 shows that this is sufficient to characterize $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}({\bar{Q}})$.

Since $v_{l}^*>0$ for $l \in {\mathcal {L}}$, we may scale each constraint so that $v_{l}^*=1$. That is, for $l \in {\mathcal {L}}$, $Q_l= \left\{ t\mid \sum _{i \in \mathbf{{\mathcal {N}}}} f_{li} t_i \ge 1,\, t_i\ge 0, \forall i \in \mathbf{{\mathcal {N}}}\right\} $. For each $l\in {\mathcal {L}}$, define

$$\begin{aligned} {\mathcal {I}}_+^l=\{ i\in \mathbf{{\mathcal {N}}}\mid f_{li}>0\},\quad {\mathcal {I}}_-^l=\{ i\in \mathbf{{\mathcal {N}}}\mid f_{li}<0\},\quad {\mathcal {I}}_0^l=\{ i\in \mathbf{{\mathcal {N}}}\mid f_{li}=0\}. \end{aligned}$$

Throughout the paper, we assume without loss of generality that $Q_l\ne \emptyset $ for each $l\in {\mathcal {L}}$. In fact, if $Q_l=\emptyset $ for some $l\in {\mathcal {L}}$, then we can simply drop the corresponding set from the disjunction. Clearly, $Q_l=\emptyset $ if and only if $f_{li}\le 0$ for all $i\in \mathbf{{\mathcal {N}}}$. We therefore make the following assumption in the rest of the paper.

Assumption 1

For each $l\in {\mathcal {L}}$, ${\mathcal {I}}_+^l\ne \emptyset $.

Proposition 3

Polyhedron $Q_l$ is full-dimensional for $l\in {\mathcal {L}}$. Further, $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q)$ is full-dimensional.

Proof

By Assumption 1, ${\mathcal {I}}_+^l\ne \emptyset $. Choose $i\in {\mathcal {I}}_+^l$ and consider the point $\left( \frac{1}{f_{li}}+1\right) e_i+\sum _{k\in \mathbf{{\mathcal {N}}}{\setminus }\{i\}} \epsilon e_k$, where $\epsilon $ is positive but sufficiently small and $e_i$ is the ith unit vector. This point is in the interior of $Q_l$ because it satisfies all the constraints in $Q_l$ with strict inequalities. Therefore, $Q_l$ is full-dimensional because a small enough ball centered at this point lies entirely in $Q_l$; see Chapter 8 of [47], for example. Further, since $Q_l\subseteq Q$, then $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q)$ is also full-dimensional. $\square $

We next argue that there are valid inequalities of $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q)$ that can be used to separate the basic feasible solution associated with the initial simplex tableau (1), if this solution violates the cardinality requirement. For instance, consider

$$\begin{aligned} \sum _{i \in \mathbf{{\mathcal {N}}}} (\mathop {\hbox {c-max}})_i t_i \ge 1 \end{aligned}$$

(3)

where $(\mathop {\hbox {c-max}})_i=\max \{f_{li}\mid l\in {\mathcal {L}}\}$ for $i \in \mathbf{{\mathcal {N}}}$. This inequality, which we refer to hereafter as c-max cut was introduced in [29] for complementarity problems. Complementarity problems are special instances of cardinality problems requiring that at most one of two variables takes a nonzero value. The c-max cut is valid for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q)$ because $\sum _{i\in \mathbf{{\mathcal {N}}}} (\mathop {\hbox {c-max}})_i t_i \ge \sum _{i\in \mathbf{{\mathcal {N}}}} f_{li} t_i \ge 1$ for all $t\in Q_l$ and $l\in {\mathcal {L}}$, i.e., it is valid for each disjunct $Q_l$. Moreover, it separates the closed convex hull from $t=0$ because this point violates (3). For the particular case where $|{\mathcal {L}}|=2$, [42] observed that the c-max cut is not always facet-defining for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q)$. In this paper, we provide a complete description of the nontrivial facet-defining inequalities of $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q)$, each of which cuts off the current basic feasible solution of (1), and we precisely characterize when the c-max cut is strong.

4 A characterization of $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q)$

In this section, we provide a characterization of the facet-defining inequalities of $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q)$. Recall that Minkowski–Weyl’s theorem, see Theorem 7.13 in [28] for instance, establishes that a polyhedron can be represented in two forms, either using its vertices and extreme rays or as a finite intersection of half-spaces. Following [53], we refer to the former as a ${\mathcal {V}}$-polyhedron, and to the latter as a ${\mathcal {H}}$-polyhedron.

Instead of Q, for notational convenience, we study its homogenization and show that this change is without loss of generality; see Proposition 6. Let $\mathbf{{\mathcal {N}}_0}:=\mathbf{{\mathcal {N}}}\cup \{0\}$. Let $Q_l^0$ be the homogenization of $Q_l$ obtained as $Q_l^0:=\bigl \{t:=(t_1,\dots ,t_n,t_0)\in \mathbb {Re}^{|\mathbf{{\mathcal {N}}_0}|}_+\bigm | \sum _{i \in \mathbf{{\mathcal {N}}}} f_{li}t_i\ge t_0\bigr \}$. After defining $f_{l0}:=-1$ and $f_l:=(f_{l1},\dots ,f_{ln},f_{l0})^{\intercal }$, we can rewrite $Q_l^0=\bigl \{t \in \mathbb {Re}^{|\mathbf{{\mathcal {N}}_0}|}\mid f_l^{\intercal }t\ge 0,\,\, t\ge 0\bigr \}$. We refer to $f_l^{\intercal }t \ge 0$ as the nontrivial constraint of disjunct l. It is clear that $Q_l^0$ is a polyhedral cone. Referring to $\bigcup _{l\in {\mathcal {L}}}Q_l^0$ as $Q^0$, it is clear that $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$ is a cone. We relate these cones to the sets we originally introduced. For a nonempty convex set C, we let $K(C):=\{\lambda (d,1)\mid d\in C, \lambda >0\}$.

Proposition 4

It holds that $Q_l^0=\mathop {\mathrm{cl}}(K(Q_l))$ and $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)=\mathop {\mathrm{cl}}(K(\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q)))$.

$\square $

Propositions 3 and 4 directly yield

Corollary 1

Polyhedron $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$ is full-dimensional. $\square $

In Proposition 5, we present ${\mathcal {V}}$-polyhedron representations of $Q_l^0$ and $Q_l$. We then obtain similar representations for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$ and $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q)$ in Corollary 2.

Proposition 5

It holds that $Q_l^0=\mathop {\mathrm{cone}}(R_l^0)$ and $Q_l=\mathop {\mathrm{conv}}\left( V_l\right) +\mathop {\mathrm{cone}}(R_l)$, where $R_l^0=\{f_{li}e_j-f_{lj}e_i\in \mathbb {Re}^{|\mathbf{{\mathcal {N}}_0}|}\mid i\in {\mathcal {I}}_+^l, j\in {\mathcal {I}}_-^l\cup \{0\}\}\cup \{e_k\in \mathbb {Re}^{|\mathbf{{\mathcal {N}}_0}|}\mid k\in {\mathcal {I}}_+^l\cup {\mathcal {I}}_0^l\}$, $V_l=\bigl \{\frac{1}{f_{li}}e_i\in \mathbb {Re}^{|\mathbf{{\mathcal {N}}}|}\mid i\in {\mathcal {I}}_+^l\bigr \}$, and $R_l=\{f_{li}e_j-f_{lj}e_i\in \mathbb {Re}^{|\mathbf{{\mathcal {N}}}|}\mid i\in {\mathcal {I}}_+^l, j\in {\mathcal {I}}_-^l\}\cup \{e_k\in \mathbb {Re}^{|\mathbf{{\mathcal {N}}}|}\mid k\in {\mathcal {I}}_+^l\cup {\mathcal {I}}_0^l\}$. Furthermore, each point and ray used in the representation is extremal.

Proof

We first show that $Q_l^0=\mathop {\mathrm{cone}}(R_l^0)$. Since $Q_l^0$ is a cone in the nonnegative orthant, it is pointed. This implies that all the points in the cone can be written as a conic combination of its extreme rays. Let r be a ray of $Q_l^0$. Then, r is extreme if and only if it belongs to the intersection of $n=|\mathbf{{\mathcal {N}}_0}|-1$ independent hyperplanes among $\{t \in \mathbb {Re}^{|\mathbf{{\mathcal {N}}_0}|}\mid f_l^{\intercal }t=0\}$ and $\{t \in \mathbb {Re}^{|\mathbf{{\mathcal {N}}_0}|}\mid t_k=0\}$, for $k \in \mathbf{{\mathcal {N}}_0}$. First, for each $i\in \mathbf{{\mathcal {N}}_0}$, suppose that these n hyperplanes are $\{t\mid t_k=0\}$ for $k\ne i$. Then $r_k=0$ for all $k\ne i$ and hence $r=\rho e_i$ with $\rho > 0$. In order to be a ray, this vector must satisfy $f_l^{\intercal }r\ge 0$, i.e., i must be chosen in ${\mathcal {I}}_+^l\cup {\mathcal {I}}_0^l$ . Next, suppose that these n hyperplanes are $\{t \in \mathbb {Re}^{|\mathbf{{\mathcal {N}}_0}|}\mid f_l^{\intercal }t=0\}$ and $\{t \in \mathbb {Re}^{|\mathbf{{\mathcal {N}}_0}|}\mid t_k=0\}$ for $k\ne i, j$ for some $i, j\in \mathbf{{\mathcal {N}}_0}$. Then the face defined by the intersection of these hyperplanes is $F:=\{t \in \mathbb {Re}^{|\mathbf{{\mathcal {N}}_0}|}\mid f_{li}t_i+f_{lj}t_j=0, t\ge 0, t_k=0, \forall k\ne i, j\}$. In order for r to be a ray, $F\ne \{0\}$ and hence $f_{li}f_{lj}\le 0$. By independence, $f_{li}\ne 0$ or $f_{lj}\ne 0$. If $f_{li}=0$ or $f_{lj}=0$ then we have that $r=e_k$ for some $k\in {\mathcal {I}}_0^l$. Now assume that $f_{li}f_{lj}<0$. Without loss of generality, assume that $f_{li}>0$ and $f_{lj}<0$. Then, $r=f_{li}e_j-f_{lj}e_i$ where $i\in {\mathcal {I}}_+^l$ and $j\in {\mathcal {I}}_-^l\cup \{0\}$. We conclude that $R_l^0$ is precisely the collection of extreme rays of $Q_l^0$, and therefore $Q_l^0= \mathop {\mathrm{cone}}(R_l^0)$.

It follows directly from Proposition 4 and Lemma 5.41 in [28] that $Q_l=\mathop {\mathrm{conv}}\left( V_l\right) +\mathop {\mathrm{cone}}(R_l)$. Extremality follows from the extremality of rays in $R_l^0$. $\square $

The result of Proposition 5 yields a ${\mathcal {V}}$-polyhedron representation for the closed convex hull of the union of the associated disjuncts.

Corollary 2

It holds that $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)=\mathop {\mathrm{cone}}(R^0)$ and $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q)= \mathop {\mathrm{conv}}(V)+\mathop {\mathrm{cone}}(R)$, where $R^0 := \bigcup _{l\in {\mathcal {L}}} R_l^0, V:=\bigcup _{l\in {\mathcal {L}}} V_l$, and $R:=\bigcup _{l\in {\mathcal {L}}} R_l$.

The coefficient vectors $\beta \in \mathbb {Re}^{|\mathbf{{\mathcal {N}}}|}$ and $\beta ' \in \mathbb {Re}^{|\mathbf{{\mathcal {N}}_0}|}$ that give rise to strong valid inequalities of $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q)$ and $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$, respectively, are closely related. There is a straightforward one-to-one correspondence between valid inequalities for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q)$ and $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$. The following result shows further that characterizing the facets of $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$ is equivalent to characterizing the facets of $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q)$.

Proposition 6

Inequality

$$\begin{aligned} \sum _{i \in \mathbf{{\mathcal {N}}}} \beta _i t_i\ge \gamma \end{aligned}$$

(4)

is facet-defining for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q)$ if and only if inequality

$$\begin{aligned} \sum _{i \in \mathbf{{\mathcal {N}}}} \beta _i t_i\ge \gamma t_0 \end{aligned}$$

(5)

is facet-defining for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$ and is not a scalar multiple of $t_0 \ge 0$.

Proof

The fact that validity is preserved is clear. Suppose now that (4) is facet-defining for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q)$. Then there exist $n=|\mathbf{{\mathcal {N}}}|$ affinely independent points $w^1,\dots , w^n$ of $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q)$ that satisfy (4) at equality. Points $(w^j,1) $ belong to $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$ for all $j \in \mathbf{{\mathcal {N}}}$ and satisfy (5) at equality. Since $\{w^j\mid j \in \mathbf{{\mathcal {N}}}\}$ are affinely independent, $\{(w^j,1)\mid j \in \mathbf{{\mathcal {N}}}\}$ are linearly independent. This proves that (5) is facet-defining for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$. Clearly, (5) is not $t_0 \ge 0$ as otherwise (4) would be $0^{\intercal }t \ge -1$, which is not facet-defining for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q)$.

Conversely, suppose that (5) is facet-defining for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$. Since $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$ is a full-dimensional polyhedral cone, there exist n linearly independent extreme rays $(r^j, r^j_0)$ of $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$ that satisfy (5) at equality. Suppose $r^j_0=0$ for all $j \in \mathbf{{\mathcal {N}}}$. Observe that $\{r^j \mid j \in \mathbf{{\mathcal {N}}}\}$ are linearly independent and $\beta ^{\intercal }r^j=0$ for all $j \in \mathbf{{\mathcal {N}}}$. This shows that $\beta =0$. However, this is not possible as (5) would then correspond to the face of $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$ induced by $t_0 \ge 0$. Therefore, there must exist $j \in \mathbf{{\mathcal {N}}}$ such that $r^j_0\ne 0$. Define $I_1=\{{j^\prime } \in \mathbf{{\mathcal {N}}}\mid r^{j^\prime }_0\ne 0\}(\ne \emptyset )$ and $I_2=\{{j^\prime } \in \mathbf{{\mathcal {N}}}\mid r^{j^\prime }_0=0\}$. Then, for $j\in I_1$, $\beta ^{\intercal }\frac{r^j}{r^j_0} = \frac{1}{r^j_0} \beta ^{\intercal }r^j = \gamma $. Further, for $k\in I_2$, $\beta ^{\intercal }r^k = 0$. Fix $j_0\in I_1$, and consider the sets of points

It is clear that these points satisfy (4) at equality and that they belong to $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q)$ by Proposition 4. It remains to prove that they are affinely independent, which can be established easily because the linear independence of vectors

follows from the assumed independence of $\{(r^j,r^0), j\in \mathbf{{\mathcal {N}}}\}$. Therefore, (4) is facet-defining for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q)$. $\square $

In the remainder of this paper, we prefer to study $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$ because, being homogeneous, it allows for a unified treatment of the extreme points and extreme rays of Q, and thus permits a more streamlined presentation.

We are now ready to further investigate the structure of coefficient vectors associated with facet-defining inequalities (5) of $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$. In particular, we will show in Proposition 8 that, except for some simple inequalities we describe next, most facet-defining inequalities (5) are such that $\gamma >0$.

For $i\in \mathbf{{\mathcal {N}}_0}$, we refer to the inequalities $t_i\ge 0$ of $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$ as trivial. For notational convenience, we redefine ${\mathcal {I}}_-^l:={\mathcal {I}}_-^l\cup \{0\}$ because $f_{l0}=-1$. Hence ${\mathcal {I}}_+^l$, ${\mathcal {I}}_-^l$, and ${\mathcal {I}}_0^l$ partition $\mathbf{{\mathcal {N}}_0}$. We also define

$$\begin{aligned}\begin{array}{cll} {\mathcal {I}}_+&{}=\{i\in \mathbf{{\mathcal {N}}_0}\mid f_{li}> 0 \,\hbox {for some} \,l\in {\mathcal {L}}\} &{}= \textstyle \bigcup _{l \in {\mathcal {L}}} {\mathcal {I}}_+^l,\\ {\mathcal {I}}_-&{}=\{i\in \mathbf{{\mathcal {N}}_0}\mid f_{li}<0 \,\hbox {for all} \,l\in {\mathcal {L}}\} &{}= \textstyle \bigcap _{l \in {\mathcal {L}}} {\mathcal {I}}_-^l,\\ {\mathcal {I}}_0&{}= \mathbf{{\mathcal {N}}_0}{\setminus } ({\mathcal {I}}_+\cup {\mathcal {I}}_-). \end{array}\end{aligned}$$

It is clear that $0 \in {\mathcal {I}}_-$ and it follows from Assumption 1 that ${\mathcal {I}}_+\ne \emptyset $. In the next proposition, we provide necessary and sufficient conditions under which trivial inequalities are facet-defining for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$.

Proposition 7

Inequality $t_i\ge 0$ is facet-defining for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$ if and only if

1.
$i \in {\mathcal {I}}_-\cup {\mathcal {I}}_0$, or
2.
$i \in {\mathcal {I}}_+$ and $|{\mathcal {I}}_+|\ge 2$.

Proof

Inequality $t_i \ge 0$ is clearly valid for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$. Assume first that $i\in {\mathcal {I}}_-\cup {\mathcal {I}}_0$. Since ${\mathcal {I}}_+\ne \emptyset $, there exists $j \in {\mathcal {I}}_+$ and $l \in {\mathcal {L}}$ such that $f_{lj}>0$. Consider the point

$$\begin{aligned} \sum _{k \in \mathbf{{\mathcal {N}}_0} \backslash \{i,j\}} \epsilon e_k + \frac{1-\sum _{k \in \mathbf{{\mathcal {N}}_0} \backslash \{i,j\}}\epsilon f_{lk}}{f_{lj}}e_j \end{aligned}$$

(6)

for $\epsilon $ positive and sufficiently small. This point is in the relative interior of $Q_l^0\cap \{t \in \mathbb {Re}^{|\mathbf{{\mathcal {N}}_0}|}\mid t_i=0\}$. Hence, it is in the relative interior of $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)\cap \{t \in \mathbb {Re}^{|\mathbf{{\mathcal {N}}_0}|}\mid t_i=0\}$. It follows that $t_i\ge 0$ is facet-defining for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$. Next, assume that $i\in {\mathcal {I}}_+$. If $|{\mathcal {I}}_+|\ge 2$, there exists $j\in {\mathcal {I}}_+{\setminus }\{i\}$ and $l \in {\mathcal {L}}$ such that $f_{lj}>0$. Then, (6) is an interior point of $Q_l^0\cap \{t \in \mathbb {Re}^{|\mathbf{{\mathcal {N}}_0}|}\mid t_i=0\}$ and hence $t_i\ge 0$ is facet-defining for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$. Suppose ${\mathcal {I}}_+=\{i\}$ and $j\in {\mathcal {I}}_-$. Then, for each $l \in {\mathcal {L}}$, every point in $Q_l^0\cap \{t \in \mathbb {Re}^{|\mathbf{{\mathcal {N}}_0}|}\mid t_i=0\}$ satisfies $t_j=0$. It follows that $t_i \ge 0$ defines a face of $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$ of dimension at least two less than that of $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$, showing that this inequality is not facet-defining. $\square $

Proposition 7 shows that trivial inequalities $t_i\ge 0$ are facet-defining unless $i \in {\mathcal {I}}_+$ and $|{\mathcal {I}}_+|=1$. In the remainder of this paper, we consider $\beta $ to be a vector in $\mathbb {Re}^{|\mathbf{{\mathcal {N}}_0}|}$. We show next that the sign of the entries of coefficient vectors $\beta $ for nontrivial facet-defining inequalities of $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$ can be deduced directly from the sets ${\mathcal {I}}_+, {\mathcal {I}}_-$, and ${\mathcal {I}}_0$.

Proposition 8

Let

$$\begin{aligned} \sum _{i\in \mathbf{{\mathcal {N}}_0}} \beta _i t_i \ge 0 \end{aligned}$$

(7)

be a nontrivial facet-defining inequality for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$. Then

1.
$\beta _i\ge -\max \{f_{li}\mid l\in {\mathcal {L}}\}\beta _0$ for $i\in {\mathcal {I}}_+$,
2.
$\beta _j<0$ for $j\in {\mathcal {I}}_-$
3.
$\beta _k=0$ if $\max \{f_{lk}\mid l\in {\mathcal {L}}\}=0$.

In particular, $\beta _i > 0$ for $i\in {\mathcal {I}}_+$.

Proof

Consider a nontrivial facet-defining inequality (7). Observe that $\beta _i\ge 0$ for $i\in {\mathcal {I}}_+\cup {\mathcal {I}}_0$ because $e_i$ is a ray of $Q^0$.

We first prove 1. Choose $j'\in {\mathcal {I}}_-$ with $\beta _{j'}<0$. Such a $j'$ exists because otherwise, (7) is implied by trivial inequalities. Let $i\in {\mathcal {I}}_+^l$ for some $l\in {\mathcal {L}}$. Since $f_{li}e_{j'}-f_{lj'}e_{i}$ is a ray for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$, it follows that $\beta _i \ge \max \{f_{li}\mid l\in {\mathcal {L}}\}\frac{\beta _{j'}}{f_{lj'}} > 0$. Remember now that $0\in {\mathcal {I}}_-$. If $\beta _0<0$, Part 1 follows easily since $f_{l0}=-1$. If $\beta _0=0$, Part 1 simply states that $\beta _i \ge 0$ while the inequality just proven for $j'$ is stronger.

We now prove 2 and 3. Consider $j\in {\mathcal {I}}_-\cup {\mathcal {I}}_0$. There exists an extreme ray r of $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$ such that $\beta ^\intercal r=0$ and $r_j>0$ because otherwise, (7) is a trivial inequality. Proposition 5 shows that this ray is one of two forms. First, let $r=f_{li}e_j-f_{lj}e_{i}$ for some $l\in {\mathcal {L}}$ and $i\in {\mathcal {I}}_+^l$. As shown, $\beta _i > 0$. It follows from $\beta ^\intercal r=0$ that $\beta _j < 0$. This shows Part 2 when $j\in {\mathcal {I}}_-$ and shows that it is not the desired ray when $j\in {\mathcal {I}}_0$ as it contradicts the established relation $\beta _j\ge 0$. Now, consider $j\in {\mathcal {I}}_0$. We must have that $r=e_j$. This shows that $\beta _j = 0$ proving Part 3. $\square $

Example 1

Consider the set $Q^0$ with disjuncts defined by the constraints

$$\begin{aligned} \begin{array}{rrrrrrrl} 5t_1 &{} -3t_2 &{} + 0t_3 &{}+1t_4 &{}-5t_5&{}-t_0&{}\ge 0\,\,\\ 3t_1 &{} -1t_2 &{} + 2t_3 &{}-3t_4&{}-3t_5 &{}-t_0&{}\ge 0\,\,\\ 4t_1 &{} -6t_2 &{} + 4t_3 &{}-2t_4&{}+0t_5&{}-t_0&{}\ge 0\,\,\\ 2t_1 &{} -2t_2 &{} -2t_3 &{}+0t_4 &{}-2t_5&{}-t_0&{}\ge 0.\\ \end{array} \end{aligned}$$

(8)

Then ${\mathcal {I}}_+=\{1,3,4\},\quad {\mathcal {I}}_-=\{2,0\},\text { and } {\mathcal {I}}_0=\{5\}$. We use PORTA [14, 15] to obtain the extreme rays of each disjunct and run it again to obtain all facet-defining inequalities of $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$, from which the nontrivial ones below:

$$\begin{aligned} \begin{array}{rrrrrrl} 5t_1 &{} -\frac{5}{3}t_2 &{} + 4t_3 &{}+t_4 &{}+0t_5&{}-t_0&{}\ge 0 \\ 9t_1 &{} -3t_2 &{} + 6t_3 &{}+t_4&{}+0t_5 &{}-t_0&{}\ge 0 \\ 6t_1 &{} -2t_2 &{} + 4t_3 &{}+t_4&{}+0t_5 &{}-t_0&{}\ge 0. \\ \end{array} \end{aligned}$$

(9)

We observe that, as argued in Proposition 8, $\beta _i>0$ for $i \in {\mathcal {I}}_+$, $\beta _i<0$ for $i \in {\mathcal {I}}_-$ and $\beta _5=0$ in all nontrivial facet-defining inequalities (9). $\square $

Proposition 8 shows that, for nontrivial facet-defining inequalities of $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$, $\beta _k$ equals zero for each index k for which the tableau coefficients satisfy $f_{lk}\le 0$ for all $l\in {\mathcal {L}}$ and $f_{l^\prime k}=0$ for some $l^\prime \in {\mathcal {L}}$. Then, $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q)=\{t=(t_{-k},t_k)\mid t_{-k}\in \mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q_{-k}), t_k\in \mathbb {Re}_+\}$ where $t_{-k}$ is the vector obtained by dropping component $t_k$ from t and $Q_{-k}:=\mathop {\mathrm{proj}}_{t_{-k}}(Q)$. Thus, it is sufficient to study $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q_{-k})$. We therefore make the following assumption in the remainder of the paper.

Assumption 2

${\mathcal {I}}_0=\emptyset $.

With Assumption 2, it follows that ${\mathcal {I}}_+$ and ${\mathcal {I}}_-$ partition $\mathbf{{\mathcal {N}}_0}$.

We next derive an ${\mathcal {H}}$-polyhedron representation of $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$. We obtain the linear inequalities of this representation by considering the dual cone of its ${\mathcal {V}}$-polyhedron representation, which was obtained in Corollary 2. For a given cone $C \subseteq \mathbb {Re}^n$, we denote the dual cone of C by $C^*$. Recall that $C^*=\{y \in \mathbb {Re}^n\mid y^{\intercal }x \ge 0,\,\,\forall x\in C\}$. As we established in Corollary 2 that $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)=\mathop {\mathrm{cone}}(R^0)$ where $R^0:=\bigcup _{l\in {\mathcal {L}}} R_l^0$, it is easy to see that $\beta ^{\intercal }t \ge 0$ is a valid inequality for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$ if and only if $\beta ^{\intercal }r \ge 0$ for all $r\in R^0$. Therefore, the coefficient vectors of valid inequalities for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$ belong to

(10)

where we use $B_{1}$ as a shorthand notation for $[\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)]^*$.

Among the facet-defining inequalities of $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$, trivial inequalities are not useful in practice, since they do not cut off the basic solution associated with simplex tableau (1). We therefore concentrate on nontrivial facet-defining inequalities of $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$, which have $\beta _0<0$ as shown in Proposition 8. Therefore, by scaling if necessary, we may assume that $\beta _0=-1$. For this reason, we focus our ensuing study on $B_{2}:=B_{1}\cap \{\beta \in \mathbb {Re}^{|\mathbf{{\mathcal {N}}_0}|}\mid \beta _0=-1\}$, and show that the description of this polyhedron requires fewer constraints than those given in (10).

Proposition 9

For $(i,j) \in {\mathcal {I}}_+\times {\mathcal {I}}_-$, define

(11)

Then

(12)

Proof

Just as in the proof of Proposition 8, when $\beta _0 < 0$, the inequalities $\beta _k \ge 0$ for $k\in {\mathcal {I}}_+^l\cup {\mathcal {I}}_0^l$ do not support $B_1$ and can therefore be dropped. Now, for any $i\in {\mathcal {I}}_+^l$ and $j\in {\mathcal {I}}_-^l$, $\beta _i \ge \frac{f_{li}}{f_{lj}}\beta _j$. This inequality is redundant if $j\in {\mathcal {I}}_+$ because, as argued above, $\beta _i > 0$. Therefore, $j\in {\mathcal {I}}_-$. Maximizing $\frac{f_{li}}{f_{lj}}\beta _j$ yields (12). $\square $

It is easy to see that the coefficients $\beta \in B_{2}$ are sign-constrained. Therefore, $B_{2}$ has no lines. Because $B_{2}$ does not have a line, it has at least one extreme point; see Corollary 18.5.3 in [45]. We mention that $B_{2}$ does also have rays, including vectors $e_i$ for $i \in {\mathcal {I}}_+\cup {\mathcal {I}}_-\backslash \{0\}$.

We next show that there is a one-to-one correspondence between the nontrivial facet-defining inequalities of $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$ and the extreme points of $B_{2}$.

Theorem 1

Any inequality $\beta ^{\intercal }t\ge 0$ with $\beta _0=-1$ is facet-defining for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$ if and only if $\beta $ is an extreme point of $B_{2}$.

Proof

For a facet-defining inequality, $\beta ^{\intercal }t\ge 0$ of $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$, $\beta $ is an extreme point of $B_{2}$ because of the n linearly independent tight constraints $\beta ^{\intercal }r^j = 0$, one for each tight linearly independent extreme ray $r^j$ of $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$ and the equality constraint $\beta _0=-1$. For the reverse inclusion, the tight constraints, besides $\beta _0=-1$, each yield a linearly independent extreme ray tight for the inequality. $\square $

Extreme rays of $B_{2}$ also lead to valid inequalities for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$. In fact, consider a solution $\beta $ and an extreme ray $\rho $ of $B_{2}$. Clearly, $\rho _0=0$. For all $\tau \ge 0$, $\beta + \tau \rho \in B_{2}$, and therefore the inequality $(\beta + \tau \rho )^{\intercal }t \ge 0$ is valid for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$. Dividing throughout by $\tau $ and letting $\tau \rightarrow \infty $, we then conclude that $\rho ^{\intercal }t \ge 0$, an inequality with $\rho _0=0$, is valid for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$. If this inequality is facet-defining for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$, then it must be one of the trivial ones. However, extreme rays, unlike extreme points, do not necessarily yield facet-defining inequalities for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$. We next illustrate these observations, together with the statement of Theorem 1.

Example 1

(continued) For the set $Q^0$ with disjuncts defined by (8) and where variable $t_5$ has been removed, we compute that $w_{12}:=\min \left\{ \frac{3}{5}, \frac{1}{3}, \frac{6}{4}, \frac{2}{2} \right\} =\frac{1}{3}$, $w_{10}:=\min \left\{ \frac{1}{5}, \frac{1}{3}, \frac{1}{4}, \frac{1}{2}\right\} =\frac{1}{5}$, $w_{32}:=\min \left\{ \frac{1}{2}, \frac{6}{4}\right\} =\frac{1}{2}$, $w_{30}:=\min \left\{ \frac{1}{2}, \frac{1}{4}\right\} =\frac{1}{4}$, $w_{42} :=3$, and $w_{40} :=1$. It then follows from Proposition 9 that

Coefficient vectors of all facet-defining inequalities of $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$ that cut off the solution (0, 0, 0, 0, 1) belong to $B_{2}$. For instance, the coefficient vector $\beta =(5,-\frac{5}{3},4,1,-1)$ belongs to $B_{2}$. Further, it satisfies the following system of linearly independent equations $\beta _2 + \frac{1}{3}\beta _1 = 0$, $\beta _0 + \frac{1}{5}\beta _1 = 0$, $\beta _0 + \frac{1}{4}\beta _3 = 0$, $\beta _0 + \beta _4 = 0$, and $\beta _0=-1$. Since the system has a unique solution, $\beta $ is an extreme point of $B_{2}$. This extreme point is the coefficient vector of the first facet-defining inequality of (9) (where we have omitted the coefficient $\beta _5$ since ${\mathcal {I}}_0=\{5\}$). It can also be verified that $(3,-1,2,\frac{1}{3},0)$ is an extreme ray of $B_{2}$. It corresponds to the valid inequality $3t_1-t_2+2t_3+\frac{1}{3}t_4 \ge 0$, which is not facet-defining for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$ since it can be obtained as a conic combination of the second facet-defining inequality of (9) and $t_0 \ge 0$ with equal weights of $\frac{1}{3}$. $\square $

5 Dual network formulation of $B_{2}$

In this section, we present a nonlinear transformation that maps (a subset of) the polyhedron $B_{2}$ to the feasible region of the dual of a transportation problem. We show that this transformation preserves the face-lattice of $B_{2}$ (see below for a definition). We use these results in Sect. 6 to establish a correspondence between the extreme points of $B_{2}$, i.e., the nontrivial facet-defining inequalities of $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$, and certain spanning trees of a suitably defined transportation network.

We have shown in Proposition 8 that if $\beta $ is an extreme point of $B_{2}$, $\beta _i>0$ for all $i\in {\mathcal {I}}_+$ and $\beta _j<0$ for all $j\in {\mathcal {I}}_-$. Define $A=\{\beta \in \mathbb {Re}^{|\mathbf{{\mathcal {N}}_0}|}\mid \beta _i>0, \beta _j<0, \forall i\in {\mathcal {I}}_+, \forall j\in {\mathcal {I}}_-\}$. Observe that, for any $\beta \in B_{2}\cap A$ and for $(i,j) \in {\mathcal {I}}_+\times {\mathcal {I}}_-$,

$$\begin{aligned} \beta _j+w_{ij}\beta _i\ge 0 \Longleftrightarrow \frac{-\beta _j}{\beta _i} \le w_{ij} \Longleftrightarrow \log (-\beta _j)-\log (\beta _i) \le \log (w_{ij}). \end{aligned}$$

All the logarithms computed above are well-defined under the conditions of A. Define $T:A\rightarrow \mathbb {Re}^{|\mathbf{{\mathcal {N}}_0}|}$ by $[T(\beta )]_k := \log |\beta _k|.$ Its inverse transformation $T^{-1}$ is then $[T^{-1}(\delta )]_k=e^{\delta _k}$ if $k\in {\mathcal {I}}_+$ and $-e^{\delta _k}$ if $k\in {\mathcal {I}}_-$. After introducing the new variables $\delta _i=\log (\beta _i)$, for $i\in {\mathcal {I}}_+$ and $\delta _j=\log (-\beta _j)$, for $j\in {\mathcal {I}}_-$, and the constants $c_{ij}=\log (w_{ij})$, for $(i,j) \in {\mathcal {I}}_+\times {\mathcal {I}}_-$, we define

Proposition 10

It holds that $T(B_{2}\cap A) = D_{2}$. $\square $

It is clear that for $\beta \in B_{2} \cap A$ and $\delta =T(\beta ) \in D_{2}$,

$$\begin{aligned} \beta _j + w_{ij} \beta _i = 0\iff & {} \delta _j - \delta _i = c_{ij}, \end{aligned}$$

(13)

$$\begin{aligned} \beta _j + w_{ij} \beta _i \le 0\iff & {} \delta _j - \delta _i \le c_{ij}. \end{aligned}$$

(14)

Let H(E) be the subgraph of the complete bipartite graph $G:=({\mathcal {I}}_+, {\mathcal {I}}_-)$ with edge set $E\subseteq {\mathcal {I}}_+\times {\mathcal {I}}_-$. Let $P, Q\in \mathbb {Re}^{|{\mathcal {I}}_+|\times |{\mathcal {I}}_-|}$ be two matrices. We create the $|E|\times (n+1)$ matrix M(H(E), P, Q) by fixing an ordering of the edges of E (say lexicographical) and by assigning the row of M(H(E), P, Q) corresponding to edge $\{i,j\} \in E$ to be the vector $P_{ij}e_j^\intercal +Q_{ij}e_i^\intercal $.

Lemma 1

Assume that H(E) is a subforest of G. Assume also that $P_{ij}\ne 0$ and $Q_{ij}\ne 0$ for all $\{i,j\}\in E$. Then M(H(E), P, Q) has full rank.

Proof

Suppose H(E) is a subforest of G. Since $|E|< n+1$, we only need to prove independence of the rows of M(H(E), P, Q). For a positive integer $k=1,\dots ,|E|-1$, observe that the $(k+1)$th row of M(H(E), P, Q) introduces a new nonzero entry, which was zero in the first k rows because H(E) does not contain a cycle, $P_{ij}\ne 0$ and $Q_{ij}\ne 0$. This shows that the rows of M(H(E), P, Q) are independent. $\square $

Define $\mathbb {J}$ to be the $|{\mathcal {I}}_+|\times |{\mathcal {I}}_-|$ matrix of ones and $\mathbb {W}$ to be the $|{\mathcal {I}}_+|\times |{\mathcal {I}}_-|$ matrix whose (i, j) entry is $w_{ij}$. For any $E \subseteq {\mathcal {I}}_+\times {\mathcal {I}}_-$ such that H(E) is a forest, Lemma 1 shows that both matrices $M(H(E), \mathbb {J}, -\mathbb {J})$ and $M(H(E), \mathbb {J}, \mathbb {W})$ have full rank.

Proposition 11

Let H(E) be a subgraph of G with $n=|\mathbf{{\mathcal {N}}_0}|-1$ edges such that $\mathop {\mathrm{rank}}(M(H(E), \mathbb {J}, \mathbb {W}))=n$ and $M(H(E), \mathbb {J}, \mathbb {W}) \beta =0$ for some $\beta \in B_{2}$. Then H(E) is a tree of G.

Proof

Assume by contradiction that H(E) has a cycle C, and let $\beta ^C$ be the components of $\beta $ associated with nodes of C. Let $M'$ be the $n\times n$ submatrix of $M(H(E), \mathbb {J}, \mathbb {W})$ associated with cycle C. Then it is easy to verify that $M'$ is nonsingular and $M'\beta ^C=0 $ which implies that $\beta ^C=0$. Since G is bipartite, C contains a node $k \in {\mathcal {I}}_+$, and so $\beta _k=0$. Because $\beta \in B_{2}$, it satisfies $\beta _0 + w_{k0} \beta _k \ge 0$, which implies that $\beta _0 \ge 0$. This is a contradiction to the fact that $\beta _0=-1$. Since H(E) has n edges, $n+1$ nodes and no cycle, it is a tree. $\square $

A finite partially ordered set $(S,\le )$, or poset, is the association of a finite set S with a relation “$\le $” which is (i) reflexive: $x\le x$ for all $x\in S$, (ii) transitive: $x\le y$ and $y\le z$ imply $x\le z$, and (iii) antisymmetric: $x\le y$ and $y\le x$ imply $x=y$. The face-lattice of a polyhedron P is the poset of its faces, partially ordered by inclusion. We say that two posets $(S,\le )$ and $(S',\preceq )$ are isomorphic if there is a bijection $T(\cdot )$ from S to $S'$ such that $s_1 \le s_2$ if and only if $T(s_1) \preceq T(s_2)$. Moreover, we say two polyhedra are isomorphic if their face-lattices are isomorphic.

Proposition 12

Polyhedra $D_{2}$ and $B_{2}$ are isomorphic.

Proof

Given a polyhedron $P \subseteq \mathbb {Re}^n$, we define ${\mathcal {F}}(P)$ to be the set of faces of P. Given $E \subseteq {\mathcal {I}}_+\times {\mathcal {I}}_-$, we define

$$\begin{aligned} B_{2}|_E= & {} \{ \beta \in B_{2} \,|\, \beta _j + w_{ij} \beta _i = 0, \forall (i,j) \in E \} \\ D_{2}|_E= & {} \{ \delta \in D_{2} \,|\, \delta _j - \delta _i = c_{ij}, \forall (i,j) \in E \}. \end{aligned}$$

Clearly, $B_{2}|_E$ and $D_{2}|_E$ are (possibly empty) faces of $B_{2}$ and $D_{2}$, respectively. Given a nonempty face F of $B_{2}$, we denote by $\mathbb {E}(F)$ the largest subset $E \subseteq {\mathcal {I}}_+\times {\mathcal {I}}_-$ such that $F = B_{2}|_E$. In particular, for every point $\beta ^*$ in the relative interior of F, $\beta ^*_j + w_{ij} \beta ^*_i < 0$ for $(i,j) \in ({\mathcal {I}}_+\times {\mathcal {I}}_-) \backslash \mathbb {E}(F)$. Similarly, given a nonempty face $F^{\prime }$ of $D_{2}$, we denote by $\mathbb {E}'(F^{\prime })$ the largest subset $E^{\prime }\subseteq {\mathcal {I}}_+\times {\mathcal {I}}_-$ such that $F^{\prime } = D_{2}|_{E^{\prime }}$. For every point $\delta ^*$ in the relative interior of $F^{\prime }$, $\delta _j - \delta _i < c_{ij}$ for $(i,j) \in ({\mathcal {I}}_+\times {\mathcal {I}}_-) \backslash \mathbb {E}'(F^{\prime })$.

Next, we define $\varphi : {\mathcal {F}}(B_{2}) \mapsto {\mathcal {F}}(D_{2})$ to be such that $\varphi (F) = D_{2}|_{\mathbb {E}(F)}$ for any nonempty face $F \in {\mathcal {F}}(B_{2})$ and $\varphi (\emptyset )=\emptyset $. We show that $\varphi $ is a bijection by constructing an inverse to $\varphi $. Define $\psi : {\mathcal {F}}(D_{2}) \mapsto {\mathcal {F}}(B_{2})$ to be such that $\psi (F^{\prime }) = B_{2}|_{\mathbb {E}'(F^{\prime })}$ for any nonempty face $F^{\prime } \in {\mathcal {F}}(D_{2})$ and $\psi (\emptyset )=\emptyset $. First, we argue that if $F \in {\mathcal {F}}(B_{2})$ and $F^{\prime }=\varphi (F)$, then $\mathbb {E}(F)=\mathbb {E}'(F^{\prime })$. Consider a point ${\bar{\beta }}$ in the relative interior of F and an extreme point ${\tilde{\beta }}$ of that face. The line segment $[{\bar{\beta }},{\tilde{\beta }})$ is in the relative interior of F; see Theorem 6.1 of [45]. Further, there exists a point $\beta $ on this line segment, sufficiently close to ${\tilde{\beta }}$, that belongs to A. Define $\delta = T(\beta )$. It follows from (13) and (14) that $\delta $ belongs to the relative interior of $F^{\prime }$ and that $\mathbb {E}(F)=\mathbb {E}'(F^{\prime })$. Similarly, if $F^{\prime } \in {\mathcal {F}}(D_{2})$ and $F^{\prime \prime }=\psi (F^{\prime })$, then $\mathbb {E}'(F^{\prime })=\mathbb {E}(F^{\prime \prime })$. Second, we argue that for each $F \in {\mathcal {F}}(B_{2})$, $\psi (\varphi (F))=F$. The result is clear when $F=\emptyset $. When $F \ne \emptyset $, define $F^{\prime }=\varphi (F)$ and $F^{\prime \prime }=\psi (F^{\prime })$. It follows from the above discussion that $\mathbb {E}(F)=\mathbb {E}'(F^{\prime })=\mathbb {E}(F^{\prime \prime })$ Therefore $F^{\prime \prime }=B_{2}|_{\mathbb {E}(F^{\prime \prime })}=B_{2}|_{\mathbb {E}(F)} = F$.

To conclude the proof, consider two faces $F_1$ and $F_2$ of ${\mathcal {F}}(B_{2})$ such that $F_1 \subseteq F_2$. Define $F^{\prime }_1=\varphi (F_1)$ and $F^{\prime }_2=\varphi (F_2)$. Since $F_1 \subseteq F_2$, then $\mathbb {E}(F_1) \supseteq \mathbb {E}(F_2)$. It follows from the above discussion that $\mathbb {E}'(F_1^{\prime }) \supseteq \mathbb {E}'(F_2^{\prime })$, showing that $\varphi (F_1)=F^{\prime }_1 \subseteq F^{\prime }_2=\varphi (F_2)$. $\square $

It is shown in Theorem 10.1 of [12] that two isomorphic polytopes have the same dimension, and that faces matched through the bijection $T(\cdot )$ have identical dimensions. The proof idea extends to our setting.

The proof of Proposition 12 therefore implies that there is a one-to-one correspondence between the faces of dimension one of $B_{2}$ and $D_{2}$. We obtain

Corollary 3

If u (resp. v) is an extreme point of $D_{2}$ (resp. $B_{2}$) then $T^{-1}(u)$ (resp. T(v)) is an extreme point of $B_{2}$ (resp. $D_{2}$). $\square $

Extreme points of $D_{2}$ can be exposed as unique optimal solutions to certain linear programs (LPs) over $D_{2}$, or equivalently can be obtained from optimal solutions of certain LPs over $D_{1}$. In order for such LPs to have an optimal extreme point solution, the objective coefficient vector should be chosen in the polar cone of the recession cone of the feasible set. The recession cones of $D_{1}$ and $D_{2}$ are

$$\begin{aligned} \mathop {\mathrm{rec}}(D_{1})= & {} \{\delta \in \mathbb {Re}^{|\mathbf{{\mathcal {N}}_0}|}\mid \delta _j-\delta _i\le 0, \forall (i,j) \in {\mathcal {I}}_+\times {\mathcal {I}}_-\},\\ \mathop {\mathrm{rec}}(D_{2})= & {} \{\delta \in \mathbb {Re}^{|\mathbf{{\mathcal {N}}_0}|}\mid \delta _j-\delta _i\le 0, \delta _0=0, \forall (i,j) \in {\mathcal {I}}_+\times {\mathcal {I}}_-\}. \end{aligned}$$

We next derive a ${\mathcal {V}}$-polyhedron description of $\mathop {\mathrm{rec}}(D_{1})$ and $\mathop {\mathrm{rec}}(D_{2})$. In this result, we let $\mathbb {1}=\sum _{k\in \mathbf{{\mathcal {N}}_0}} e_k$. For a set of vectors V, we define $\mathop {\mathrm{lin}}(V)$ to be the linear subspace generated by V. For notational convenience, we use $\mathop {\mathrm{lin}}(v)$ to denote $\mathop {\mathrm{lin}}(\{v\})$ for a vector v.

Proposition 13

1.
Let $R_2=\{e_i\mid i\in {\mathcal {I}}_+\} \cup \{-e_j\mid j\in {\mathcal {I}}_-{\setminus }\{0\}\} \cup \{\mathbb {1}-e_0\}$. Then $\mathop {\mathrm{rec}}(D_{2})=\mathop {\mathrm{cone}}(R_2)$.
2.
Let $R_1=\{e_i\mid i\in {\mathcal {I}}_+\} \cup \{-e_j\mid j\in {\mathcal {I}}_-\}$. Then $\mathop {\mathrm{rec}}(D_{1})=\mathop {\mathrm{cone}}(R_1)+\mathop {\mathrm{lin}}(\mathbb {1})$.

Proof

Assume that $\delta \in \mathop {\mathrm{rec}}(D_{2})$. Let $a=\min \{\delta _i\mid i\in {\mathcal {I}}_+\}$ and $b=\max \{\delta _j\mid j\in {\mathcal {I}}_-\}$. Then, $b\ge \delta _0=0$. Furthermore, $\delta _j \le b\le a\le \delta _i$, for $i\in {\mathcal {I}}_+$ and $j \in {\mathcal {I}}_-$. We can then write

$$\begin{aligned} \delta = b(\mathbb {1}-e_0)+\sum _{j\in {\mathcal {I}}_-{\setminus }\{0\}}(b-\delta _j)(-e_j)+\sum _{i\in {\mathcal {I}}_+}(\delta _i-b)e_i, \end{aligned}$$

which shows that $\mathop {\mathrm{rec}}(D_{2})\subseteq \mathop {\mathrm{cone}}(R_2)$. Observe next that $R_2\subseteq \mathop {\mathrm{rec}}(D_{2})$ since the elements of $R_2$ are rays of $D_{2}$. Therefore $\mathop {\mathrm{cone}}(R_2) \subseteq \mathop {\mathrm{rec}}(D_{2})$, proving 13. We next show that $\mathop {\mathrm{rec}}(D_{1})=\mathop {\mathrm{rec}}(D_{2})+\mathop {\mathrm{lin}}(\mathbb {1})$, which will prove 13 since $\mathbb {1}-e_0 \in -e_0 + \mathop {\mathrm{lin}}(\mathbb {1})$. To prove the forward inclusion ($\subseteq $), consider $\delta ^\prime $ in $\mathop {\mathrm{rec}}(D_{1})$. Then $\delta ^{\prime }-\delta _0^{\prime }\mathbb {1}$ belongs to $\mathop {\mathrm{rec}}(D_{2})$. To prove the reverse inclusion $(\supseteq )$, consider $\delta ^\prime $ in $\mathop {\mathrm{rec}}(D_{2})$ and $t \in \mathbb {Re}$. It is clear that $\delta ^{\prime }+t\mathbb {1}\in \mathop {\mathrm{rec}}(D_{1})$. $\square $

By definition of polar cone,

$$\begin{aligned} (\mathop {\mathrm{rec}}(D_{1}))^o:= & {} \{y\mid y^{\intercal }x\le 0, x\in \mathop {\mathrm{rec}}(D_{1})\}\\= & {} \{y\mid y^{\intercal }x\le 0, x\in R_1 \cup \{-\mathbb {1},\mathbb {1}\}\}\\= & {} \left\{ y\left| y_i\le 0, \forall i\in {\mathcal {I}}_+;\; y_j\ge 0, \forall j\in {\mathcal {I}}_-;\; \sum _{k\in \mathbf{{\mathcal {N}}_0}} y_k= 0 \right. \right\} . \end{aligned}$$

Similar to $B_{2}$, it is simple to verify that $D_{2}$ does not contain lines, and therefore has at least one extreme point. We next show that each extreme point of $D_{2}$ can be derived from an optimal solution of an LP over $D_{1}$ by setting an appropriate objective vector y in $\mathop {\mathrm{ri}}\left( \mathop {\mathrm{rec}}(D_{1})^o\right) $. Define ${\mathbf {s}}:=-y_{{\mathcal {I}}_+}$ and ${\mathbf {d}}:=y_{{\mathcal {I}}_-}$. The desired LP is

$$\begin{aligned} \max&\displaystyle -\sum _{i\in {\mathcal {I}}_+} {\mathbf {s}}_i \delta _i +\sum _{j\in {\mathcal {I}}_-} {\mathbf {d}}_j \delta _j&\nonumber \\ \text {s.t.}&\delta _j-\delta _i\le c_{ij},&\forall (i,j) \in {\mathcal {I}}_+\times {\mathcal {I}}_-. \end{aligned}$$

(15)

Its dual is the transportation problem:

$$\begin{aligned} \min&\displaystyle \sum _{i\in {\mathcal {I}}_+}\sum _{j\in {\mathcal {I}}_-} c_{ij} x_{ij}&\nonumber \\ \text {s.t.}&\displaystyle \sum _{j\in {\mathcal {I}}_-}x_{ij} = {\mathbf {s}}_i,&\forall i\in {\mathcal {I}}_+,\nonumber \\&\displaystyle \sum _{i\in {\mathcal {I}}_+}x_{ij} = {\mathbf {d}}_j,&\forall j\in {\mathcal {I}}_-,\\&x_{ij}\ge 0&\forall (i,j) \in {\mathcal {I}}_+\times {\mathcal {I}}_-.\nonumber \end{aligned}$$

(16)

We next argue that both primal (16) and dual (15) problems are feasible, thereby showing that optimal primal and dual solutions exist. The primal problem (16) is feasible because $\sum _{j\in {\mathcal {I}}_-}{\mathbf {d}}_j=\sum _{i\in {\mathcal {I}}_+}{\mathbf {s}}_i$. The c-max cut shows that the dual problem (16) is feasible. In fact, let $\mathop {\hbox {c-max}}$ be the coefficient vector of the c-max cut. This vector is in $B_{2}$. Furthermore, $(\mathop {\hbox {c-max}})_i>0$ for $i\in {\mathcal {I}}_+$ and $(\mathop {\hbox {c-max}})_j<0$ for $j\in {\mathcal {I}}_-$. Therefore, $(\mathop {\hbox {c-max}})\in B_{2}\cap A$. It follows that $T(\mathop {\hbox {c-max}})\in D_{2}\subseteq D_{1}$. The fact that $D_{1}$ is nonempty also follows from Proposition 12.

Proposition 13 shows that $D_{1}$ has a lineality. It follows that the faces of $D_{1}$ of smallest dimension are edges. Because (15) has an optimal solution, it must therefore be that it has an edge of optimal solutions. Let $\delta ^\prime $ be a solution on this edge. There are n active constraints of $D_{1}$ at $\delta ^\prime $. Now define $\delta ^*=\delta ^\prime -\delta ^\prime _0\mathbb {1}$. Then,

$$\begin{aligned} \displaystyle -\sum _{i\in {\mathcal {I}}_+} {\mathbf {s}}_i \delta _i^*+\sum _{j\in {\mathcal {I}}_-} {\mathbf {d}}_j \delta _j^*= & {} -\displaystyle \sum _{i\in {\mathcal {I}}_+} {\mathbf {s}}_i \left( \delta _i'-\delta _0\right) +\sum _{j\in {\mathcal {I}}_-} {\mathbf {d}}_j \left( \delta _j'-\delta _0\right) \\= & {} \displaystyle -\sum _{i\in {\mathcal {I}}_+} {\mathbf {s}}_i \delta _i'+\sum _{j\in {\mathcal {I}}_-} {\mathbf {d}}_j\delta _j' +\delta _0\left( \sum _{i\in {\mathcal {I}}_+}{\mathbf {s}}_i-\sum _{j\in {\mathcal {I}}_-} {\mathbf {d}}_j\right) \\= & {} \displaystyle -\sum _{i\in {\mathcal {I}}_+} {\mathbf {s}}_i \delta _i'+\sum _{j\in {\mathcal {I}}_-} {\mathbf {d}}_j\delta _j'. \end{aligned}$$

Hence $\delta ^*$ has the same objective function value as $\delta '$. Moreover, $\delta ^*$ satisfies all the constraints in (15) because $\delta ^*_j-\delta ^*_i = (\delta ^\prime _j-\delta ^\prime _0)-(\delta ^\prime _i-\delta ^\prime _0)=\delta ^\prime _j-\delta ^\prime _i\le c_{ij}$ for all $(i,j)\in {\mathcal {I}}_+\times {\mathcal {I}}_-$. Clearly, $\delta ^*$ is an extreme point of $D_{2}$ since it satisfies $\delta ^*_0=0$ in addition to the n independent constraints active at $\delta ^{\prime }$. Proposition 12 then implies that $\beta ^* = T^{-1}(\delta ^*)$ is an extreme point of $B_{2}$, i.e., the coefficient vector of a facet-defining inequality for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$ that cuts off $(t_1,\dots ,t_n,t_0)=(0,\dots ,0,1)$.

Theorem 2

A solution to the separation problem (which consists of finding a hyperplane that separates $(t_1,\dots ,t_n,t_0)$ and $Q^0$) is $(\beta ^*)^\intercal t\ge 0$ where $\beta ^*=T^{-1}(\delta ^*)$ and $\delta ^*$ is an optimal solution to the dual transportation problem (15) with $\delta ^*_0=0$. Moreover, this solution yields a facet-defining inequality for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$. $\square $

Because basic feasible solutions of (16) correspond to certain spanning trees of G, it is natural to suspect that facet-defining inequalities of $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$ can be associated to those spanning trees. We explore this connection in Sect. 6.

6 Label-connected trees and facet-defining inequalities of $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$

In this section, we show that facet-defining inequalities for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$ correspond to certain subtrees of the complete undirected bipartite graph $G=({\mathcal {I}}_+,{\mathcal {I}}_-)$. Recall that, in (11), we associated a weight$w_{ij}$ to each arc $\{i,j\}$ where $i \in {\mathcal {I}}_+$ and $j \in {\mathcal {I}}_-$. To streamline notation, we define $w_{ji}:=\frac{1}{w_{ij}}$ for $(i,j) \in {\mathcal {I}}_+\times {\mathcal {I}}_-$.

Consider a spanning tree S of G. Then for any node $i\in {\mathcal {I}}_+\cup {\mathcal {I}}_-{\setminus } \{0\}$ there exists a unique path from node 0 to node i in S. We denote this path $P_{0i}$ by

$$\begin{aligned} (0=) i_0 - i_1 - i_2 - \dots - i_p (=i). \end{aligned}$$

We say that an inequality $\beta ^{\intercal }t\ge 0$ (or the associated coefficient vector $\beta $) is induced by the spanning tree S if $\beta _i := (-1)^p\beta _0 w_{i_0i_1}w_{i_1i_2}\dots w_{i_{p-1}i_p}$ for each $i\in \mathbf{{\mathcal {N}}}$. It follows directly from the definition of induced inequality that, on path $P_{0i}$, if $0 \le q < r \le p$

$$\begin{aligned} \beta _{i_r}= & {} (-1)^{r-q} \beta _{i_q} w_{i_qi_{q+1}} w_{i_{q+1}i_{q+2}} \ldots w_{i_{r-1}i_r}. \end{aligned}$$

(17)

In particular, if two distinct spanning trees S and $S^{\prime }$ of G share the same path from node $i_q$ to node $i_r$, then it follows from (17) that $\frac{\beta ^S_{i_r}}{\beta ^S_{i_q}} = \frac{\beta ^{S^{\prime }}_{i_r}}{\beta ^{S^{\prime }}_{i_q}}$ where $\beta ^S$ and $\beta ^{S^{\prime }}$ represent the coefficient vectors induced by spanning trees S and $S^{\prime }$, respectively.

In Proposition 14 and Example 2, we show that every facet-defining inequality is induced by a spanning tree of G, but some spanning trees do not induce valid inequalities.

Example 2

Consider the set $Q^0$ with disjuncts defined by $4t_1+3t_2-t_3-t_0\ge 0$$5t_1+t_2-2t_3-t_0\ge 0$, and $5t_1+2t_2-2t_3-t_0\ge 0$. We have that ${\mathcal {I}}_+=\{1,2\}$ and ${\mathcal {I}}_-=\{3,0\}$. Further, edge weights can be computed to be $w_{13}=\frac{1}{4}$,$w_{10}=\frac{1}{5}$, $w_{23}=\frac{1}{3}$, and $w_{20}=\frac{1}{3}$. Two spanning trees of G are shown in Fig. 1. The inequality induced by the subtree of Fig. 1a is $5t_1+3t_2-t_3-t_0\ge 0$. This inequality is the c-max cut and, hence, is valid for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$. Furthermore, it can be verified to be facet-defining for this set. The inequality induced by the subtree of Fig. 1b is $5t_1+3t_2-\frac{5}{4}t_3-t_0\ge 0$, which is not valid because it cuts off the feasible point $(0,1,3,0) \in Q^0_1 \subseteq \mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$. $\square $

Example 2 shows that not all spanning trees of G induce a valid inequality. The reason is that the induced coefficients may violate an inequality corresponding to an edge that is not included in the spanning tree. We refer to a spanning tree that induces a valid inequality as a feasible spanning tree. We next show that any inequality induced by a feasible spanning tree is facet-defining for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$.

Proposition 14

Inequality $\beta ^{\intercal }t\ge 0$ with $\beta _0=-1$ is facet-defining for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$ if and only if $\beta $ is induced by a feasible spanning tree of G.

Proof

Let $\beta ^{\intercal }t\ge 0$ with $\beta _0=-1$ be a facet-defining inequality for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$. Then, by Theorem 1, $\beta $ is an extreme point of $B_{2}$. Since $\beta $ is an extreme point of $B_{2}$, it belongs to $n=|\mathbf{{\mathcal {N}}_0}|-1$ hyperplanes of the form $\{\beta \in \mathbb {Re}^{|\mathbf{{\mathcal {N}}_0}|}\mid \beta _j + w_{ij}\beta _i=0\}$ whose coefficient vectors are linearly independent, in addition to $\beta _0=-1$. By Proposition 11, the subgraph with respect to $\beta $ forms a spanning tree of G.

For the converse, suppose $\beta ^{\intercal }t\ge 0$ with $\beta _0=-1$ is induced by a feasible spanning tree. The validity of $\beta ^{\intercal }t\ge 0$ follows directly from the definition of a feasible spanning tree. By construction, see (17), coefficients $\beta $ satisfy n equations of the form $\beta _j+w_{ij}\beta _i=0$, one for each edge of the tree. Lemma 1 shows that these n coefficient vectors are independent. Therefore, $\beta $ is an extreme point of $B_{2}$. Hence, Theorem 1 implies that $\beta ^{\intercal }t\ge 0$ is facet-defining for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$. $\square $

We next introduce the notion of label-connectivity. Let S be a spanning tree of G with edge set $E\subseteq {\mathcal {I}}_+\times {\mathcal {I}}_-$. A function $L:E\rightarrow {\mathcal {L}}$ is called a label-function if for each $\{i,j\} \in E$. In words, $L(\{i,j\})$ returns the index l of an inequality in the description of $Q^0$ with $f_{li}>0$ and the property that the ratio of the coefficient of $t_j$ over that of $t_i$ equals $-w_{ij}$. Because the ratio $w_{ij}$ might be achieved in different rows, several label-functions might be associated with a single spanning tree. For this reason, we define the set of all the label-functions of spanning tree S by $\mathbb {L}(S)$. We write S(E, L) to refer to a specific spanning tree with edge set E and label-function L. We say there is a label-disconnection for label l in S(E, L) if the subgraph of S(E, L) induced by the edges of label l is disconnected. This definition is equivalent to stating that there exists a path in S(E, L) where two edges with label l are connected within the tree using a path whose edges do not have label l. Finally, we say that a spanning tree S with edge set E is label-connected if there exists a label-function $L \in \mathbb {L}(S)$ such that S(E, L) does not exhibit label-disconnection for any $l\in {\mathcal {L}}$. Otherwise we say it is label-disconnected.

Example 2

(continued) In Fig. 2, we add all possible valid edge labels to the edges of the spanning trees presented in Fig. 1. In Fig. 2a, we observe that there are two possible labels for edge $\{1,0\}$, each of which determines that $w_{10}=\frac{1}{5}$. We see that, independent of the choice of label for edge $\{1,0\}$, the spanning tree does not exhibit any label-disconnection. It is therefore label-connected. In Fig. 2b, we observe that independent of the choice of label for edge $\{1,0\}$, the spanning tree will exhibit a label-disconnection for label 1 along the path 3–1–0–2. We conclude that this spanning tree is label-disconnected. $\square $

Label-connected spanning trees do not necessarily induce valid inequalities and not all feasible trees that induce a facet-defining inequality are label-connected. However, we show next via an example and later prove that, for facet-defining inequalities, there exists a feasible spanning tree that is label-connected.

Example 3

Consider the set $Q^0$ with disjuncts defined by $\frac{25}{4}t_1-\frac{5}{2}t_2+\frac{5}{16}t_3+\frac{15}{4}t_4-t_0\ge 0$ and $5t_1-\frac{5}{2}t_2+t_3+\frac{7}{2}t_4-t_0\ge 0$. Here, ${\mathcal {I}}_+=\{1,3,4\}$; ${\mathcal {I}}_-=\{2,0\}$; $w_{10}=\frac{4}{25}$, $w_{30}=1$, $w_{40}=\frac{4}{15}$, $w_{12}=\frac{2}{5}$, $w_{32}=\frac{5}{2}$, $w_{42}=\frac{2}{3}$; $l_{10}=1$, $l_{30}=2$, $l_{40}=1$, $l_{12}=1$, $l_{32}=2$, and $l_{42}=1$. The facet-defining inequality

$$\begin{aligned} \nicefrac {1}{4} \left( 25 t_1 - 10t_2+4t_3+15t_4-4t_0\right) \ge 0 \end{aligned}$$

(18)

is induced by the spanning tree of Fig. 3a, which is label-disconnected, and also by that in Fig. 3b, which is label-connected. $\square $

Lemma 2

Consider a facet-defining inequality induced by a spanning tree S for which there is a label-disconnection for label l. Let $C_1$ and $C_2$ be any two distinct components in the subgraph induced by edges with label l. Then, there exists a non-empty subtree of $C_2$ that can be detached from $C_2$ and attached to $C_1$, using an edge with label l, without changing the rest of the tree or the corresponding facet-defining inequality.

Proof

Since the given facet-defining inequality is induced by a spanning tree, there exists a unique path from a node in $C_1$ to a node in $C_2$ that contains no edge from $C_1$ or $C_2$. Let the starting node be $i_1\in C_1$ and the ending node be $j_1\in C_2$. Further, let $i_2$ be a neighbor of $i_1$ in $C_1$, and $j_2$ be a neighbor of $j_1$ in $C_2$. Let $i^{\prime }\in {\mathcal {I}}_+\cap \{i_1,i_2\}, j^{\prime }\in \{i_1, i_2\}{\setminus }\{i^\prime \}$ and let $i^{\prime \prime }\in {\mathcal {I}}_+\cap \{j_1,j_2\}, j^{\prime \prime }\in \{j_1, j_2\}{\setminus }\{i^{\prime \prime }\}$. Since edges $(i^{\prime }, j^{\prime })$ and $(i^{\prime \prime }, j^{\prime \prime })$ have label l, it follows that

$$\begin{aligned} \beta _{j^{\prime }} =\beta _{i^{\prime }}\frac{f_{lj^{\prime }}}{f_{li^{\prime }}},\quad \text {and} \quad \beta _{j^{\prime \prime }} = \beta _{i^{\prime \prime }}\frac{f_{lj^{\prime \prime }}}{f_{li^{\prime \prime }}}. \end{aligned}$$

(19)

Further, since the spanning tree yields a valid inequality,

$$\begin{aligned} \beta _{i^{\prime \prime }}\frac{f_{lj^{\prime }}}{f_{li^{\prime \prime }}} \le \beta _{j^{\prime }},\quad \text {and}\quad \beta _{i^{\prime }}\frac{f_{lj^{\prime \prime }}}{f_{li^{\prime }}} \le \beta _{j^{\prime \prime }}. \end{aligned}$$

(20)

We write $\beta _{i^{\prime \prime }}\frac{f_{lj^{\prime }}}{f_{li^{\prime \prime }}}\le \beta _{j^{\prime }} = \beta _{i^{\prime }}\frac{f_{lj^{\prime }}}{f_{li^{\prime }}} = \beta _{i^{\prime }}\frac{f_{lj^{\prime \prime }}}{f_{li^{\prime }}}\frac{f_{lj^{\prime }}}{f_{lj^{\prime \prime }}} \le \beta _{j^{\prime \prime }}\frac{f_{lj^{\prime }}}{f_{lj^{\prime \prime }}} = \beta _{i^{\prime \prime }}\frac{f_{lj^{\prime }}}{f_{li^{\prime \prime }}}$, where the inequalities hold because of (20) and the equalities holds because of (19). Therefore, equality holds throughout and $\beta _{j^{\prime \prime }}= \beta _{i^{\prime }}\frac{f_{lj^{\prime \prime }}}{f_{li^{\prime }}}$, and $\beta _{j^{\prime }}= \beta _{i^{\prime \prime }}\frac{f_{lj^{\prime }}}{f_{li^{\prime \prime }}}$. Now create a new spanning tree by deleting arc $(j_1,j_2)$ from S and by connecting $j_2$ to the one node among $i_1$ and $i_2$ that belongs to the other partition of the bipartite graph. Call this node k and refer to the resulting spanning tree as $S^{\prime }$. Clearly $S'$ contains a label-connected component for label l that subsumes $C_1$ and has at least one more arc. Further, the label of both the edge added and the edge removed is l, while all other edges and their labels remain unchanged.

For any node i, $\beta _i$ is obtained by taking products of $-w_{i'j'}$ for edges $\{ i', j'\}$ along the path from 0 assuming $\beta _0=-1$. We split this path into three parts from 0 to ${\bar{i}}$, ${\bar{i}}$ to ${\bar{j}}$, and ${\bar{j}}$ to i, where ${\bar{i}}$ (resp. ${\bar{j}}$) is the first (resp. last) of the nodes $\{i_1,i_2,j_1,j_2\}$ encountered along this path. Since the arcs from 0 to ${\bar{i}}$ and those from ${\bar{j}}$ to i are untouched, the ratios $\frac{\beta _{{\bar{i}}}}{\beta _{0}}$ and $\frac{\beta _{i}}{\beta _{{\bar{j}}}}$ are preserved. We showed that the tree preserves $\frac{\beta _{{\bar{j}}}}{\beta _{{\bar{i}}}}$. Taking a product, we see that $\beta _i$ is preserved. $\square $

Example 3

(continued) We have seen that the spanning tree of Fig. 3a is feasible, but is label-disconnected. Label-1 disconnection occurs on the path $1 - 2 - 3 - 0 - 4$, as $L(\{1,2\})=L(\{4,0\})=1$ and $L(\{3,2\})=L(\{3,0\})=2$. Consider edge $\{4,2\}$. It is shown in Example 3 that $L(\{4,2\})=1$. Replacing edge $\{4,0\}$ with $\{4,2\}$ in the spanning tree does not change the induced inequality and yields the label-connected spanning tree shown in Fig. 3b. $\square $

Theorem 3

Let $\beta ^{\intercal }t\ge 0$ be a non-trivial facet-defining inequality for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$. Then, there exists a label-connected feasible spanning tree that induces it.

Proof

If $\beta ^{\intercal }t\ge 0$ is a nontrivial facet-defining inequality, Proposition 14 shows that it is induced by a feasible spanning tree. We prove the existence of a label-connected feasible spanning tree by contradiction. Let ${\mathcal {T}}$ be the set of all feasible spanning trees that induce this inequality. Note that ${\mathcal {T}}\ne \emptyset $, ${\mathcal {T}}$ is a finite set, and each tree in ${\mathcal {T}}$ is disconnected for some label. For any tree $T\in {\mathcal {T}}$, let l(T) be the smallest label index for which it exhibits disconnection. Let $l^\prime =\max \{l(T)\mid T\in {\mathcal {T}}\}$ and let C(T, l) be the size of the largest connected component of label l in T. Choose $T^\prime \in \arg \max \{C(T, l^\prime )\mid T\in {\mathcal {T}}, l(T)=l^\prime \}$. Using Lemma 2, we can construct $T^{\prime \prime }$ from $T^\prime $ by choosing $C_1$ as a component of size $C(T^\prime , l^\prime )$. Since $T^{\prime \prime }$ is obtained without altering labels on any arc with labels other than $l^\prime $, labels that were previously connected remain connected. Further, $T^{\prime \prime }$ has a connected component for label $l'$ of size larger than $C(T^{\prime }, l^\prime )$. The existence of $T^{\prime \prime }$ contradicts the definition of $T^\prime $, proving that there must exist a label-connected feasible spanning tree in ${\mathcal {T}}$. $\square $

Example 4

Consider the set $Q^0$ defined in Example 1, where variable $t_5$ has been omitted. We record all spanning trees of $G({\mathcal {I}}_+,{\mathcal {I}}_-)$ in Table 1. In particular, the columns of Table 1 contain the edges of each spanning tree, the coefficient $\beta $ this spanning tree induces, and, in the case where the tree is infeasible, one edge that $\beta $ violates. We conclude that $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$ has only three nontrivial facet-defining inequalities, which were previously listed in (9). It can be easily verified that the three feasible spanning trees are label-connected. $\square $

Table 1 Feasible spanning trees for Example 4

Full size table

7 A fast algorithm to generate a facet-defining inequality

This section presents a constructive algorithm to strengthen a given inequality to a facet-defining inequality for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$ using label-connected spanning trees, which were introduced in Sect. 6.

We first discuss the case of complementarity problems, which is a special case of our setting and has been recently studied in [42]. Later, we will show that our results substantially generalize this work and yield new insights even for complementarity problems. For $f_1, f_2 \in \mathbb {Re}^n$, $f_1\not \le 0$ and $f_2\not \le 0$, [42] considered , and proposed the Equate-and-Relax (E&R) procedure to construct $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^c)$. Set $Q^c$ arises, similarly to our derivations in Sect. 3, while relaxing a complementarity constraint between basic variables in a simplex tableau. The E&R procedure has two steps. In the E-step, either the right-hand-side, or a variable $t_i$ whose coefficients $f_{1i}$ and $f_{2i}$ are of the same sign is chosen. The nontrivial disjunct constraints $f_1^{\intercal }t \ge 1$ and $f_2^{\intercal }t \ge 1$ are then multiplied by suitable nonnegative scalars $\alpha $ and $\gamma $ so that their right-hand-sides or the coefficients of variable $t_i$ become equal, i.e., $\alpha =\gamma $ or $\alpha f_{1i} = \gamma f_{2i}$. In the R-step, a valid inequality is created by setting the coefficient of each variable to be the maximum of its coefficients in the scaled inequalities. The right-hand-side of the inequality is set to the minimum of the right-hand-sides of the scaled inequalities. The valid inequality produced is

$$\begin{aligned} \sum _{i=1}^n \max \{\alpha f_{1i}, \gamma f_{2i}\}t_i\ge \min \{\alpha ,\gamma \}. \end{aligned}$$

(21)

When $\alpha =\gamma >0$, (21) is the c-max cut described in Sect. 3. It is shown in [42] that the family of E&R cuts characterizes $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^c)$. The requirement that $\alpha =\gamma $ or $\alpha f_{1i} = \gamma f_{2i}$ for some i, which is not explicit in traditional disjunctive programming constructs, allows for the set of multipliers to be restricted to a finite collection. Although they collectively describe $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^c)$, not all E&R inequalities are facet-defining for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^c)$. In [42], partial results are presented as to when E&R inequalities are facet-defining for complementarity problems. However, this characterization is incomplete, even for the case of the c-max cut. Next, we will first provide a precise characterization of when an E&R inequality is facet-defining for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^c)$. En route, we will generalize E&R to the cardinality setting. Recall that we are interested in $Q^0= \bigcup _{l\in {\mathcal {L}}} Q_l^0$ with $|{\mathcal {L}}|=K+1$ where $Q_l^0=\{t\in \mathbb {Re}^{|\mathbf{{\mathcal {N}}_0}|}\mid f_l^{\intercal }t\ge 0, t\ge 0\}$ and $f_{l0}=-1$ for all $l\in {\mathcal {L}}$. For multipliers $u_l\ge 0$, where $l\in {\mathcal {L}}$, we derive the following valid inequality

$$\begin{aligned} \sum _{i\in \mathbf{{\mathcal {N}}}}\max _{l\in {\mathcal {L}}}\left\{ u_l f_{li}\right\} t_i\ge 0. \end{aligned}$$

(22)

It follows from [3] that the collection of inequalities of the form (22) characterizes $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$. We show next that it is sufficient to consider weights associated with feasible label-connected spanning trees. We first illustrate the result on an example.

Example 5

Consider the set $Q^0$ defined in Example 1, where variable $t_5$ has been omitted. Using multipliers $\left( 1, \frac{5}{3}, 1, 1\right) $, we obtain

$$\begin{aligned} 5t_1-\frac{5}{3}t_2+4t_3+t_4-t_0\ge 0, \end{aligned}$$

(23)

an inequality that is facet-defining inequality for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$; see (9)a. $\square $

Given a nontrivial facet-defining inequality, we next describe how to derive it using (22) by computing the appropriate multipliers $u_l$ for $l \in {\mathcal {L}}$. If $\beta ^{\intercal }t \ge 0$ is a nontrivial facet-defining inequality of $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$, then by Theorem 3 there exists a label-connected feasible spanning tree T that induces it. Consider 0 as the root of spanning tree T. For each label $l\in {\mathcal {L}}$ that appears in T, let ${\dot{\curlywedge }}_l$ be the node with smallest distance (measured in number of arcs) from 0 among all nodes incident to an arc with label l. Let $\{\ell _n\}_{n=1,\dots , r}$ be the sequence of distinct labels encountered on the path from 0 to ${\dot{\curlywedge }}_l$ and let $\ell _{r+1}$ be l. Compute

$$\begin{aligned} u_{l} = \left\{ \begin{array}{ll} \displaystyle \prod \nolimits _{j=2}^{r+1}\frac{f_{\ell _{j-1} {\dot{\curlywedge }}_{\ell _j}}}{f_{\ell _j {\dot{\curlywedge }}_{\ell _j}}} &{}\quad \text {if }r\ge 1\\ 1&{}\quad \text {if }r=0. \end{array}\right. \end{aligned}$$

(24)

For labels l that do not appear in the spanning tree T, choose

(25)

We now explain the procedure intuitively before providing a formal proof. For each $l \in {\mathcal {L}}$, the subgraph induced by all arcs with label l is a (possibly empty) tree because T is label-connected. We refer to it as $S_l$ and to its node set as $N(S_l)$. If $S_l$ is empty, then disjunct l does not play an active role in the derivation of the inequality. If $S_l$ is not empty, the valid inequality produced is such that the coefficients of variables $t_i$ for $i \in N(S_l)$ are a common multiple $u_l$ of their coefficients in the nontrivial constraint of disjunct l. This multiple $u_l$ is chosen to be 1 if $0 \in N(S_l)$. Otherwise $u_l$ is computed so that the scaled coefficients in disjuncts l and $\ell _r$ of the variable $t_{{\dot{\curlywedge }}_l}$ are equal. In the complementarity case, where $|{\mathcal {L}}|=2$, this procedure reduces to aggregating scaled constraints using (21) such that either the right-hand-sides match or one variable has the same coefficient in both constraints.

Proposition 15

Let $\beta ^{\intercal }t \ge 0$ be a nontrivial facet-defining inequality of $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$. Let T be any feasible label-connected spanning tree that induces it. Then $\beta ^{\intercal }t \ge 0$ can be obtained as (22) by selecting weights $u_l$ for $l \in {\mathcal {L}}$ as in (24) and (25).

Proof

First, we argue that weights $u_l$ are well-defined for $l \in {\mathcal {L}}$. For labels l that appear in T, weights are uniquely defined by (24) since label-connectedness implies that ${\dot{\curlywedge }}_l$ is uniquely defined. For labels l that do not appear in T, the interval described in (25) is nonempty because T is feasible and therefore $f_{li}\beta _j-f_{lj}\beta _i\ge 0$ implies that $\frac{\beta _i}{f_{li}}\ge \frac{\beta _j}{f_{lj}}$ for all $i\in {\mathcal {I}}_+$ with $f_{li}>0$ and $j\in {\mathcal {I}}_-$.

Second, we show that the given weights are nonnegative. When l does not appear in T, $u_l$ is chosen according to (25). The lower bound of this interval is positive, proving the claim. Assume therefore that label l appears in the tree T. Let k be a node incident to an arc of label l. Assume that the path from 0 to k is $(k_0(=0), \dots , k_s, k)$ with sequence of labels $\{\ell ^\prime _n\}_{n=1,\dots ,s+1}$ and sequence of distinct labels $\{\ell _n\}_{n=1,\dots ,r}$ where $\ell _{r+1}=l$. Note that the node associating $\ell _n$ and $\ell _{n+1}$ for $n=1,\dots ,r$ is ${\dot{\curlywedge }}_{\ell _{n+1}}$ and hence

$$\begin{aligned} \beta _k= & {} \displaystyle (-1)^{s+1} w_{k_0,k_1}\dots w_{k_s,k}\beta _0 =\displaystyle \frac{f_{\ell ^\prime _1 k_1}}{f_{\ell ^\prime _1 k_0}}\dots \frac{f_{\ell ^\prime _s k_s}}{f_{\ell ^\prime _s k_{s-1}}} \frac{f_{\ell ^\prime _{s+1} k}}{f_{\ell ^\prime _{s+1} k_s}} \beta _0\nonumber \\= & {} \displaystyle \frac{f_{\ell _1 {\dot{\curlywedge }}_{\ell _2}}}{f_{\ell _1 {\dot{\curlywedge }}_{\ell _1}}}\frac{f_{\ell _2 {\dot{\curlywedge }}_{\ell _3}}}{f_{\ell _2 {\dot{\curlywedge }}_{\ell _2}}} \dots \frac{f_{\ell _r {\dot{\curlywedge }}_{\ell _{r+1}}}}{f_{\ell _r {\dot{\curlywedge }}_{\ell _r}}} \frac{f_{\ell _{r+1} k}}{f_{\ell _{r+1}{\dot{\curlywedge }}_{\ell _{r+1}}}}\beta _0\nonumber \\= & {} \displaystyle \frac{f_{\ell _1 {\dot{\curlywedge }}_{\ell _2}}}{-1}\frac{f_{\ell _2 {\dot{\curlywedge }}_{\ell _3}}}{f_{\ell _2 {\dot{\curlywedge }}_{\ell _2}}}\dots \frac{f_{\ell _r {\dot{\curlywedge }}_{\ell _{r+1}}}}{f_{\ell _r {\dot{\curlywedge }}_{\ell _r}}} \frac{f_{\ell _{r+1} k}}{f_{\ell _{r+1} {\dot{\curlywedge }}_{\ell _{r+1}}}}(-1)\nonumber \\= & {} \left\{ \begin{array}{ll}\left( \prod _{j=2}^{r+1}\frac{f_{\ell _{j-1} {\dot{\curlywedge }}_{\ell _j}}}{f_{\ell _{j} {\dot{\curlywedge }}_{\ell _j}}} \right) f_{\ell _{r+1} k} &{}\quad \text {if}\; r\ge 1\\ f_{\ell _{r+1}k} &{}\quad \text {if}\; r=0\end{array}\right. \nonumber \\= & {} u_{l} f_{l k}. \end{aligned}$$

(26)

If $k \in {\mathcal {I}}_-$, $\beta _k<0$ and $f_{lk}<0$. It follows from (26) that $u_l=\frac{\beta _k}{f_k}>0$. If $k \in {\mathcal {I}}_+$, then $\beta _k>0$ and $f_{lk}>0$. It follows from (26) that $u_l=\frac{\beta _k}{f_k}>0$.

Finally, we show that with the given weights, (22) yields the desired inequality. Let $k \in \mathbf{{\mathcal {N}}_0}$. It follows directly from (26) that $\beta _k \le \max _{l\in {\mathcal {L}}} \{u_{l} f_{l k}\}$. We next show that $\beta _k \ge \max \{u_{l} f_{l k} \mid l\in {\mathcal {L}}\}$. If l is not in the tree, the definition of the interval (25) directly implies that $\beta _k\ge u_l f_{lk}$. Consider therefore the situation where l is in the tree. Assume for a contradiction that $\beta _k < u_l f_{lk}$. Let $C_1$ be the set of nodes in the connected component for label l. Then, for any $i\in C_1$, $\beta _i = u_l f_{li}$. This shows that $k\notin C_1$. Choose a node $k^\prime $ in $C_1$ that belongs to ${\mathcal {I}}_-$ if $k\in {\mathcal {I}}_+$ and that belongs to ${\mathcal {I}}_+$ if $k\in {\mathcal {I}}_-$. Because $k^\prime \in C_1$, $u_l=\frac{\beta _{k^{\prime }}}{f_{lk^{\prime }}}$. Our assumption then implies that $\beta _k < \frac{\beta _{k^{\prime }}}{f_{lk^{\prime }}} f_{lk}$, which is a contradiction to the fact that T is feasible. $\square $

Example 5

(continued) Consider the label-connected feasible spanning tree in Fig. 4. Because labels 1 and 3 are adjacent to node 0, we set $u_1=1$ and $u_3=1$. Then, $u_2=\frac{5}{3}$ because $f_{21}=3$ and $f_{11}=5$. Finally, $u_4$ can be any value in [1, 5 / 2], because label 4 does not appear in the tree. Using these weights yields (23). $\square $

We next describe a procedure that expresses a valid inequality for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$ that is not facet-defining as a conic combination of “stronger” valid inequalities. In order to express this result, given a vector $\beta \in B_{2}$, we introduce the notation $d_{B_{2}}(\beta ) = \dim (F)$, where F is the face of $B_2$ that contains $\beta $ in its relative interior. Although this result can be proven in a more general setting, the specialized proof we give has the advantage of yielding a low-order polynomial time algorithm for strengthening a valid inequality of $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$ into a facet-defining inequality.

Proposition 16

Let $\beta ^{\intercal }t \ge 0$ be a valid inequality for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$ with $\beta _0<0$ that is not facet-defining, i.e., $d_{B_{2}}(\beta )=k>0$. Then either

1.
There exist two valid inequalities ${\bar{\beta }}^{\intercal }t \ge 0$ and ${\tilde{\beta }}^{\intercal }t \ge 0$ and $\theta \in (0,1)$ such that $\beta = \theta {\bar{\beta }} + (1-\theta ) {\tilde{\beta }}$, $d_{B_{2}}({\bar{\beta }})<k$ and $d_{B_{2}}({\tilde{\beta }})<k$, or
2.
There exists a valid inequality ${\bar{\beta }}^{\intercal }t \ge 0$ such that ${\bar{\beta }} \le \beta $ and $d_{B_{2}}({\bar{\beta }})<k$.

Proof

Let $\beta ^{\intercal }t\ge 0$ be the given inequality. We may assume that $\beta \in B_2,\,\,\beta _i\ge 0$ for $i\in {\mathcal {I}}_+$, and $\beta _j\le 0$ for $j\in {\mathcal {I}}_-$. Define $\delta = \log (|\beta |)$ where $\log (0):=-\infty $.

Given $\delta $, we construct the subgraph $G_{\delta }({\mathcal {I}}_+,{\mathcal {I}}_-)$ of $G({\mathcal {I}}_+,{\mathcal {I}}_-)$ induced by the edges (i, j) for which inequality $\delta _j-\delta _i\le c_{ij}$ is satisfied at equality by $\delta $. Subgraph $G_{\delta }({\mathcal {I}}_+,{\mathcal {I}}_-)$ is disconnected. In fact, if it was connected, any spanning tree would induce $\beta ^{\intercal }t \ge 0$, and being feasible, would contradict the fact that this inequality is not facet-defining for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$; see Proposition 14.

Let $C_1$ and $C_2$ be the partition of $\mathbf{{\mathcal {N}}_0}$ where $C_1$ is the node set of the connected component of $G_{\delta }({\mathcal {I}}_+,{\mathcal {I}}_-)$ that contains 0 and $C_2=\mathbf{{\mathcal {N}}_0}{\setminus } C_1$. Compute

There is at least one arc connecting $C_1$ with $C_2$ in $G({\mathcal {I}}_+,{\mathcal {I}}_-)$. If not, $C_2\cap {\mathcal {I}}_+= \emptyset $, which means $C_1\cap {\mathcal {I}}_+\supseteq {\mathcal {I}}_+\ne \emptyset $, yielding a contradiction to $C_2\ne \emptyset $. Let $\chi (C)$ denote the indicator vector of C. Clearly, at least one of $\varDelta ^+$ and $\varDelta ^-$ is well-defined. When $\varDelta ^+$ is not well defined then $\chi (C_2)$ (resp. $-\chi (C_1)$) is a recession direction of $D_2$ when $C_2\cap {\mathcal {I}}_-= \emptyset $ (resp. $C_1\cap {\mathcal {I}}_+= \emptyset $) and we express $\delta = (\delta +\varDelta ^-\chi (C_2)) - \varDelta ^-\chi (C_2)$ (resp. $\delta = (\delta -\varDelta ^-\chi (C_1)) + \varDelta ^-\chi (C_1)$). Similarly, when $\varDelta ^-$ is not well defined, $C_2\cap {\mathcal {I}}_+=\emptyset $ then $-\chi (C_2)$ is a recession direction for $D_2$ and we express $\delta = (\delta +\varDelta ^+\chi (C_2)) - \varDelta ^+\chi (C_2)$. Finally, when $\varDelta ^+$ and $\varDelta ^-$ are well defined, we express $\delta = \frac{\varDelta ^+}{\varDelta ^+-\varDelta ^-}(\delta + \varDelta ^-\chi (C_2)) - \frac{\varDelta ^-}{\varDelta ^+-\varDelta ^-}(\delta + \varDelta ^+\chi (C_2))$. The result still works after the transformation $\beta =e^{\delta }$ because for $\varDelta ', \varDelta '' \ge 0$ and some set of nodes C, the perturbation $\delta +\varDelta '\chi (C)$ (resp. $\delta -\varDelta ''\chi (C)$) yields an inequality $\beta '=e^{\varDelta '\chi (C)}\circ \beta $ (resp. $\beta ''=e^{-\varDelta ''\chi (C)}\circ \beta $), where $\circ $ denotes Hadamard product. Since $e^{-\varDelta ''} \le 1\le e^{\varDelta '}$, $\beta $ can be expressed as a convex combination of $\beta '$ and $\beta ''$. The case with recession cones also follows by letting $\varDelta \rightarrow \infty $. Since the size of $C_1$ increases each time, the inequalities we use (if not the trivial recession directions) come from a smaller dimension face of $D_2$ and, hence of $B_2$. $\square $

We now illustrate the procedure used in the proof of Proposition 16.

Example 5

(continued) Consider the inequality $21t_1 -7t_2 + 20t_3 +4t_4 -4t_0\ge 0$ which is valid for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$. We next express this inequality as a conic combination of facet-defining inequalities of $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$ using the procedure in the proof of Proposition 16. Let $\beta =(21,-7,20,4,-4)$. For this vector $\beta $, only the two inequalities $\beta _2+w_{12}\beta _1 \ge 0$ and $\beta _0 + w_{40} \beta _4 \ge 0$ are satisfied at equality. It follows that $C_1=\{4,0\}$ and $C_2 = \{1,2,3\}$. For $f=\nicefrac {12}{7}$ and $g=\nicefrac {20}{21}$, define $\beta ^\prime = (21f,-7f,20f,4,-4)$ and $\beta ^{\prime \prime } = (21g,-7g,20g,4,-4)$ and express

$$\begin{aligned} \beta =\frac{1-g}{f-g}\beta ^{\prime } + \frac{f-1}{f-g} \beta ^{\prime \prime } = \frac{1}{16} \beta ^{\prime } + \frac{15}{16} \beta ^{\prime \prime }. \end{aligned}$$

(27)

We recursively treat $\beta ^1=\beta ^{\prime }$ and $\beta ^2=\beta ^{\prime \prime }$. For $\beta ^1$, $C_1=\{1,2,4,0\}$ and $C_2 = \{3\}$. With $g'=\nicefrac {7}{10}$, let $\beta ^{1,\prime \prime } = (36,-12,\nicefrac {240}{7} \,g',4,-4)$. For $\beta ^2$, $C_1=\{1,2,4,0\}$ and $C_2 = \{3\}$. With $g''=\nicefrac {21}{25}$, we define $\beta ^{2,\prime \prime } = (20,-\nicefrac {20}{3},\nicefrac {400}{21} \,g'',4,-4)$. Then

$$\begin{aligned} \beta ^1= & {} \beta ^{1,\prime \prime } + (0,0,\nicefrac {240}{7}\,(1-g'),0,0)=\beta ^{1,\prime \prime } + (0,0,\nicefrac {72}{7},0,0), \end{aligned}$$

(28)

$$\begin{aligned} \beta ^2= & {} \beta ^{2,\prime \prime } + (0,0,\nicefrac {400}{21}\,(1-g''),0,0)=\beta ^{2,\prime \prime } + (0,0,\nicefrac {64}{21},0,0). \end{aligned}$$

(29)

Both $\beta ^{1,\prime \prime }$ and $\beta ^{2,\prime \prime }$ are facet-defining. Combining (27), (28) and (29), we obtain $\beta = \nicefrac {1}{16} \beta ^{1,\prime \prime } + \nicefrac {15}{16} \beta ^{2,\prime \prime } + \nicefrac {7}{2} (0,0,1,0,0)$. Thus, our valid inequality is a conic combination of (9)b, (9)a and $t_3 \ge 0$ with weights $\nicefrac {4}{16}$, $\nicefrac {60}{16}$, $\nicefrac {7}{2}$, respectively. $\square $

When the c-max cut does not define a facet, which can occur even when $|{\mathcal {L}}|=2$ [42], we use the algorithm in the proof of Proposition 16 to strengthen its coefficients to those of a facet-defining inequality.

Proposition 17

Any c-max cut can be expressed as a conic combination of a single nontrivial facet-defining inequality together with trivial inequalities. Moreover, the coefficients of the c-max cut and those of the single nontrivial facet-defining inequality are identical for each $i\in {\mathcal {I}}_+$.

Proof

In the proof of Proposition 16, when $C_1\supseteq {\mathcal {I}}_+$, $\varDelta ^-$ is not defined and the inequality is expressed as a conic combination of a tighter valid inequality and a trivial inequality. The coefficients of the variables in $C_1$ do not change. Thus, we only need to show that throughout $C_1\supseteq {\mathcal {I}}_+$. This is true at the beginning because the coefficient for each $i\in {\mathcal {I}}_+$ is derived from inequality $l'\in \arg \max _{l} f_{li}$. It is also true at each subsequent step because $C_1$ grows at each step. $\square $

The question of when the c-max cut is facet-defining for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$ with $|{\mathcal {L}}|=2$, was raised but left open in [42]. The proofs of Propositions 16 and 17answer this question in the general case where $|{\mathcal {L}}| \ge 2$. A c-max cut $\beta ^{\intercal }t\ge 0$ is facet-defining for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$ precisely when $C_2=\emptyset $ at the first step in the proof of Proposition 16. Since ${\mathcal {I}}_+\subseteq C_1$, this condition can be restated as each node $j \in {\mathcal {I}}_-{\setminus }\{0\}$ is such that $\beta _j + w_{ij} \beta _i=0$ for some $i \in {\mathcal {I}}_+$, yielding the following.

Corollary 4

A c-max cut $\beta ^T t\ge 0$ is facet-defining for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$ if and only if for each $j \in {\mathcal {I}}_-\backslash \{0\}$, $\beta _i=f_{li}$ and $\beta _j=f_{lj}$ for some $l\in {\mathcal {L}}$ and $i\in {\mathcal {I}}_+$. $\square $

We may also use the constructive procedure used in the proof of Proposition 16 to design an algorithm to “tighten” a valid inequality for $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$ into a facet-defining of $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$. We choose to develop such an algorithm in the space of $\delta $ variables. A similar procedure could be developed in the space of $\beta $ variables. The underlying idea is to expand the subgraph of tight equalities in $D_{1}$ for the given $\delta $ into a connected graph, while maintaining feasibility for the non-tight inequalities.

The constructive procedure is presented in Algorithm 1 and is a variation of Prim’s algorithm for minimum weight spanning trees; see [43]. It requires sets ${\mathcal {I}}_+$ and ${\mathcal {I}}_-$, a coefficient vector $\delta \in D_{1}$, and the matrix $C=[c_{ij}]$ where $c_{ij}=\log (w_{ij})$. Define $s_{ji}=s_{ij}=c_{ij}-\delta _j+\delta _i$ for $i\in {\mathcal {I}}_+$ and $j\in {\mathcal {I}}_-$. Since $\delta \in D_{1}$, $s_{ji}=s_{ij} \ge 0$. If $\text {key}^*$ is the key on termination then $\delta +\text {key}^*$ gives the coefficient vector of the desired facet-defining inequality. We refer to “Appendix” for more detail.

Theorem 4

From any valid inequality of $\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)$, Algorithm 1 constructs a facet-defining inequality in time $O(e+n\log n)$, where e and n are the number of edges and nodes of $G({\mathcal {I}}_+,{\mathcal {I}}_-)$, respectively. Further, the facet obtained contains the face defined by the initial inequality. $\square $

8 Conclusion

Considerable attention has been paid to optimization problems with cardinality constraints. Given an LP relaxation and a basic solution that does not satisfy the cardinality requirement, we derive violated valid inequalities, which are facet-defining for a disjunctive relaxation of the problem. Separation is carried out by solving a network-flow problem in the original problem space instead of a higher-dimensional cut-generation LP. We show that facet-defining inequalities can be associated with label-connected feasible spanning trees of a suitably defined bipartite graph and, consequently, derive various insights into their structure and validity. Using these insights, we modify the recently proposed E&R procedure, which generates cuts for complementarity problems, to the more general setting involving cardinality constraints. Our analysis reveals conditions under which the c-max cut, a cut widely used in the complementarity literature, is not facet-defining and can be improved using a simple procedure. More generally, we develop a fast separation procedure that tightens valid inequalities into facet-defining inequalities for our relaxation using a Prim-type combinatorial algorithm.

References

Avella, P., Sassano, A.: On the $p$-median polytope. Math. Program. 89(3), 395–411 (2001)
MathSciNet MATH Google Scholar
Balas, E.: Intersection cuts—a new type of cutting planes for integer programming. Oper. Res. 19(1), 19–39 (1971)
MathSciNet MATH Google Scholar
Balas, E.: Disjunctive programming. Ann. Discrete Math. 5, 3–51 (1979)
MathSciNet MATH Google Scholar
Balas, E., Perregaard, M.: A precise correspondence between lift-and-project cuts, simple disjunctive cuts, and mixed integer gomory cuts for 0–1 programming. Math. Program. 94(2–3), 221–245 (2003)
MathSciNet MATH Google Scholar
Bazaraa, M.S., Sherali, H.D.: On the use of exact and heuristic cutting plane methods for the quadratic assignment problem. J. Oper. Res. Soc. 33(11), 991–1003 (1982)
MathSciNet MATH Google Scholar
Beale, E.M.L., Tomlin, J.A.: Special facilities in a general mathematical programming system for non-convex problems using ordered sets of variables. In: Lawrence, J. (ed.) Prodeedings of the Fifth International Conference on Operational Research, pp. 447–454. Tavistock Publications, London (1970)
Google Scholar
Beasley, J.E.: A note on solving large $p$-median problems. Eur. J. Oper. Res. 21(2), 270–273 (1985)
MathSciNet MATH Google Scholar
Beasley, J.E., Meade, N., Chang, T.J.: An evolutionary heuristic for the index tracking problem. Eur. J. Oper. Res. 148(3), 621–643 (2003)
MathSciNet MATH Google Scholar
Bertsimas, D., King, A., Mazumder, R.: Best subset selection via a modern optimization lens. Ann. Stat. 44(2), 813–852 (2016)
MathSciNet MATH Google Scholar
Bertsimas, D., Shioda, R.: Algorithm for cardinality-constrained quadratic optimization. Comput. Optim. Appl. 43(1), 1–22 (2009)
MathSciNet MATH Google Scholar
Bienstock, D.: Computational study of a family of mixed-integer quadratic programming problems. Math. Program. 74(2), 121–140 (1996)
MathSciNet MATH Google Scholar
Brøndsted, A.: An Introduction to Convex Polytopes, vol. 90. Springer, Berlin (1983)
MATH Google Scholar
Chang, T.J., Meade, N., Beasley, J.E., Sharaiha, Y.M.: Heuristics for cardinality constrained portfolio optimisation. Comput. Oper. Res. 27(13), 1271–1302 (2000)
MATH Google Scholar
Christof, T.: Ein verfahren zur transformation zwischen polyederdarstellungen. Ph.D. thesis, Diplomarbeit, Universität Augsburg (1991)
Christof, T., Löbel, A.: PORTA—polyhedron representation transformation algorithm. Available at https://comopt.ifi.uni-heidelberg.de/software/PORTA/ (2008)
Christofides, N., Beasley, J.E.: A tree search algorithm for the $p$-median problem. Eur. J. Oper. Res. 10(2), 196–204 (1982)
MathSciNet MATH Google Scholar
Crama, Y., Schyns, M.: Simulated annealing for complex portfolio selection problems. Eur. J. Oper. Res. 150(3), 546–571 (2003)
MATH Google Scholar
d’Aspremont, A., El Ghaoui, L., Jordan, M.I., Lanckriet, G.R.: A direct formulation for sparse PCA using semidefinite programming. SIAM Rev. 49(3), 434–448 (2007)
MathSciNet MATH Google Scholar
Ehrgott, M., Klamroth, K., Schwehm, C.: An MCDM approach to portfolio optimization. Eur. J. Oper. Res. 155(3), 752–770 (2004)
MathSciNet MATH Google Scholar
de Farias Jr, I.R.: A family of facets for the uncapacitated $p$-median polytope. Oper. Res. Lett. 28(4), 161–167 (2001)
MathSciNet MATH Google Scholar
de Farias Jr, I.R., Johnson, E.L., Nemhauser, G.L.: Branch-and-cut for combinatorial optimization problems without auxiliary binary variables. Knowl. Eng. Rev. 16(1), 25–39 (2001)
MATH Google Scholar
de Farias Jr, I.R., Nemhauser, G.L.: A polyhedral study of the cardinality constrained knapsack problem. Math. Program. 96(3), 439–467 (2003)
MathSciNet MATH Google Scholar
Gilli, M., Këllezi, E.: The threshold accepting heuristic for index tracking. In: Pardalos, P.M., Tsitsiringos, V.K. (eds.) Financial Engineering, E-Commerce and Supply Chain, pp. 1–18. Springer, Berlin (2002)
MATH Google Scholar
Glover, F.: Convexity cuts and cut search. Oper. Res. 21(1), 123–134 (1973)
MathSciNet MATH Google Scholar
Gomory, R.: An algorithm for the mixed integer problem. Tech. Rep. RM-2597, The Rand Corporation (1960)
Gomory, R.E.: Some polyhedra related to combinatorial problems. Linear Algebra Appl. 2, 341–375 (1969)
MathSciNet MATH Google Scholar
Guan, Y., Dy, J.G.: Sparse probabilistic principal component analysis. In: International Conference on Artificial Intelligence and Statistics, pp. 185–192 (2009)
Güler, O.: Foundations of Optimization, vol. 258. Springer, Berlin (2010)
MATH Google Scholar
Ibaraki, T.: The use of cuts in complementary programming. Oper. Res. 21(1), 353–359 (1973)
MathSciNet MATH Google Scholar
Järvinen, P., Rajala, J., Sinervo, H.: A branch-and-bound algorithm for seeking the $p$-median. Oper. Res. 20(1), 173–178 (1972)
MATH Google Scholar
Jeurissen, R., Van den Berg, J.: Index tracking using a hybrid genetic algorithm. In: 2005 ICSC Congress on Computational Intelligence Methods and Applications, pp. 1–6. IEEE (2005)
Jeurissen, R., van den Berg, J.: Optimized index tracking using a hybrid genetic algorithm. In: IEEE Congress on Evolutionary Computation (CEC 2008), pp. 2327–2334. IEEE (2008)
Jobst, N.J., Horniman, M.D., Lucas, C.A., Mitra, G.: Computational aspects of alternative portfolio selection models in the presence of discrete asset choice constraints. Quant. Finance 1(5), 489–501 (2001)
MathSciNet MATH Google Scholar
Journée, M., Nesterov, Y., Richtárik, P., Sepulchre, R.: Generalized power method for sparse principal component analysis. J. Mach. Learn. Res. 11, 517–553 (2010)
MathSciNet MATH Google Scholar
Kariv, O., Hakimi, S.L.: An algorithmic approach to network location problems. II: the $p$-medians. SIAM J. Appl. Math. 37(3), 539–560 (1979)
MathSciNet MATH Google Scholar
Kis, T.: Lift-and-project for general two-term disjunctions. Discrete Optim. 12, 98–114 (2014)
MathSciNet MATH Google Scholar
Laundy, R.: Some logically constrained mathematical programming problems. Ph.D. thesis, University of Southampton (1983)
Loraschi, A., Tomassini, M., Tettamanzi, A., Verda, P.: Distributed genetic algorithms with an application to portfolio selection problems. In: Pearson, D.W., Steele, N.C., Albrecht, R.F. (eds.) Artificial Neural Nets and Genetic Algorithms, pp. 384–387. Springer, Berlin (1995)
Google Scholar
Mansini, R., Speranza, M.G.: Heuristic algorithms for the portfolio selection problem with minimum transaction lots. Eur. J. Oper. Res. 114(2), 219–233 (1999)
MATH Google Scholar
Moral-Escudero, R., Ruiz-Torrubiano, R., Suarez, A.: Selection of optimal investment portfolios with cardinality constraints. In: IEEE Congress on Evolutionary Computation (CEC 2006), pp. 2382–2388. IEEE (2006)
Narula, S.C., Ogbu, U.I., Samuelsson, H.M.: An algorithm for the $p$-median problem. Oper. Res. 25(4), 709–713 (1977)
MathSciNet MATH Google Scholar
Nguyen, T.T.: Convexification techniques for complementarity and multilinear constraints. Ph.D. thesis, University of Florida (2013)
Prim, R.C.: Shortest connection networks and some generalizations. Bell Syst. Tech. J. 36(6), 1389–1401 (1957)
Google Scholar
Raghavachari, M.: On connections between zero-one integer programming and concave programming under linear constraints. Oper. Res. 17(4), 680–684 (1969)
MathSciNet MATH Google Scholar
Rockafellar, R.T.: Convex Analysis, vol. 28. Princeton University Press, Princeton (1997)
MATH Google Scholar
Ruiz-Torrubiano, R., Suárez, A.: A hybrid optimization approach to index tracking. Ann. Oper. Res. 166(1), 57–71 (2009)
MathSciNet MATH Google Scholar
Schrijver, A.: Theory of Linear and Integer Programming. Wiley, New York (1998)
MATH Google Scholar
Shapcott, J.: Index Tracking: Genetic Algorithms for Investment Portfolio Selection. Edinburgh Parallel Computing Centre, Edinburgh (1992)
Google Scholar
Tuy, H.: Concave programming under linear constraints. Sov. Math. 5, 1437–1440 (1964)
MATH Google Scholar
Zeng, B., Richard, J.-P.P.: A polyhedral study on 0–1 knapsack problems with disjoint cardinality constraints: facet-defining inequalities by sequential lifting. Discrete Optim. 8(2), 277–301 (2011)
MathSciNet MATH Google Scholar
Zeng, B., Richard, J.-P.P.: A polyhedral study on 0–1 knapsack problems with disjoint cardinality constraints: strong valid inequalities by sequence-independent lifting. Discrete Optim. 8(2), 259–276 (2011)
MathSciNet MATH Google Scholar
Zhang, Y., Burer, S., Street, W.N.: Ensemble pruning via semi-definite programming. J. Mach. Learn. Res. 7, 1315–1338 (2006)
MathSciNet MATH Google Scholar
Ziegler, G.M.: Lectures on Polytopes, vol. 152. Springer, Berlin (1995)
MATH Google Scholar
Zou, H., Hastie, T., Tibshirani, R.: Sparse principal component analysis. J. Comput. Graph. Stat. 15(2), 265–286 (2006)
MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Mitchell College of Business, University of South Alabama, 307 N. University Blvd., Mobile, AL, 36608-3074, USA
Jinhak Kim
Krannert School of Management, Purdue University, 100 S. Grant Street, West Lafayette, IN, 47907-2076, USA
Mohit Tawarmalani
Department of Industrial and Systems Engineering, University of Florida, 303 Weil Hall, Gainesville, FL, 32611-6595, USA
Jean-Philippe P. Richard

Authors

Jinhak Kim
View author publications
You can also search for this author in PubMed Google Scholar
Mohit Tawarmalani
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Philippe P. Richard
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Jinhak Kim or Mohit Tawarmalani.

Additional information

This work was supported by NSF CMMI Grants 1234897, 1235236, 1727989 and 1728031.

Appendix: Proof of Theorem 4

Proof

For a given node set $X\subseteq \mathbf{{\mathcal {N}}_0}$, the operation Extract-Min(X) (resp. Extract-Max(X)) removes and returns the element of X with the smallest (resp. largest) key. We also define $\text {Min}(X):=\min _{i\in X}\text {key}[i]$ and $\text {Max}(X):=\max _{i\in X}\text {key}[i]$. We denote by ${\mathcal {Q}}_+$ (resp. ${\mathcal {Q}}_-$) the max-priority queue in ${\mathcal {I}}_+$ (resp. min-priority queue in ${\mathcal {I}}_-$) whose keys are not yet finalized. We let ${\mathcal {Q}}={\mathcal {Q}}_+\cup {\mathcal {Q}}_-$. For a node v, $\text {Adj}[v]$ is the set of nodes adjacent to v in ${\mathcal {Q}}$.

Let H represent the sequence of nodes extracted during the successive iterations of the while loop. For nodes i and j, we write $j \prec i$ if j occurs in H before i. We define $\text {key}[0]=0$. Then, for $i\in {\mathcal {Q}}_+$ (resp. ${\mathcal {Q}}_-\backslash \{0\}$), we define $\text {key}[i]$ to be $\max _{j\in {\mathcal {I}}_-\backslash {\mathcal {Q}}_-, j \prec i} -s_{ij} + \text {key}[j]$ (resp. $\min _{j\in {\mathcal {I}}_+\backslash {\mathcal {Q}}_+, j\prec i} s_{ij}+\text {key}[j]$). We use induction to show that keys follow this definition. The base case can be verified via the initial assignment of keys and the convention that $\min \{\emptyset \}=\infty $. If we assume that the keys satisfy the above definition before the iteration, then since $k\prec i$ for any $i\in {\mathcal {Q}}\backslash \{k\}$, the definition remains valid after the step 11 (resp. step 13). It is clear that once a node is extracted, the key of the node never changes in the remainder of the algorithm.

We first show that $\min ({\mathcal {Q}}_-)-\text {key}[k]\ge 0$ and $\text {key}[k] - \max ({\mathcal {Q}}_+)\ge 0$ at step 4 of the algorithm. This is clearly true for the base case. We assume that these inequalities are true and we choose to extract $k'$ at either step 5 or step 7. We will only argue that the above inequalities hold for $k'$ selected at step 5 because the other case is similar. We first argue that the result holds before step 9. If $k'$ was selected at step 5, i.e., $k'\in {\mathcal {Q}}_-$, then the first inequality holds because $k'$ was chosen to be the node with minimum key in ${\mathcal {Q}}_-$. The second inequality holds because $\text {key}[k']-\text {key}[k]\ge 0$ and $\text {key}[k]-\max ({\mathcal {Q}}_+)\ge 0$ by induction hypothesis. Now, we show that these inequalities continue to hold until the step 4 of the next iteration. In particular, observe that for $j\in {\mathcal {Q}}_-$ (resp. $j\in {\mathcal {Q}}_+$), since $\text {key}[j]\ge \text {key}[k']$ (resp. $\text {key}[j]\le \text {key}[k']$) before the update in step 13 (resp. step 11) and $s_{jk'}\ge 0$, it remains so after the update as well.

Now, we show that at each iteration of the algorithm where node $k'$ is extracted, $\delta + \sum _{j \preceq k'} \text {key}[j]\chi (\{j\}) + \text {key}[k']\chi ({\mathcal {Q}})$ defines a valid cut. This is trivially true for the base case. We now consider the case when $k'$ is extracted. The incremental change to the vector is $(\text {key}[k']-\text {key}[k])\chi ({\mathcal {Q}}\cup \{k'\})$, where k immediately precedes $k'$ in H. Clearly, this change does not affect any inequality in $D_{1}$ expressed for nodes i and j which both precede $k'$ or both succeed $k'$. Therefore, we only need to concern ourselves with an inequality with respect to i and j where $i\preceq k'\preceq j$. Assume $j\in {\mathcal {Q}}_+$. If $k'\in {\mathcal {I}}_+$, then the result follows because $0\le s_{ij} + \text {key}[j] - \text {key}[i] \le s_{ij} + \text {key}[k'] - \text {key}[i]$ because $k'$ is the maximizer in ${\mathcal {Q}}_+$. On the other hand, if $k'\in {\mathcal {I}}_-$, then $s_{ij} + \text {key}[k'] - \text {key}[i] \ge 0$ if $k'=i$ and $s_{ij} + \text {key}[k'] - \text {key}[i] \ge s_{ij} + \text {key}[k] - \text {key}[i] \ge 0$, where the first inequality follows because $\text {key}[k']-\text {key}[k]\ge 0$ by our earlier proof and the second inequality by the induction hypothesis and because $\text {key}[i]$ was not updated. The proof for the case $j\in {\mathcal {Q}}_-$ is similar.

It follows from the definition of keys that at least one of the inequalities with respect to $k'$ and its predecessors becomes tight. Since the procedure only stops when all the nodes are visited, it follows that the graph of tight inequalities is connected at the end and $\delta + \sum _{j\in \mathbf{{\mathcal {N}}_0}} \text {key}^*[j]\chi (\{j\})$ defines a facet-defining inequality.

We now show that all the tight inequalities remain tight during the procedure. In particular, assume $s_{ij} = 0$. Assume $j\preceq i$ where $j\in {\mathcal {I}}_-$ (the proof for $j \in {\mathcal {I}}_+$ is similar). Clearly, when $k=j$ at step 4, $\text {key}[j]\ge \text {key}[i]$. However, $\text {key}[i]\ge \text {key}[j]$ because of the previous update at step 11. Therefore, $\text {key}[j]-\text {key}[i]=0$. Then, because of the condition in step 4, the keys added match $\text {key}[i]$. Therefore, there is no update to $\text {key}[i]$ because $\text {key}[i]\ge -s_{ik} + \text {key}[k]$ follows from $s_{ik}\ge 0$ and $\text {key}[i]=\text {key}[k]$.

The above algorithm can be implemented using heaps for both ${\mathcal {Q}}_+$ and ${\mathcal {Q}}_-$. If the graph $G({\mathcal {I}}_+,{\mathcal {I}}_-)$ has n nodes and e edges, it requires O(n) Insert, O(n) Min, O(n) Max, O(n) Extract-Min and Extract-Max, and O(e) Decrease-Key operations. With Fibonacci heaps, the running time is $O(e+n\log n)$ which exactly matches that of Prim’s algorithm. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kim, J., Tawarmalani, M. & Richard, JP.P. On cutting planes for cardinality-constrained linear programs. Math. Program. 178, 417–448 (2019). https://doi.org/10.1007/s10107-018-1306-0

Download citation

Received: 12 May 2016
Accepted: 23 May 2018
Published: 06 June 2018
Issue Date: November 2019
DOI: https://doi.org/10.1007/s10107-018-1306-0

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

On cutting planes for cardinality-constrained linear programs

Abstract

Similar content being viewed by others

Cutting planes from extended LP formulations

A new family of facet defining inequalities for the maximum edge-weighted clique problem

Intersection Cuts for Polynomial Optimization

1 Introduction

2 Overview of results

3 Disjunctive relaxation of a cardinality-constrained simplex tableau

Remark 1

Proposition 1

Proposition 2

Assumption 1

Proposition 3

Proof

4 A characterization of \(\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q)\)

Proposition 4

Corollary 1

Proposition 5

Proof

Corollary 2

Proposition 6

Proof

Proposition 7

Proof

Proposition 8

Proof

Example 1

Assumption 2

Proposition 9

Proof

Theorem 1

Proof

Example 1

5 Dual network formulation of \(B_{2}\)

Proposition 10

Lemma 1

Proof

Proposition 11

Proof

Proposition 12

Proof

Corollary 3

Proposition 13

Proof

Theorem 2

6 Label-connected trees and facet-defining inequalities of \(\mathop {\mathrm{cl}}\mathop {\mathrm{conv}}(Q^0)\)

Example 2

Proposition 14

Proof

Example 2

Example 3

Lemma 2

Proof

Example 3

Theorem 3

Proof

Example 4

7 A fast algorithm to generate a facet-defining inequality

Example 5

Proposition 15

Proof

Example 5

Proposition 16

Proof

Example 5

Proposition 17

Proof

Corollary 4

Theorem 4

8 Conclusion

References

Author information

Authors and Affiliations

Corresponding authors

Additional information

Appendix: Proof of Theorem 4

Appendix: Proof of Theorem 4

Proof

Rights and permissions