1 Introduction

The simplex method is one of the most important methods for solving linear programs (LPs), that is, optimization problems of the form \(\max \{{\langle {\mathbf {c}}, {\mathbf {x}}\rangle : \mathbf {x} \in P}\}\) where P is a polyhedron defined by linear constraints. Starting from an initial vertex \(\mathbf {v}\), a simplex algorithm provides a rule for moving from vertex to vertex along edges of the graph or 1-skeleton of P until an optimal vertex \(\mathbf {w}\) (or an unbounded ray) is found.

A long standing open question is whether there exists a polynomial-time simplex algorithm for LPs. The first obstacle in proving the existence (or non-existence) of such a method is the following fundamental question:

Question 1

Given any two vertices \(\mathbf {v},\mathbf {w}\) of a polyhedron P, what is the best possible bound on the length of the shortest path between them, as a function of the dimension n and the number of constraints m?

The polynomial Hirsch conjecture posits that the diameter of the graph of a polyhedron is bounded by a polynomial in m and n. The best known general upper bounds are however much larger. Barnette [3] and Larman [14] proved a bound of \(O(2^n m)\), and Todd [20] recently proved a bound of \((m - n)^{\log n}\), slightly improving an earlier bound of Kalai and Kleitman [12, 13]. The original Hirsch conjecture, which posited a bound of \(m-n\), was recently disproved for polytopes (i.e.  bounded polyhedra) by Santos [15, 17], who gave a lower bound of \((1+\varepsilon )(m-n)\) (only slightly violating the conjectured bound).

Given the difficulty of the general question, much research has been aimed at bounding the diameter of special classes of polyhedra. For example, polynomial bounds have been given for 0 / 1 polytopes [16], transportation polytopes [2, 6, 9], and flag polytopes [1].

Another important class, which has recently received much attention and is directly related to this work, are polyhedra whose constraint matrices are “well-conditioned”. Dyer and Frieze [10] showed that the diameter of totally unimodular polyhedra—i.e. having integer constraint matrices with all subdeterminants in \(\{{0,\pm 1}\}\)—is bounded by \(O(n^{16}m^3(\log nm)^3)\). Their work also contains a polynomial time randomized simplex algorithm that solves linear programs over totally unimodular polyhedra.

The diameter bound of Dyer and Frieze was both generalized and improved in the work of Bonifas et al. [4]. They showed that polyhedra with integer constraint matrices and all subdeterminants bounded by \(\Delta \) have diameter \(O(\Delta ^2 n^4 \log (n \Delta ))\) if they are unbounded and \(O(\Delta ^2 n^{3.5} \log (n \Delta ))\) if they are bounded, where \(\Delta =1\) corresponds to the totally unimodular setting. Their proof used certain expansion properties of the polyhedral graph and was non-constructive.

In an attempt to make the bound of [4] constructive, Brunsch and Röglin [7] showed that given any two vertices \(\mathbf {v},\mathbf {w}\) on such a polyhedron P, a path between them of length \(O(m \Delta ^4 n^4)\) (note the dependence on m) can be constructed using the shadow simplex method. In fact, they give a more general bound based on the so-called \(\delta \)-distance property of the constraint matrix, which measures how “well spread” the rows of the constraint matrix are.Footnote 1 Using this parameter they give a bound of \(O(m n^2/\delta ^2)\) on the length of the constructed path, and recover the previous bound by the relationship \(\delta \ge 1/(n \Delta ^2)\). Note that this relationship gives \(\delta \ge 1/n\) for any n-dimensional totally unimodular polyhedron.

Most recently, Eisenbrand and Vempala [11] provided a different approach to making the Bonifas et al. [4] result constructive, which more closely resembles the random walk approach of Dyer and Frieze and also extends to optimization. When the constraint matrix satisfies the \(\delta \)-distance property, they show that given an initial vertex and objective, an optimal vertex can be computed using \(\mathrm{poly}(n,1/\delta )\) random walk steps (no dependence on m). Furthermore, an initial feasible vertex can be computed using m calls to their optimization algorithm over subsets of the original constraints.

2 Results

Building and improving upon the works of Bonifas et al. [4], Brunsch and Röglin [7], and Vempala and Eisenbrand [11], we give an improved (constructive) diameter bound and simplex algorithm for polyhedra satisfying the \(\delta \)-distance and other related properties. We also make improvements in the treatment of unbounded polyhedra and degeneracy. All our results are based on a new variant and analysis of the shadow simplex method.

We now introduce the “discrete curvature measures” we use along with the corresponding results. We list these measures in order of increasing strength. In the next section, we shall explain our variant of the shadow simplex method and compare it with previous implementations.

Let \(\Vert \mathbf {x}\Vert = \sqrt{\sum _{i=1}^n \mathbf {x}_i^2}\), for \(\mathbf {x} \in \mathbb {R}^n\), denote the Euclidean norm, \(\mathcal {B}_2^n = \{{\mathbf {x} \in \mathbb {R}^n: \Vert \mathbf {x}\Vert \le 1}\}\) the unit ball, and \(\mathbb {S}^{n-1} = \partial \mathcal {B}_2^n\) the unit sphere. For vectors \(\mathbf {y}_1,\dots ,\mathbf {y}_k \in \mathbb {R}^n\), we define \(\mathrm{cone}\,(\mathbf {y}_1,\dots ,\mathbf {y}_k) = \big \{{\sum _{i=1}^k \lambda _i \mathbf {y}_i: \lambda _i \ge 0, i \in [k]}\big \}\) to be the cone they generate.

Let \(P = \{{\mathbf {x} \in \mathbb {R}^n: A\mathbf {x} \le \mathbf {b}}\}\), \(A \in \mathbb {R}^{m \times n}\), \(\mathbf {b} \in \mathbb {R}^m\) be a pointed polyhedron (A has full column rank \(\Leftrightarrow \) P has vertices). For a vertex \(\mathbf {v}\) of P, the normal cone at \(\mathbf {v}\) is \(N_\mathbf {v} = \big \{{\sum _{i \in I_{\mathbf {v}}} \lambda _i \mathbf {a}_i: \lambda _i \ge 0, i \in I_{\mathbf {v}}}\big \}\), where \(I_{\mathbf {v}} = \{{i \in [m]: \langle {\mathbf {a}_i}, {\mathbf {v}}\rangle = \mathbf {b}_i}\}\) is the set of tight constraints. Equivalently, \(N_{\mathbf {v}}\) is the set of all linear objective functions whose maximum over P is attained at \(\mathbf {v}\). \(N_{\mathbf {v}}\) is simplicial (non-degenerate) if it is generated by a basis of A, that is, if exactly n linearly independent constraints of P are tight at \(\mathbf {v}\). The normal fan of P is the collection of all the vertex normal cones, and the support of the normal fan N(P) is their union. A polyhedron is simple (or non-degenerate) if all its vertex normal cones are simplicial.

Definition 2

(\(\tau \)-Wide Polyhedra) We say that a cone C is \(\tau \)-wide if it contains a Euclidean ball of radius \(\tau \) centered on the unit sphere. We define a polyhedron P to have a \(\tau \)-wide normal fan (or simply P to be \(\tau \)-wide) if every vertex normal cone is \(\tau \)-wide.

figure a

Roughly speaking, having a \(\tau \)-wide normal fan enforces that facets always intersect at “sharp angles” (i.e. angle bounded away from \(\pi \)). In particular, for any vertex \(\mathbf {v}\) of P, the angle between any two rays emanating from \(\mathbf {v}\) and (non-trivially) passing through P is at most \(\pi -2\tau \). Hence one can interpret this condition as a discrete form of curvature for polyhedra. We now state our diameter bound for \(\tau \)-wide polyhedra.

Theorem 3

(Diameter Bound, see Theorem 11) Let \(P \subseteq \mathbb {R}^n\) be an n-dimensional pointed polyhedron having a \(\tau \)-wide normal fan. Then the graph of P has diameter bounded by \(8n/\tau (1+\ln (1/\tau ))\). Furthermore, a path of this expected length can be constructed via the shadow simplex method.

Restricting to n-dimensional polyhedra with subdeterminants bounded by \(\Delta \), using the relation \(\tau \ge 1/(n\Delta )^2\) (see Lemma 26) we achieve a bound of \(O(n^3\Delta ^2\ln (n\Delta ))\), improving on the existential bounds of Bonifas et al. [4]. In contrast to [4], we note that our bound (and proof) is the same for polytopes and unbounded polyhedra.

While our bound is constructive—we follow a shadow simplex path—it is in general only efficiently implementable when the polyhedron is simple. In the presence of degeneracy, we note that computing a single edge of the path is essentially as hard as solving linear programming. Furthermore, standard techniques for removing degeneracy, such as the perturbation or lexicographic method, may unfortunately introduce a large number of extra simplex pivots.

Interestingly, our diameter bound can take advantage of degeneracy in situations where it makes the normal cones wider. While degeneracy does not occur for “generic polyhedra”, it is very common for combinatorial polytopes. Furthermore, it can occur in ways that are useful to our diameter bound. For example, we remark that using degeneracy one can prove that the normal fan of the perfect matching polytope is \(\Omega (1/\sqrt{|E|})\)-wide, see Sect. 7.

To solve linear optimization problems via the shadow simplex method, we will need more than a wide normal fan. In fact, we will have different requirements for the two phases of the simplex algorithm: Phase 1, which finds an initial feasible vertex, will require more than Phase 2, which finds an optimal vertex with respect to the objective starting from a feasible vertex.

Definition 4

(\(\delta \)-Distance property) A set of linearly independent vectors \(\mathbf {v}_1,\dots ,\mathbf {v}_k \in \mathbb {R}^n\) satisfies the \(\delta \)-distance property if for every \(i \in [k]\), the vector \(\mathbf {v}_i\) is at Euclidean distance at least \(\delta \Vert \mathbf {v}_i\Vert \) from the span of \(\{{\mathbf {v}_j: j \in [k] \setminus \left\{ i\right\} }\}\).

For a polyhedron \(P = \{{\mathbf {x} \in \mathbb {R}^n: A\mathbf {x} \le \mathbf {b}}\}\), we define P to satisfy the local \(\delta \)-distance property if every feasible basis of A, i.e. the rows of A defining a vertex of P, satisfies the \(\delta \)-distance property.

We say that a set of vectors \(\mathbf {v}_1,\ldots ,\mathbf {v}_m \in \mathbb {R}^n\) satisfies the global \(\delta \)-distance property if every linearly independent subset satisfies the \(\delta \)-distance property. We say that a matrix \(A \in \mathbb {R}^{m \times n}\) satisfies the global \(\delta \)-distance property if its row vectors do.

Lemma 5

Let \(\mathbf {v}_1,\dots ,\mathbf {v}_n \in \mathbb {S}^{n-1}\) be a basis satisfying the \(\delta \)-distance property. Then \(\mathrm{cone}(\mathbf {v}_1,\dots ,\mathbf {v}_n)\) is \(\delta /n\)-wide.

Proof

Let \(\mathbf {v}_1^*,\dots ,\mathbf {v}_n^*\) be the dual basis satisfying \(\langle {\mathbf {v}_i,} {\mathbf {v}_j^*}\rangle = 1\) if \(i=j\) and 0 otherwise. By the \(\delta \)-distance property

$$\begin{aligned} \langle {\mathbf {v}_i}, {\mathbf {v}_i^*/\Vert \mathbf {v}_i^*\Vert }\rangle \ge \delta \quad \Rightarrow \quad \Vert \mathbf {v}_i^*\Vert \le 1/\delta . \end{aligned}$$

Note that \(\mathbf {x} \in \mathrm{cone}(\mathbf {v}_1,\dots ,\mathbf {v}_n)\) iff \(\langle {\mathbf {v}_i^*}, {\mathbf {x}}\rangle \ge 0\), for all \(i \in [n]\). Let \(\bar{\mathbf {v}} = \sum _{i=1}^n \mathbf {v}_i/n\), and note that \(\Vert \bar{\mathbf {v}}\Vert \le 1\), since it is an average of unit vectors.

We will show that \(\bar{\mathbf {v}} + \frac{\delta }{n} \mathcal {B}_2^n \subseteq \mathrm{cone}(\mathbf {v}_1,\dots ,\mathbf {v}_n)\), which suffices to prove the lemma. Take any vector \(\mathbf {e}\), \(\Vert \mathbf {e}\Vert \le \delta /n\). Then for any \(i \in [m]\), note that

$$\begin{aligned} \langle {\mathbf {v}_i^*}, {\bar{\mathbf {v}}+\mathbf {e}}\rangle = 1/n + \langle {\mathbf {v}_i^*}, {\mathbf {e}}\rangle \ge 1/n - \Vert \mathbf {v}_i^*\Vert \Vert \mathbf {e}\Vert \ge 0. \end{aligned}$$

Hence \(\bar{\mathbf {v}} + \mathbf {e} \in \mathrm{cone}(\mathbf {v}_1,\dots ,\mathbf {v}_n)\), as needed. \(\square \)

The definitions differ in strength mainly based on the sets of bases to which they apply. The local \(\delta \)-distance property is stronger than the \(\tau \)-wide property for \(\tau = \delta /n\), because it implies that all triangulations of the normal fan are \(\tau \)-wide.Footnote 2 The global property is stronger than the local property since it applies also to infeasible bases, which allows one to control the geometry of polyhedra related to P, such as polyhedra obtained by removing a subset of constraints, which will be needed for Phase 1.

We now state our main result for Phase 2 simplex.

Theorem 6

(Optimization via Shadow Simplex, see Theorem 25) Let \(P = \{{\mathbf {x} \in \mathbb {R}^n: A\mathbf {x} \le \mathbf {b}}\}\) be an n-dimensional polytope with m constraints satisfying the local \(\delta \)-distance property. Then, given an objective \(\mathbf {c} \in \mathbb {R}^n\) and a vertex \(\mathbf {v}\) of P, an optimal vertex can be computed using an expected \(O((n^3 / \delta ) \ln (n/\delta ))\) shadow simplex pivots, where each pivot requires O(mn) arithmetic operations.

Our Phase 2 algorithm above is faster than the algorithms in [7, 11] and relies on a weaker assumption than [11]. The \(\mathbf {v}\),\(\mathbf {w}\) path finding algorithm of Brunsch and Röglin [7] is in fact a special case of the above, since we can choose \(\mathbf {c}\) to be any objective maximized at \(\mathbf {w}\). Comparing to the Phase 2 algorithm of Eisenbrand and Vempala [11], we require only the local \(\delta \)-distance property instead of the global one. Whether one could rely only on the local property was left as open question in [11], which we resolve in the affirmative.

A small technical caveat is that as stated, the algorithm requires knowledge of \(\delta \). Since \(\delta \le 1\), we can always guess a number \(\delta ' \le \delta \le 2\delta '\) by trying \(O(\ln 1/\delta )\) different values, incurring an \(O(\ln 1/\delta )\) factor increase in running time (overestimating \(\delta \) only affects correctness, not runtime). For simplicity, we shall henceforth assume that \(\delta \) is known.

A more important caveat is that the above algorithm requires that P be a polytope (i.e. bounded). This restriction is due to the fact that we can only generate the randomness required for our bounds efficiently (that is, without solving a general LP) when the support of the normal fan equals \(\mathbb {R}^n\).

The unbounded setting can be reduced to the bounded setting, in the standard way, by adding one or more constraints to make P bounded while not cutting off any of its vertices.

Definition 7

Let \(P = \{{ \mathbf {x} \in \mathbb {R}^n : A\mathbf {x} \le \mathbf {b}}\}\) be a pointed polyhedron. Then a polytope \(P' = \{{ \mathbf {x} \in \mathbb {R}^n : A\mathbf {x} \le \mathbf {b}, A' \mathbf {x} \le \mathbf {b}' }\}\) is LP equivalent to P if every vertex \(\mathbf {v} \in P\) satisfies \(\langle {\mathbf {a}_i'}, {\mathbf {v}}\rangle < \mathbf {b}_i'\) for all i; in particular, \(\mathbf {v}\) is a vertex of \(P'\).

Given an optimal vertex \(\mathbf {v}\) of \(P'\) as above, one can easily check whether \(\mathbf {v}\) is a vertex of P. If it is not, the original LP must be unbounded. In general, however, adding constraints to P happens at the expense of a degraded \(\delta \). In particular, the standard reduction of adding a large box constraint can degrade \(\delta \) arbitrarily, hence the constraints must be added with care. We state the guarantees we can achieve below.

Lemma 8

(Removing Unboundedness, see Sect. 8) Let \(P = \{{\mathbf {x} \in \mathbb {R}^n: A\mathbf {x} \le \mathbf {b}}\}\) be an n-dimensional pointed polyhedron with m constraints. Let \(\mathbf {a}_1,\dots ,\mathbf {a}_m\) denote the rows of A and \(b_\mathrm{max} = \max _{i \in [m]} |\mathbf {b}_i|/\Vert \mathbf {a}_i\Vert \).

  1. 1.

    Assume that P satisfies the local \(\delta \)-distance property and that \(I \subseteq [m]\), \(|I|=n\), indexes the rows of a feasible basis. Letting \(\mathbf {w} = -1/n \sum _{i \in I} \mathbf {a}_i/\Vert \mathbf {a}_i\Vert \), we have that

    $$\begin{aligned} P' = \big \{{\mathbf {x} \in \mathbb {R}^n: A\mathbf {x} \le \mathbf {b}, \langle {\mathbf {w}}, {\mathbf {x}}\rangle \le n b_\mathrm{max}/\delta }\big \}, \end{aligned}$$

    is a polytope that is LP equivalent to P and satisfies the local \(\delta ^2/(2n)\)-distance property.

  2. 2.

    Assume that A satisfies the global \(\delta \)-distance property. Then

    $$\begin{aligned} P' = \big \{\mathbf {x} \in \mathbb {R}^n: -n \Vert \mathbf {a}_i\Vert b_\mathrm{max}/\delta - 1 \le \langle {\mathbf {a}_i}, {\mathbf {x}}\rangle \le \mathbf {b}_i\,\forall i \in [m]\big \} \end{aligned}$$

    is a polytope that is LP equivalent to P and satisfies the global \(\delta \)-distance property.

Finally, we use standard techniques for reducing feasibility to Phase 2 type optimization. As this generally requires pivoting over infeasible bases, we will require global instead of local properties here. Interestingly, for LPs with bounded subdeterminants, we get that the number of simplex pivots is completely independent of the number of constraints.

Theorem 9

(Feasibility via Shadow Simplex, see Sect. 6) Let \(P = \{{\mathbf {x} \in \mathbb {R}^n: A\mathbf {x} \le b}\}\) be an n-dimensional polyhedron whose constraint matrix has full column rank and satisfies the global \(\delta \)-distance property. Then a feasible solution to P can be computed using an expected \(O((mn^3/\delta ) \ln (n/\delta ))\) shadow simplex pivots. Furthermore, if A is integral and has subdeterminants bounded by \(\Delta \), a feasible solution can be computed using an expected \(O(n^5\Delta ^2 \ln (n\Delta ))\) shadow simplex pivots.

Shadow Simplex Method

Our main technical contribution is a new analysis and variant of the shadow simplex method, which utilizes (rather unexpectedly) an approach developed in [8] for navigating over the Voronoi graph of a Euclidean lattice (see related work section).

The shadow simplex has been at the heart of many theoretical attempts to explain the surprising efficiency of the simplex method in practice. It has been shown to give polynomial bounds for the simplex method over random and smoothed linear programs [5, 19, 21]. As mentioned above, Brunsch and Röglin [7] already showed that it yields short paths for the polyhedra we consider here.

At a high level, the shadow simplex over a polyhedron P works as follows. Given an initial objective function \(\mathbf {c}\), a vertex \(\mathbf {v}\) of P which maximizes this objective, and a target objective function \(\mathbf {d}\), the shadow simplex interpolates between the objective functions \(\mathbf {c}\) and \(\mathbf {d}\) and performs a pivot step whenever the optimal vertex changes (hence the alternative name parametric simplex method referring to the parameterization \(\mathbf {c}(\lambda ) = (1 - \lambda ) \mathbf {c} + \lambda \mathbf {d}\) of the objective function, where \(\lambda \) grows from 0 to 1 over the course of the algorithm).

Traditionally, this method is understood and analyzed with a primal interpretation: The polyhedron P is orthogonally projected onto the 2-dimensional plane spanned by \(\mathbf {c}\) and \(\mathbf {d}\) (hence the term “shadow”), and the algorithm is understood in terms of the boundary of the projection \(P'\). The optimal vertices for \(\mathbf {c}\) and \(\mathbf {d}\) project to the boundary of \(P'\), and as long as \(\mathbf {c}\) and \(\mathbf {d}\) are in sufficiently general position, edges of \(P'\) lift to edges of P so that the boundary can be followed efficiently by an algorithm that performs simplex pivots in the original space. The number of pivot steps is then typically bounded in terms of the lengths of edges or in terms of angles between edges of \(P'\).

Our analysis is substantially different and based on a dual perspective: The shadow simplex method follows the line segment \([\mathbf {c}, \mathbf {d}]\) through the normal fan of P, pivoting whenever the segment crosses into a different n-dimensional normal cone. We express the number of crossings, that is, the number of intersections between \([\mathbf {c}, \mathbf {d}]\) and the facets of the normal fan of P, in terms of certain surface area measures of translates of the normal fan. The bounds we obtain on the number of intersections are stated below.

Theorem 10

(Intersection bounds, see Lemmas 15 and 18) Let \(\mathcal {T}= (C_1,\ldots ,C_k)\) be a partition of a cone \(\Sigma \) into polyhedral \(\tau \)-wide cones. Let \(\mathbf {c}, \mathbf {d} \in \mathbb {R}^n\) and let \(X \in \mathbb {R}^n\) be exponentially distributed on \(\Sigma \) (see Sect. 3.1).

  1. 1.

    The expected number of facets hit by the shifted line segment \([\mathbf {c} + X, \mathbf {d} + X]\) satisfies

    $$\begin{aligned} {{\mathrm{\mathbb {E}}}}[ |\partial \mathcal {T}\cap [\mathbf {c} + X, \mathbf {d} + X]| ] \le \frac{\Vert \mathbf {d} - \mathbf {c}\Vert }{\tau } . \end{aligned}$$
  2. 2.

    Let \(\alpha \in (0,1)\). Then

    $$\begin{aligned} {{\mathrm{\mathbb {E}}}}[|\partial \mathcal {T}\cap [\mathbf {c} + \alpha X, \mathbf {c} + X]|] \le \frac{2n}{\tau } \ln {\frac{1}{\alpha }} . \end{aligned}$$

To achieve the above bounds, the main idea is to relate the probability that the above random line segments pass through a normal cone to the probability that the associated perturbation vector lands in the cone (or some joint shift). Under the \(\tau \)-wideness condition, we can in fact uniformly upper bound these proportionality factors. Since the jointly shifted normal cones are all disjoint, we can deduce the desired bounds from the fact that the sum of their measures is \(\le \)1.

We compose these bounds in a way that also departs from the classic template by using three consecutive shadow simplex paths instead of just one. For given vertices \(\mathbf {v}\) and \(\mathbf {w}\) we first pick objectives \(\mathbf {c}\) and \(\mathbf {d}\) that are “deep inside” the respective normal cones. From here, we sample an exponentially distributed perturbation vector X and traverse three paths through the normal fan in sequence:

$$\begin{aligned} \mathbf {c} \quad \mathop {\longrightarrow }\limits ^{(a)}\quad \mathbf {c} + X \quad \mathop {\longrightarrow }\limits ^{(b)}\quad \mathbf {d} + X \quad \mathop {\longrightarrow }\limits ^{(c)}\quad \mathbf {d} \end{aligned}$$

The perturbation X will be quite large and hence almost always large enough to push \(\mathbf {c}\) and \(\mathbf {d}\) away from their normal cones. Indeed, the high level intuition behind our path is that in order to avoid unusually long paths from \(\mathbf {c}\) to \(\mathbf {d}\), we first travel to a “random intermediate location”.

We note that in phases (a) and (c), randomness is only used to perturb one of the objectives. As far as we are aware, this paper provides the first successful analysis of the shadow simplex path in this setting. Furthermore, this extension is crucial to achieving our improved diameter bound. Previous algorithms were constrained to random perturbations that kept \(\mathbf {c}\) and \(\mathbf {d}\) inside their respective normal cones, making the amount of randomness they could take advantage of much smaller.

We now use the bounds from Theorem 10 to derive the diameter bound.

Theorem 11

Let \(P \subseteq \mathbb {R}^n\) be a pointed full dimensional polyhedron with \(\tau \)-wide normal cones. Then P has diameter bounded by \(\frac{8n}{\tau }(1+\ln 1/\tau )\).

Proof

Let \(\mathbf {v}_1,\mathbf {v}_2\) be vertices of P with normal cones \(N_{\mathbf {v}_1},N_{\mathbf {v}_2}\). Let \(\mathbf {c}_1,\mathbf {c}_2 \in \mathbb {S}^{n-1}\) satisfy \(\mathbf {c}_i+\tau \mathcal {B}_2^n \subseteq N_{\mathbf {v}_i}\), \(i \in \left\{ 1,2\right\} \). Let \(\Sigma = N(P)\) denote the support of the normal fan of P, and let X be exponentially distributed over \(\Sigma \).

We will construct a path from \(\mathbf {v}_1\) to \(\mathbf {v}_2\) by following the sequence of vertices optimizing the objectives in the segments \([s \mathbf {c}_1, s \mathbf {c}_1 + X]\), \([s \mathbf {c}_1 + X, s \mathbf {c}_2 + X]\), \([s \mathbf {c}_2 + X, s \mathbf {c}_2]\), where \(s > 0\) is a scalar to be chosen later. We will condition on the event that \(\Vert X\Vert \le 2n\). Since \({{\mathrm{\mathbb {E}}}}[\Vert X\Vert ] = n\) (see Lemma 13), by Markov’s inequality this occurs with probablity at least 1 / 2. Under this event, by \(\tau \)-wideness, we will not pivot in the segments \([s \mathbf {c}_1, s \mathbf {c}_1 + \frac{s\tau }{2n} X]\) and \([s \mathbf {c}_2 + \frac{s\tau }{2n} X, s \mathbf {c}_2]\). Using Theorem 10, the number of pivots along the segments \([s \mathbf {c}_1 + \frac{s\tau }{2n} X, s\mathbf {c}_1 + X]\), \([s \mathbf {c}_1 + X, s \mathbf {c}_2 + X]\), \([s \mathbf {c}_2 + X, s \mathbf {c}_2 + \frac{s\tau }{2n} X]\), is bounded by

$$\begin{aligned} \frac{\Big (\frac{s\Vert \mathbf {c}_2-\mathbf {c}_1\Vert }{\tau } + \frac{4n}{\tau } \ln \big (\frac{2n}{s\tau }\big ) \Big )}{\Pr [\Vert X\Vert \le 2n]} \le 2 \Big (\frac{s\Vert \mathbf {c}_2-\mathbf {c}_1\Vert }{\tau } + \frac{4n}{\tau } \ln \Big (\frac{2n}{s\tau }\Big ) \Big ) . \end{aligned}$$

Setting \(s = \frac{4n}{\Vert \mathbf {c}_2-\mathbf {c}_1\Vert }\), the above expression is bounded by

$$\begin{aligned} \frac{8n}{\tau }\Big (1+\ln \Big (\frac{\Vert \mathbf {c}_2-\mathbf {c}_1\Vert }{2\tau }\Big )\Big ) \le \frac{8n}{\tau }\Big (1+ \ln \frac{1}{\tau } \Big )\quad \text {as needed.} \end{aligned}$$

\(\square \)

Related Work

In a surprising connection, we borrow techniques developed in a recent work of Bonifas and the first named author [8] for a totally different purpose, namely, for solving the Closest Vector Problem with Preprocessing on Euclidean lattices. In [8], a 3-step “perturbed” line path was analyzed to navigate over the Voronoi graph of the lattice, where lattice points are connected if their associated Voronoi cells touch in a facet.

In the current work, we show a strikingly close analogy between analyzing the number of intersections of a random straight line path with a Voronoi tiling of space and the intersections of a shadow simplex path with the normal fan of a polyhedron. This unexpected connection makes us hopeful that these ideas may have even broader applicability.

Organization

In Sect. 3, we regroup all the necessary notation and definitions. In Sect. 4, we prove the intersection bounds for our shadow simplex method. In Sect. 5, we give our shadow simplex based optimization algorithm. In Sect. 6, we give our shadow simplex based algorithms for LP feasibility. In Sect. 7, we give lower bounds on the width of the normal fan of the perfect matching polytope. In Sect. 8, we give our reductions from unbounded \(\delta \)-wide LPs to bounded ones. In Sect. 9, we show how to implement shadow simplex pivots and how to deal with degeneracy.

3 Notation and Definitions

For vectors \(\mathbf {x},\mathbf {y} \in \mathbb {R}^n\), we let \(\langle {\mathbf {x}}, {\mathbf {y}}\rangle = \sum _{i=1}^n \mathbf {x}_i \mathbf {y}_i\) denote their inner product. We denote the linear span of a set \(A \subseteq \mathbb {R}^n\) by \(\mathrm{span}(A)\). We use the notation \(\mathrm{I}[\mathbf {x} \in A]\) for the indicator of \(\mathbf {x} \in A\), that is \(\mathrm{I}[\mathbf {x} \in A]\) is 1 if \(\mathbf {x} \in A\) and 0 otherwise. For a set of scalars \(S \subseteq \mathbb {R}\), we write \(SA = \{{s\mathbf {a}: s \in S, \mathbf {a} \in A}\}\). For two sets \(A,B \subseteq \mathbb {R}^n\), we define their Minkowski sum \(A+B = \{{\mathbf {a}+\mathbf {b}: \mathbf {a} \in A, \mathbf {b} \in B}\}\). We let \(d(A,B) = \inf \left\{ \Vert \mathbf {x}-\mathbf {y}\Vert : \mathbf {x} \in A, \mathbf {y} \in B\right\} \), denote the Euclidean distance between A and B. For vectors \(\mathbf {a},\mathbf {b} \in \mathbb {R}^n\) we write \([\mathbf {a},\mathbf {b}]\) for the closed line segment and \([\mathbf {a},\mathbf {b})\) for the half-open line segment from \(\mathbf {a}\) to \(\mathbf {b}\).

Definition 12

(Cone) A cone \(\Sigma \subseteq \mathbb {R}^n\) satisfies the following three properties:

  • \(\mathbf {0} \in \Sigma \).

  • \(\mathbf {x}+\mathbf {y} \in \Sigma \) if \(\mathbf {x}\) and \(\mathbf {y}\) are in \(\Sigma \).

  • \(\lambda \mathbf {x} \in \Sigma \) if \(\mathbf {x} \in \Sigma \) and \(\lambda \ge 0\).

For vectors \(\mathbf {y}_1,\dots ,\mathbf {y}_k \in \mathbb {R}^n\), we define the closed cone they generate as

$$\begin{aligned} \mathrm{cone}(\mathbf {y}_1,\dots ,\mathbf {y}_k) = \Big \{\sum _{i=1}^m \lambda _i \mathbf {y}_i: \lambda _i \ge 0, i \in [m]\Big \} . \end{aligned}$$

A cone is polyhedral if it can be generated by a finite number of vectors, and is simplicial if the generators are linearly independent. By convention, we let \(\mathrm{cone}(\emptyset ) = \mathbf {0}\). A simplicial cone has the \(\delta \)-distance property if its extreme rays satisfy the \(\delta \)-distance property.Footnote 3

The faces of a convex set \(K \subseteq \mathbb {R}^n\) are its subsets of the form \(F = \{{\mathbf {x} \in K : \langle {\mathbf {a}}, {\mathbf {x}}\rangle = \beta }\}\) where \(\mathbf {a} \in \mathbb {R}^n\) and \(\beta \in \mathbb {R}\) satisfy \(\langle {\mathbf {a}}, {\mathbf {x}}\rangle \le \beta \) for all \(\mathbf {x} \in K\). Faces of co-dimension 1 are called facets. For a simplicial cone C, we note that its non-empty faces are exactly all the subcones generated by any subset of the generators of C.

A set of cones \(\mathcal {T}= \{{C_1,\dots ,C_k}\}\) is an n-dimensional cone partition if:

  • Each \(C_i \subseteq \mathbb {R}^n\), \(i \in [k]\), is a closed n-dimensional cone.

  • Any two cones \(C_i,C_j\), \(i \ne j\), meet in a shared face.

  • The support of \(\mathcal {T}\), \(\mathrm{sup}(\mathcal {T}) \mathbin {\mathop {=}\limits ^\mathrm{def}}\bigcup _{i \in [k]} C_i\), is a closed cone.

We say that F is a face of \(\mathcal {T}\) if it is a face of one of its contained cones. A cone partition \(\mathcal {T}\) is \(\tau \)-wide if every \(C_i\) is \(\tau \)-wide. It is simplicial if every \(C_i\) is simplicial. In this case, we also call \(\mathcal {T}\) a cone triangulation. A cone triangulation satisfies the local \(\delta \)-distance property if every \(C_i\) satisfies it. We define the boundary of \(\mathcal {T}\), \(\partial \mathcal {T}= \bigcup _{i=1}^k \partial C_i\). We say that a cone triangulation \(\mathcal {T}\) triangulates a cone partition \(\mathcal {P}\) if \(\mathcal {T}\) and \(\mathcal {P}\) have the same support and every cone \(C \in \mathcal {T}\) is generated by a subset of the extreme rays of some cone of \(\mathcal {P}\). This means that \(\mathcal {T}\) partitions (“refines”) every cone of \(\mathcal {P}\) into simplicial cones.

3.1 Exponential Distribution

We say that a random variable \(X \in \mathbb {R}^n\) is exponentially distributed on a cone \(\Sigma \) if

$$\begin{aligned} \Pr [X \in S] = \int _S \zeta _\Sigma (\mathbf {x}) \mathrm{d}\mathbf {x} \end{aligned}$$

for every measurable \(S \subseteq \mathbb {R}^n\), where \(\zeta _{\Sigma }(\mathbf {x}) = c_\Sigma e^{-\Vert \mathbf {x}\Vert } \mathrm{I}[\mathbf {x} \in \Sigma ]\). A standard computation, which we include for completeness, yields the normalizing constant and the expected norm.

Lemma 13

The normalizing constant is \(c_\Sigma ^{-1} = n! \mathrm{vol}_n(\mathcal {B}_2^n \cap \Sigma )\). For X exponentially distributed on \(\Sigma \), we have that \({{\mathrm{\mathbb {E}}}}[\Vert X\Vert ] = n\).

Proof

For the first part,

$$\begin{aligned} c_\Sigma ^{-1}&= \int _{\Sigma } e^{-\Vert \mathbf {x}\Vert } \mathrm{d}\mathbf {x} = \int _{\Sigma } \int _{\Vert \mathbf {x}\Vert }^\infty e^{-t} \mathrm{dt}\, \mathrm{d}\mathbf {x} \\&= \int _0^\infty e^{-t} \int _{\Sigma } \mathrm{I}[\Vert \mathbf {x}\Vert \le t] \mathrm{d}\mathbf {x}\,\mathrm{dt} = \int _0^\infty e^{-t} t^n \mathrm{vol}_n(\mathcal {B}_2^n \cap \Sigma )\mathrm{dt} = n! \mathrm{vol}_n(\mathcal {B}_2^n \cap \Sigma ) . \end{aligned}$$

For the expected norm,

$$\begin{aligned} {{\mathrm{\mathbb {E}}}}[\Vert X\Vert ]&= c_\Sigma \int _{\Sigma } \Vert \mathbf {x}\Vert e^{-\Vert \mathbf {x}\Vert } \mathrm{d}\mathbf {x} = c_\Sigma \int _{\Sigma } \Vert \mathbf {x}\Vert \int _{\Vert \mathbf {x}\Vert }^\infty e^{-t} \mathrm{dt}\, \mathrm{d}\mathbf {x} \\&= c_\Sigma \int _0^\infty e^{-t} \int _{t(\mathcal {B}_2^n \cap \Sigma )} \Vert \mathbf {x}\Vert \mathrm{d}\mathbf {x}\, \mathrm{dt} = c_\Sigma \int _0^\infty e^{-t} t^{n+1} \mathrm{dt} \int _{\mathcal {B}_2^n \cap \Sigma } \Vert \mathbf {x}\Vert \mathrm{d}\mathbf {x} \\&= c_\Sigma (n+1)! \int _0^1 (1-s^n) \mathrm{vol}_n(\mathcal {B}_2^n \cap \Sigma ) \mathrm{ds} \\&= c_\Sigma (n+1)! \mathrm{vol}_n(\mathcal {B}_2^n \cap \Sigma ) \frac{n}{n+1} = n . \end{aligned}$$

\(\square \)

4 Intersection Bounds

Lemma 14

Let C be a polyhedral cone containing \(\mathbf {u} + \tau \mathcal {B}_2^n\), where \(\Vert \mathbf {u}\Vert =1\). Let \(\mathbf {c}, \mathbf {d} \in \mathbb {R}^n\) and let \(X \in \mathbb {R}^n\) be exponentially distributed on a full dimensional cone \(\Sigma \ni \mathbf {u}\). Then the expected number of times the shifted line segment \([\mathbf {c} + X, \mathbf {d} + X]\) hits the boundary of C is at most

$$\begin{aligned} {{\mathrm{\mathbb {E}}}}[|\partial C \cap [\mathbf {c} + X, \mathbf {d} + X]|] \le \frac{\Vert \mathbf {d} - \mathbf {c}\Vert }{\tau } \int _0^1 \int _{(C -((1-\lambda )c + \lambda d)) \cap \Sigma } \zeta _\Sigma (\mathbf {x}) \mathrm{d}\mathbf {x}\, \mathrm{d}\lambda . \end{aligned}$$

Proof

Let F be a facet of C. Note that with probability 1, the line segment \([\mathbf {c}+X,\mathbf {d}+X]\) passes through F at most once. By linearity, we see that

$$\begin{aligned} {{\mathrm{\mathbb {E}}}}[|\partial C \cap [\mathbf {c} + X, \mathbf {d} + X]|] = \sum _{F \text { facet of } C} \Pr [(F \cap [\mathbf {c}+X,\mathbf {d}+X]) \ne \emptyset ] . \end{aligned}$$
(1)

We now bound the crossing probability for any facet F.

We first calculate the hitting probability as

$$\begin{aligned}&\Pr [ F \cap [\mathbf {c} + X, \mathbf {d} + X] \ne \emptyset ]\nonumber \\&\quad = \Pr [ X \in -[\mathbf {c},\mathbf {d}] + F ] \nonumber \\&\quad = \int _{-[\mathbf {c},\mathbf {d}] + F} \zeta _\Sigma (\mathbf {x}) \mathrm{d}\mathbf {x}\nonumber \\&\quad = |\langle {\mathbf {n}}, {\mathbf {d}-\mathbf {c}}\rangle | \int _0^1 \int _{F - ((1-\lambda )\mathbf {c} + \lambda \mathbf {d})} \zeta _\Sigma (\mathbf {x}) \mathrm{d}\mathrm{vol}_{n-1}(\mathbf {x}) \,\mathrm{d}\lambda \nonumber \\&\quad \le \Vert \mathbf {d}-\mathbf {c}\Vert \int _0^1 \int _{(F - ((1-\lambda )\mathbf {c} + \lambda \mathbf {d})) \cap \Sigma } c_\Sigma e^{-\Vert \mathbf {x}\Vert } \mathrm{d}\mathrm{vol}_{n-1}(\mathbf {x}) \,\mathrm{d}\lambda , \end{aligned}$$
(2)

where \(\mathbf {n} \in \mathbb {R}^n\) is a unit normal vector to F and we use \(\mathrm{d}\mathrm{vol}_{n-1}(\mathbf {x})\) to indicate an integral with respect to the usual \((n-1)\)-dimensional measure on the affine hyperplane spanned by the integration domain. Bounding the hitting probability therefore boils down to bounding the measure of a shift of the facet F. Letting \(h = |\langle {\mathbf {n}}, {\mathbf {u}}\rangle | \ge \tau \) (which holds by assumption on \(\mathbf {u}\)), for any shift \(\mathbf {t} \in \mathbb {R}^n\) we have that

$$\begin{aligned} \int _{(F + \mathbf {t} + \mathrm{cone}(\mathbf {u})) \cap \Sigma } e^{-\Vert \mathbf {x}\Vert } \mathrm{d}\mathbf {x}&\ge \int _{((F + \mathbf {t}) \cap \Sigma ) + \mathrm{cone}(\mathbf {u})} e^{-\Vert \mathbf {x}\Vert } \mathrm{d}\mathbf {x} \quad (\text {since}\,\mathbf {u} \in \Sigma ) \nonumber \\&= \int _0^\infty \int _{((F + \mathbf {t}) \cap \Sigma ) + \frac{r}{h} \mathbf {u}} e^{-\Vert \mathbf {x}\Vert } \mathrm{d}\mathrm{vol}_{n-1}(\mathbf {x})\, \mathrm{d}r \nonumber \\&= \int _0^\infty \int _{(F + \mathbf {t}) \cap \Sigma } e^{-\Vert \mathbf {x} + \frac{r}{h} \mathbf {u}\Vert } \mathrm{d}\mathrm{vol}_{n-1}(\mathbf {x})\, \mathrm{d}r \nonumber \\&\ge \int _0^\infty e^{-r/h} \mathrm{d}r \int _{(F + \mathbf {t}) \cap \Sigma } e^{-\Vert \mathbf {x}\Vert } \mathrm{d}\mathrm{vol}_{n-1}(\mathbf {x}) \nonumber \\&\ge \tau \int _{(F + \mathbf {t}) \cap \Sigma } e^{-\Vert \mathbf {x}\Vert } \mathrm{d}\mathrm{vol}_{n-1}(\mathbf {x}). \end{aligned}$$
(3)

The lemma now follows by combining (1), (2), (3), using the fact that the \(F + \mathrm{cone}(\mathbf {u})\) partition the cone C up to sets of measure 0. \(\square \)

Lemma 15

Let \(\mathcal {T}= (C_1,\ldots ,C_k)\) be a partition of a cone \(\Sigma \) into polyhedral \(\tau \)-wide cones. Let \(\mathbf {c}, \mathbf {d} \in \mathbb {R}^n\) and let \(X \in \mathbb {R}^n\) be exponentially distributed on \(\Sigma \). Then the expected number of facets hit by the shifted line segment \([\mathbf {c} + X, \mathbf {d} + X]\) satisfies

$$\begin{aligned} {{\mathrm{\mathbb {E}}}}[ |\partial \mathcal {T}\cap [\mathbf {c} + X, \mathbf {d} + X]| ] \le \frac{\Vert \mathbf {d} - \mathbf {c}\Vert }{\tau } . \end{aligned}$$

Proof

Using Lemma 14, we bound

$$\begin{aligned} {{\mathrm{\mathbb {E}}}}[ |\partial \mathcal {T}\cap [\mathbf {c} + X, \mathbf {d} + X]| ]&\le \sum _{i=1}^k {{\mathrm{\mathbb {E}}}}[|\partial C_i \cap [\mathbf {c} + X, \mathbf {d} + X]|] \\&\le \sum _{i=1}^k \frac{\Vert \mathbf {d} - \mathbf {c}\Vert }{\tau } \int _0^1 \int _{(C_i-((1-\lambda )\mathbf {c}+\lambda \mathbf {d})) \cap \Sigma } \zeta _{\Sigma }(\mathbf {x})\mathrm{d}\mathbf {x}\, \mathrm{d}\lambda \\&\le \frac{\Vert \mathbf {d} - \mathbf {c}\Vert }{\tau } \int _0^1 \int _{\Sigma } \zeta _{\Sigma }(\mathbf {x})\mathrm{d}\mathbf {x}\, \mathrm{d}\lambda \\&\le \frac{\Vert \mathbf {d} - \mathbf {c}\Vert }{\tau }, \end{aligned}$$

as needed. \(\square \)

We will need the following simple lemma about the exponential distribution.

Lemma 16

Let Y be exponentially distributed on \(\mathbb {R}_+\). Then for any \(c \in \mathbb {R}\), \({{\mathrm{\mathbb {E}}}}[|Y-c|] \ge |c|/2\).

Proof

Since \(Y \ge 0\), the inequality is trivial if \(c \le 0\). Hence we may assume that \(c \ge 0\). Using integration by parts, we have that

$$\begin{aligned} {{\mathrm{\mathbb {E}}}}[|Y-c|]&= \int _0^c (c-x)e^{-x}\mathrm{d}x + \int _c^\infty (x-c)e^{-x} \mathrm{d}x \\&= (x-c)e^{-x}\big |_0^c - \int _0^c e^{-x} \mathrm{d}x + (c-x)e^{-x}\big |_c^\infty + \int _c^\infty e^{-x} \mathrm{d}x \\&= (x-c+1)e^{-x}\big |_0^c + (c-x-1)e^{-x}\big |_c^\infty = 2e^{-c} + c - 1 . \end{aligned}$$

We wish to show that \(2e^{-c} + c - 1 \ge c/2\), hence it suffices to show \(2e^{-c} + c/2 - 1 \ge 0\) for all \(c \ge 0\). This function is minimized at \(c = \ln 4\) where it achieves value \((\ln 4 - 1)/2 > 0\). \(\square \)

While we could choose \(\mathbf {c}\) and \(\mathbf {d}\) such that \(\mathbf {c} + X\) and \(\mathbf {d} + X\) lie in the same cones as \(\mathbf {c}\) and \(\mathbf {d}\) with high probability, and hence no facets are hit by the segments \([\mathbf {c}, \mathbf {c} + X]\) and \([\mathbf {d}, \mathbf {d} + X]\), this would require us to choose \(\Vert \mathbf {d} - \mathbf {c}\Vert \) quite large. We will show a better way to bound the number of facets that are hit by the segment \([\mathbf {c}, \mathbf {c} + X]\).

Lemma 17

Let \(C \subseteq \mathbb {R}^n\) be a polyhedral cone containing \(\mathbf {u}+\tau \mathcal {B}_2^n\), where \(\Vert \mathbf {u}\Vert =1\). Let \(\mathbf {c} \in \mathbb {R}^n\) and let \(X \in \mathbb {R}^n\) be exponentially distributed on a cone \(\Sigma \ni \mathbf {u}\). Then for every \(\alpha \in (0,1)\) we have

$$\begin{aligned} {{\mathrm{\mathbb {E}}}}[|\partial C \cap [\mathbf {c} + \alpha X, \mathbf {c} + X]|] \le \frac{2}{\tau } \int _1^{1/\alpha } \frac{1}{s} \int _{(C - s\mathbf {c})\cap \Sigma } \Vert \mathbf {x}\Vert \zeta _{\Sigma }(\mathbf {x}) \mathrm{d}\mathbf {x}\, \mathrm{d}s. \end{aligned}$$

Proof

As in the proof of Lemma 14, we will decompose the expectation over the facets of C, where we have

$$\begin{aligned} {{\mathrm{\mathbb {E}}}}[|\partial C \cap [\mathbf {c} + \alpha X, \mathbf {c} + X]|] = \sum _{F \text { facet of } C} \Pr [F \cap [\mathbf {c} + \alpha X, \mathbf {c} + X] \ne \emptyset ]. \end{aligned}$$
(4)

Take a facet F of C and let \(\mathbf {n}\) denote a unit normal to F pointing in the direction of the cone (i.e., \(\left\langle {n, u}\right\rangle > 0\)).

(5)

Again, we have to bound an integral over a shifted facet, similar to the proof of Lemma 14. Letting \(h = |\langle {\mathbf {n}}, {\mathbf {u}}\rangle | \ge \tau \), we have that

$$\begin{aligned} \int _{(F+\mathbf {t}+\mathrm{cone}(\mathbf {u})) \cap \Sigma } \Vert \mathbf {x}\Vert e^{-\Vert \mathbf {x}\Vert } \mathrm{d}\mathbf {x}&\ge \int _{((F+\mathbf {t}) \cap \Sigma )+\mathrm{cone}(\mathbf {u})} \Vert \mathbf {x}\Vert e^{-\Vert \mathbf {x}\Vert } \mathrm{d}\mathbf {x} \quad (\text {since } \mathbf {u} \in \Sigma )\nonumber \\&= \int _0^\infty \int _{((F+\mathbf {t}) \cap \Sigma ) + \frac{r}{h} \mathbf {u}} \Vert \mathbf {x}\Vert e^{-\Vert \mathbf {x}\Vert } \mathrm{d}\mathrm{vol}_{n-1}(\mathbf {x}) \,\mathrm{d}r \nonumber \\&= \int _0^\infty \int _{(F+\mathbf {t}) \cap \Sigma } \Vert \mathbf {x} + \tfrac{r}{h}\mathbf {u}\Vert e^{-\Vert \mathbf {x} + \frac{r}{h} \mathbf {u}\Vert } \mathrm{d}\mathrm{vol}_{n-1}(\mathbf {x})\, \mathrm{d}r \nonumber \\&\ge \int _0^\infty \int _{(F + \mathbf {t}) \cap \Sigma } |\langle {\mathbf {n}}, {\mathbf {x} + \tfrac{r}{h}\mathbf {u}}\rangle | e^{-r/h} e^{-\Vert \mathbf {x}\Vert } \mathrm{d}\mathrm{vol}_{n-1}(\mathbf {x})\, \mathrm{d}r \nonumber \\&= h^2 \int _0^\infty |\langle {\mathbf {n}}, {\mathbf {t}}\rangle /h + s| e^{-s} \mathrm{d}s \int _{(F+\mathbf {t}) \cap \Sigma } e^{-\Vert \mathbf {x}\Vert } \mathrm{d}\mathrm{vol}_{n-1}(\mathbf {x})\nonumber \\&\ge \frac{h}{2} \int _{(F+\mathbf {t}) \cap \Sigma } |\langle {\mathbf {n}}, {\mathbf {t}}\rangle |e^{-\Vert \mathbf {x}\Vert } \mathrm{d}\mathrm{vol}_{n-1}(\mathbf {x}) \quad (\text {by Lemma 16})\nonumber \\&\ge \frac{\tau }{2} \int _{(F+\mathbf {t}) \cap \Sigma }|\langle {\mathbf {n}}, {\mathbf {t}}\rangle | e^{-\Vert \mathbf {x}\Vert } \mathrm{d}\mathrm{vol}_{n-1}(\mathbf {x})\\\nonumber \end{aligned}$$
(6)

The lemma now follows by combining (4),(5),(6). \(\square \)

Lemma 18

Let \(\mathcal {T}= (C_1,\ldots ,C_k)\) be partition of a cone \(\Sigma \) into polyhedral \(\tau \)-wide cones. Let \(\mathbf {c} \in \mathbb {R}^n\) and \(\alpha \in (0,1)\) be fixed and let \(X \in \mathbb {R}^n\) be exponentially distributed over \(\Sigma \). Then

$$\begin{aligned} {{\mathrm{\mathbb {E}}}}[|\partial \mathcal {T}\cap [\mathbf {c} + \alpha X, \mathbf {c} + X]|] \le \frac{2n}{\tau } \ln {\frac{1}{\alpha }} . \end{aligned}$$

Proof

By Lemmas 13 and 17, we have that

$$\begin{aligned} {{\mathrm{\mathbb {E}}}}[|\partial \mathcal {T}\cap [\mathbf {c} + \alpha X, \mathbf {c} + X]|]&\le \sum _{i=1}^k {{\mathrm{\mathbb {E}}}}[|\partial C_i \cap [\mathbf {c} + \alpha X, \mathbf {c} + X]|] \\&\le \frac{2}{\tau } \sum _{i=1}^k \int _1^{1/\alpha } \frac{1}{s} \int _{(C_i - s\mathbf {c})\cap \Sigma } \Vert \mathbf {x}\Vert \zeta _{\Sigma }(\mathbf {x}) \mathrm{d}\mathbf {x} \,\mathrm{d}s \\&\le \frac{2}{\tau } \int _1^{1/\alpha } \frac{1}{s} \int _{\Sigma } \Vert \mathbf {x}\Vert \zeta _{\Sigma }(\mathbf {x}) \mathrm{d}\mathbf {x}\, \mathrm{d}s \\&\le \frac{2}{\tau } \int _1^{1/\alpha } \frac{1}{s} {{\mathrm{\mathbb {E}}}}[\Vert X\Vert ] \mathrm{d}s = \frac{2n}{\tau } \ln \frac{1}{\alpha }. \end{aligned}$$

\(\square \)

5 Optimization

While bounding the number of intersections of line segments \([\mathbf {c}, \mathbf {d}]\) with the facets of the normal fan of \(P = \{{ \mathbf {x} \in \mathbb {R}^n : A\mathbf {x} \le \mathbf {b}}\}\) is sufficient to obtain existential bounds on the diameter of P, we also need to be able to efficiently compute the corresponding pivots to obtain efficient algorithms. The following summarizes the required results, the technical details of which are found in Sect. 9.

Theorem 19

(Shadow simplex, see Theorem 35) Let \(P = \{{ \mathbf {x} \in \mathbb {R}^n : A\mathbf {x} \le \mathbf {b}}\}\) be pointed, \(\mathbf {c}, \mathbf {d} \in \mathbb {R}^n\), and B an optimal basis for \(\mathbf {c}\). If every intersection of \([\mathbf {c}, \mathbf {d})\) with a facet F of a cone spanned by a feasible basis of P lies in the relative interior of F, the Shadow Simplex can be used to compute an optimal basis for \(\mathbf {d}\) in \(O(mn^2 + Nmn)\) arithmetic operations, where N is the number of intersections of \([\mathbf {c},\mathbf {d}]\) with some triangulation \(\mathcal {T}\) of the normal fan of P, where \(\mathcal {T}\) contains the cone spanned by the initial basis B.

As explained in Sect. 2, we want to follow segments \([\mathbf {c}, \mathbf {c} + X]\), \([\mathbf {c} + X, \mathbf {d} + X]\), \([\mathbf {d} + X, \mathbf {d}]\) in the normal fan. Our intersection bounds from Theorem 10 are not quite sufficient to bound the number of steps on the first and last segments entirely. This is easily dealt with for the first segment, because we can control the initial objective function \(\mathbf {c}\) so that it lies deep in the initial normal cone.

For the final segment, we follow the approach of Eisenbrand and Vempala [11], who showed that if A satisfies the global \(\delta \)-distance property, then an optimal facet for \(\mathbf {d}\) can be derived from a basis that is optimal for some \({\tilde{\mathbf {d}}}\) with \(\Vert \mathbf {d} - {\tilde{\mathbf {d}}}\Vert \le \frac{\delta }{n}\). Recursion can then be used on a problem of reduced dimension to move from \({\tilde{\mathbf {d}}}\) to \(\mathbf {d}\). We strengthen their result (thereby answering a question left open by [11]) and show that the local \(\delta \)-distance property is sufficient to get the same result as long as \(\Vert \mathbf {d} - {\tilde{\mathbf {d}}}\Vert \le \frac{\delta }{n^2}\).Footnote 4

Lemma 20

Let \(\mathbf {x}_1,\dots ,\mathbf {x}_m \in \mathbb {S}^{n-1}\) be a set of vectors. Then the following are equivalent:

  1. 1.

    \(\mathbf {x}_1,\dots ,\mathbf {x}_m\) satisfy the \(\delta \)-distance property.

  2. 2.

    \(\forall ~ I \subseteq [m]\) for which \(\{{\mathbf {x}_i: i \in I}\}\) are linearly independent and \(\forall ~ (a_i \in \mathbb {R}: i \in I)\)

    $$\begin{aligned} \big \Vert \sum _{i \in I} a_i \mathbf {x}_i \big \Vert \ge \delta \max _{i \in I} |a_i| . \end{aligned}$$

Proof

\((1) \Rightarrow (2)\) Let \(I \subseteq [m]\) for which \(\{{\mathbf {x}_i: i \in I}\}\) are linearly independent, and examine the linear combination \(\sum _{i \in I} a_i \mathbf {x}_i\). Letting \(j = {{\mathrm{arg\,max}}}_{i \in I} |a_i|\), we have that

$$\begin{aligned} \big \Vert a_j\mathbf {x}_j + \sum _{i \in I \setminus \left\{ j\right\} } a_i\mathbf {x}_i\big \Vert&\ge d(a_j\mathbf {x}_j,\!\!~\mathrm{span}(\{{\mathbf {x}_i: i \in I \setminus \left\{ j\right\} }\})) \\&= |a_j| d(\mathbf {x}_j,\!\!~\mathrm{span}(\{{\mathbf {x}_i: i \in I \setminus \left\{ j\right\} }\})) \ge \delta |a_j|\quad (\text {by property} (1)) \end{aligned}$$

as needed.

\((2) \Rightarrow (1)\) Take \(i \in [m]\) and \(J \subseteq [m]\), such that \(\mathbf {x}_i \not \in \mathrm{span}(\{{\mathbf {x}_j: j \in J}\})\). Since we need only prove a lower bound on \(d(\mathbf {x}_i, \mathrm{span}(\{{\mathbf {x}_j: j \in J}\}))\), we may clearly assume that \(\{{\mathbf {x}_j: j \in \left\{ i\right\} \cup J}\}\) are linearly independent. Given this, we have that

$$\begin{aligned} d(\mathbf {x}_i, \mathrm{span}(\{{\mathbf {x}_j: j \in J}\}))&= \min \Big \{{\mathbf {x}_i - \sum _{j \in J} a_j \mathbf {x}_j: a_j \in \mathbb {R}, j \in J\Big \}} \\&\ge \min \Big \{{\delta \max \Big \{{1, \max _{j \in J} |a_j|}\Big \}: a_j \in \mathbb {R}, j \in J\Big \}} \quad (\text {by property (2)}) \\&\ge \delta \,\text {as needed.} \end{aligned}$$

\(\square \)

Definition 21

Let F be a face of a cone triangulation \(\mathcal {T}\) and let \(\mathbf {x}\) be a vector in the support of \(\mathcal {T}\). Let \(G = \mathrm{cone}(\mathbf {x}_1, \ldots , \mathbf {x}_k)\), \(\Vert \mathbf {x}_i\Vert = 1\), be the minimal face of \(\mathcal {T}\) that contains \(\mathbf {x}\) and consider the unique conic combination \(\mathbf {x} = \lambda _1 \mathbf {x}_1 + \dots + \lambda _k \mathbf {x}_k\). We define

$$\begin{aligned} \alpha _F(\mathbf {x}) := \sum _{i : \mathbf {x}_i \not \in F} \lambda _i. \end{aligned}$$

In particular, \(\alpha _F(\mathbf {x}) \ge 1\) if \(\mathbf {x}\) is a unit vector and the minimal face containing it is disjoint from F, and \(\alpha _F(\mathbf {x}) = 0\) if \(\mathbf {x} \in F\).

Lemma 22

Let F be a cone of an n-dimensional cone triangulation \(\mathcal {T}\) satisfying the local \(\delta \)-distance property. Let \(\mathbf {x}\) be a point in the support of \(\mathcal {T}\). Then \(d(\mathbf {x},F) \ge \alpha _F(\mathbf {x}) \cdot \frac{\delta }{n}\).

Proof

Let \(\mathbf {y} \in F\) be the (unique) point with \(d(\mathbf {x},\mathbf {y}) = d(\mathbf {x},F)\). Note that by convexity, the segment \([\mathbf {x},\mathbf {y}]\) is contained in the support of \(\mathcal {T}\). By considering the cones of \(\mathcal {T}\) that contain points on the segment \([\mathbf {x},\mathbf {y}]\), we obtain a sequence of points

$$\begin{aligned} \mathbf {x} = \mathbf {x}_0, \mathbf {x}_1, \ldots , \mathbf {x}_r = \mathbf {y} \end{aligned}$$

on the segment \([\mathbf {x},\mathbf {y}]\) and (full-dimensional) cones \(G_1, \ldots , G_r\) such that

$$\begin{aligned} G_i \cap [\mathbf {x},\mathbf {y}] = [\mathbf {x}_{i-1}, \mathbf {x}_i]. \end{aligned}$$

The result of the lemma will follow from the lower bounds

$$\begin{aligned} d(\mathbf {x}_{i-1}, \mathbf {x}_i) \ge |\alpha _F(\mathbf {x}_{i-1}) - \alpha _F(\mathbf {x}_i)| \cdot \frac{\delta }{n}~\, \forall i \in [r]. \end{aligned}$$
(7)

To derive the lemma, we combine the above bounds in the following manner:

$$\begin{aligned} d(\mathbf {x},\mathbf {y})= & {} \sum _{i=1}^r d(\mathbf {x}_{i-1}, \mathbf {x}_i) \nonumber \\\ge & {} \sum _{i=1}^n |\alpha _F(\mathbf {x}_{i-1}) - \alpha _F(\mathbf {x}_i)| \cdot \frac{\delta }{n} \ge (\alpha _F(\mathbf {x})-\alpha _F(\mathbf {y})) \cdot \frac{\delta }{n} = \alpha _F(\mathbf {x}) \cdot \frac{\delta }{n} , \end{aligned}$$

where the last equality follows since \(\alpha _F(\mathbf {y}) = 0\).

We now prove (7). Fix some \(G_i = \mathrm{cone}(\mathbf {y}_1, \ldots , \mathbf {y}_n)\). By relabelling, we may assume that \(\mathrm{cone}(\mathbf {y}_1,\dots ,\mathbf {y}_k) = G_i \cap F\) (since \(G_i\) and F are both faces of \(\mathcal {T}\)), for some \(0 \le k \le n\) (if \(k=0\) then \(G_i \cap F = \left\{ \mathbf {0}\right\} \)).

For every \(\mathbf {z} \in G_i\), the minimal cone containing \(\mathbf {z}\) is a face of \(G_i\). Therefore, using the unique conic combination \(\mathbf {z} = \sum _{i=1} \lambda _i \mathbf {y}_i\), we have that \(\alpha _F(\mathbf {z}) = \sum _{k < i \le n} \lambda _i\).

Writing \(\mathbf {x}_{i-1} = \sum _{i=1}^n a_i \mathbf {y}_i\) and \(\mathbf {x}_i = \sum _{i=1}^n b_i \mathbf {y}_i\), by Lemma 20 we have that

$$\begin{aligned} d(\mathbf {x}_{i-1},\mathbf {x}_i)&\ge \delta \max _{1 \le i \le n} |a_i-b_i| \ge \delta \max _{k< i \le n} |a_i-b_i| \ge \frac{\delta }{n} \sum _{k < i \le n} |a_i - b_i| \\&\ge \frac{\delta }{n} |\alpha _F(\mathbf {x}_{i-1})-\alpha _F(\mathbf {x}_i)| , \end{aligned}$$

which completes the proof of the claim. \(\square \)

Lemma 23

Let F be a cone of a triangulation \(\mathcal {T}\) satisfying the local \(\delta \)-distance property and let \(\mathbf {x}\) be a point in the support of \(\mathcal {T}\) with \(d(\mathbf {x},F) \le \frac{\delta }{n^2}\). Let \(G = \mathrm{cone}(\mathbf {x}_1, \ldots , \mathbf {x}_n)\), \(\Vert \mathbf {x}_i\Vert = 1\), be a cone of \(\mathcal {T}\) containing \(\mathbf {x}\) and let

$$\begin{aligned} \mathbf {x} = \lambda _1 \mathbf {x}_1 + \dots + \lambda _n \mathbf {x}_n \end{aligned}$$

be the corresponding conic combination. Then for every \(i \in [n]\) with \(\lambda _i > \frac{1}{n}\) one has \(\mathbf {x}_i \in F\).

Proof

Suppose there is some i with \(\lambda _i > \frac{1}{n}\) and \(\mathbf {x}_i \not \in F\). Then \(\alpha _F(\mathbf {x}) > \frac{1}{n}\) and by Lemma 22 we get \(d(\mathbf {x},F) > \frac{\delta }{n^2}\), which is a contradiction. \(\square \)

For the recursion on a facet, we let \(\pi _i(\mathbf {x}) := \mathbf {x} - \frac{\langle {\mathbf {x}}, {\mathbf {a}_i}\rangle }{\langle {\mathbf {a}_i}, {\mathbf {a}_i}\rangle } \mathbf {a}_i\) be the orthogonal projection onto the subspace orthogonal to \(\mathbf {a}_i\) and we let \(F_i\) be the facet of P defined by \(\langle {\mathbf {a}_i}, {\mathbf {x}}\rangle = \mathbf {b}_i\).

Lemma 24

Let \(\mathbf {v}_1, \ldots , \mathbf {v}_k \in \mathbb {R}^n\) be linearly independent vectors that satisfy the \(\delta \)-distance property and let \(\pi \) be the orthogonal projection onto the subspace orthogonal to \(\mathbf {v}_k\). Then \(\pi (\mathbf {v}_1), \ldots , \pi (\mathbf {v}_{k-1})\) satisfy the \(\delta \)-distance property.

Proof

Let \(i \in \{1, \dots , k-1\}\). Let \(S_i = \mathrm{span}\{{ \pi (\mathbf {v}_j) : j \ne i, k}\}\). First, observe

$$\begin{aligned} \mathrm{span}(\{{ \mathbf {v}_k } \cup S_i) = \mathrm{span}(\{{ \mathbf {v}_j : j \ne i}\}). \end{aligned}$$

Assume that \(d(\pi (\mathbf {v}_i), S_i) < \delta \Vert \pi (\mathbf {v}_i)\Vert \). So there exists some \(\mathbf {x} \in S_i\) such that \(d(\pi (\mathbf {v}_i), \mathbf {x}) < \delta \Vert \pi (\mathbf {v}_i)\Vert \). Since we can write \(\mathbf {v}_i = \pi (\mathbf {v}_i) + \lambda \mathbf {v}_k\) for some \(\lambda \in \mathbb {R}\), we get

$$\begin{aligned} d(\mathbf {v}_i, \mathrm{span}(\{{ \mathbf {v}_j : j \ne i}\})) \le d(\mathbf {v}_i, \mathbf {x} + \lambda \mathbf {v}_k) = d(\pi (\mathbf {v}_i), \mathbf {x}) < \delta \Vert \pi (\mathbf {v}_i)\Vert \le \delta \Vert \mathbf {v}_i\Vert , \end{aligned}$$

which contradicts the \(\delta \)-distance property of \(\mathbf {v}_1, \ldots , \mathbf {v}_k\). So we must in fact have \(d(\pi (\mathbf {v}_i), S_i) \ge \delta \Vert \pi (\mathbf {v}_i)\Vert \) for all \(i \in \{1, \ldots , k-1\}\), which completes the proof. \(\square \)

This lemma, which was already used byin [11], implies that if P satisfies the local \(\delta \)-distance property then so does \(F_i\), where the definition of local \(\delta \)-distance is understood relative to the affine hull of \(F_i\),Footnote 5 because the normal vectors of \(F_i\) arise from orthogonal projections of the normal vectors of P.

figure b

Theorem 25

If P satisfies the local \(\delta \)-distance property, then Algorithm 1 correctly computes an optimal basis for \(\mathbf {d}\) using an expected \(O((n^3/\delta ) \ln (n/\delta ))\) shadow simplex pivots.

Proof

For correctness, let \(\mathcal {T}\) be some triangulation of the normal fan of P and let C be a cone in \(\mathcal {T}\) that contains \(\mathbf {d}\). We have \(\Vert \frac{\delta }{2n^3} X\Vert \le \frac{\delta }{n^2}\) and therefore \(d({\tilde{\mathbf {d}}}, C) \le d({\tilde{\mathbf {d}}}, \mathbf {d}) \le \frac{\delta }{n^2}\). Furthermore, \(\Vert {\tilde{\mathbf {d}}}\Vert \ge \Vert \mathbf {d}\Vert - \frac{\delta }{n^2} > 1\) implies that \(\sum _{i \in B} \lambda _i > 1\) so that there is some i with \(\lambda _i > \frac{1}{n}\). Applying Lemma 23 yields that \(\mathbf {a}_{i^\star }\) is a generator of C, which means that \(i^\star \) is contained in some optimal basis for \(\mathbf {d}\). This implies that recursion on \(F_{i^\star }\) yields the correct result.

In order to bound the number of pivots, let C be the cone of the initial basis and observe that \(\mathbf {c} + \delta \mathcal {B}_2^n \subseteq C\) by the proof of Lemma 5. Hence the segment \([\mathbf {c} + \frac{\delta }{2n} X)\) does not cross a facet of the triangulation \(\mathcal {T}_1\) of the normal fan that is implicitly used by the first leg of the shadow simplex path.

If X were exponentially distributed (without the conditioning on \(\Vert X\Vert \le 2n\)), Theorem 10 together with Lemma 5 would bound the expected number of pivot steps along the three segments by

$$\begin{aligned} {{\mathrm{\mathbb {E}}}}[ N ] \le \frac{2n^2}{\delta } \ln \frac{2n}{\delta } + \frac{n \Vert \mathbf {d} - \mathbf {c}\Vert }{\delta } + \frac{2n^2}{\delta } \ln \frac{2n^3}{\delta } \le O\Big (\frac{n^2}{\delta } \ln \Big (\frac{n}{\delta }\Big )\Big ). \end{aligned}$$

Since \({{\mathrm{\mathbb {E}}}}[\Vert X\Vert ] = n\) we have \(\Pr [ \Vert X\Vert \le 2n ] \ge \frac{1}{2}\) by Markov’s inequality and therefore

$$\begin{aligned} {{\mathrm{\mathbb {E}}}}[ N ~|~ \Vert X\Vert \le 2n ] \le 2 {{\mathrm{\mathbb {E}}}}[ N ] \le O\Big (\frac{n^2}{\delta } \ln \Big (\frac{n}{\delta }\Big )\Big ). \end{aligned}$$

The bound on the total expected number of pivot steps follows from the depth n of recursion. \(\square \)

6 Feasibility

The following result is already implicit in [4]. We provide it here for completeness.

Lemma 26

Let \(A \in \mathbb {R}^{m \times n}\) be an integral matrix whose entries are bounded by \(\Delta _1\) and whose \((n-1)\times (n-1)\) subdeterminants are bounded by \(\Delta _{n-1}\) in absolute value. Then A satisfies the global \(\delta \)-distance property with \(\delta = 1/(n\Delta _1\Delta _{n-1})\). Furthermore, any polyhedron with constraint matrix A is \(\tau \)-wide with \(\tau = 1/(n^2\Delta _1\Delta _{n-1})\).

Proof

It is sufficient to consider the case where \(A \in \mathbb {Z}^{n \times n}\) is invertible. Let \(\mathbf {a}_1, \ldots , \mathbf {a}_n\) be the rows of A and let \(H_i = \mathrm{span}\{{ \mathbf {a}_j : j \ne i}\}\). The vector \(\mathbf {u}_i\) satisfying \(A\mathbf {u}_i = |\!\det (A)| \mathbf {e}_i\) is a normal vector of \(H_i\) with \(\mathbf {u}_i \in \mathbb {Z}^n\) and \(\Vert \mathbf {u}_i\Vert _\infty \le \Delta _{n-1}\) by Cramer’s rule. We can compute

$$\begin{aligned} d(\mathbf {a}_i/\Vert \mathbf {a}_i\Vert , H_i) = \frac{\langle {\mathbf {a}_i}, {\mathbf {u}_i}\rangle }{\Vert \mathbf {a}_i\Vert \Vert \mathbf {u}_i\Vert } \ge \frac{1}{(\sqrt{n} \Delta _1) (\sqrt{n} \Delta _{n-1})} = \frac{1}{n \Delta _1 \Delta _{n-1}} \end{aligned}$$

using the fact that \(\Vert \mathbf {a}_i\Vert _\infty \le \Delta _1\).

The “furthermore” part of the statement of the Lemma follows from Lemma 5. \(\square \)

We now provide our LP feasibility algorithms which prove Theorem 9 in Sect. 2. We break the proof up into the following two lemmas. The first part of Theorem 9 is given by Lemma 27 and the furthermore is given by Lemma 28.

Lemma 27

Let \(P = \{{ \mathbf {x} \in \mathbb {R}^n : A \mathbf {x} \le \mathbf {b}}\}\) where \(A \in \mathbb {R}^{m \times n}\) satisfies the global \(\delta \)-distance property. One can compute a feasible basis of P or decide infeasibility using an expected \(O((mn^3/\delta ) \ln n/\delta )\) shadow simplex pivots.

Proof

We proceed by iteratively adding the constraints of P, one at a time. First, observe that we can find a basis B of A efficiently using Gauss elimination, which gives us the unique feasible basis of \(P_B = \{{ \mathbf {x} \in \mathbb {R}^n : A_B \mathbf {x} \le \mathbf {b}_B}\}\).

Now suppose we already found a feasible basis B of \(P_I = \left\{ \mathbf {x} \in \mathbb {R}^n : A_I \mathbf {x} \le \mathbf {b}_I \right\} \), \(I \subsetneq [m]\). Let \(i \not \in I\). Since \(P_I\) satisfies the global \(\delta \)-distance property, we can combine Lemma 8 with Theorem 6 to solve the linear program

$$\begin{aligned} \min \{{\langle {\mathbf {a}_i}, {\mathbf {x}}\rangle : \mathbf {x} \in P_I }\} = \gamma \end{aligned}$$

using an expected \(O((n^3/\delta ) \ln n/\delta )\) shadow simplex pivots. If \(\gamma > \mathbf {b}_i\), this implies that \(P_{I + i}\) is empty and therefore P is empty. Otherwise, the solution of the linear program yields a point \(\mathbf {x} \in P_{I + i}\) (if \(\gamma = -\infty \), we find a suitable point on an unbounded ray) which we can round to a feasible basis of \(P_{I + i}\) using a standard ray-casting procedure if necessary.

Applying this procedure iteratively until \(I = [m]\), we use an expected \(O((mn^3/\delta ) \ln n/\delta )\) shadow simplex pivots to obtain a feasible basis of P. Observe that the time required for those pivots dominates the time required for any intermediate ray-casting. \(\square \)

Lemma 28

Let \(P = \{{ \mathbf {x} \in \mathbb {R}^n : A \mathbf {x} \le \mathbf {b}}\}\) where \(A \in \mathbb {Z}^{m \times n}\) is an integral matrix of full column rank with subdeterminants bounded by \(\Delta \) in absolute value. One can compute a feasible basis of P or decide infeasibility using an expected \(O(n^5 \Delta ^2 \ln (n\Delta ))\) shadow simplex pivots.

Proof

Consider the linear program

The \((m + 1) \times (n + 1)\)-constraint matrix is integral of full column rank and has \(n \times n\)-subdeterminants bounded by \(n\Delta \). Therefore, it satisfies the global \(\delta \)-distance property with \(\delta = \frac{1}{n^2\Delta ^2}\) by Lemma 26. The point \((\mathbf {0}, -\min (\left\{ 0\right\} \cup \left\{ \mathbf {b}_i : i \in [m]\right\} ))\) is feasible, and so a feasible basis can be found using a standard ray-casting procedure. Lemma 8 implies that we can construct an LP equivalent polytope with the same parameter \(\delta \), which we can then optimize in \(O(n^5 \Delta ^2 \ln n\Delta )\) shadow simplex pivots by Theorem 6. If the optimal solution we found satisfies \(s = 0\), we can read off a feasible basis of P; otherwise, we know that P is empty. \(\square \)

7 The Perfect Matching Polytope

The perfect matching polytope \(P_G \subset \mathbb {R}^E\) of an undirected graph \(G = (V, E)\), \(|V| = 2n\), is the convex hull of the characteristic vectors \(\chi _M\) of perfect matchings \(M \subseteq E\) (see [18, Chap. 25] for a collection of fundamental results on \(P_G\)). It is described by the system of inequalities

We will also use the fact that two vertices \(\chi _M\) and \(\chi _N\) of \(P_G\) are adjacent if and only if \(M \triangle N\) is a cycle.

We will show that even though this polytope has an exponential number of facets, so that its normal fan contains an exponential number of extreme rays, every normal cone is rather wide. This is due to the high level of degeneracy of the polytope. While a much better bound on the diameter of \(P_G\) follows directly from the combinatorial adjacency structure noted above, it is interesting to see that some of our techniques can be applied to \(P_G\). As far as we know, this is the first example of a combinatorial polytope that satisfies this kind of “discrete curvature bound” without having a constraint matrix with small subdeterminants.

Since \(P_G\) is not full-dimensional, there are two different but essentially equivalent definitions for the normal cones of \(P_G\). One can treat \(P_G\) as a polytope in the ambient space \(\mathbb {R}^E\), keeping the definition of the normal cone \(N_v\) of a vertex \(\mathbf {v} \in P_G\) as the set of objective functions \(\mathbf {c} \in \mathbb {R}^E\) that are maximized at \(\mathbf {v}\). Hence \(N_v\) is not pointed, and its lineality space \(L^\bot \) is the set of vectors that are orthogonal to the affine hull \(\mathrm{aff}(P_G)\) of \(P_G\). Alternatively, one may treat \(P_G\) as a full-dimensional polytope within \(\mathrm{aff}(P_G)\), in which case the normal cone \(N_v'\) is simply the restriction of \(N_v\) to the linear space L of vectors parallel to \(\mathrm{aff}(P_G)\). The choice of definition does not affect \(\tau \)-width: given a ball \(\mathbf {w} + \tau B_2^{E} \subseteq N_v\), \(\Vert \mathbf {w}\Vert = 1\), the orthogonal projection of the ball onto L is \(\mathbf {w}' + \tau B_2^L \subseteq N_v'\) with \(\Vert \mathbf {w}'\Vert \le 1\).

Theorem 29

\(P_G\) is \(\tau \)-wide for \(\tau = 1/(3\sqrt{|E|})\).

Fig. 1
figure 1

A perfect matching and a selection of tight odd sets

Proof

Let \(\chi _M \in P_G\) be a vertex and let \(N_{\chi _M}\) be its normal cone. Let us label the vertices of G such that

$$\begin{aligned} M = \{ u_0v_0, u_1v_1, \ldots , u_{n-1}v_{n-1}\}. \end{aligned}$$

For an odd set \(U \subseteq V\) of vertices, let \(\mathbf {a}_U\) be the corresponding row of the constraint matrix in normal form, i.e. \(\mathbf {a}_U = - \chi _{\delta (U)}\). We consider the following 3-element sets, see Fig. 1:

$$\begin{aligned} \mathcal {U}:= \big \{{\{{ u_k, v_k, w}\} : 0 \le k \le n -1, w \in \{{ u_{k + 1\!\!\!\!\pmod {n}}, v_{k + 1\!\!\!\!\pmod {n}}}}\}\big \}. \end{aligned}$$

Since the corresponding constraints are tight at \(\chi _M\), we have that

$$\begin{aligned} \mathbf {w} := \sum _{U \in \mathcal {U}} \mathbf {a}_U \in N_{\chi _M}. \end{aligned}$$

Note that for \(e \in M\), we have \(\mathbf {w}(e) = -2\), and for \(e \in E \setminus M\), we have \(\mathbf {w}(e) \in \{ -4, -6 \}\) (every vertex is contained in exactly 3 of the sets in \(\mathcal {U}\), and there can be at most one set in \(\mathcal {U}\) that contains both endpoints of \(e \not \in M\)). In particular, \(\Vert \mathbf {w}\Vert \le 6 \sqrt{|E|}\).

Every facet F of \(N_{\chi _M}\) corresponds to an edge from \(\chi _M\) to some other vertex \(\chi _N\). We know that \(M \triangle N\) is a cycle C. The direction \(\mathbf {v}\) of the edge from \(\chi _N\) to \(\chi _M\), which is (inner) normal to F, satisfies

$$\begin{aligned} \mathbf {v}(e) = {\left\{ \begin{array}{ll} 1, &{} e \in C \cap M, \\ -1, &{} e \in C \setminus M, \\ 0, &{} e \not \in C. \end{array}\right. } \end{aligned}$$

We compute

$$\begin{aligned} \left\langle {\mathbf {v}, \mathbf {w}}\right\rangle&= \sum _{e \in C \cap M} \mathbf {w}(e) - \sum _{e \in C \setminus M} \mathbf {w}(e) \ge -2 |C \cap M| + 4 |C \setminus M| = |C|, \end{aligned}$$

where we use the fact that C alternates between edges in M and edges in N for the last equation. Let H be the affine span of F. We obtain

$$\begin{aligned} d(\mathbf {w}, H) = \frac{\langle {\mathbf {v}}, {\mathbf {w}}\rangle }{\Vert \mathbf {v}\Vert } \ge \frac{|C|}{\Vert \mathbf {v}\Vert } = \sqrt{|C|} \ge 2. \end{aligned}$$

In other words, \(N_{\chi _M}\) contains a ball of radius 2 around w, and so \(N_{\chi _M}\) is \(\tau \)-wide for \(\tau = 2/\Vert \mathbf {w}\Vert \ge 1/(3\sqrt{|E|})\). \(\square \)

8 Removing Unboundedness

In this section, we discuss two approaches to add constraints to a pointed polyhedron \(P = \{{ \mathbf {x} \in \mathbb {R}^n : A\mathbf {x} \le \mathbf {b} }\}\) to construct an LP equivalent polytope \(P'\). This requires the additional constraints to be strictly valid for all vertices of P.

Lemma 30

Let \(M \in \mathbb {R}^{n \times n}\) be a matrix whose rows are \(\mathbf {v}_1, \ldots , \mathbf {v}_n \in \mathbb {S}^{n-1}\). Then \(\mathbf {v}_1, \ldots , \mathbf {v}_n\) satisfy the \(\delta \)-distance property if and only if the columns \(\mathbf {u}_1,\ldots ,\mathbf {u}_n\) of \(M^{-1}\) satisfy \(\Vert \mathbf {u}_j\Vert \le 1/\delta \) for all j.

Proof

Let \(H_i = \mathrm{span}\{{ \mathbf {v}_j : j \ne i}\}\). Then

$$\begin{aligned} 1 = \langle {\mathbf {v}_i}, {\mathbf {u}_i}\rangle = \Vert \mathbf {u}_i\Vert d(\mathbf {v}_i, H_i) \iff d(\mathbf {v}_i, H_i) = \frac{1}{\Vert \mathbf {u}_i\Vert } \end{aligned}$$

implies the statement. \(\square \)

Lemma 31

Let \(P = \{{ \mathbf {x} \in \mathbb {R}^n : A\mathbf {x} \le \mathbf {b} }\}\) satisfy the local \(\delta \)-distance property and let \(b_{\max } = \max \{{ \frac{|\mathbf {b}_i|}{\Vert \mathbf {a}_i\Vert } : i \in [m]}\}\). Then every vertex \(\mathbf {x} \in P\) satisfies \(\Vert \mathbf {x}\Vert \le \frac{n b_{\max }}{\delta }\).

Proof

We may assume without loss of generality that \(\Vert \mathbf {a}_i\Vert = 1\) for all \(i \in [m]\). Let \(\mathbf {x}\) be a vertex and let B be a basis for \(\mathbf {x}\). Let \(\mathbf {u}_1, \ldots , \mathbf {u}_n \in \mathbb {R}^n\) be the columns of \(A_B^{-1}\). The triangle inequality and Lemma 30 imply \(\Vert \mathbf {x}\Vert = \Vert A_B^{-1} \mathbf {b}_B\Vert \le \sum _{i \in B} \Vert \mathbf {u}_i\Vert b_{\max } \le n b_{\max } / \delta \). \(\square \)

Proof

of Lemma 8

  1. 1.

    Let \(P = \{{\mathbf {x} \in \mathbb {R}^n : A\mathbf {x} \le \mathbf {b}}\}\) satisfy the local \(\delta \)-distance property. For some feasible basis I we let

    $$\begin{aligned} \mathbf {w} = - \frac{1}{n} \sum _{i \in I} \frac{\mathbf {a}_i}{\Vert \mathbf {a}_i\Vert } \end{aligned}$$

    and \(P' = \{{ \mathbf {x} \in P : \langle {\mathbf {w}}, {\mathbf {x}}\rangle \le \frac{n b_{\max }}{\delta }}\}\). The normal fan of \(P'\) covers \(\mathbb {R}^n\), because \(\mathbf {0}\) lies in the interior of \(\mathrm{conv}(\{{\mathbf {a}_i : i \in I} \cup \left\{ \mathbf {w}\right\} \})\), and so \(P'\) is a polytope. Since \(\Vert \mathbf {w}\Vert < 1\), we have \(\langle {\mathbf {w}}, {\mathbf {x}}\rangle < \Vert \mathbf {x}\Vert \le \frac{n b_{\max }}{\delta }\) for every vertex \(\mathbf {x} \in P\) by Lemma 31, so \(P'\) is LP equivalent to P. Every vertex of \(P'\) is either a vertex of P or the intersection of an unbounded ray of P with the new constraint. Consider a feasible basis B of \(P'\). Either B is already feasible for P, in which case it satisfies the \(\delta \)-distance property. Otherwise, it is of the form \(\mathbf {a}_1, \ldots , \mathbf {a}_{n-1}, \mathbf {w}\), where \(\mathbf {a}_1, \ldots , \mathbf {a}_n\) is a feasible basis for P, after a suitable renumbering of indices. In this case, \(\mathrm{cone}\left\{ \mathbf {a}_1, \ldots , \mathbf {a}_{n-1}\right\} \) is on the boundary of the support of the normal fan of P, and so by the proof of Lemma 5 we have \(d(\mathbf {w}, H) \ge \delta / n\), where \(H = \mathrm{span}\left\{ \mathbf {a}_1, \ldots , \mathbf {a}_{n-1}\right\} \). Using an orthogonal transformation, we may assume without loss of generality that \(\mathbf {a}_{1n} = \dots = \mathbf {a}_{(n-1),n} = 0\) and so the matrix M whose rows are the basis vectors normalized to unit length is of the form

    $$\begin{aligned} M = \begin{pmatrix} A' &{} 0 \\ \mathbf {w}'^T &{} h \end{pmatrix} \in \mathbb {R}^{n \times n}, \end{aligned}$$

    where \(h = d(\mathbf {w}/\Vert \mathbf {w}\Vert , H) \ge \delta / n\) and \(\Vert \mathbf {w}'\Vert < 1\). We compute

    $$\begin{aligned} M^{-1} = \begin{pmatrix} A'^{-1} &{} 0 \\ -\mathbf {w}'^T A'^{-1} / h &{} 1/h \end{pmatrix}. \end{aligned}$$

    By Lemma 30, it is sufficient to show that the norms of the columns of \(M^{-1}\) are bounded by \(2n/\delta ^2\). This is immediate for the last column. Let \(\mathbf {u}_1, \ldots , \mathbf {u}_{n-1}\) be the columns of \(A'^{-1}\). We have \(\Vert \mathbf {u}_i\Vert \le 1/\delta \) by Lemma 30. Furthermore, the i-th entry of the last row of \(M^{-1}\) is bounded in absolute value by

    $$\begin{aligned} \Big | \frac{\langle {\mathbf {w}'}, {\mathbf {u}_i}\rangle }{h} \Big | \le \frac{\Vert \mathbf {u}_i\Vert n}{\delta } \le \frac{n}{\delta ^2}. \end{aligned}$$

    By the triangle inequality, the norms of the first \(n-1\) columns of \(M^{-1}\) are bounded by \(1/\delta + n/\delta ^2 \le 2n/\delta ^2\). This completes the proof that \(P'\) satisfies the local \(\delta ^2/(2n)\)-distance property.

  2. 2.

    Now suppose that A satisfies the global \(\delta \)-distance property. For every vertex \(\mathbf {x} \in P\), we have

    $$\begin{aligned} \langle {-\mathbf {a}_i}, {\mathbf {x}}\rangle \le \Vert \mathbf {a}_i\Vert \Vert \mathbf {x}\Vert \le \frac{n \Vert \mathbf {a}_i\Vert b_{\max }}{\delta } \end{aligned}$$

    by Lemma 31, and so the polytope

    $$\begin{aligned} P' = \Big \{\mathbf {x} \in \mathbb {R}^n: -\frac{n\Vert \mathbf {a}_i\Vert b_\mathrm{max}}{\delta } - 1 \le \langle {\mathbf {a}_i}, {\mathbf {x}}\rangle \le \mathbf {b}_i \quad \forall i \in [m]\Big \} \end{aligned}$$

    is LP equivalent to P. Furthermore, every (not necessarily feasible) basis of the constraint matrix of \(P'\) is equal to a basis of A up to sign changes, which do not affect the \(\delta \)-distance property. Therefore, \(P'\) satisfies the global \(\delta \)-distance property.

\(\square \)

9 The Shadow Simplex Method with Symbolic Perturbation

In this section, we will give a self-contained presentation of the algorithmic details of the shadow simplex method, including the details of coping with degenerate P using a perturbation of the right-hand sides \(\mathbf {b}\). For clarity of presentation, we will first consider the case of simple polyhedra. We will also assume that \(\mathbf {c}\) and \(\mathbf {d}\) are in general position as made precise in the precondition of Algorithm 2.

figure c

Lemma 32

Let \(A \in \mathbb {R}^{m \times n}\), \(\mathbf {b} \in \mathbb {R}^m\), and \(U \in \mathbb {R}^{n \times n}\) invertible. Then a basis B is optimal for \(\max \{{\langle {\mathbf {c}}, {\mathbf {x}}\rangle : A\mathbf {x} \le \mathbf {b}}\}\) if and only if it is optimal for \(\max \{{ \langle {U^T \mathbf {c}}, {\mathbf {x}}\rangle : AU\mathbf {x} \le \mathbf {b}}\}\).

Proof

Let \(\mathbf {a}_1, \dots , \mathbf {a}_m \in \mathbb {R}^n\) be the rows of A. The basis B is optimal for the first problem if and only if \(\mathbf {c} \in \mathrm{cone}\,\{{ \mathbf {a}_i : i \in B }\}\). This is equivalent to \(U^T \mathbf {c} \in \mathrm{cone}\,\{{ U^T \mathbf {a}_i : i \in B}\}\). Since the \(U^T \mathbf {a}_i\) are the rows of AU, this is equivalent to B being an optimal basis for the second problem. \(\square \)

Theorem 33

Algorithm 2 is correct as specified and requires \(O(mn^2 + Nmn)\) arithmetic operations, where N is the number of normal cone facets intersected by \([\mathbf {c},\mathbf {d}]\).

Proof

The initial Gauss elimination requires \(O(mn^2)\) arithmetic operations. Each iteration is dominated by the computation of \(j^\star \) and the rank-1 Gauss elimination update, both of which require O(mn) arithmetic operations.

We will show the invariant that B is an optimal basis for \(\max \{{ \langle {\mathbf {c}_\lambda }, {\mathbf {x}}\rangle : A\mathbf {x} \le \mathbf {u} }\}\), where \(\mathbf {c}_\lambda = (1-\lambda ) \mathbf {c} + \lambda \mathbf {d}\). The invariant initially holds by definition of the input and remains unchanged by the Gauss elimination steps due to Lemma 32.

Fig. 2
figure 2

One iteration of the Shadow Simplex

A typical iteration is illustrated in Fig. 2. The vertex corresponding to B is \(\mathbf {b}_B\), its normal cone is the positive orthant \(\mathbb {R}^n_+\). The invariant implies that \(\mathbf {c}_{\lambda } \in \mathbb {R}^n_+\). If \(\mathbf {d} - \mathbf {c} \ge 0\), the algorithm returns B. Indeed, B is an optimal basis for \(\mathbf {d}\) in this case because \(\mathbf {d} \in \mathbb {R}^n_+\).

Otherwise, the ray from \(\mathbf {c}_\lambda \) through \(\mathbf {d}\) eventually leaves the positive orthant, and in fact \(\mathbf {c}_{\lambda ^\star }\) is the last point contained in \(\mathbb {R}^n_+\). If \(\lambda ^\star \ge 1\) we know that \(\mathbf {d} = \mathbf {c}_1 \in \mathbb {R}^n_+\) and so the current basis is optimal for \(\mathbf {d}\). The index \(i^\star \) is the index into the basis whose contribution to the conic combination describing \(\mathbf {c}_{\lambda ^\star }\) is 0.Footnote 6

Letting \(B' = B \setminus \{ B_{i^\star } \} \cup \{ j^\star \}\), where \(B_{i^\star }\) is the \(i^\star \)-th index in the basis, it is trivially true that \(\mathbf {c}_{\lambda ^\star } \in \mathrm{cone}A_{B'}^T\), so it only remains to show that \(B'\) is a feasible basis. The edge of the polyhedron described by the constraints \(B \setminus \{ B_{i^\star } \}\) is contained in the ray starting at the vertex \(\mathbf {b}_B\) in direction \(-\mathbf {e}_{i^\star }\). This ray can only be cut off by constraints \(\mathbf {a}_j \mathbf {x} \le \mathbf {b}_j\) with \(\mathbf {a}_{ji^\star } < 0\). At least one such constraint must exist by the condition that \(\mathbf {d}\) lies in the support of the normal fan, i.e. the corresponding linear program is bounded. The fraction \(\frac{\langle {\mathbf {a}_j}, {\mathbf {b}_B}\rangle - \mathbf {b}_j}{\mathbf {a}_{ji^\star }}\) is the Euclidean distance from \(\mathbf {b}_B\) to the intersection of the constraint \(\langle {\mathbf {a}_j}, {\mathbf {x}}\rangle \le \mathbf {b}_j\) with the ray, and so \(B'\) is feasible.Footnote 7 This completes the proof of the invariant and thus the proof of correctness.

If the initial objective \(\mathbf {c}\) lies in the facet of a normal cone we may get \(\lambda ^\star = \lambda \) in the first iteration. Nevertheless, due to the precondition on \([\mathbf {c}, \mathbf {d})\), every computed value \(\lambda ^\star \) is distinct and, except for the last one, corresponds to one intersection point of \([\mathbf {c},\mathbf {d}]\) with a facet of a normal cone. Therefore, the number of iterations is bounded by \(N + 1\). \(\square \)

If the input polyhedron were non-simple, the proof of correctness and termination would still work given \(\mathbf {c}\) in sufficiently general position. However, the \(\lambda ^\star \) would then correspond to points of intersection between \([\mathbf {c}, \mathbf {d}]\) and cones corresponding to bases. Those cones are merely subsets of normal cones, and they may not be mutually consistent with a single triangulation of the normal fan. For this reason, we consider a perturbed polyhedron

$$\begin{aligned} P_\varepsilon := \{{ \mathbf {x} \in \mathbb {R}^n : A\mathbf {x} \le \mathbf {b} + \mathbf {\gamma }(\varepsilon )}\}. \end{aligned}$$

where \(\mathbf {\gamma }(\varepsilon ) := (\varepsilon , \varepsilon ^2, \ldots , \varepsilon ^m)\) for \(\varepsilon > 0\).

Lemma 34

If \(\varepsilon > 0\) is sufficiently small, one has that

  1. 1.

    every feasible basis B for \(P_\varepsilon \) is also feasible for P,

  2. 2.

    the normal fan of \(P_\varepsilon \) triangulates the normal fan of P, and

  3. 3.

    if \(B = \{ m - n + 1, \ldots , m \}\) is a feasible basis for P, then it is also feasible for \(P_\varepsilon \).

Proof

If \(P_\varepsilon \) is not simple, there is some basis B and a constraint \(j \not \in B\) such that

$$\begin{aligned} \langle { \mathbf {a}_j }, { A_B^{-1} (\mathbf {b}_B + \mathbf {\gamma }(\varepsilon )_B)}\rangle = \mathbf {b}_j + \varepsilon ^j. \end{aligned}$$

In other words, \(\varepsilon \) is a root of a non-zero polynomial. Since there are only finitely many pairs of B and \(j \not \in B\), we have that \(P_\varepsilon \) is simple for sufficiently small \(\varepsilon > 0\). In particular, the normal fan of \(P_\varepsilon \) is a triangulation. For the second claim, it remains to show that for all vertices \(\mathbf {x} \in P_\varepsilon \) one has \(N_\mathbf {x} \subseteq N_\mathbf {y}\) for some vertex \(\mathbf {y} \in P\). In fact, this is implied by the first claim, which we show next.

Let B be an infeasible basis for P, i.e. there is some \(j \not \in B\) such that

$$\begin{aligned} \langle { \mathbf {a}_j }, { A_B^{-1} (\mathbf {b}_B + \mathbf {\gamma }(\varepsilon )_B)}\rangle > \mathbf {b}_j + \varepsilon ^j \end{aligned}$$

holds for \(\varepsilon = 0\). Since both sides of the inequality are continuous as functions in \(\varepsilon \), strict inequality also holds for all sufficiently small \(\varepsilon > 0\). There are only finitely many infeasible bases for P, so we have that all of them are infeasible for \(P_\varepsilon \) when \(\varepsilon > 0\) is sufficiently small. This implies the first claim.

Finally, if \(B = \{ m - n + 1, \ldots , m \}\) is feasible for P, we have that

$$\begin{aligned} \langle { \mathbf {a}_j }, { A_B^{-1} (\mathbf {b}_B + \mathbf {\gamma }(\varepsilon )_B)}\rangle - \mathbf {b}_j - \varepsilon ^j \le 0 \end{aligned}$$

holds for all \(j \le m - n\) when \(\varepsilon = 0\). The left hand side is a polynomial in \(\varepsilon \) whose lowest-degree non-constant monomial is \(-\varepsilon ^j\). This implies that the inequality also holds when \(\varepsilon > 0\) is sufficiently small, hence B is feasible for \(P_\varepsilon \). \(\square \)

Without explicit bounds on the coefficients describing P, we cannot give a quantitative bound for \(\varepsilon \). We avoid the need for such a bound by applying the perturbation symbolically. Since the right-hand sides \(\mathbf {b}\) never appear in divisors, we can perform related computations in the polynomial ring \(R := \mathbb {R}[\varepsilon ]\). The order \(\le \) on \(\mathbb {R}\) naturally extends to a lexicographic order on R such that \(a \le b\) holds for \(a, b \in R\) if and only if \(a(\varepsilon ) \le b(\varepsilon )\) holds over the reals for all sufficiently small \(\varepsilon > 0\).

Theorem 35

Let \(P = \{{ \mathbf {x} \in \mathbb {R}^n : A\mathbf {x} \le \mathbf {b} }\}\) be pointed, \(\mathbf {c}, \mathbf {d} \in \mathbb {R}^n\), and B an optimal basis for \(\mathbf {c}\). If every intersection of \([\mathbf {c}, \mathbf {d})\) with a facet F of a cone spanned by a feasible basis of P lies in the relative interior of F, the Shadow Simplex can be used to compute an optimal basis for \(\mathbf {d}\) in \(O(mn^2 + Nmn)\) arithmetic operations, where N is the number of intersections of \([\mathbf {c},\mathbf {d}]\) with some triangulation of the normal fan of P that contains the cone spanned by the initial basis B.

Proof

Rearrange the rows of \(A\mathbf {x} \le \mathbf {b}\) so that \(B = \{{ m - n + 1, \ldots , m }\}\) and apply Lemma 34. This gives us all preconditions of the Shadow Simplex algorithm, and the proof of Theorem 33 applies with a single caveat: the computation of \(j^\star \) involves computations and comparisons of terms of the form \(\frac{\langle {\mathbf {a}_j}, {\mathbf {b}_B + \mathbf {\gamma }(\varepsilon )_B}\rangle - \mathbf {b}_j - \varepsilon ^j}{\mathbf {a}_{ji^\star }} \in R\). The resulting polynomials contain at most \(n+2\) monomials. By storing them as sparse sorted vectors, we can compute each term and compare it to the previous best in time O(n), so that the computation of \(j^\star \) still requires only O(mn) arithmetic operations. \(\square \)