1 Introduction

In order to eliminate redundant information and strengthen the formulation of an integer program, solvers apply a number of techniques before the first linear programming relaxation of an instance is solved. This step is referred to as presolving or preprocessing. The solvers then work with this reduced formulation rather than the original and recover the values of original variables afterwards. Presolving techniques are often not only applied before solving the linear programming relaxation at the root node in a branch-and-bound tree but may also be performed in a reduced fashion, called node presolving, at other nodes of the tree. In this paper, however we concentrate on techniques that we apply in the very first step of the solution process.

Presolving has been applied for solving linear and mixed integer programming problems for decades. Brearly et al. [13] and Williams [27] discussed bound tightening, row elimination, and variable fixings in mathematical programming systems, while Andersen and Andersen [5] published presolving techniques in the context of linear programming. In addition, presolving techniques on zero-one inequalities have been studied by Guignard and Spielberg [18], Johnson and Suhl [21], Crowder et al. [14], and Hoffman and Padberg [19]. Williams [28] developed a projection method for the elimination of integer variables and Savelsbergh [25] investigated preprocessing and probing techniques for mixed integer programming problems. An overview of different presolving techniques can be found in the books of Nemhauser [24] and Wolsey [29], in Fügenschuh and Martin [17] and in Mahajan [23]. Details on implementing presolving techniques effectively within a mixed integer linear programming solver are presented in Suhl and Szymanski [26], Atamtürk and Savelsbergh [7], and Achterberg [2].

The impact of presolving on the entire solution process of mixed integer linear programming problems was published in Bixby and Rothberg [10]. By disabling root presolving, a mean performance degradation of about a factor of ten was detected. Only cutting planes had a bigger influence on the solving process.

This paper is organized as follows. In Sect. 2 we present our notation. Section 3 describes a presolving technique we call Stuffing Singleton Columns, where continuous variables with only one non-zero coefficient in the constraint matrix are fixed at a suitable bound. In Sect. 4 we characterize a column based method called Dominating Columns working on a partial order. Using this dominance relation new fixings and bounds can be derived. Finally, in Sect. 5, a technique based on Connected Components is presented that allows to split the whole problem into subproblems which can be solved independently. In Sect. 6 we show computational results for all three presolving techniques with the state-of-the-art non-commercial MIP solver SCIP  [3] on supply chain management instances and a test set comprised of all benchmark instances from the last three MIPLIB versions [4, 9, 22]. We compare the achieved reductions as well as the overall performance improvements in Sect. 6 and close our paper with some conclusions in Sect. 7.

2 Notation and basics

Consider a mixed integer program (MIP)

$$\begin{aligned} \begin{array}{lll} &{} \text{ min } &{} c^{T}x \\ &{} \text{ s.t. } &{} Ax\le b \\ &{} &{} 0 \le \ell \le x \le u \\ &{} &{}x \in \mathbb {Z}^{p} \times \mathbb {R}^{n-p} \end{array}\end{aligned}$$
(1)

with \(c \in \mathbb {R}^{n}, \ell , u \in \mathbb {R}^{n}_+\cup \left\{ +\infty \right\} , A \in \mathbb {R}^{m \times n}, b\in \mathbb {R}^{m}\) and \(p \in \{0,1,\ldots ,n\}\).

We use the notation \(A_{\cdot j}\) for column j of the matrix A and \(A_{i\cdot }\) to denote the row vector i. The value in the i-th row and j-th column of A, is called \(a_{ij}\).

For \(x \in \mathbb {R}^{n}\), \(\text{ supp }\,(x)=\{i \in \{1,2,\ldots ,n\} \mid x_i\not = 0\}\) denotes the support of x.

In [13] a procedure for tightening bounds of variables was published. Important also for our algorithms is the so-called maximal (2) and minimal activity (3) of a linear constraint \(A_{r\cdot }x\le b_r\):

$$\begin{aligned} U_r= & {} \sum _{\scriptstyle \forall k,\, a_{rk}>0} a_{rk}u_k + \sum _{\scriptstyle \forall k,\, a_{rk}<0} a_{rk}\ell _{k} \end{aligned}$$
(2)
$$\begin{aligned} L_r= & {} \sum _{\scriptstyle \forall k,\, a_{rk}>0} a_{rk}\ell _k + \sum _{\scriptstyle \forall k,\, a_{rk}<0} a_{rk}u_{k} \end{aligned}$$
(3)

\(L_r\) may be \(-\infty \) and \(U_r\) may be \(+\infty \). Obviously, \(L_r \le A_{r\cdot }x \le U_r\) holds. By using the minimal activity \(L_r\), it is possible to calculate upper bounds \(u_{j}^*\) and lower bounds \(\ell _{j}^*\) for variable \(x_j\). For all feasible solutions x the inequalities

$$\begin{aligned} x_j\le & {} \frac{b_r- L_r + a_{rj}\ell _j }{a_{rj}}=u'_{rj}, \forall r \quad \text{ with } a_{rj}>0 \end{aligned}$$
(4)
$$\begin{aligned} x_j\ge & {} \frac{b_r- L_r + a_{rj}u_j }{a_{rj}}=\ell '_{rj}, \forall r \quad \text{ with } a_{rj}<0 \end{aligned}$$
(5)

hold. Hence we obtain potentially new bounds by

$$\begin{aligned} u^{*}_j = \min \left\{ u_j,\min _{\forall r, a_{rj}>0}\{u'_{rj}\}\right\} \quad {\text {and}}\quad \ell ^{*}_j = \max \left\{ \ell _j,\max _{\forall r, a_{rj}<0}\{\ell '_{rj}\}\right\} . \end{aligned}$$

For integer variables we may also apply rounding and use \(\lfloor u_j^* \rfloor \) or \(\lceil \ell _j^*\rceil \) instead. Thus, we may assume that integer variables have integer bounds. However, we do not require that bound tightening is applied prior to the methods described in the following sections.

3 Stuffing singleton columns

A singleton column is a column of the matrix A with \(|\text{ supp }\,(A_{\cdot j})|=1\). In this section we analyze a set of singleton columns of continuous variables \(x_j\) within a row r and try to fix them at one of its bounds.

To illustrate the basic idea, consider the linear programming relaxation of the binary knapsack problem. Items, each with a certain profit and size, need to be selected to be put into a knapsack of a given capacity. In contrast to the binary knapsack problem, it is possible to pack any fraction of an item. An optimal solution can be obtained by sorting the items by decreasing profit/size ratio and selecting them in this order until the knapsack capacity is reached [15]. Transferring this idea to a general MIP of the form (1) causes two difficulties. Integer variables are present and variables usually appear in more than one row. So we cannot simply proceed in the same manner as in the linear programming relaxation of the binary knapsack problem, but need to modify the idea as described in the following.

Suppose there is a column j of problem (1) with \(c_j\ge 0\) satisfying \(a_{rj}\ge 0\) for all rows \(r \in \{1,\ldots ,m\}\). If \(\ell _j > -\infty \), we can set variable \(x_j\) to its lower bound. If \(\ell _j = -\infty \) and \(c_j>0\), then the problem is unbounded. The appropriate argument applies to a column j with \(c_j \le 0\) and \(a_{rj}\le 0\) for all rows r. If \(u_j < \infty \), we can set variable \(x_j\) to its upper bound. In case \(u_j=\infty \) and \(c_j<0\), the problem is unbounded. This presolving technique is called duality fixing [17]. Thus, duality fixing already covers the cases where \(c_j/a_{rj}\ge 0\). We use additional information about the rows in order to treat the remaining cases where \(c_j/a_{rj}<0\).

Let us first focus on the case \(a_{rj}>0\) and \(c_j<0\). For a given row r, consider the set of variable indexes

$$\begin{aligned} J(r)= \{j \in \{1,\ldots ,n\} \mid x_j \in \mathbb {R}\wedge |\text{ supp }\,(A_{\cdot j})|=1 \wedge a_{rj}>0 \wedge c_j <0\}. \end{aligned}$$
(6)

Furthermore, we define the following restricted maximal activity, which is similar to the maximal activity (2) of row r except that continuous singleton columns \(x_j\) with \(j \in J(r)\) are considered to be at their lower bounds.

$$\begin{aligned} \tilde{U}_r= & {} \sum _{\begin{array}{c} j \in J(r) \end{array}} a_{rj}\ell _j + \sum _{\begin{array}{c} j \notin J(r) \\ a_{rj}>0 \end{array}} a_{rj} u_j + \sum _{\begin{array}{c} j \notin J(r)\\ a_{rj}<0 \end{array}} a_{rj}\ell _j \end{aligned}$$
(7)

The values \(\tilde{U}_r\) can now be used to determine optimal variable values for singleton columns. First, we sort the indices \(j\in J(r)\) by their cost/size ratios \(c_j/a_{rj}\) in non-decreasing order. Let s be the first index in this order. Define \(\varDelta =a_{rs} (u_s - \ell _s)\) to be the difference in the contribution of variable \(x_{s}\) to constraint r when setting \(x_{s}\) to its upper instead of its lower bound. If \(\varDelta \le b_r - \tilde{U}_r\), \(x_s\) can be set to its upper bound. After this step, s is removed from J(r), \(\tilde{U}_r\) is updated by adding \(\varDelta \) and the procedure is iterated.

Theorem 1

Given a MIP of the form (1), some row r, the index set J(r) as defined in (6), \(\tilde{U}_r\) as defined in (7), and the index \(s\in J(r)\) with the smallest ratio \(c_s/a_{rs}\). If \(\varDelta \le b_r - \tilde{U}_r\) then \(x_s = u_s\) holds for at least one optimal solution.

Proof

Let \(x^*\) be an optimal solution with \(x_s^*<u_s\). If \(x_j^*=\ell _j\) for all \(j \in J(r)\!\!\setminus \!\!\{s\}\), then a new solution \(x'\) constructed by setting \(x_s\) to \(u_s\) is feasible because

$$\begin{aligned} \sum _{\forall j} a_{rj} x_j'= & {} \sum _{\forall j, j \ne s}a_{rj}x_j' + a_{rs} u_s' \le \tilde{U}_r + \varDelta \le b_r. \end{aligned}$$

Additionally, the objective function value improves because \(c_s < 0\). This contradicts our assumption of \(x^*\) being optimal. Hence there exists a \(j \in J(r)\!\!\setminus \!\! \{s\}\) with \(x_j^* > \ell _j\). In this case we can construct a new solution \(x'\) in which we decrease the value of \(x_j^*\) to \(x_j'\) while at the same time increasing the value of \(x_s\) such that \(A_{r\cdot }x' = A_{r\cdot }x^{*}\). In particular, \(a_{rs}(x_s'-x_s^*) = a_{rj}(x_j^*-x_j')\) holds. The change of the objective function can be estimated by

$$\begin{aligned} c_s x_s' + c_j x_j'= & {} c_s x_s^* + c_j x_j^* + c_s(x_s'-x_s^*) - c_j(x_j^*-x_j') \\= & {} c_s x_s^* + c_j x_j^* + \frac{c_s a_{rs}}{a_{rs}}(x_s'-x_s^*) - \frac{c_j a_{rj}}{a_{rj}}(x_j^*-x_j') \\\le & {} c_s x_s^* + c_j x_j^* + \frac{c_s a_{rs}}{a_{rs}}(x_s'-x_s^*) - \frac{c_s a_{rj}}{a_{rs}}(x_j^*-x_j') \\= & {} c_s x_s^* + c_j x_j^* + \frac{c_s}{a_{rs}}(a_{rs}(x_s'-x_s^*) - a_{rj}(x_j^*-x_j')) \\= & {} c_s x_s^* + c_j x_j^*. \end{aligned}$$

If \(x_s'=u_s\), we proved the theorem. Otherwise, \(x_j'=\ell _j\) holds. Applying this argument iteratively results in an optimal solution with \(x_s'=u_s\) or \(x_j'=\ell _j\) for all \(j \in J(r)\!\!\setminus \!\! \{s\}\). But as we have shown before, the latter case contradicts the optimality of \(x'\). \(\square \)

A similar procedure is followed where \(a_{rj}<0\) and \(c_j>0\). We define

$$\begin{aligned} J(r)= \{j \in \{1,\ldots ,n\} \mid x_j \in \mathbb {R}\wedge |\text{ supp }\,(A_{\cdot j})|=1 \wedge a_{rj}<0 \wedge c_j >0\} \end{aligned}$$

and

$$\begin{aligned} {\tilde{L}}_r= & {} \sum _{\begin{array}{c} j \in J(r) \end{array}} a_{rj}\ell _j +\sum _{\begin{array}{c} j \notin J(r) \\ a_{rj}<0 \end{array}} a_{rj} u_j +\sum _{\begin{array}{c} j \notin J(r) \\ a_{rj}>0 \end{array}} a_{rj}\ell _j. \end{aligned}$$

Now, we begin with the index \(s \in J(r)\) corresponding to the greatest ratio \(c_j/a_{rj}\). If \(\varDelta \ge b_r - \tilde{L}_r\), \(x_s\) can be set to its upper bound. We update the set J(r) and the value \(\tilde{L}_r\) according to their definition and repeat the process.

Sorting the ratios takes \(O(|J(r)| \log |J(r)|)\) and computing whether the variables can be set to the upper or lower bound requires O(|J(r)|). Furthermore, the size of J(r) is usually small and hence the algorithm does not deteriorate the performance on instances where no or only very few reductions are found. This is especially true for most MIPLIB instances. For certain practical problems, such as supply chain management, stuffing singleton columns may, however, find fixings quite often (see Sect. 6).

4 Dominating columns

This presolving technique is based on a \(\le \)-relation between the coefficients of two variables. Because this relation is reflexive, antisymmetric and transitive, it defines a partial order (poset). The relation causes a consecutive behavior of the variable values and can thus be seen as a dominance relation. The idea of exploiting a dominance relation between variables for presolving is not new. Andersen and Andersen [5] used dominating columns in the presolving of linear programming problems and Borndörfer [12] applied this technique in the context of set partitioning problems. In addition, Babayev and Mardanov [8] and Zhu and Broughan [31] introduced procedures based on comparing pairs of columns for reducing the number of integer variables, mostly applied on knapsack problems. Our method can be seen as a generalization and extension of existing dominating columns approaches for mixed integer programming problems.

4.1 Dominance relation

Definition 1

Let a MIP of the form (1) with two different variables \(x_j\) and \(x_i\) of the same type, i.e., binary, integer or continuous, be given. We say \(x_j\) dominates \(x_i\) (\(x_j \succ x_i\)), if

  1. (i)

    \(c_j \le c_i\) and

  2. (ii)

    \(a_{rj}\le a_{ri}\) for every constraint r.

We call \(x_j\) the dominating variable and \(x_i\) the dominated variable.

The following example illustrates Definition 1.

Example 1

In this example \(x_1 \succ x_2\), \(x_3 \succ x_4\) and \(x_5 \succ x_6\) and the optimal solution is \(x_1=4, x_2=0, x_3=3.5, x_4=1, x_5=1,x_6=0\) with value \(-5.5\).

Example 1 raises suspicion that one of the variables involved in a dominance relation is at one of its bounds in the optimal solution. Indeed, this is a general property of the dominance relation that we will prove in the following. To achieve this, we first show that increasing the dominating variable and decreasing the dominated variable by the same amount preserves feasibility and optimality, provided the variable bounds are still satisfied.

Lemma 1

Let \(\bar{x}\) be a feasible solution for (1) and \(x_j \succ x_i\). For \(0 < \alpha \in \mathbb {R}\), we define \(x^{*}\) with

$$\begin{aligned} x^{*}_{k} = {\left\{ \begin{array}{ll} \bar{x}_{k} + \alpha &{}\quad k = j,\\ \bar{x}_{k} - \alpha &{}\quad k = i,\\ \bar{x}_{k} &{}\quad \mathrm{else}. \end{array}\right. } \end{aligned}$$

If \(x^{*}_{j} = \bar{x}_{j} + \alpha \le u_j\) and \(x^{*}_{i} = \bar{x}_{i} - \alpha \ge \ell _i\), then \(x^{*}\) is feasible and its objective value is not worse than the one of \(\bar{x}\).

Proof

For every constraint \(A_{r\cdot }x \le b_r\), we get

$$\begin{aligned} \sum _{k = 1}^{n} a_{rk}x^{*}_{k}= & {} \sum _{\begin{array}{c} k = 1\\ k \not = i,j \end{array}}^{n}a_{rk}\bar{x}_k + a_{rj}(\bar{x}_j + \alpha ) + a_{ri} (\bar{x}_i - \alpha ) \\= & {} \underbrace{\sum _{\begin{array}{c} k = 1 \end{array}}^{n}a_{rk}\bar{x}_k}_{\le b_r} +~\alpha \underbrace{(a_{rj}-a_{ri})}_{\le 0} \le b_r. \end{aligned}$$

By assumption the bounds of the variables are fulfilled, hence \(x^{*}\) is feasible. Additionally, we know from Definition 1 that \(c_j\le c_i\), thus \(c^{T}x^{*} = c^{T}\bar{x} + \alpha (c_{j} - c_{i}) \le c^{T}\bar{x}\), i.e., the objective value is not getting worse. \(\square \)

This leads us to the following theorem, stating that the dominated variable is at its lower bound or the dominating variable is at its upper bound in at least one optimal solution.

Theorem 2

Let \(x_j \succ x_i\), then there always exists an optimal solution \(x^{*}\) for (1) with

$$\begin{aligned} x^{*}_j = u_j \vee x^{*}_i = \ell _i. \end{aligned}$$

Proof

Let \(\bar{x}\) be an optimal solution with \(\bar{x}_j < u_j \wedge \bar{x}_i > \ell _i\). We construct a feasible solution \(x^{*}\) by defining \(\alpha = \min \{\bar{x}_i - \ell _i, u_j - \bar{x}_j\}\) and applying Lemma 1. Since \(\bar{x}\) is optimal and \(c^{T}x^{*} \le c^{T}\bar{x}\), \(x^{*}\) is optimal. By definition of \(\alpha \), also \(x^{*}_j = u_j \vee x^{*}_i = \ell _i\) holds. \(\square \)

4.2 Predictive bound analysis

Based on Theorem 2 we describe sufficient conditions which allow in combination with Definition 1 to tighten bounds or fix variables. We first extend the maximal and minimal row activity from (2) and (3) as a function of one variable \(x_t\).

Definition 2

Let a linear constraint \(A_{r\cdot }x \le b_r\) of (1) and a variable \(x_{t}\) be given. We denote by

$$\begin{aligned} U^{t}_r(x_{t})= & {} \sum _{\begin{array}{c} k=1\\ k\not =t\\ a_{rk}>0 \end{array}}^{n} a_{rk}u_k + \sum _{\begin{array}{c} k=1\\ k\not =t\\ a_{rk}<0 \end{array}}^{n} a_{rk}\ell _{k} + a_{rt}x_{t} \end{aligned}$$

the conditional maximal activity of the linear constraint w.r.t. \(x_t\) and by

$$\begin{aligned} L^{t}_r(x_{t})= & {} \sum _{\begin{array}{c} k=1\\ k\not =t\\ a_{rk}>0 \end{array}}^{n} a_{rk}\ell _k + \sum _{\begin{array}{c} k=1\\ k\not =t\\ a_{rk}<0 \end{array}}^{n} a_{rk}u_{k} + a_{rt}x_{t} \end{aligned}$$

the conditional minimal activity of the linear constraint w.r.t. \(x_t\).

Definition 2 is now used to define specific functions, which predict the bounds of variable \(x_s\) depending on the value of another variable \(x_t\). We call this approach predictive bound analysis.

Definition 3

Let (1) and two variables \(x_{s}\) and \(x_{t}\) be given. We define the functions

$$\begin{aligned} \begin{array}{lcrl} \textsc {maxl}^{t}_{s}(x_{t}) &{}=&{} \displaystyle \max _{r=1,\ldots ,m} \left\{ \frac{b_r - L^{t}_r(x_{t}) + a_{rs}u_s}{a_{rs}} \mid a_{rs},a_{rt} < 0 \right\} \!,&{}\\ \textsc {maxu}^{t}_{s}(x_{t}) &{}=&{} \displaystyle \max _{r=1,\ldots ,m} \left\{ \frac{b_r - U^{t}_r(x_{t}) + a_{rs}\ell _s}{a_{rs}} \mid a_{rs},a_{rt} < 0 \right\} \!,&{}\\ \textsc {minl}^{t}_{s}(x_{t}) &{}=&{} \displaystyle \min _{r=1,\ldots ,m} \left\{ \frac{b_r - L^{t}_r(x_{t}) + a_{rs}\ell _s}{a_{rs}} \mid a_{rs},a_{rt} > 0 \right\} &{} \text{ and } \\ \textsc {minu}^{t}_{s}(x_{t}) &{}=&{} \displaystyle \min _{r=1,\ldots ,m} \left\{ \frac{b_r - U^{t}_r(x_{t}) + a_{rs}u_s}{a_{rs}} \mid a_{rs},a_{rt} > 0 \right\} \!.&{} \end{array} \end{aligned}$$

\(\textsc {minl}^{t}_{s}(x_{t})\) takes into account all constraints in which \(x_{s}\) and \(x_{t}\) have positive coefficients, i.e., a subset of the constraints that imply an upper bound on \(x_{s}\). Similar to the bound tightening (see (4)), the upper bound on \(x_{s}\) is computed for each constraint, but instead of using the minimal activity, the conditional minimal activity w.r.t. \(x_{t}\) is used. Therefore, each constraint gives an upper bound for \(x_{s}\) subject to the value of \(x_{t}\). Minimizing over these bounds, \(\textsc {minl}^{t}_{s}(x_{t})\) gives the tightest implied upper bound on \(x_{s}\) as a function of the value of \(x_{t}\). Analogously, \(\textsc {maxl}^{t}_{s}(x_{t})\) gives the tightest implied lower bound on \(x_{s}\) as a function of the value of \(x_{t}\).

The other two functions \(\textsc {maxu}^{t}_{s}(x_{t})\) and \(\textsc {minu}^{t}_{s}(x_{t})\) take into account the maximal instead of the minimal activity. Since the maximal activity is the worst-case when regarding feasibility of a \(\le \)-constraint, we obtain a larger lower and a smaller upper bound for \(x_s\) subject to \(x_t\). This range of values for \(x_s\) is feasible for all constraints, independent of the other variables. It may, however, exceed the variable’s domain.

In the following, we assume \(\textsc {maxl}^{t}_{s}(x_t), \textsc {maxu}^{t}_{s}(x_t), \textsc {minl}^{t}_{s}(x_t)\), and \(\textsc {minu}^{t}_{s}(x_t)\) to be finite, in particular, that there are rows satisfying the requirements on the coefficients as demanded in Definition 3.

Next, we show that these four functions are strictly monotonically decreasing. This property is fundamental to obtain a maximum value if we assume \(x_t\) is at its lower bound and vice versa.

Lemma 2

\(\textsc {maxl}^{t}_s(x_{t})\), \(\textsc {maxu}^{t}_{s}(x_{t})\), \(\textsc {minl}^{t}_{s}(x_{t})\) and \(\textsc {minu}^{t}_{s}(x_{t})\) are strictly monotonically decreasing functions, i.e., for \(\ell _t \le x_{t}'< x_{t}''\le u_t\) holds

$$\begin{aligned} \textsc {maxl}^{t}_s(x_{t}')> & {} \textsc {maxl}^{t}_s(x_{t}''),\\ \textsc {maxu}^{t}_s(x_{t}')> & {} \textsc {maxu}^{t}_s(x_{t}''),\\ \textsc {minl}^{t}_s(x_{t}')> & {} \textsc {minl}^{t}_s(x_{t}'') \text { and} \\ \textsc {minu}^{t}_s(x_{t}')> & {} \textsc {minu}^{t}_s(x_{t}''). \end{aligned}$$

Proof

We only prove the first inequality, the others can be shown using a similar procedure. Let \(\tilde{r}\) be one row defining the maximum in the computation of \(\textsc {maxl}^{t}_s(x_{t}'')\). Since \(L^{t}_{\tilde{r}}(x_{t}'') - L^{t}_{\tilde{r}}(x_{t}') = a_{{\tilde{r}}t}(x_{t}'' - x_{t}')\) and \(a_{{\tilde{r}}s}, a_{{\tilde{r}}t} < 0\) by the definition of \(\textsc {maxl}^{t}_s(x_{t})\), the following holds:

$$\begin{aligned} \textsc {maxl}^{t}_s(x_{t}') - \textsc {maxl}^{t}_s(x_{t}'')= & {} \max _{r=1,\dots ,m} \left\{ \frac{b_r - L^{t}_r(x_{t}') + a_{rs}u_s}{a_{rs}} \big | a_{rs},a_{rt} < 0 \right\} \\&- \max _{r=1,\dots ,m} \left\{ \frac{b_r - L^{t}_r(x_{t}'') + a_{rs}u_s}{a_{rs}} \big | a_{rs},a_{rt} < 0 \right\} \\\ge & {} \frac{b_{\tilde{r}} - L^{t}_{\tilde{r}}(x_{t}') + a_{{\tilde{r}}s}u_s}{a_{{\tilde{r}}s}} - \frac{b_{\tilde{r}} - L^{t}_{\tilde{r}}(x_{t}'') + a_{{\tilde{r}}s}u_s}{a_{{\tilde{r}}s}} \\= & {} \frac{a_{{\tilde{r}}t}}{a_{{\tilde{r}}s}}(x_{t}'' - x_{t}') \\> & {} 0 \end{aligned}$$

\(\square \)

These functions can help us to infer bounds for the dominating or the dominated variable in an optimal solution.

Theorem 3

Let two continuous variables \(x_j,x_i\) of (1) and \(x_j \succ x_i\) be given, then the bounds

  1. (i)

    \(x_j \le \textsc {minl}^{i}_j(\ell _i)\),

  2. (ii)

    \( x_i \ge \textsc {maxl}^{j}_i(u_j)\),

  3. (iii)

    \(x_j \ge \min \{u_{j},\textsc {maxl}^{i}_j(\ell _i)\}\),

  4. (iv)

    \(x_i \le \max \{\ell _{i},\textsc {minl}^{j}_i(u_j)\}\),

  5. (v)

    If \(c_j \le 0\) then \(x_j \ge \min \{u_{j}, \textsc {minu}^{i}_j(\ell _i)\}\) and

  6. (vi)

    If \(c_i \ge 0\) then \(x_i \le \max \{\ell _{i},\textsc {maxu}^{j}_i(u_j)\}\)

hold for at least one optimal solution.

Proof

  1. (i)

    By Definition 3, \(a_{rj}\) and \(a_{ri}\) are positive for all rows regarded for the computation of \(\textsc {minl}^{i}_j(\ell _i)\). Therefore setting \(x_{i}\) to \(\ell _i\) does not change the minimal activity (3) of the row. Thus by (4) follows \(\textsc {minl}^{i}_j(\ell _i)\) is an upper bound of \(x_j\).

  2. (ii)

    By Definition 3, \(a_{rj}\) and \(a_{ri}\) are negative for all rows regarded for the computation of \(\textsc {maxl}^{j}_i(u_j)\). Therefore setting \(x_{j}\) to \(u_j\) does not change the minimal activity (3) of the row. Thus by (5) follows \(\textsc {maxl}^{j}_i(u_j)\) is a lower bound of \(x_i\).

  3. (iii)

    We only treat the interesting case \(\ell _j \le \textsc {maxl}^{i}_j(\ell _i)\). By Definition 3, there exists one row r with \(a_{rj}\textsc {maxl}^{i}_j(\ell _i) + L^{i}_r(\ell _{i}) - a_{rj}u_j = b_r\). Let \(x^{*}\) be an optimal solution with \(\alpha = \min \{u_{j},\textsc {maxl}^{i}_j(\ell _i)\} - x_j^{*} > 0\). Assuming \(x_i^{*} = \ell _i\), then \(a_{rj}\textsc {maxl}^{i}_j(\ell _i) + L^{i}_r(\ell _{i}) - a_{rj}u_j = b_r < a_{rj}x_j^{*}+ L^{i}_r(\ell _{i})-a_{rj}u_j\) since \(a_{rj}<0\). This leads to a contradiction, because \(x^{*}\) was chosen to be feasible. Thus \(x_i^{*} = \ell _i+\beta \) with \(\beta >0\) and \(a_{rj}x_j^{*} + a_{ri}(\ell _i + \beta ) + L_r -a_{rj}u_j -a_{ri}u_i \le b_r\). Together with \(a_{rj}(x_j^{*} +\alpha ) + L^{i}_r(\ell _{i}) - a_{rj}u_j \ge a_{rj}\textsc {maxl}^{i}_j(\ell _i) + L^{i}_r(\ell _{i}) - a_{rj}u_j = b_r\) we get the inequality

    $$\begin{aligned} a_{rj}x_j^{*} + a_{ri}(\ell _i + \beta ) + L_r -a_{rj}u_j -a_{ri}u_i \le a_{rj}(x_j^{*} +\alpha ) + L^{i}_r(\ell _{i}) - a_{rj}u_j, \end{aligned}$$

    resulting in \(\beta \ge \alpha \cdot a_{rj}/a_{ri}\) with \(a_{rj}/a_{ri} \ge 1\). By Lemma 1, we can increase \(x_j^{*}\) by \(\alpha \) and decrease \(x_i^{*}\) by \(\alpha \) without loosing feasibility or optimality.

  4. (iv)

    We only treat the interesting case \(\textsc {minl}^{j}_i(u_j)\le u_i\). By Definition 3, there exists one row r with \(a_{ri}\textsc {minl}^{j}_i(u_j) + L^{j}_r(u_{j}) - a_{ri}\ell _i = b_r\). Let \(x^{*}\) be an optimal solution with \(\alpha = x_i^{*} - \max \{\ell _{i},\textsc {minl}^{j}_i(u_j)\} > 0\). Assuming \(x_j^{*}=u_j\), then \(a_{ri}\textsc {minl}^{j}_i(u_j) + L^{j}_r(u_{j}) - a_{ri}\ell _i = b_r < a_{ri}x_i^{*} + L^{j}_r(u_{j}) - a_{ri}\ell _i\) since \(a_{ri}>0\). This leads to a contradiction, because \(x^{*}\) was chosen to be optimal, and therefore also feasible. Thus \(x_j^{*}=u_j-\beta \) with \(\beta >0\) and \(a_{rj}(u_j-\beta ) +a_{ri}x_i^{*} + L_r -a_{rj}\ell _j - a_{ri}\ell _i \le b_r\). Together with \(a_{ri}(x_i^{*}-\alpha ) + L^{j}_r(u_{j}) - a_{ri}\ell _i \ge a_{ri}\textsc {minl}^{j}_i(u_j) + L^{j}_r(u_{j}) - a_{ri}\ell _i = b_r\) we get the inequality

    $$\begin{aligned} a_{rj}(u_j-\beta ) +a_{ri}x_i^{*} + L_r -a_{rj}\ell _j - a_{ri}\ell _i \le a_{ri}(x_i^{*}-\alpha ) + L^{j}_r(u_{j}) - a_{ri}\ell _i \end{aligned}$$

    yielding \(\beta \ge \alpha \cdot a_{ri}/a_{rj}\) with \(a_{ri}/a_{rj}\ge 1\). By Lemma 1, we can decrease \(x_i^{*}\) by \(\alpha \) and increase \(x_j^{*}\) by \(\alpha \) without loosing feasibility or optimality.

  5. (v)

    We only treat the interesting case \(\ell _j \le \textsc {minu}^{i}_j(\ell _i)\). By Definition 3, there exists one row r with \(a_{rj}\textsc {minu}^{i}_j(\ell _i) + U^{i}_r(\ell _{i}) - a_{rj}u_j = b_r\). Suppose there is an optimal solution \(x^{*}\) with \(x_j^{*} < \min \{u_{j}, \textsc {minu}^{i}_j(\ell _i)\}\). Let \(\alpha _{j} = \min \{u_{j}, \textsc {minu}^{i}_j(\ell _i)\} - x_j^{*}\), \(\alpha _{i} = x_i^{*} - \ell _{i}\) and \(\alpha = \min \{\alpha _{i},\alpha _{j}\}\). By Lemma 1, we can increase \(x_j^{*}\) by \(\alpha \) and decrease \(x_i^{*}\) by \(\alpha \) without loosing feasibility or optimality. If \(\alpha = \alpha _{j}\) we are finished because we constructed an optimal solution with \(x_j = \min \{u_{j}, \textsc {minu}^{i}_j(\ell _i)\}\). Otherwise, we get an optimal solution \(x^{*}\) with \(x_i^{*} = \ell _{i}\). Now, we show that \(\bar{x}\) with \(\bar{x}_{j} = \min \{u_{j}, \textsc {minu}^{i}_j(\ell _i)\}\), \(\bar{x}_{i}=\ell _{i}\) and \(\bar{x}_{k} = x_k^{*}\) for \(k\ne j,i\) is also an optimal solution. Because \(x^{*}\) is feasible and by definition of \(\bar{x}_{j}\), \(\bar{x}\) fulfills all bounds. By increasing \(x_{j}\), we can only loose feasibility for rows r with \(a_{rj} > 0\). From \(x_j \succ x_i\) we know \(0 < a_{rj} \le a_{ri}\), so these rows are exactly the rows regarded in the definition of \(\textsc {minu}^{i}_j(\ell _i)\). Assume one of these rows is violated, i.e., \(a^{T}_{r} \bar{x} > b_{r}\), then

    $$\begin{aligned} 0> & {} b_{r} - \sum _{\begin{array}{c} k=1 \end{array}}^{n} a_{rk} \bar{x}_{k} = b_{r} - \left( \sum _{\begin{array}{c} k=1\\ k\not =i\\ a_{rk}>0 \end{array}}^{n} a_{rk}\bar{x}_{k} + \sum _{\begin{array}{c} k=1\\ k\not =i\\ a_{rk}<0 \end{array}}^{n} a_{rk}\bar{x}_{k} + a_{ri}\ell _{i}\right) \\\ge & {} b_{r} - \left( \sum _{\begin{array}{c} k=1\\ k\not =i\\ a_{rk}>0 \end{array}}^{n} a_{rk}u_k + \sum _{\begin{array}{c} k=1\\ k\not =i\\ a_{rk}<0 \end{array}}^{n} a_{rk}\ell _{k} + a_{ri}\ell _{i} - a_{rj}u_{j} + a_{rj}\bar{x}_j \right) \\= & {} b_{r} - U^{i}_{r}(\ell _{i}) + a_{rj}u_{j} - a_{rj}\bar{x}_j \end{aligned}$$

    It follows that \(\bar{x}_j > (b_{r} - U^{i}_{r}(\ell _{i}) + a_{rj}u_{j})/a_{rj} \ge \textsc {minu}^{i}_j(\ell _i)\), but this contradicts the definition of \(\bar{x}_{j}\), so all rows must still be feasible. \(\bar{x}\) is also optimal since we get \(c^T\bar{x} \le c^Tx^{*}\) from \(\bar{x}_{j} > x_j^{*}\) and \(c_j \le 0\).

  6. (vi)

    We only treat the interesting case \(\textsc {maxu}^{j}_i(u_j) \le u_i\). By Definition 3, there exists one row r with \(a_{ri}\textsc {maxu}^{j}_i(u_j) + U^{j}_r(u_{j}) - a_{ri}\ell _i = b_r\). Suppose there is an optimal solution \(x^{*}\) with \(x_i^{*} > \max \{\ell _{i},\textsc {maxu}^{j}_i(u_j)\}\). Let \(\alpha _{i} = x_i^{*} - \max \{\ell _{i},\textsc {maxu}^{j}_i(u_j)\}\), \(\alpha _{j} = u_{j} - x_j^{*}\), and \(\alpha = \min \{\alpha _{i},\alpha _{j}\}\). By Lemma 1, we can decrease \(x_i^{*}\) by \(\alpha \) and increase \(x_j^{*}\) by \(\alpha \) without loosing feasibility or optimality. If \(\alpha = \alpha _{i}\), then we are finished because we constructed an optimal solution with \(x_i = \max \{\ell _{i},\textsc {maxu}^{j}_i(u_j)\}\). Otherwise, we get an optimal solution \(x^{*}\) with \(x_j^{*} = u_{j}\). Now, we show that \(\bar{x}\) with \(\bar{x}_{i} = \max \{\ell _{i},\textsc {maxu}^{j}_i(u_j)\}\), \(\bar{x}_{j}=u_j\) and \(\bar{x}_{k} = x_k^{*}\) for \(k\ne i,j\) is also an optimal solution. Because \(x^{*}\) is feasible and by definition of \(\bar{x}_{i}\), \(\bar{x}\) fulfills all bounds. By decreasing \(x_{i}\), we can only loose feasibility for rows r with \(a_{ri} < 0\). From \(x_j \succ x_i\) we know \(a_{rj} \le a_{ri} < 0\), so these rows are exactly the rows regarded in the definition of \(\textsc {maxu}^{j}_i(u_j)\). Assume one of these rows is violated, i.e., \(a^{T}_{r} \bar{x} > b_{r}\), then

    $$\begin{aligned} 0> & {} b_{r} - \sum _{\begin{array}{c} k=1 \end{array}}^{n} a_{rk} \bar{x}_{k} = b_{r} - \left( \sum _{\begin{array}{c} k=1\\ k\not =j\\ a_{rk}>0 \end{array}}^{n} a_{rk}\bar{x}_{k} + \sum _{\begin{array}{c} k=1\\ k\not =j\\ a_{rk}<0 \end{array}}^{n} a_{rk}\bar{x}_{k} + a_{rj}u_{j}\right) \\\ge & {} b_{r} - \left( \sum _{\begin{array}{c} k=1\\ k\not =j\\ a_{rk}>0 \end{array}}^{n} a_{rk}u_k + \sum _{\begin{array}{c} k=1\\ k\not =j\\ a_{rk}<0 \end{array}}^{n} a_{rk}\ell _{k} + a_{rj}u_{j} - a_{ri}\ell _{i} + a_{ri}\bar{x}_i \right) \\= & {} b_{r} - U^{j}_{r}(u_{j}) + a_{ri}\ell _{i} - a_{ri}\bar{x}_i \end{aligned}$$

    Since \(a_{ri} < 0\), it follows that \(\bar{x}_i < (b_{r} - U^{j}_{r}(u_{j}) + a_{ri}\ell _{i})/a_{ri} \le \textsc {maxu}^{j}_i(u_j)\), but this contradicts the definition of \(\bar{x}_{i}\), so all rows must still be feasible. \(\bar{x}\) is also optimal since we get \(c^T\bar{x} \le c^Tx^{*}\) from \(\bar{x}_{i} < x_i^{*}\) and \(c_i \ge 0\). \(\square \)

Whenever in Theorem 3, (iii)–(vi), the minimum or maximum is obtained for the first argument, the variable can be fixed. Since this has the highest impact regarding presolving, as it reduces the problem size, and we do not need to pay attention to rounding \(\textsc {maxl}^{t}_{s}(x_t), \textsc {maxu}^{t}_{s}(x_t), \textsc {minl}^{t}_{s}(x_t)\) and \(\textsc {minu}^{t}_{s}(x_t)\) for integer variables, we summarize the fixing criteria.

Corollary 1

Let a MIP of form (1) be given as well as two variables \(x_j, x_i\) of the same type, i.e., binary, integer or continuous, with \(x_j \succ x_i\). In the following cases, we can fix a variable while preserving at least one optimal solution.

  1. (i)

    \(\textsc {maxl}^{i}_j(\ell _i) \ge u_{j} \Rightarrow x_j\) can be set to \(u_{j}\).

  2. (ii)

    \(\textsc {minl}^{j}_i(u_j) \le \ell _{i} \Rightarrow x_i\) can be set to \(\ell _{i}\).

  3. (iii)

    \(c_j \le 0\) and \(\textsc {minu}^{i}_j(\ell _i) \ge u_{j} \Rightarrow x_j\) can be set to \(u_{j}\).

  4. (iv)

    \(c_i \ge 0\) and \(\textsc {maxu}^{j}_i(u_j) \le \ell _{i} \Rightarrow x_i\) can be set to \(\ell _{i}\).

We now apply Corollary 1 on Example 1. First we need to calculate some conditional activities \(L_{1}^{1}(u_1)=1, L_{2}^{4}(\ell _4)=-11, U_{3}^{6}(\ell _6)=4.5, U_{4}^{5}(u_5)=1.5\), which allows us to determine the values \(\textsc {minl}^{1}_2(u_1)=0, \textsc {maxl}^{4}_3(\ell _4)=3.5, \textsc {minu}^{6}_5(\ell _6)=1.25\) and \(\textsc {maxu}^{5}_6(u_5)=0\). Thus we can set \(x_2\) and \(x_6\) to their lower bounds and \(x_3\) and \(x_5\) to their upper bounds.

The following criteria can be used instead of Corollary 1. By having two alternative criteria for each variable fixing, we can select the one that fits better in a given situation. In particular, an infinite upper bound is more common than an infinite lower bound since many problems are modeled using non-negative variables.

Corollary 2

Given a MIP of the form (1) and two variables \(x_j, x_i\) of the same type, i.e., binary, integer or continuous, with \(x_j \succ x_i\). In the following cases, we can fix a variable while preserving at least one optimal solution.

  1. (i)

    \(\textsc {maxl}^{j}_i(u_j) \ge \ell _i \Rightarrow x_j\) can be set to \(u_j \).

  2. (ii)

    \(\textsc {minl}^{i}_j(\ell _i) \le u_j \Rightarrow x_i\) can be set to \(\ell _i\).

  3. (iii)

    \(c_j \le 0\) and \(\textsc {minu}^{j}_i(u_j) \ge \ell _i \Rightarrow x_j\) can  be  set to \(u_j \).

  4. (iv)

    \(c_i \ge 0\) and \(\textsc {maxu}^{i}_j(\ell _i) \le u_j \Rightarrow x_i\) can be set to \(\ell _i \).

Proof

  1. (i)

    If \(\textsc {maxl}^{j}_i(u_j) \ge \ell _i\), then by Definition 3 and Lemma 2 it follows that

    $$\begin{aligned} \textsc {maxl}^{i}_j(\ell _i) \ge \textsc {maxl}^{i}_j(\textsc {maxl}^{j}_i(u_j)) = u_j. \end{aligned}$$

    From \(\textsc {maxl}^{j}_i(u_j) < \ell _i\) follows

    $$\begin{aligned} \textsc {maxl}^{i}_j(\ell _i) < \textsc {maxl}^{i}_j(\textsc {maxl}^{j}_i(u_j)) = u_j. \end{aligned}$$

    This is the statement of Corollary 1(i).

(ii)–(iv) are similar to case (i). \(\square \)

4.3 Utilize conflict information for binary variables

For binary variables we can use information from a conflict graph [6] to fix additional variables in connection with the dominance relation. The use of this information has the advantage that it was concurrently extracted in preceding presolving rounds.

An undirected graph \(G=(V,E)\) is called a conflict graph of (1), if for every binary variable \(x_i\) there is a vertex \(v_i \in V\) and a vertex \(\bar{v}_i \in V\) for its complement \(\bar{x}_i=1-x_i\). The edge set E consists of edges \(v_i \bar{v}_i\) for all binary variables \(x_i\) and edges between two vertices when at most one of the corresponding variables or complements can be equal to 1 in an optimal solution.

Theorem 4

  1. (i)

    Let two binary variables \(x_j,x_i\) of (1) with \(x_j \succ x_i\) and \(v_j v_i \in E\) be given, then \(x_i\) can be set to 0.

  2. (ii)

    Let two binary variables \(x_j,x_i\) of (1) with \(x_j \succ x_i\) and \(\bar{v}_j \bar{v}_i \in E\) be given, then \(x_j\) can be set to 1.

Proof

  1. (i)

    With two binary variables, four variable assignments are possible. Because \(x_j=1 \wedge x_i=1\) is not allowed due to \(v_j v_i \in E\), only the possibilities \(x_j=1 \wedge x_i=0\), \(x_j=0 \wedge x_i=0\) and \(x_j=0 \wedge x_i=1\) remain. From Definition 1 and Lemma 1 we know that it is possible to increase \(x_j\) and decrease \(x_i\) accordingly, thereby staying feasible and optimal. Thus, only the cases \(x_j=1 \wedge x_i=0\) and \(x_j=0 \wedge x_i=0\) remain. In both cases, \(x_i\) is at its lower bound.

  2. (ii)

    The case is similar to (i). Finally, the logical conjunctions \(x_j=1 \wedge x_i=1\) and \(x_j=1 \wedge x_i=0\) are left. In both cases, \(x_j\) is at its upper bound. \(\square \)

4.4 Finding a dominance relation

The complexity of an algorithm that operates on a partial order (poset) is mainly determined by the width w of the poset. w is defined to be the maximum cardinality of an anti-chain, which is a subset of mutually incomparable or non-dominating elements. In [16] an algorithm was introduced that sorts a width-w poset of size n in \(O(n(w + \log n))\). Its representation has size O(wn) and permits retrieval of the relation between any two elements in time O(1). Since we cannot assume to verify the relation between any two elements in O(1) and w is usually large in our case, we decided to go for it heuristically. This approach works well in practice, which can be seen in the computational results of Sect. 6. Let \(R^{=}\subseteq \{1,\ldots ,m\}\) be the set of row indices of equalities and \(C^{=}=\{j\in \{1,\ldots ,n\} \mid a_{kj}\not = 0,k\in R^{=}\}\) be the set of columns of matrix \(A \in \mathbb {R}^{m \times n}\) having non-zero entries within equalities. \(C^{\le } = \{1,\ldots ,n\}\!\!\setminus \!\! C^{=}\) is the set of columns having only non-zeros within inequalities. The approach consists of two stages. First we sub-divide \(C^{=}\) into different parallel classes \(C^{=}_p\) concerning the non-zero entries within equalities. The detection of the parallel classes is performed by an algorithm [11] developed for detecting parallel columns in \(O(\sum _{k\in R^{=}} \left|\text{ supp }\,(A_{k\cdot })\right|\log \left|\text{ supp }\,(A_{k\cdot })\right|)\). Then we compare all columns of every \(C^{=}_p\) in \(O(m\cdot {|C^{=}_p|\atopwithdelims ()2})\). The first stage guarantees to find all dominance relations within \(C^{=}\). The second stage considers only columns of \(C^{\le }\) and takes advantage of the sparsity of A via analyzing the rows by increasing \(\left|\text{ supp }\,(A_{r\cdot })\right|\) in \(O(m\cdot {\left|\tiny \text{ supp }(A_{r\cdot })\right|\atopwithdelims ()2})\). After one row was executed, the processed columns therein are not compared to other columns anymore. Thus the number of columns which should be compared is usually much smaller than n. In cases where we have to compare a lot of columns, there is a mechanism which monitors the number of fixings per number of paired comparisons. If not enough fixings are found, then this row will not be investigated further.

5 Connected components

The connected components presolver aims at identifying small subproblems that are independent of the remaining part of the problem and tries to solve those to optimality during the presolving phase. After a component is solved to optimality, the variables and constraints forming the component can be removed from the remaining problem. This reduces the size of the problem and the linear program to be solved at each node.

Although a well modeled problem should in general not contain independent components, they occur regularly in practice. And even if a problem cannot be split into its components at the beginning, it might decompose after some rounds of presolving, e.g., because constraints connecting independent problems are detected to be redundant and can be removed. Figure 1 depicts the constraint matrices of two real-world instances at some point during presolving, reordered in a way such that independent components can easily be identified.

Fig. 1
figure 1

Matrix structures of one instance from MIPLIB 2010 and one supply chain management instance: columns and rows were permuted to visualize the block structure. Dots represent non-zero entries while gray rectangles represent the blocks, which are ordered by their size from top left to bottom right

We detect independent subproblems by first transferring the structure of the problem to an undirected graph \(G\) and then searching for connected components like in [20]. The graph \(G\) is constructed as follows: for every variable \(x_{i}\), we create a node \(v_{i}\), and for each constraint, we add edges to \(G\) connecting the variables with non-zero coefficients in the constraint. Thereby, we do not add an edge for each pair of these variables, but—in order to reduce the graph size—add a single path in the graph connecting all these variables. More formally, the graph is defined as follows: \(G = (V,E)\) with

$$\begin{aligned} \begin{array}{rclcl} V &{}=&{} \left\{ v_{i}\right\} _{i=1,\dots ,n} &{} \\ E &{}=&{} \bigcup _{k = 1}^{m}\left\{ \left( v_{i},v_{j}\right) \right. \mid 1 \le i < j \le n: &{} &{} a_{k i} \ne 0 \\ &{} &{} &{} \wedge &{} a_{k j} \ne 0 \\ &{} &{} &{} \wedge &{} \left. a_{k \ell } = 0 \;\forall \ell \in \left\{ i + 1, \dots , j - 1\right\} \right\} . \end{array} \end{aligned}$$

Given this graph, we identify connected components using depth first search. By definition, each constraint contains variables of only one component and can easily be assigned to the corresponding subproblem.

The size of the graph is linear in the number of variables and non-zeros. It has n nodes and—due to the representation of a constraint as a path—exactly \(z - m\) edges,Footnote 1 where z is the number of non-zeros in the constraint matrix. The connected components of a graph can be computed in linear time w.r.t. the number of nodes and edges of the graph [20], which is also linear in the number of variables and non-zeros of the MIP.

If we identify more than one subproblem, we try to solve the small ones immediately. In general, we would expect a better performance by solving all subproblems to optimality one after another rather than solving the complete original problem to optimality. However, this has the drawback that we do not compute valid primal and dual bounds until we start solving the last subproblem. In practical applications, we often do not need to find an optimal solution, but a time limit is applied or the solving process is stopped when a small optimality gap is reached. In this case, it is preferable to only solve easy components to optimality during presolving and solve remaining larger problems together, thereby computing valid dual and primal bounds for the complete problem.

To estimate the computational complexity of the components, we count the number of discrete variables. In case this number is larger then a specific amount we do not solve this particular component separately to avoid spending too much time in this step. In particular, subproblems containing only continuous variables are always solved, despite their dimensions.

However, the number of discrete variables is not a reliable indicator for the complexity of a problem and the time needed to solve it to optimality.Footnote 2 Therefore, we also limit the number of branch-and-bound nodes for every single subproblem. If the node limit is hit, we merge the component back into the remaining problem and try to transfer as much information to the original problem as possible; however, most insight is typically lost. Therefore, it is important to choose the parameters in a way such that this scenario is avoided.

6 Computational results

In this section, we present computational results that show the impact of the new presolving methods on the presolving performance as well as on the overall solution process.

We implemented three new presolving techniques, which were already included in the SCIP 3.0 release. The stuffing algorithm is implemented within the dominated columns presolver, because it makes use of the same data structures.

The experiments were performed on a cluster of Intel Xeon X5672 3.20 GHz computers, with 12 MB cache and 48 GB RAM, running Linux (in 64 bit mode). We used two different test sets: a set of real-world supply chain management instances provided by our industry partner and the MMM test set, which is the union of MIPLIB  3 [9], MIPLIB  2003 [4], and the benchmark set of MIPLIB  2010 [22]. For the experiments, we used the development version 3.0.1.2 of SCIP  [3] (git hash 7e5af5b) with SoPlex [30] version 1.7.0.4 (git hash 791a5cc) as the underlying LP solver and a time limit of two hours per instance. In the following, we distinguish two versions of presolving: the basic and the advanced version. The basic version performs all the presolving steps implemented in SCIP (for more details, we refer to [2]), but disables the techniques newly introduced in this paper, which are included in the advanced presolving. This measures the impact of the new methods within an environment that already contains various presolving methods. SCIP triggers a so-called restart if during root node processing a certain fraction of integer variables has been fixed. In this case presolving is also applied once more. Restarts were disabled to prevent further calls of presolvers during the solving process, thereby ensuring an unbiased comparison of the methods.

Fig. 2
figure 2

Size of the presolved supply chain management instances relative to the original number of variables and constraints

Figure 2 illustrates the presolve reductions for the supply chain management instances. For each of the instances, the percentage of remaining variables (Fig. 2a) and remaining constraints (Fig. 2b) after presolving is shown, both for the basic as well as the advanced presolving. While for every instance, the new presolving methods do some additional reductions, the amount of reductions varies heavily. On the one hand, only few additional reductions are found for the 1- and 3-series as well as parts of the 4-series, on the other hand, the size of some instances, in particular from the 2- and 5-series, is reduced to less than 1 % of the original size. The reason for this is that these instances decompose into up to 1000 independent subproblems most of which the connected components presolver does easily solve to optimality during presolve. Average results including presolving and solving time are listed in Table 1, detailed instance-wise results can be found in Table 2 in Appendix A. This also includes statistics about the impact of the new presolvers. On average, the advanced presolving reduces the number of variables and constraints by about 59 and 64 %, respectively, while the basic presolving only removes about 33 and 43 %, respectively. The components presolver fixes on average about 18 % of the variables and 16 % of the constraints. 3.5 and 0.9 % of the variables are fixed by dominating columns and stuffing, respectively. This increases the shifted geometric mean of the presolving time from 2.12 to 3.18 s, but pays off since the solving time can be reduced by almost 50 %. For a definition and discussion of the shifted geometric mean, we refer to [2].

Table 1 Comparison of basic and advanced presolving on the supply chain management test set and the MMM test set, complete as well as divided into instances with equal presolving reductions and instances where the new presolvers found additional reductions

The structure of the supply chain management instances allows the new presolving methods to often find many reductions. This is different for the instances from the more general MMM test set, where on average, the advanced presolving removes about 3 % more variables and 1 % more constraints. It allows to solve one more instance within the time limit and reduces the solving time from 335 to 317 s in the shifted geometric mean. This slight improvement can also be registered in the performance diagram shown in Fig. 3.

Fig. 3
figure 3

Performance diagram for the MMM test set. The graph indicates the number of instances solved within a certain time

However, many of the instances in the MMM test set do not contain a structure that can be used by the new presolving techniques: they are able to find reductions for less than a quarter of the instances. On the set of instances where no additional reductions are found, the time spent in presolving as well as the total time are almost the same, see row MMM:eq in Table 1. Slight differences are due to inaccurate time measurements. When regarding only the set of instances where the advanced presolving does additional reductions, the effects become clearer: while increasing the presolving time by about 50 % in the shifted geometric mean, 14.1 % more variables and 4.5 % more constraints are removed from the problem. This is depicted in Fig. 4. The majority of the variables is removed by the dominating columns presolver, which removes about 11 % of the variables on average, the connected components presolver and the stuffing have a smaller impact with less than 1 % removed variables and constraints, respectively. Often, the reductions found by the new techniques also allow other presolving methods to find additional reductions. As an example, see bley_xl1, where the dominating columns presolver finds 76 reductions, which results in more than 4200 additionally removed variables and 135,000 additionally removed constraints. On this set of instances, the advanced presolving reduces the shifted geometric mean of the solving time by 21 % in the end.

Fig. 4
figure 4

Size of the presolved instances relative to the original number of variables and constraints for all instances from the MMM test set where the new presolving techniques find reductions

7 Conclusions

In this paper, we reported on three presolving techniques for mixed integer programming which were implemented in the state-of-the-art academic MIP solver SCIP. At first, they were developed with a focus on a set of real-world supply chain management instances. Many of these contain independent subproblems which the connected components presolver can identify, solve, and remove from the problem during presolving. On the other hand, the dominating columns presolver finds reductions for all the regarded instances, removing about a quarter of the variables from some of the problems. In addition the stuffing singleton columns presolver finds reductions, although not as many as the dominating columns presolver. Together, they help to significantly improve SCIP’s overall performance on this class of instances.

Besides this set of supply chain management instances, we also regarded a set of general MIP instances from various contexts. On this set, we cannot expect the presolving steps to work on all or a majority of the instances, because many of them miss the structure needed. As a consequence, it is very important that the new presolvers do not cause a large overhead when the structure is missing, a goal we obtained by our implementation. On those instances where the new presolvers do find reductions, however, they notably speed up the solution process.

Our results show that there is still a need for new presolving techniques, also in an environment which already incorporates various such techniques. In spite of the maturity of MIP solvers, these results should motivate further research in this area, especially since presolving is one of the most important components of a MIP solver.