Keywords

1 Derivation

Let , with n > m, have full rank, and let and . Consider the Linear Program (LP)

$$\displaystyle \begin{aligned} z^{\star} = \min &\ z = c^{\mathrm{T}}x\\ \mathrm{s.t.} &\ Ax =b\\ &\ x\geq 0, \end{aligned} $$

whose feasible set is assumed to be non-empty. Let a non-optimal non-degenerate basic feasible solution be at hand. With a proper variable ordering, the solution corresponds to the partitioning A = (B, N) and \(c = (c_B^{\mathrm {T}}, c_N^{\mathrm {T}})^{\mathrm {T}}\), where is the non-singular basis matrix and is the matrix of non-basic columns. Let \(x = (x_B^{\mathrm {T}}, x_N^{\mathrm {T}})^{\mathrm {T}}\). Introducing the reduced cost vector \(\bar {c}_N^{\mathrm {T}}=c_N^{\mathrm {T}}-u^{\mathrm {T}}N\), where \(u^{\mathrm {T}}=c_B^{\mathrm {T}}B^{-1}\) is the complementary dual solution, and letting I m be the identity matrix of size m, problem LP is equivalent to

$$\displaystyle \begin{aligned} z^{\star} = \min &\ z = c_B^{\mathrm{T}}B^{-1}b+\bar{c}_N^{\mathrm{T}}x_N\\ \mathrm{s.t.} &\ I_m x_{B}+B^{-1}Nx_{N}=B^{-1}b\\ &\ x_B, ~x_N \geq 0. \end{aligned} $$

The basic solution is x B = B −1b and x N = 0. By assumption, B −1b > 0 and \(\bar {c}_N\ngeq 0\) hold.

Let \(\mathscr {N}=\{ m+1,\ldots ,n \}\), that is, the index set of the non-basic variables, and let a j be column \(j \in \mathscr {N}\) of the matrix A. Geometrically, the given basic feasible solution corresponds to an extreme point of the feasible polyhedron of problem LP, and a variable x j, \(j \in \mathscr {N}\), that enters the basis corresponds to the movement along a feasible direction that follows an edge from the extreme point. The edge direction is given by, see e.g. [4],

where is a unit vector with a one entry in position j − m. The directional derivative of the objective function along an edge direction of unit length is \(c^{\mathrm {T}}\eta _j / \| \eta _j \| = (c_B^{\mathrm {T}}, c_N^{\mathrm {T}}) \eta _j / \| \eta _j \| = (- c_B^{\mathrm {T}}B^{-1}a_j + c_j) / \| \eta _j \|= \bar {c}_j / \| \eta _j \|\) (where ∥⋅∥ is the Euclidean norm). This is the rationale for the steepest-edge criterion [3], which in the simplex method finds a variable x r, \(r \in \mathscr {N}\), to enter the basis such that .

We consider feasible directions that are constructed from non-negative linear combinations of the edge directions. To this extent, let and consider the direction

where I nm is the identity matrix of size n − m. Note that any feasible solution to LP is reachable from the given basic feasible solution along some direction η(w), and that \(c^{\mathrm {T}}\eta (w)= \bar {c}_{N}^{\mathrm {T}} w\). Our development is founded on the problem and theorem below.

Define the Steepest-Edge Problem (SEP)

$$\displaystyle \begin{aligned} \min &\ \bar{c}_{N}^{\mathrm{T}} w \\ \mathrm{s.t.} & \ \| \eta(w)\|{}^{2} \leq 1 \\ & \ \mathrm{supp}(w) =1 \\ & \ w\geq 0, \end{aligned} $$

where supp(⋅) is the support of a vector, that is, its number of nonzero components.

Theorem 1

An index \(r\in \mathscr {N}\) fulfils the steepest-edge criterion if and only if the solution

$$\displaystyle \begin{aligned} w_j = \left\{ \begin{array}{l} 1/ \| \eta_j \| \ if \ j=r\\ 0 \ otherwise \end{array} \right., \quad j \in \mathscr{N}, \end{aligned}$$

solves SEP.

The theorem follows directly from an enumeration of all feasible solutions to problem SEP. Note that the optimal value of SEP is the steepest-edge slope \(\bar {c}_r / \| \eta _{r} \|\).

To find a feasible direction that is steeper than the steepest edge, the support constraint in problem SEP is relaxed. The relaxed problem, called Direction-Finding Problem (DFP), can be stated as

$$\displaystyle \begin{aligned} \min &\ \bar{c}_N^{\mathrm{T}}w\\ \mathrm{s.t.} & \ w^{\mathrm{T}}Qw \leq 1 {}\\ & \ w \geq 0, \end{aligned} $$
(1)

where the matrix , and it gives the steepest feasible direction. Because Q is symmetric and positive definite, and \(\bar {c}_N\ngeq 0\) holds, the problem DFP has a unique, nonzero optimal solution, which fulfils the normalization constraint (1) with equality and has a negative objective value. Further, the optimal solution will in general yield a feasible direction that is a non-trivial non-negative linear combination of the edge directions, and has a directional derivative that is strictly better than that of the steepest edge.

As an example, we study the linear program

$$\displaystyle \begin{aligned} \min \left\{ -x_1-2x_2 ~|~ 5x_1-2x_2 \leq 10; ~ -2x_1+4x_2 \leq 8; ~ 2x_1+x_2 \leq 6; ~ x_1,x_2 \geq 0 \right\}, \end{aligned}$$

which is illustrated to the left in Fig. 1. For the extreme point at the origin, which has the slack basis with B = I 3, we have η 1 = (−5, 2, −2, 1, 0)T and η 2 = (2, −4, −1, 0, 1)T, with \(\bar {c}_1 / \| \eta _1 \| = -1 / \sqrt {34}\) and \(\bar {c}_2 / \| \eta _2 \| = -2 / \sqrt {22}\). If using the steepest-edge criterion, the variable x 2 would therefore enter the basis. The feasible set of DFP is shown to the right in Fig. 1. The optimal solution of DFP is w ≈ (0.163, 0.254)T with \(\bar {c}_N^{\mathrm {T}} w^* = c^{\mathrm {T}}\eta (w^*)\approx -0.672\), which should be compared with \(-2 / \sqrt {22} \approx -0.426\). The feasible direction is η(w ) ≈ (−0.309, −0.690, −0.581, 0.163, 0.254)T. The maximal feasible step in this direction yields the boundary point (x 1, x 2) ≈ (1.687, 2.625), whose objective value is better than those of the extreme points that are adjacent to the origin. (Further, this boundary point is close to the extreme point (1.6, 2.8)T, which is optimal.)

Fig. 1
figure 1

Illustration of steepest feasible direction and problem DFP

We next establish that problem DFP can be solved by means of a relaxed problem. Let μ∕2 > 0 be an arbitrary value of a Lagrangian multiplier for the constraint (1), and consider the Lagrangian Relaxation (LR)

$$\displaystyle \begin{aligned} \min_{w \geq 0} &\ r(w)= \bar{c}_{N}^{\mathrm{T}}w +\frac{\mu}{2} w^{\mathrm{T}}Qw \end{aligned} $$

of problem DFP (ignoring the constant − μ∕2). Since r is a strictly convex function, problem LR has a unique optimum, denoted w (μ). The following result can be shown.

Theorem 2

If w (μ) is optimal in problem LR, then w  = w (μ)∕∥η(w (μ))∥ is the optimal solution to problem DFP.

The proof is straightforward; both problems are convex and have interior points, and it can be verified that if w (μ) satisfies the Karush–Kuhn–Tucker conditions for problem LR then w  = w (μ)∕∥η(w (μ))∥ satisfies these conditions for problem DFP.

Hence, the steepest feasible direction can be found by solving the simply constrained quadratic program LR, for any choice of μ > 0. The following result, which is easily verified, gives an interesting characterization of the gradient of the objective function r. It should be useful if LR is approached by an iterative descent method, such as for example a projected Newton method [1].

Proposition 1

Let η B(w) = −B −1Nw and Δu T = μη(w)TB −1. Then

$$\displaystyle \begin{aligned} \nabla r(w)= c_{N}-N^{\mathrm{T}}\left(u+\varDelta u \right)+\mu w. \end{aligned} $$

Note that the expression for Δu is similar to that of the dual solution, and that the pricing mechanism of the simplex method is used to compute ∇r(w), but with a modified dual solution. Further, , that is, a non-negative linear combination of the original columns expressed in the current basis.

In order to use a feasible direction within the simplex method, it is converted into an external column [2], which is a non-negative linear combination of the original columns in LP. Letting w be an (approximate) optimal solution to problem DFP, we define the external column as \(c_{n+1}=c^{\mathrm {T}}_Nw^*\) and a n+1 = Nw . Letting \(\bar {c}_{n+1} = c_{n+1} - u^{\mathrm {T}}a_{n+1}\), the problem LP is augmented with the variable x n+1, giving

$$\displaystyle \begin{aligned} z^{\star} = \min &\ z = c_B^{\mathrm{T}}B^{-1}b+\bar{c}_N^{\mathrm{T}}x_N + \bar{c}_{n+1}x_{n+1}\\ \mathrm{s.t.} &\ I_m x_{B}+B^{-1}Nx_{N}+B^{-1}a_{n+1}x_{n+1}=B^{-1}b\\ &\ x_B, ~x_N, ~x_{n+1} \geq 0, \end{aligned} $$

where \(\bar {c}_{n+1} < 0\). By letting the external column enter the basis, the feasible direction will be followed. Note that the augmented problem has the same optimal value as the original one (If B −1a n+1 ≤ 0 holds, then z  = −.) Further, if the external column is part of an optimal solution to the augmented problem, then it is easy to recover an optimal solution to the original problem [2].

The approach presented above is related to those in [5] and [2], which both use auxiliary primal variables for following a feasible direction. (The term external column is adopted from the latter reference.) These two works do however use ad hoc rules for constructing the feasible direction, for example based on only reduced costs, instead of solving a direction-finding problem with the purpose of finding a steep direction.

2 Numerical Illustration and Conclusions

It is in practice reasonable to use a version of LR that contains only a restricted number of edge directions. Letting \(J \subseteq \mathscr {N}\), w J = (w j)jJ, \(\bar {c}_J=(\bar {c}_j) _{j\in J}\), N J = (a j)jJ and \( Q_{J} = N_{J}^{\mathrm {T}}B^{-\mathrm {T}}B^{-1}N_{J}+I_{|J|}\), the restricted LR is given by \(\min _{w_J \geq 0} r_J(w_J) = \bar {c}_J^{\mathrm {T}}w_J+\frac {\mu }{2} w_J^{\mathrm {T}}Q_Jw_J\).

An external column should be advantageous to use compared to an ordinary non-basic column, but it is computationally more expensive. Further, pivots on external columns will lead to points that are not extreme points in the original polyhedron. It is therefore reasonable to combine pivots on external columns with ordinary pivots. Considering the choice of |J|, a high value makes the restricted LR computationally more expensive, but it then also has the potential to yield a better feasible direction. Hence, there is a trade-off between the computational burden of creating the external columns, both with respect to frequency of external columns and the choice of |J|, and the reduction in simplex iterations that they may lead to.

To make an initial assessment of the potential usefulness of generating external columns as described above, we made a simple implementation in MATLAB of the revised simplex method, using the Dantzig entering variable criterion but with the option to replace ordinary pivots with pivots on external column, which are found by solving restricted versions of LR. Letting k be the number of edge directions to be included in the restriction, we consider two ways of making the selection: by finding the k most negative values among or among . These ways are called Dantzig selection and steepest-edge selection, respectively. The latter is computationally more expensive. The restricted problem LR is solved using the built-in solver quadprog.

We used test problem instances that are randomly generated according to the principle in [2], which allows the simplex method to start with the slack basis. The external columns are constructed from original columns only (although it is in principle possible to include also earlier external columns). We study the number of simplex iterations and running times required for reaching optimality when using the standard simplex method, and when using the simplex method with external columns with different numbers of edge directions used to generate the external columns and when generating external column only once or repeatedly. Table 1 shows the results. Figure 2 shows the convergence histories for the smallest problem instance when using k = 200.

Fig. 2
figure 2

Objective value versus iteration and running time, respectively

Table 1 Simplex iterations and running times

Our results indicate that the use of approximate steepest feasible directions can considerably reduce both the number of simplex iterations and the total running times, if the directions are based on many edge directions and created seldomly; if the directions are based on few edge directions and created frequently, then the overall performance can instead worsen. These findings clearly demand for more investigations. Further, our feasible direction approach must be extended to properly handle degeneracy, and tailored algorithms for fast solution of the restricted problem LR should be developed.