A partly convex problem as formulated in Sect. 8.2 of the preceding chapter can also be viewed as a convex optimization problem depending upon a parameter \( x \in \mathbb{R}^{n}. \) The dimension n of the parameter space is referred to as the nonconvexity rank of the problem. Roughly speaking, this is the number of “nonconvex variables” in the problem. As we saw in Sect. 8.2, a partly convex problem of rank n can be efficiently solved by a BB decomposition algorithm with branching performed in \( \mathbb{R}^{n}. \) This chapter discusses decomposition methods for partly convex problems with small nonconvexity rank n. It turns out that these problems can be solved by streamlined decomposition methods based on parametric programming.

1 Functions Monotonic w.r.t. a Convex Cone

Let X be a convex set in \( \mathbb{R}^{n} \) and K a closed convex cone in \( \mathbb{R}^{n}. \) A function \( f: \mathbb{R}^{n} \rightarrow \mathbb{R} \) is said to be monotonic on X with respect to K (or K-monotonic on X for short) if

$$ \displaystyle{ x,x' \in X,\ x' - x \in K\quad \Rightarrow \quad f(x') \geq f(x). } $$
(9.1)

In other words, for every x ∈ X and u ∈ K the univariate function θf(x +θ u) is nondecreasing in the interval (0, θ ) where θ  = sup{θ |  x +θ u ∈ X}. When \( X = \mathbb{R}^{n} \) we simply say that f(x) is K-monotonic.

Let L be the lineality space of K, and Q an r × n matrix of rank r such that L = { x |  Qx = 0}. From (9.1) it follows that

$$ \displaystyle{ x,x' \in X,\ Qx = Qx'\quad \Rightarrow \quad f(x) = f(x'), } $$
(9.2)

i.e., f(x) is constant on every set {x ∈ X |  Qx = const}. Denote \( Q(X) =\{ y \in \mathbb{R}^{r}\vert \) y = Qx, x ∈ X}. 

Proposition 9.1

Let K ⊂rec X. A function f(x) is K-monotonic on X if and only if

$$ \displaystyle{ f(x) = F(Qx)\quad \forall x \in X, } $$
(9.3)

where \( F: Q(X) \rightarrow \mathbb{R} \) satisfies

$$ \displaystyle{ y,y' \in Q(X),\ y' - y \in Q(K)\quad \Rightarrow \quad F(y') \geq F(y). } $$
(9.4)

Proof

If f(x) is K-monotonic on X then we can define a function \( F: Q(X) \rightarrow \mathbb{R} \) by setting, for every y ∈ Q(X): 

$$ \displaystyle{ F(y) = f(x)\ \mbox{ for any}\ x \in X\ \mbox{ such that}\ y = Qx. } $$
(9.5)

This definition is correct in view of (9.2). Obviously, f(x) = F(Qx). For any y, y′ ∈ Q(X) such that y′ − y ∈ Q(K) we have y = Qx, y′ = Qx′, with Q(x′ − x) = Qu,  u ∈ K, i.e., Qx′ = Q(x + u),  u ∈ K. Here x + u ∈ X because u ∈ recX, hence by (9.1), f(x′) = f(x + u) ≥ f(x), i.e., F(y′) ≥ F(y). Conversely, if f(x) = F(Qx) and \( F: Q(X) \rightarrow \mathbb{R} \) satisfies (9.4), then for any x, x′ ∈ X such that x′ − x ∈ K we have y = Qx, y′ = Qx′, with y′ − y ∈ Q(K), hence F(y′) ≥ F(y), i.e., f(x′) ≥ f(x).  □ 

The above representation of K-monotonic functions explains their role in global optimization. Indeed, using this representation, any optimization problem in \( \mathbb{R}^{n} \) of the form

$$ \displaystyle{ \min \{\,f(x)\vert \quad x \in D\} } $$
(9.6)

where D ⊂ X, and f(x) is K-monotonic on X, can be rewritten as

$$ \displaystyle{ \min \{F(y)\vert \quad y \in Q(D)\} } $$
(9.7)

which is a problem in \( \mathbb{R}^{r}. \) Assume further, that f(x) is quasiconcave on X, i.e., by (9.5), F(y) is quasiconcave on Q(X). 

Proposition 9.2

Let K ⊂rec X. A quasiconcave function \( f: X \rightarrow \mathbb{R} \) on X is K-monotonic on X if and only if for every x 0 ∈ X with f(x 0 ) = γ:

$$ \displaystyle{x^{0} + K \subset X(\gamma ):=\{ x \in X\vert \ f(x) \geq \gamma \}.} $$

Proof

If f(x) is K-monotonic on X, then for any x ∈ X(γ) and u ∈ K we have x + u ∈ X (because K ⊂ recX) and condition (9.1) implies that f(x + u) ≥ f(x) ≥ γ, hence x + u ∈ X(γ), i.e., u is a recession direction for X(γ). Conversely, if for any γ ∈ f(X), K is contained in the recession cone of X(γ), then for any x′, x ∈ X such that x′ − x ∈ K one has, for γ = f(x): x ∈ X(γ), hence x′ ∈ X(γ), i.e., f(x′) ≥ f(x).  □ 

Proposition 9.1 provides the foundation for reducing the dimension of problem (9.6), whereas Proposition 9.2 suggests methods for restricting the global search domain. Indeed, since local information is generally insufficient to verify the global optimality of a solution, the search for a global optimal solution must be carried out, in principle, over the entire feasible set. If, however, the objective function f(x) in (9.6) is K-monotonic, then, once a solution x 0 ∈ D is known, one can ignore the whole set D ∩ (x 0 + K) ⊂ { x ∈ D |  f(x) ≥ f(x 0)}, because no better feasible solution than x 0 can be found in this set. Such kind of information is often very helpful and may drastically simplify the problem by limiting the global search to a restricted region of the feasible domain. In the next sections, we shall discuss how efficient algorithms for K-monotonic problems can be developed based on these observations.

Below are some important classes of quasiconcave monotonic functions encountered in the applications.

Example 9.1

f(x) = φ(x 1, , x p ) + d T x, 

with φ(y 1, , y p ) a continuous concave function of y = (y 1, , y p ). 

Here F(y) = φ(y 1, , y p ) + y p+1, Qx = (x 1, , x p , d T x)T. Monotonicity holds with respect to the cone K = { u |  u i  = 0(i = 1, , p),  d T u ≥ 0}. It has long been recognized that concave minimization problems with objective functions of this form can be efficiently solved only by taking advantage of the presence of the linear part in the objective function (Rosen 1983; Tuy 1984).

Example 9.2

f(x) = − i = 1 p[〈c i, x〉]2 + 〈c p+1, x〉. 

Here F(y) = − i = 1 p y i 2 + y p+1, Qx = (〈c 1, x〉, , 〈c p+1, x〉)T. Monotonicity holds with respect to K = { u |  〈c i, u〉 = 0(i = 1, , p),  〈c p+1, u〉 ≥ 0}. For p = 1 such a function f(x) cannot in general be written as a product of two affine functions and finding its minimum over a polytope is a problem known to be NP-hard (Pardalos and Vavasis 1991), however, quite practically solvable by a parametric algorithm (see Sect. 9.5).

Example 9.3

\( f(x) =\prod _{ i=1}^{r}[\langle c^{i},x\rangle + d_{i}]^{\alpha _{i}} \) with α i  > 0,  i = 1, , r. 

This class includes functions which are products of affine functions. Since logf(x) =  i = 1 r α i log[〈c i, x〉 + d i ] for every x such that 〈c i, x〉 + d i  > 0, it is easily seen that f(x) is quasiconcave on the set X = { x |  〈c i, x〉 + d i  ≥ 0, i = 1, , r}. Furthermore, f(x) is monotonic on X with respect to the cone K = { u |  〈c i, u〉 ≥ 0,  i = 1, , r}. Here f(x) has the form (9.3) with \( F(y) =\prod _{ i=1}^{r}(y_{i} + d_{i})^{\alpha _{i}}. \) Problems of minimizing functions f(x) of this form (with α i  = 1,  i = 1, , r) under linear constraints are termed linear multiplicative programs and will be discussed later in this chapter (Sect. 9.5).

Example 9.4

\( f(x) = -\sum _{i=1}^{r}\theta _{i}e^{\langle c^{i},x\rangle } \) with θ i  > 0, i = 1, , r.

Here \( F(y) = -\sum _{i=1}^{r}\theta _{i}e^{y_{i}} \) is concave in y, so f(x) is concave in x. Monotonicity holds with respect to the cone K = { x |  〈c i, x〉 ≤ 0,  i = 1, , r}. Functions of this type appear when dealing with geometric programs with negative coefficients (“signomial programs,” cf Example 5.6).

Example 9.5

f(x) quasiconcave and differentiable, with

$$ \displaystyle{\nabla f(x) =\sum _{ i=1}^{r}p_{ i}(x)c^{i},\quad c^{i} \in \mathbb{R}^{n},\ i = 1,\ldots,r.} $$

For every x, x′ ∈ X satisfying 〈c i, x′ − x〉 = 0, i = 1, , r, there is some θ ∈ (0, 1) such that f(x′)−f(x) = 〈∇f(x+θ(x′−x)), x′−x〉 =  i = 1 r p i (x+θ(x′−x))〈c i, x′−x〉 = 0, hence f(x) is monotonic with respect to the space L = { x |   〈c i, x〉 = 0,  i = 1, , r}. If, in addition, \( p_{i}(x) \geq 0\ \ \forall x\ (i = 1,\ldots,r) \) then monotonicity holds with respect to the cone K = { u |   〈c i, u〉 ≥ 0, i = 1, , r}. 

Example 9.6

f(x) = min{y T Qx |  y ∈ E}

where E is a convex subset of \( \mathbb{R}^{m} \) and Q is an m × n matrix of rank r. 

This function appears when transforming rank r bilinear programs into concave minimization problems (see Sect. 10.4). Here monotonicity holds with respect to the subspace K = { u |  Qu = 0}. If \( E \subset \mathbb{R}_{+}^{m} \) then Qu ≥ 0 implies y T Q(x + u) ≥ y T Qx, hence f(x + u) ≥ f(x), and so f(x) is monotonic with respect to the cone K = { u |  Qu ≥ 0}. 

Example 9.7

f(x) = sup{d T y |   Ax + By ≤ q}

where \( A \in \mathbb{R}^{p\times n} \) with rank\( A = r,\ B \in \mathbb{R}^{p\times m},\ y \in \mathbb{R}^{m} \) and \( q \in \mathbb{R}^{p}. \)

This function appears in bilevel linear programming, for instance, in the max-min problem (Falk 1973b):

$$ \displaystyle{\min _{x}\max _{y}\{c^{T}x + d^{T}y\vert \ \ Ax + By \leq q,x \geq 0,y \geq 0\}.} $$

Obviously, Au ≤ 0 implies A(x + u) ≤ Ax, hence {y |   Ax + By ≤ q} ⊂ {y |   A(x + u) + By ≤ q}, hence f(x + u) ≥ f(x). Therefore, monotonicity holds with respect to the cone K = { u |  Au ≤ 0}. 

Note that by Corollary 1.15, rankQ = n − dimL = dimK . A K- monotonic function f(x) on X, with rankQ = r is also said to possess the rank r monotonicity property.

2 Decomposition by Projection

Let D be a compact subset of a closed convex set X in \( \mathbb{R}^{n} \) and let \( f: \mathbb{R}^{n} \rightarrow \mathbb{R} \) be a quasiconcave function, monotonic on X with respect to a polyhedral convex cone K contained in the recession cone of X. Assume that the lineality space of K is L = { x |  Qx = 0}, where Q is an r × n matrix of rank r. Consider the quasiconcave monotonic problem

$$ \displaystyle{ (QCM)\qquad \qquad \min \{f(x)\vert \quad x \in D\} } $$
(9.8)

In this and the next two sections, we shall discuss decomposition methods for solving this problem, under the additional assumption that D is a polytope. The idea of decomposition is to transform the original problem into a sequence of subproblems involving each a reduced number of nonconvex variables. This can be done either by projection, or by dualization (polyhedral annexation), or by parametrization ( for r ≤ 3). We first present decomposition methods by projection.

As has been previously observed, by setting

$$ \displaystyle{ F(y) = f(x)\ \mbox{ for any}\ x \in X\ \mbox{ satisfying}\ y = Qx, } $$
(9.9)

we unambiguously define a quasiconcave function \( F: Q(X) \rightarrow \mathbb{R} \) such that problem (QC M) is equivalent to

$$ \displaystyle{ \min F(y)\quad \mathrm{s.t.}\quad y = (y_{1},\ldots,y_{r}) \in G } $$
(9.10)

where G = Q(D) is a polytope in \( \mathbb{R}^{r}. \) The algorithms presented in Chaps. 5 and 6 can of course be adapted to solve this quasiconcave minimization problem.

We first show how to compute F(y). Since rankQ = r, by writing Q = [Q B , Q N ], \( x = \left [\begin{array}{c} x_{B} \\ x_{N} \end{array} \right ], \) where Q B is an r × r nonsingular matrix, the equation Qx = y yields

$$ \displaystyle{x = \left [\begin{array}{c} Q_{B}^{-1} \\ 0 \end{array} \right ]y+\left [\begin{array}{c} - Q_{B}^{-1}Q_{ N}x_{N}\\ x_{ N} \end{array} \right ]} $$

Setting \( Z = \left [\begin{array}{c} Q_{B}^{-1} \\ 0 \end{array} \right ] \) and \( u = \left [\begin{array}{c} - Q_{B}^{-1}Q_{ N}x_{N}\\ x_{ N} \end{array} \right ] \) we obtain

$$ \displaystyle{ x = Zy + u\quad \mathrm{with}\ Qu = -Q_{N}x_{N} + Q_{N}x_{N} = 0. } $$
(9.11)

Since Qx = Q(Zy) with Zy = xu ∈ XL = X, it follows that f(x) = f(Zy). Thus, the objective function of (9.10) can be computed by the formula

$$ \displaystyle{ F(y) = f(Zy). } $$
(9.12)

We shall assume that f(x) is continuous on some open set in \( \mathbb{R}^{n} \) containing D, so F(y) is continuous on some open set in \( \mathbb{R}^{r} \) containing the constraint polytope G = Q(D) of (9.10). The peculiar form of the latter set (image of a polytope D under the linear map Q) is a feature which requires special treatment, but does not cause any particular difficulty.

2.1 Outer Approximation

To solve (9.10) by outer approximation, the key point is: given a point \( \overline{y} \in Q(\mathbb{R}^{n}) \subset \mathbb{R}^{r} \), determine whether \( \overline{y} \in G = Q(D), \) and if \( \overline{y}\notin G \) then construct a linear inequality (cut) l(y) ≤ 0 to exclude \( \overline{y} \) without excluding any point of G. 

Observe that \( \overline{y}\notin Q(D) \) if and only if the linear system \( Qx = \overline{y} \) has no solution in D, so the existence of l(y) is ensured by the separation theorem or any of its equivalent forms (such as the Farkas–Minkowski Lemma). However, since we are not so much interested in the existence as in the effective construction of l(y), the best way is to use the duality theorem of linear programming (which is another form of the separation theorem). Specifically, assuming D = { x |  Ax ≤ b,  x ≥ 0}, with \( A \in \mathbb{R}^{m\times n},\ b \in \mathbb{R}^{m}, \) consider the dual pair of linear programs

$$ \displaystyle\begin{array}{rcl} & \min \{\langle h,x\rangle \vert \ \ Qx = \overline{y},Ax \leq b,x \geq 0\}&{}\end{array} $$
(9.13)
$$ \displaystyle\begin{array}{rcl} & \max \{\langle \overline{y},v\rangle -\langle b,w\rangle \vert \ Q^{T}v - A^{T}w \leq h,\ w \geq 0\}.&{}\end{array} $$
(9.14)

where h is any vector chosen so that (9.14) is feasible (for example, h = 0) and can be solved with least effort.

Proposition 9.3

Solving (9.14) either yields a finite optimal solution or an extreme direction \( (\overline{v},\overline{w}) \) of the cone Q T v − A T w ≤ 0, w ≥ 0, such that \( \langle \overline{y},\overline{v}\rangle -\langle b,\overline{w}\rangle > 0. \) In the former case, \( \overline{y} \in G = Q(D); \) in the latter case, the affine function

$$ \displaystyle{ l(y) =\langle y,\overline{v}\rangle -\langle b,\overline{w}\rangle } $$
(9.15)

satisfies

$$ \displaystyle{ l(\overline{y}) > 0,\quad l(y) \leq 0\quad \forall y \in G. } $$
(9.16)

Proof

Immediate from the duality theorem of linear programming. □ 

On the basis of this proposition a finite OA procedure for solving (9.10) can be carried out in the standard way, starting from an initial r-simplex P 1 ⊃ G = Q(D) with a known or readily computable vertex set. In most cases, one can take this initial simplex to be

$$ \displaystyle{ P_{1} = \left \{y\vert \ \ y_{i} \geq \alpha _{i},\ (i = 1,\ldots,r),\ \sum _{i=1}^{r}(y_{ i} -\alpha _{i}) \leq \beta \right \} } $$
(9.17)

where

$$ \displaystyle\begin{array}{rcl} & \alpha _{i} =\min \{ y_{i}\vert \ \ y = Qx,x \in D\},i = 1,\ldots,r,&{}\end{array} $$
(9.18)
$$ \displaystyle\begin{array}{rcl} & \beta =\max \left \{\sum _{i=1}^{r}(y_{i} -\alpha _{i})\vert \ \ y = Qx,x \in D\right \}.&{}\end{array} $$
(9.19)

Remark 9.1

A typical case of interest is the concave program with few nonconvex variables (Example 7.1):

$$ \displaystyle{ \min \{\varphi (y) + dz\vert \ \ Ay + Bz \leq b,\ y \geq 0,\ z \geq 0\} } $$
(9.20)

where the function \( \varphi: \mathbb{R}^{p} \rightarrow \mathbb{R} \) is concave and \( (y,z) \in \mathbb{R}^{p} \times \mathbb{R}^{q}. \) By rewriting this problem in the form (9.10):

$$ \displaystyle{\min \{\varphi (y) + t\vert \ \ dz \leq t,\ Ay + Bz \leq b,\ y \geq 0,\ z \geq 0\}} $$

we see that for checking the feasibility of a vector \( (\overline{y},\overline{t}) \in \mathbb{R}_{+}^{p} \times \mathbb{R} \) the best choice of h in (9.13) is h = d. This leads to the pair of dual linear programs

$$ \displaystyle\begin{array}{rcl} & \min \{dz\vert \ \ Bz \leq b - A\overline{y},\ z \geq 0\}&{}\end{array} $$
(9.21)
$$ \displaystyle\begin{array}{rcl} & \max \{\langle A\overline{y} - b,v\rangle \vert \ \ - B^{T}v \leq d,\ v \geq 0\}.&{}\end{array} $$
(9.22)

We leave it to the reader to verify that:

  1. 1.

    If \( \overline{v} \) is an optimal solution to (9.22) and \( \langle A\overline{y} - b,\overline{v}\rangle \leq \overline{t} \) then \( (\overline{y},\overline{t}) \) is feasible;

  2. 2.

    If \( \overline{v} \) is an optimal solution to (9.22) and \( \langle A\overline{y} - b,\overline{v}\rangle > \overline{t} \) then the cut \( \langle Ay - b,\overline{v}\rangle \leq t \) excludes \( (\overline{y},\overline{t}) \) without excluding any feasible point;

  3. 3.

    If (9.22) is unbounded, i.e., \( \langle A\overline{y} - b,\overline{v}\rangle > 0 \) for some extreme direction \( \overline{v} \) of the feasible set of (9.22), then the cut \( \langle Ay - b,\overline{v}\rangle \leq 0 \) excludes \( (\overline{y},\overline{t}) \) without excluding any feasible point.

    Sometimes the matrix B has a favorable structure such that for each fixed y ≥ 0 the problem (9.20) is solvable by efficient specialized algorithms (see Sect. 9.8). Since the above decomposition preserves this structure in the auxiliary problems, the latter problems can be solved by these algorithms.

2.2 Branch and Bound

A general rule in the branch and bound approach to nonconvex optimization problems is to branch upon the nonconvex variables, except for rare cases where there are obvious reasons to do otherwise. Following this rule the space to be partitioned for solving (QCM) is \( Q(\mathbb{R}^{n}) = \mathbb{R}^{r}. \) Given a partition set \( M \subset \mathbb{R}^{r} \) a basic operation is to compute a lower bound for F(y) over the feasible points in M. To this end, select an appropriate affine minorant l M (y) of F(y) over M and solve the linear program

$$ \displaystyle{\min \{l_{M}(y)\vert \ \ Qx = y,x \in D,y \in M\}.} $$

If β(M) is the optimal value of this linear program then obviously

$$ \displaystyle{\beta (M) \leq \min \{ F(y)\vert \ \ y \in Q(D) \cap M\}.} $$

According to condition (6.4) in Sect. 6.2 this can serve as a valid lower bound, provided, in addition, β(M) < + whenever Q(D) ∩ M ≠ ∅ − which is the case when M is bounded as, e.g., when using simplicial or rectangular partitioning.

Simplicial Algorithm

If M = [u 1, , u r+1] is an r-simplex in \( Q(\mathbb{R}^{n}) \) then l M (x) can be taken to be the affine function that agrees with F(y) at u 1, , u r+1. Since l M (y) =  i = 1 r+1 t i F(u i) for y =  i = 1 r+1 t i u i with t = (t 1, , t r+1) ≥ 0,  i = 1 r+1 t i  = 1, the above linear program can also be written as

$$ \displaystyle{\min \left \{\sum _{i=1}^{r+1}t_{ i}F(u^{i})\vert \ \ x \in D,Qx =\sum _{ i=1}^{r+1}t_{ i}u^{i},\sum _{ i=1}^{r+1}t_{ i} = 1,t \geq 0\right \}.} $$

Simplicial subdivision is advisable when f(x) = f 0(x) + dx, where f 0(x) is concave and monotonic with respect to the cone Qx ≥ 0. Although f(x) is then actually monotonic with respect to the cone Qx ≥ 0, dx ≥ 0, by writing (QCM) in the form

$$ \displaystyle{\min \{F_{0}(y) + dx\vert \ \ Qx = y,x \in D\}} $$

one can see that it can be solved by a simplicial algorithm operating in \( Q(\mathbb{R}^{n}) \) rather than \( Q(\mathbb{R}^{n}) \times \mathbb{R}. \) As initial simplex M 1 one can take the simplex (9.17).

Rectangular Algorithm

When F(y) is concave separable, i.e.,

$$ \displaystyle{F(y) =\sum _{ i=1}^{r}F_{ i}(y_{i}),} $$

where each F i (⋅ ), i = 1, , r is a concave univariate function, this structure can be best exploited by rectangular subdivision. In that case, for any rectangle M = [p, q] = { y |  p ≤ y ≤ q}, l M (y) can be taken to be the uniquely defined affine function which agrees with F(y) at the corners of M. This function is l M (y) =  i = 1 r l i, M (y i ) where each l i, M (t) is the affine univariate function that agrees with F i (t) at the endpoints p i , q i of the segment [p i , q i ]. The initial rectangle can be taken to be M 1 = [α 1, β 1] × × [α r , β r ], where α i is defined as in (9.18) and

$$ \displaystyle{\beta _{i} =\sup \{ y_{i}\vert \ y = Qx,x \in D\}.} $$

Conical Algorithm

In the general case, since the global minimum of F(y) is achieved on the boundary of Q(D), conical subdivision is most appropriate. Nevertheless, instead of bounding, we use a selection operation, so that the algorithm is a branch and select rather than branch and bound procedure.

The key point in this algorithm is to decide, at each iteration, which cones in the current partition are ‘non promising’ and which one is the ‘most promising’ (selection rule).

Let \( \overline{y} \) be the current best solution (CBS) at a given iteration and \( \gamma:= F(\overline{y}). \) Assume that all the cones are vertexed at 0 and F(0) > γ, so that

$$ \displaystyle{0 \in \mathrm{int}\varOmega _{\gamma },} $$

where Ω γ  = { y ∈ Ω |  F(y) ≥ γ}. Let l M (y) = 1 be the equation of the hyperplane in \( \mathbb{R}^{r} \) passing through the intersections of the edges of M with the surface F(y) = γ. Denote by ω(M) a basic optimal solution and by μ(M) the optimal value of the linear program

$$ \displaystyle{LP(M,\gamma )\qquad \qquad \quad \max \{l_{M}(y)\vert \ \ y \in Q(D) \cap M\}.} $$

Note that if u 1, , u r are the directions of the edges of M and U = (u 1, , u r) is the matrix of columns u 1, , u r then M = { y |  y = Ut, t = (t 1, , t r )T ≥ 0}, so LP(M,γ) can also be written as

$$ \displaystyle{\max \{l_{M}(Ut)\vert \ Ax \leq b,\ Qx = Ut,\ x \geq 0,\ t \geq 0\}.} $$

The deletion and selection rule are based on the following:

Proposition 9.4

If μ(M) ≤ 1 then M cannot contain any feasible point better than CBS. If μ(M) > 1 and ω(M) lies on an edge of M then ω(M) is a better feasible solution than CBS.

Proof

Clearly DM is contained in the simplex Z: = M ∩{ y |  l M (y) ≤ μ(M)}. If μ(M) ≤ 1, then the vertices of Z other than 0 lie inside Ω γ , hence the values of F(y) at these vertices are at least equal to γ. But by quasiconcavity of F(y), its minimum over Z must be achieved at some vertex of Z, hence must be at least equal to γ. This means that F(y) ≥ γ for all y ∈ Z, and hence, for all y ∈ DM. To prove the second assertion, observe that if ω(M) lies on an edge of M and μ(M) > 1, then ω(M) ∉ Ω γ , i.e., F(ω(M)) < γ; furthermore, ω(M) is obviously feasible. □ 

The algorithm is initialized from a partition of \( \mathbb{R}^{r} \) into r + 1 cones vertexed at 0 (take, e.g., the partition induced by the r-simplex [e 0, e 1, , e r] where e i, for i = 1, , r, is the i-th unit vector of \( \mathbb{R}^{r} \) and \( e^{0} = -\frac{1} {r}(e^{1} +\ldots +e^{r})). \) On the basis of Proposition 9.4, at iteration k a cone M will be fathomed if μ(M) ≤ 1, while the most promising cone M k (the candidate for further subdivision) is the unfathomed cone that has largest μ(M). If ω k = ω(M k ) lies on any edge of M k it yields, by Proposition 9.4, a better feasible solution than CBS and the algorithm can go to the next iteration with this improved CBS. Otherwise, M k can be subdivided via the ray through ω k. 

As proved in Chap. 7 (Theorem 7.3), the above described conical algorithm converges, in the sense that any accumulation point of the sequence of CBS that it generates is a global optimal solution. Note that no bound is needed in this branch and select algorithm. Although a bound for F(y) over the feasible points in M can be easily derived from the constructions used in Proposition 9.4, it would require the computation of the values of F(y) at r additional points (the intersections of the edges of M with the hyperplane l M (y) = μ(M)), which may entail a nonnegligible extra cost.

3 Decomposition by Polyhedral Annexation

The decomposition method by outer approximation is conceptually simple. However, a drawback of this method is its inability to be restarted. Furthermore, the construction of the initial polytope involves solving r + 1 linear programs. An alternative decomposition method which allows restarts and usually performs better is by dualization through polyhedral annexation.

Let \( \overline{x} \) be a feasible solution and \( C:=\{ x \in X\vert \ \ f(x) \geq f(\overline{x})\}. \) By quasiconcavity and continuity of f(x), C is a closed convex set. By translating if necessary, assume that 0 is a feasible point such that \( f(0) > f(\overline{x}), \) so 0 ∈ D ∩intC. The PA method (Chap. 8, Sect. 8.1) uses as a subroutine Procedure DC* for solving the following subproblem:

Find a point x ∈ D∖C (i.e., a better feasible solution than \( \overline{x}) \) or else prove that D ⊂ C (i.e., \( \overline{x} \) is a global optimal solution).

The essential idea of Procedure DC* is to build up, starting with a polyhedron P 1 ⊃ C (Lemma 8.1), a nested sequence of polyhedrons P 1 ⊃ P 2 ⊃  ⊃ C that yields eventually either a polyhedron P k  ⊂ D (in that case C  ⊂ D , hence D ⊂ C) or a point x k ∈ D ∖ C. In order to exploit the monotonicity structure, we now try to choose the initial polyhedron P 1 so that P 1 ⊂ L  ⊥  (orthogonal complement of L). If this can be achieved, then the whole procedure will operate in L  ⊥ , which allows much computational saving since dimL  ⊥  = r < n. 

Let c 1, , c r be the rows of Q (so that L  ⊥  is the subspace generated by these vectors) and let c 0 = − i = 1 r c i. Let \( \pi: L^{\perp }\rightarrow \mathbb{R}^{r} \) be the linear map such that y =  i = 1 r t i c i ⇔ π(y) = t. For each i = 0, 1, , r since 0 ∈ intC, the halfline from 0 through c i intersects C. If this halfline meets ∂ C, we define α i to be the positive number such that z i = α i c i ∈ ∂ C; otherwise, we set α i  = +. Let I = { i |  α i  < +}, S 1 = conv{0, z i, i ∈ I} + cone{c i, iI}. 

Lemma 9.1

  1. (i)

    The polar P 1 of S 1 + L is an r-simplex in L defined by

    $$ \displaystyle{ \pi (P_{1}) = \left \{t \in \mathbb{R}^{r}\vert \ \sum _{ i=1}^{r}t_{ i}\langle c^{i},c^{j}\rangle \leq \frac{1} {\alpha _{j}},\ j = 0,1,\ldots,r\right \}, } $$
    (9.23)

    (as usual \( \frac{1} {+\infty } = 0). \)

  2. (ii)

    C ⊂ P 1 and if V is the vertex set of any polytope P such that C ⊂ P ⊂ P 1 , then

    $$ \displaystyle{ \max \{\langle v,x\rangle \vert \quad x \in D\} \leq 1\quad \forall v \in V \quad \Rightarrow \quad C^{\circ }\subset D^{\circ }. } $$
    (9.24)

Proof

Clearly S 1 ⊂ L  ⊥  and 0 ∈ riS 1, so 0 ∈ int(S 1 + L), hence P 1 is compact. Since (S 1 + L) = S 1 L  ⊥ , any y ∈ P 1 belongs to L  ⊥ , i.e., is of the form y =  i = 1 r t i c i for some t = (t 1, , t r ). But by Proposition 1.28, y ∈ S 1 is equivalent to \( \langle c^{j},y\rangle \leq \frac{1} {\alpha _{j}}\ \forall j = 0,1,\ldots,r. \) Hence P 1 is the polyhedron defined by (9.23). We have \( 0 < \frac{1} {\alpha _{j}}\ \forall j = 0,1,\ldots,r, \) so dimP 1 = r, (Corollary 1.16) and since P 1 is compact it must be an r-simplex. To prove (ii) observe that S 1 + L ⊂ CL  ⊥  + L = C, so C  ⊂ (S 1 + L) = P 1. If \( \max \{\langle v,x\rangle \vert \ x \in D\} \leq 1\quad \forall v \in V, \) then \( \max \{\langle y,x\rangle \vert \ x \in D\} \leq 1\ \forall y \in P, \) i.e., P ⊂ D , hence C  ⊂ D because C  ⊂ P.  □ 

Thus Procedure DC* can be applied, starting from P 1 as an initial polytope. By incorporating Procedure DC* (initialized from P 1) into the PA Algorithm for (BCP) (Sect. 7.1) we obtain the following decomposition algorithm for (QCM):

PA Algorithm for (QCM)

Initialization. Let \( \overline{x} \) be a feasible solution (the best available), \( C =\{ x \in X\vert \ f(x) \geq f(\overline{x})\}. \) Choose a feasible point x 0 of D such that \( f(x^{0}) > f(\overline{x}) \) and set D ← Dx 0,  C ← Cx 0. Let \( \tilde{P}_{1} =\pi (P_{1}) \) be the simplex in \( \mathbb{R}^{r} \) defined by (9.23) (or \( \tilde{P}_{1} =\pi (P_{1}) \) with P 1 being any polytope in L  ⊥  such that C  ⊂ P 1). Let V 1 be the vertex set of \( \tilde{P}_{1}. \) Set k = 1. 

Step 1.:

For every new t = (t 1, , t r ) ∈ V k solve the linear program

$$ \displaystyle{ \max \left \{\sum _{i=1}^{r}t_{ i}\langle c^{i},x\rangle \vert \ \ x \in D\right \} } $$
(9.25)

to obtain its optimal value μ(t) and a basic optimal solution x(t). 

Step 2.:

Let t k ∈ argmax{μ(t) |  t ∈ V k }. If μ(t k) ≤ 1 then terminate: \( \overline{x} \) is an optimal solution of (QCM).

Step 3.:

If μ(t k) > 1 and x k = x(t k) ∉ C, then update the current best feasible solution and the set C by resetting \( \overline{x} = x^{k}. \)

Step 4.:

Compute

$$ \displaystyle{\theta _{k} =\sup \{\theta \vert \ \ f(\theta x^{k}) \geq f(\overline{x})\}} $$

and define

$$ \displaystyle{\tilde{P}_{k+1} =\tilde{ P}_{k} \cap \left \{t\vert \ \sum _{i=1}^{r}t_{ i}\langle x^{k},c^{i}\rangle \leq \frac{1} {\theta _{k}}\right \}.} $$
Step 5.:

From V k derive the vertex set V k+1 of \( \tilde{P}_{k+1}. \) Set k ← k + 1 and go back to Step 1.

Remark 9.2

Reported computational experiments (Tuy and Tam 1995) seem to indicate that the PA method is quite practical for problems with fairly large n provided r is small (typically r ≤ 6). For r ≤ 3, however, specialized parametric methods can be developed which are more efficient. These will be discussed in the next section.

Case When K = { u |  Qu ≥ 0}

A case when the initial simplex P 1 can be constructed so that its vertex set is readily available when K = { u |  Qu ≥ 0}, with an r × n matrix Q of rank r. Let c 1, , c r be the rows of Q. Using formula (9.11) one can compute a point w satisfying Qw = −e, where e = (1, , 1)T, and a value α > 0 such that \( \hat{w}=\alpha w \in C \) (for the efficiency of the algorithm, α should be taken to be as large as possible).

Lemma 9.2

The polar P 1 of the set \( S_{1} = \hat{w}+ K \) is an r- simplex with vertex set {0,−c 1 ∕α,…,−c r ∕α} and satisfying condition (ii) in Lemma  9.1.

Proof

Since K ⊂ recC and \( \hat{w}\in C \) it follows that \( S_{1} = \hat{w}+ K \subset C, \) and hence P 1 ⊃ C . Furthermore, clearly S 1 = { x |  〈c i, x〉 ≥ −α,  i = 1, , r} because x ∈ S 1 if and only if \( x -\hat{w}\in K, \) i.e., \( \langle c^{i},x -\hat{w} \rangle \geq 0\ i = 1,\ldots,r, \) i.e., \( \langle c^{i},x\rangle \geq \langle c^{i}, \hat{w} \rangle = -\alpha,\ i = 1,\ldots,r. \) From Proposition 1.28 we then deduce that S 1  = conv{0, −c 1α, , −c rα}. The rest is immediate. □ 

Also note that in this case K  = cone{ − c 1, , −c r} and so \( \pi (K^{\circ }) = \mathbb{R}_{-}^{r} \) while \( \pi (P_{1}) =\{ t \in \mathbb{R}_{-}^{r}\vert \ \sum _{i=1}^{r}t_{i} \geq -1/\alpha \}. \)

4 The Parametric Programming Approach

In the previous sections, it was shown how a quasiconcave monotonic problem can be transformed into an equivalent quasiconcave problem of reduced dimension. In this section an alternative approach will be presented in which a quasiconcave monotonic problem is solved through the analysis of an associated linear program depending on a parameter (of the same dimension as the reduced problem in the former approach).

Consider the general problem formulated in Sect. 8.2:

$$ \displaystyle{ (QCM)\qquad \qquad \min \{f(x)\vert \quad x \in D\} } $$
(9.26)

where D is a compact set in \( \mathbb{R}^{n}, \) not necessarily even convex, and f(x) is a quasiconcave monotonic function on a closed convex set X ⊃ D with respect to a polyhedral cone K with lineality space L = { x |  Qx = 0}. To this problem we associate the following program where v is an arbitrary vector in K : 

$$ \displaystyle{ LP(v)\qquad \qquad \min \{\langle v,x\rangle \vert \quad x \in D\}. } $$
(9.27)

Denote by K b any compact set such that cone(K b ) = K . 

Theorem 9.1

For every v ∈−K , let x v be an arbitrary optimal solution of LP(v). Then

$$ \displaystyle{ \min \{f(x)\vert \ x \in D\} =\min \{ f(x^{v})\vert \ v \in -K_{ b}^{\circ }\}. } $$
(9.28)

Proof

Let γ = min{f(x v) |  v ∈ −K b 0}. Clearly γ = min{f(x v) |  v ∈ −K {0}}. Denote by E the convex hull of the closure of the set of all x v, v ∈ −K {0}. Since D is compact, so is E and it is easy to see that the convex set G = E + K is closed. Indeed, for any sequence x ν = u ν + v ν → x 0 with u ν ∈ E, v ν ∈ K, one has, by passing to subsequences if necessary, u ν → u 0 ∈ E, hence v ν = x νu ν → x 0u 0 ∈ K, which implies that x 0 = u 0 + (x 0u 0) ∈ E + K = G. Thus, G is a closed convex set. By K-monotonicity we have f(y + u) ≥ f(y) for all y ∈ E, u ∈ K, hence f(x) ≥ γ for all x ∈ E + K = G. Suppose now that γ > min{f(x) |  x ∈ D}, so that for some z ∈ D we have f(z) < γ, i.e., z ∈ D ∖ G. By Proposition 1.15 on the normals to a convex set there exists x 0 ∈ ∂ G such that v: = x 0z satisfies

$$ \displaystyle{ \langle v,x - x^{0}\rangle \geq 0\quad \forall x \in G,\quad \langle v,z - x^{0}\rangle < 0. } $$
(9.29)

For every u ∈ K, since x 0 + u ∈ G + K = G, we have 〈v, u〉 ≥ 0, hence v ∈ −K {0}. But, since z ∈ D, it follows from the definition of x v that 〈v, x v〉 ≤ 〈v, z〉 < 〈v, x 0〉 by (9.29), hence 〈v, x vx 0〉 < 0, conflicting with the fact x v ∈ E ⊂ G and (9.29). □ 

The above theorem reduces problem (QCM) to minimizing the function φ(v): = f(x v) over K b . Though K  ⊂ L  ⊥ , and dimL  ⊥  = r may be small, this is still a difficult problem even when D is convex, because φ(v) is generally highly nonconvex. The point, however, is that in many important cases the problem (9.28) is more amenable to computational analysis than the original problem. For instance, as we saw in the previous section, when D is a polytope the PA Method solves (QCM) through solving a suitable sequence of linear programs (9.25) which can be recognized to be identical to (9.27) with v = − i = 1 r t i c i. 

An important consequence of Theorem 9.1 can be drawn when:

$$ \displaystyle{ \left \vert \begin{array}{l} D\ \mbox{ is a polyhedron}; \\ K =\{ x\vert \ \ \langle c^{i},x\rangle \geq 0,i = 1,\ldots,p;\ \langle c^{i},x\rangle = 0,i = p + 1,\ldots,r\} \end{array} \right. } $$
(9.30)

(c 1, , c r being linearly independent vectors of \( \mathbb{R}^{n}). \) Let

$$ \displaystyle\begin{array}{rcl} & & \varLambda =\{\lambda \in \mathbb{R}_{+}^{p}\vert \ \lambda _{ p} = 1\}, {}\\ & & W =\{\alpha = (\alpha _{p+1},\ldots,\alpha _{r})\vert \ \underline{\alpha }_{i} \leq \alpha _{i} \leq \overline{\alpha }_{i},\ i = p + 1,\ldots,r\}, {}\\ & & \underline{\alpha }_{i} =\inf _{x\in D}\langle c^{i},x\rangle;\quad \overline{\alpha }_{ i} =\sup _{x\in D}\langle c^{i},x\rangle \ \ i = p + 1,\ldots,r. {}\\ \end{array} $$

Corollary 9.1

For each (λ,α) ∈Λ× W let x λ,α be an arbitrary basic optimal solution of the linear program

$$ \displaystyle{LP(\lambda,\alpha )\quad \left \vert \begin{array}{ll} \mathrm{min}&\sum _{i=1}^{p-1}\lambda _{i}\langle c^{i},x\rangle +\langle c^{p},x\rangle \\ \mathrm{s.t.} &x \in D,\ \langle c^{i},x\rangle =\alpha _{i},\ i = p + 1,\ldots,r. \end{array} \right.} $$

Then

$$ \displaystyle{ \min \{f(x)\vert \ \ x \in D\} =\min \{ f(x^{\lambda,\alpha })\vert \ \ \lambda \in \varLambda,\ \alpha \in W\}. } $$
(9.31)

Proof

Clearly

$$ \displaystyle{\min \{f(x)\vert \ \ x \in D\} =\min _{\alpha }\inf \{f(x)\vert \ \ x \in D \cap H_{\alpha }\},} $$

where H α  = { x |  〈c i, x〉 = α i ,  i = p + 1, r}. Let \( \tilde{K} =\{ x\vert \ \langle c^{i},x\rangle \geq 0, \) i = 1, , p}. Since f(x) is quasiconcave \( \tilde{K} \)-monotonic on XH α , we have by Theorem 9.1:

$$ \displaystyle{ \min \{f(x)\vert \ \ x \in D \cap H_{\alpha }\} =\min \{ f(x^{v,\alpha })\vert \ \ v \in -\tilde{K}_{ b}^{\circ }\}, } $$
(9.32)

where x v, α is any basic optimal solution of the linear program

$$ \displaystyle{ \min \{\langle v,x\rangle \vert \ \ x \in D \cap H_{\alpha }\}. } $$
(9.33)

Noting that \( \tilde{K}^{\circ } = -\mathrm{cone}\{c^{1},\ldots,c^{p}\} \) we can take \( \tilde{K}_{b}^{\circ } = -[c^{1},\ldots,c^{p}]. \) Also we can assume \( v \in -\mathrm{ri}\tilde{K}_{b}^{\circ } \) since for any \( v \in -\tilde{K}_{b}^{\circ } \) there exists \( v' \in -\mathrm{ri}\tilde{K}_{b}^{\circ } \) such that x v′, α is a basic optimal solution of (9.33). So v =  i = 1 p t i c i with t p  > 0 and the desired result follows by setting \( \lambda _{i} = \frac{t_{i}} {t_{p}}. \) □ 

As mentioned above, by Theorem 9.1 one can solve (QCM) by finding first the vector \( \overline{v} \in -K_{b}^{\circ } \) that minimizes f(x v) over all v ∈ K b . In certain methods, as, for instance, in the PA method when D is polyhedral, an optimal solution \( \overline{x} = x^{\overline{v}} \) is immediately known once \( \overline{v} \) has been found. In other cases, however, \( \overline{x} \) must be sought by solving the associated program \( \min \{\langle \overline{v},x\rangle \vert \ x \in D\}. \) The question then arises as to whether any optimal solution of the latter program will solve (QCM). 

Theorem 9.2

If, in addition to the already stated assumptions, D ⊂int X, while the function f(x) is continuous and strictly quasiconcave on X, then for any optimal solution \( \overline{x} \) of (QCM) there exists a \( \overline{v} \in K^{\circ }\setminus \{0\} \) such that \( \overline{x} \) is an optimal solution of \( LP(\overline{v}) \) and any optimal solution to \( LP(\overline{v}) \) will solve (QCM).

(A function f(x) is said to be strictly quasiconcave on X if for any two x,x′ ∈ X,f(x′) < f(x) always implies f(x′) < f(θx + (1 −θ)x′) ∀θ ∈ (0,1))

Proof

It suffices to consider the case when f(x) is not constant on D. Let \( \overline{x} \) be an optimal solution of (QCM) and \( X(\overline{x}) =\{ x \in X\vert \ f(x) \geq f(\overline{x})\}. \) By quasiconcavity and continuity of f(x), this set \( X(\overline{x}) \) is closed and convex. By optimality of \( \overline{x}, \) \( D \subset X(\overline{x}) \) and again by continuity of f(x), \( \{x \in X\vert \ f(x) > f(\overline{x})\} \subset \mathrm{int}X(\overline{x}). \) Furthermore, since f(x) is not constant on D one can find x′ ∈ D such that \( f(x') > f(\overline{x}), \) so if \( \overline{x} \in \mathrm{int}X(\overline{x}) \) then for θ > 0 small enough, \( x^{\theta } = \overline{x} -\theta (x^{{\prime}}-\overline{x}) \in X(\overline{x}),\ f(x^{\theta }) < f(x^{{\prime}}), \) hence, by strict quasiconcavity of f(x), \( f(x^{\theta }) < f(\overline{x}), \) conflicting with the definition of \( X(\overline{x}). \) Therefore, \( \overline{x} \) lies on the boundary of \( X(\overline{x}). \) By Theorem 1.5 on supporting hyperplanes there exists then a vector \( \overline{v}\neq 0 \) such that

$$ \displaystyle{ \langle \overline{v},x -\overline{x}\rangle \geq 0\quad \forall x \in X(\overline{x}). } $$
(9.34)

For any u ∈ K we have \( \overline{x} + u \in D + K \subset X(\overline{x}), \) so \( \langle \overline{v},-u\rangle \geq 0, \) and consequently, \( \overline{v} \in -K^{\circ }\setminus \{0\}. \) If \( \tilde{x} \) is any optimal solution of \( LP(\overline{v}), \) then \( \langle \overline{v},\tilde{x} -\overline{x}\rangle = 0, \) and in view of (9.34), \( \tilde{x} \) cannot be an interior point of \( X(\overline{x}) \) for then \( \langle \overline{v},x -\overline{x}\rangle = 0\ \ \forall x \in \mathbb{R}^{n}, \) conflicting with \( \overline{v}\neq 0. \) So \( \tilde{x} \) is a boundary point of \( X(\overline{x}), \) and hence \( f(\tilde{x}) = f(\overline{x}), \) because \( \{x \in X\vert \ f(x) > f(\overline{x})\} \subset \mathrm{int}X(\overline{x}). \) Thus, any optimal solution \( \tilde{x} \) to LP\( (\overline{v}) \) is optimal to (QCM). □ 

Remark 9.3

If f(x) is differentiable and pseudoconcave then it is strictly quasiconcave (see, e.g., Mangasarian 1969). In that case, since \( X(\overline{x}) =\{ x \in X\vert f(x) \geq f(\overline{x})\} \) one must have \( \langle \nabla f(\overline{x}),x -\overline{x}\rangle \geq 0\ \forall x \in X(\overline{x}) \) and since one can assume that there is x ∈ X satisfying \( f(x) > f(\overline{x}) \) the strict quasiconcavity of f(x) implies that \( \nabla f(\overline{x})\neq 0 \) (Mangasarian 1969). So \( \nabla f(\overline{x}) \) satisfies (9.34) and one can take \( \overline{v} = \nabla f(\overline{x}). \) This result was established in Sniedovich (1986) as the foundation of the so-called C-programming.

Corollary 9.2

Let \( F: \mathbb{R}^{q} \rightarrow \mathbb{R} \) be a continuous, strictly quasiconcave and K-monotonic function on a closed convex set \( Y \subset \mathbb{R}^{q}, \) where \( K \subset \mathbb{R}^{q} \) is a polyhedral convex cone contained in the recession cone of Y. Let \( g: D \rightarrow \mathbb{R}^{q} \) be a map defined on a set \( D \subset \mathbb{R}^{n} \) such that g(D) ⊂int Y. If F(g(x)) attains a minimum on D then there exists t ∈−K such that any optimal solution of

$$ \displaystyle{ \min \{\langle t,g(x)\rangle \vert \quad x \in D\} } $$
(9.35)

will solve

$$ \displaystyle{ \min \{F(g(x))\vert \quad x \in D\}. } $$
(9.36)

Proof

If \( \overline{x} \) is a minimizer of F(g(x)) over D, then \( \overline{y} = g(\overline{x}) \) is a minimizer of F(y) over G = g(D), and the result follows from Theorem 9.2. □ 

In particular, when \( K \supset \mathbb{R}_{+}^{q} \) and D is a convex set while g i (x), i = 1, , q are convex functions, problem (9.35) is a parametric convex program (t ≥ 0 because \( -K^{\circ }\subset \mathbb{R}_{+}^{q}). \)

Example 9.8

Consider the problem (Tanaka et al. 1991):

$$ \displaystyle{ \min _{x}\left \{\left (\sum _{i=1}^{n}f_{ 1,i}(x_{i})\right ).\left (\sum _{i=1}^{n}f_{ 2,i}(x_{i})\right )\vert \ \ x_{i} \in X_{i},\ i = 1,\ldots,n\right \} } $$
(9.37)

where \( X_{i} \subset \mathbb{R}_{++} \) and f 1, i , f 2, i are positive-valued functions (i = 1, n). Applying the Corollary for g(x) = (g 1(x), g 2(x)),  g 1(x) =  i = 1 n f 1, i (x i ), g 2(x) =  i = 1 n f 2, i (x i ), F(y) = y 1 y 2 for every \( y = (y_{1},y_{2}) \in \mathbb{R}_{++}^{2}, \) we see that if this problem is solvable then there exists a number t > 0 such that any optimal solution of

$$ \displaystyle{ \min _{x}\left \{\sum _{i=1}^{n}(f_{ 1,i}(x_{i}) + tf_{2,i}(x_{i}))\vert \ \ x_{i} \in X_{i},\ i = 1,\ldots,n\right \} } $$
(9.38)

will solve (9.37). For fixed t > 0 the problem (9.38) splits into n one-dimensional subproblems

$$ \displaystyle{\min \{f_{1,i}(x_{i}) + tf_{2,i}(x_{i})\vert \ \ x_{i} \in X_{i}\}.} $$

In the particular case of the problem of optimal ordering policy for jointly replenished products as treated in the mentioned paper of Tanaka et al. f 1, i (x i ) = an + a i x i , f 2, i (x i ) = b i x i ∕2 (a, a i , b i  > 0) and X i is the set of positive integers, so each of the above subproblems can easily be solved.

5 Linear Multiplicative Programming

In this section we discuss parametric methods for solving problem (QCM) under assumptions (9.30). Let Q be the matrix of rows c 1, , c r. In view of Proposition 9.1, the problem can be reformulated as

$$ \displaystyle{ \min \{F(\langle c^{1},x\rangle,\ldots,\langle c^{r},x\rangle )\vert \ \ x \in D\}, } $$
(9.39)

where D is a polyhedron in \( \mathbb{R}^{n}, \) and \( F: \mathbb{R}^{r} \rightarrow \mathbb{R} \) is a quasiconcave function on a closed convex set Y ⊃ Q(D), such that for any y, y′ ∈ Y: 

$$ \displaystyle{ \left.\begin{array}{ll} y_{i} \geq y'_{i}&(i = 1,\ldots,p) \\ y_{i} = y'_{i}&(i = p + 1,\ldots,r) \end{array} \right \} \Rightarrow F(y) \geq F(y'). } $$
(9.40)

A typical problem of this class is the linear multiplicative program:

$$ \displaystyle{ (LMP)\qquad \qquad \min \left \{\prod _{j=1}^{r}\langle c^{i},x\rangle \vert \quad Ax = b,x \geq 0\right \} } $$
(9.41)

which corresponds to the case F(y) =  i = 1 r y i . Aside from (LMP), a wide variety of other problems can be cast in the form (9.39), as shown by the many examples in Sect. 7.1. In spite of that, it turns out that solving (9.39) with different functions F(y) reduces to comparing the objective function values on a finite set of extreme points of D dependent upon the vectors c 1, , c r but not on the specific form of F(y).

Due to their relevance to applications in various fields (multiobjective programming, bond portfolio optimization, VLSI design, etc…), the above class of problems, including (LMP) and its extensions, has been the subject of quite a few research. Parametric methods for solving (LMP) date back to Swarup (1966a), Swarup (1966b), Swarup (1966c), Forgo (1975), and Gabasov and Kirillova (1980). However intensive development in the framework of global optimization began only with the works of Konno and Kuno (19921990). The basic idea of the parametric method can be described as follows. By Corollary 9.1 one can solve (LMP) by solving the problem

$$ \displaystyle{ \min \{f(x^{\lambda,\alpha })\vert \ \ \lambda \in \varLambda,\alpha \in W\}, } $$
(9.42)

where, in the notation in Corollary 9.1, x λ, α is an arbitrary optimal solution of the linear program

$$ \displaystyle{LP(\lambda,\alpha )\quad \left \vert \begin{array}{ll} \mathrm{min}&\sum _{i=1}^{p}\lambda _{i}\langle c^{i},x\rangle \\ \mathrm{s.t.} &x \in D,\ \langle c^{j},x\rangle =\alpha _{j},\ j = p + 1,\ldots,r. \end{array} \right.} $$

For solving (9.42) the parametric method exploits the property of LP(λ, α) that the parameter space is partitioned in finitely many polyhedrons over each of which the minimum of the function φ(λ, α) = f(x λ, α) can be computed relatively easily. Specifically, by writing the constraints in this linear program in the form

$$ \displaystyle{ Ax = b + S\alpha,\quad x \geq 0, } $$
(9.43)

where S is a suitable diagonal matrix, denote the collection of all basis matrices of this system by \( \mathcal{B}. \)

Lemma 9.3

Each basis B of the system (9.43) determines a set Π B ×Δ B ⊂Λ× W which is a polyhedron whenever nonempty, and an affine map \( x^{B}:\varDelta _{B} \rightarrow \mathbb{R}^{n}, \) such that x B (α) is a basic optimal solution of LP(λ,α) for all (λ,α) ∈Π B ×Δ B . The collection of polyhedrons Π B ×Δ B corresponding to all bases \( B \in \mathcal{B} \) covers all (λ,α) such that LP(λ,α) has an optimal solution.

Proof

Let B be any basis of this system. If \( A = (B,N),x = \left (\begin{array}{l} x_{B} \\ x_{N} \end{array} \right ), \) so that Bx B + Nx N  = b + S α, then

$$ \displaystyle{ x_{B} = B^{-1}(b + S\alpha ) - B^{-1}Nx_{ N} } $$
(9.44)

and the associated basic solution of (9.43) is

$$ \displaystyle{ x_{B} = B^{-1}(b + S\alpha ),\ x_{ N} = 0. } $$
(9.45)

This basic solution is feasible for all α ∈ W such that B −1(b + S α) ≥ 0 (nonnegativity condition) and is dual feasible for all λ ∈ Λ such that i = 1 p λ i [(c N i)T − (c B i)T B −1 N] ≥ 0 (optimality criterion), where \( c^{i} = \left (\begin{array}{l} c_{B}^{i} \\ c_{N}^{i} \end{array} \right ). \) If we denote the vector (9.45) by x B(α) and define the polyhedrons

$$ \displaystyle\begin{array}{rcl} \varPi _{B}& =& \left \{\lambda \in \varLambda \vert \ \sum _{i=1}^{p}\lambda _{ i}\big[(c_{N}^{i})^{T} - (c_{ B}^{i})^{T}B^{-1}N\big] \geq 0\right \} {}\\ \varDelta _{B}& =& \{\alpha \in W\vert \ B^{-1}(b + S\alpha ) \geq 0\} {}\\ \end{array} $$

then whenever both Π B and Λ B are nonempty the affine map α ∈ Δ B x B(α) ∈ \( \mathbb{R}^{n} \) is such that x B(α) is a basic optimal solution of LP(λ, α) for all λ ∈ Π B ,  α ∈ Δ B . The last assertion of the Lemma is obvious from the fact that if LP(λ, α) has a basic optimal solution, then (λ, α) ∈ Π B ×Δ B for the corresponding basis B.  □ 

Each polyhedron Π B ×Δ B is called a cell. Thus, by taking x λ, α = x B(α) for every (λ, α) ∈ Π B ×Δ B , the function φ(λ, α) = f(x λ, α) is constant in λ ∈ Π B for every fixed α ∈ Δ B and quasiconcave in α ∈ Δ B for every fixed λ ∈ Π B . Hence its minimum over the cell Π B ×Δ B is equal to the minimum of f(x B(α)) over Δ B and is achieved at a vertex of the polytope Δ B . Therefore, if the collection of cells can be effectively computed then the problem (9.42) can be solved through scanning the vertices of the cells. Unfortunately, however, when r > 3 the cells are very complicated and there is in general no practical method for computing these cells, except in rare cases where additional structure (like network constraints, see Sect. 9.8) may simplify this computation. Below we briefly examine the method for problems with r ≤ 3. For more detail the reader is referred to the cited papers of Konno and Kuno.

5.1 Parametric Objective Simplex Method

When p = r = 2 the problem (9.39) is

$$ \displaystyle{ \min \{F(\langle c^{1},x\rangle,\langle c^{2},x\rangle )\vert \ \ x \in D\} } $$
(9.46)

where D is a polyhedron contained in the set {x |  〈c 1, x〉 ≥ 0, 〈c 2, x〉 ≥ 0}, and F(y) is a quasiconcave function on the orthant \( \mathbb{R}_{+}^{2} \) such that

$$ \displaystyle{ y,y' \in \mathbb{R}_{+}^{2},y \geq y' \Rightarrow F(y) \geq F(y'). } $$
(9.47)

As seen earlier, aside from the special case when F(y) = y 1 y 2 (linear multiplicative program), the class (9.46) includes problems with quite different objective functions, such as

$$ \displaystyle{(\langle c^{1},x\rangle )^{q_{1} }(\langle c^{2},x\rangle )^{q_{2} };\ \ -q_{1}e^{\langle c^{1},x\rangle } - q_{2}e^{\langle c^{2},x\rangle }\ (q_{1} > 0,q_{2} > 0)} $$

corresponding to different choices of F(y): 

$$ \displaystyle{y_{1}^{q_{1} }y_{2}^{q_{2} };\ -q_{1}e^{y_{1} } - q_{2}e^{y_{2} }.} $$

By Corollary 9.1 (with p = r = 2) the parametric linear program associated with (9.46) is

$$ \displaystyle{ LP(\lambda )\qquad \qquad \min \{\langle c^{1} +\lambda c^{2},x\rangle \vert \ \ x \in D\},\quad \lambda \geq 0. } $$
(9.48)

Since the parameter domain Λ is the nonnegative real line, the cells of this parametric program are intervals and can easily be determined by the standard parametric objective simplex method (see, e.g., Murty 1983) which consists in the following.

Suppose a basis B k is already available such that the corresponding basic solution x k is optimal to LP(λ) for all λ ∈ Π k : = [λ k−1, λ k ] but not for λ outside this interval. The latter means that if λ is slightly greater than λ k then at least one of the inequalities in the optimality criterion

$$ \displaystyle{(c^{1} +\lambda c^{2})_{ N_{k}} - (c^{1} +\lambda c^{2})_{ B_{k}}B_{k}^{-1}N_{ k} \geq 0} $$

becomes violated. Therefore, by performing a number of simplex pivot steps we will pass to a new basis B k+1 with a basic solution x k+1 which will be optimal to LP(λ) for all λ in a new interval Π k+1 ⊂ [λ k , +). In this way, starting from the value λ 0 = 0 one can determine the interval Π 1 = [λ 0, λ 1] then pass to the next Π 2 = [λ 1, λ 2], and so on, until the whole halfline is covered. Note that the optimum objective value function ξ(λ) in the linear program LP(λ) is a concave piecewise affine function of λ with Π k as linearity intervals. The points λ 0 = 0, λ 1, , λ l are the breakpoints of ξ(λ). An optimal solution to (9.46) is then x k where

$$ \displaystyle{ k{\ast}\in \mathrm{argmin}\{F(\langle c^{1},x^{k}\rangle,\langle c^{2},x^{k}\rangle )\vert \ k = 1,2,\ldots,l\}. } $$
(9.49)

Remark 9.4

Condition (9.47) is a variant of (9.40) for p = r = 2. Condition (9.40) with p = 1, r = 2 always holds since it simply means F(y) = F(y′) whenever y = y′. Therefore, without assuming (9.47), Corollary 9.1 and hence the above method still applies. However, in this case, \( \varLambda = \mathbb{R}, \) i.e., the linear program LP(λ) should be considered for all λ ∈ (−, +). 

5.2 Parametric Right-Hand Side Method

Consider now problem (9.39) under somewhat weaker assumptions than previously, namely: F(y) is not required to be quasiconcave, while condition (9.47) is replaced by the following weaker one:

$$ \displaystyle{ y_{1} \geq y'_{1},\ y_{2} = y'_{2}\quad \Rightarrow \quad F(y) \geq F(y'). } $$
(9.50)

The latter condition implies that for fixed y 2 the univariate function F(. , y 2) is monotonic on the real line. It is then not hard to verify that Corollary 9.1 still holds for this case. In fact this is also easy to prove directly.

Lemma 9.4

Let \( \underline{\alpha }=\min \{\langle c^{2},x\rangle \vert \ x \in D\},\ \overline{\alpha } =\max \{\langle c^{2},x\rangle \vert \ x \in D\} \) and for every \( \alpha \in [\underline{\alpha },\overline{\alpha }] \) let x α be a basic optimal solution of the linear program

$$ \displaystyle{LP(\alpha )\qquad \qquad \min \{\langle c^{1},x\rangle \vert \ \ x \in D,\ \langle c^{2},x\rangle =\alpha \} \hspace{71.13188pt}.} $$

Then

$$ \displaystyle{\min \{F(\langle c^{1},x\rangle,\langle c^{2},x\rangle )\vert \ \ x \in D\} =\min \{ F(\langle c^{1},x^{\alpha }\rangle,\langle c^{2},x^{\alpha }\rangle )\vert \ \ \underline{\alpha }\leq \alpha \leq \overline{\alpha }\}.} $$

Proof

For every feasible solution x of LP(α) we have 〈c 1, xx α〉 ≥ 0 (because x α is optimal to LP(α)) and 〈c 2, xx α〉 = 0 (because x α is feasible to LP(α)). Hence, from (9.50) F(〈c 1, x〉, 〈c 2, x〉) ≥ F(〈c 1, x α〉, 〈c 2, x α〉) and the conclusion follows. □ 

Solving (9.39) thus reduces to solving LP(α) parametrically in α. This can be done by the following standard parametric right-hand side method (see, e.g., Murty 1983). Assuming that the constraints in LP(α) have been rewritten in the form

$$ \displaystyle{Ax = b +\alpha b_{0},\ x \geq 0,} $$

let B k be a basis of this system satisfying the dual feasibility condition

$$ \displaystyle{c_{N_{k}}^{1} - c_{ B_{k}}^{1}B_{ k}^{-1}N_{ k} \geq 0.} $$

Then the basic solution defined by this basis, i.e., the vector x k = u k +α v k such that \( x_{B_{k}}^{k} = B_{k}^{-1}(b +\alpha b_{0}),x_{N_{k}}^{k} = 0 \) [see (9.45)], is dual feasible. This basic solution x k is optimal to LP(α) if and only if it is feasible, i.e., if and only if it satisfies the nonnegativity condition

$$ \displaystyle{B_{k}^{-1}(b +\alpha b_{ 0}) \geq 0.} $$

Let Δ k : = [α k−1, α k ] be the interval formed by all values α satisfying the above inequalities. Then x k is optimal to LP(α) for all α in this interval, but for any value α slightly greater than α k at least one of the above inequalities becomes violated, i.e., at least one of the components of x k becomes negative. By performing then a number of dual simplex pivots, we can pass to a new basis B k+1 with a basic solution x k+1 which will be optimal for all α in some interval Δ k+1 = [α k , α k+1]. Thus, starting from the value α 0 = α, one can determine an initial interval Δ 1 = [α, α 1], then pass to the next interval Δ 2 = [α 1, α 2], and so on, until the whole interval \( [\underline{\alpha },\overline{\alpha }] \) is covered. The generated points \( \alpha _{0} =\underline{\alpha },\alpha _{1},\ldots,\alpha _{h} = \overline{\alpha } \) are the breakpoints of the optimum objective value function η(α) in LP(α) which is a convex piecewise affine function. By Lemma 9.4 an optimal solution to (9.39) is then x k where

$$ \displaystyle{ k^{{\ast}}\in \mathrm{argmin}\{F(\langle c^{1},x^{k}\rangle,\langle c^{2},x^{k}\rangle )\vert \ k = 1,2,\ldots,h\}. } $$
(9.51)

Remark 9.5

In both objective and right-hand side methods the sequence of points x k on which the objective function values have to be compared depends upon the vectors c 1, c 2 but not on the specific function F(y). Computational experiments (Konno and Kuno 1992), also (Tuy and Tam 1992) indicate that solving a problem (9.39) requires no more effort than solving just a few linear programs of the same size. Two particular problems of this class are worth mentioning:

$$ \displaystyle\begin{array}{rcl} & \min \{c^{1}x.c^{2}x\vert \quad x \in D\}&{}\end{array} $$
(9.52)
$$ \displaystyle\begin{array}{rcl} & \min \{-x_{1}^{2} + c_{0}x\vert \quad x \in D\},&{}\end{array} $$
(9.53)

where D is a polyhedron [contained in the domain c 1 x > 0, c 2 x > 0 for (9.52)]. These correspond to functions F(y) = y 1 y 2 and F(y) = −y 1 2 + y 2, respectively and have been shown to be NP-hard (Pardalos and Vavasis 1991; Matsui 1996), although, as we saw above, they are efficiently solved by the parametric method.

5.3 Parametric Combined Method

When r = 3 the parametric linear program associated to (9.39) is difficult to handle because the parameter domain is generally two-dimensional. In some particular cases, however, a problem (9.39) with r = 3 may be successfully tackled by the parametric method. As an example consider the problem

$$ \displaystyle{ \min \{c^{0}x + c^{1}x.c^{2}x\vert \quad x \in D\} } $$
(9.54)

which may not belong to the class (QCM) because we cannot prove the quasiconcavity of the function F(y) = y 1 + y 2 y 3. It is easily seen, however, that this problem is equivalent to

$$ \displaystyle{ \min \{\langle c^{0} +\alpha c^{1},x\rangle \vert \quad x \in D,\ \langle c^{2},x\rangle =\alpha,\ \ \underline{\alpha }\leq \alpha \leq \overline{\alpha }\} } $$
(9.55)

where \( \underline{\alpha }=\min _{x\in D}\langle c^{2},x\rangle,\ \overline{\alpha } =\max _{x\in D}\langle c^{2},x\rangle. \) By writing the constraints x ∈ D, 〈c 2, x〉 = α in the form

$$ \displaystyle{ Ax = b +\alpha b_{0}. } $$
(9.56)

we can prove the following:

Lemma 9.5

Each basis B of the system (9.56) determines an interval \( \varDelta _{B} \subset [\underline{\alpha },\overline{\alpha }] \) and an affine map \( x^{B}:\varDelta _{B} \rightarrow \mathbb{R}^{n} \) such that x B (α) is a basic optimal solution of the problem

$$ \displaystyle{P(\alpha )\qquad \qquad \quad \min \{\langle c^{0} +\alpha c^{1},x\rangle \vert \quad x \in D,\ \langle c^{2},x\rangle =\alpha \}} $$

for all α ∈Δ B . The collection of all intervals Δ B corresponding to all bases B such that Δ B covers all α for which P(α) has an optimal solution.

Proof

Let B be a basis, \( A = (B,N),x = \left (\begin{array}{c} x_{B} \\ x_{N} \end{array} \right ), \) so that the associated basic solution is

$$ \displaystyle{ x_{B} = B^{-1}(b +\alpha b_{ 0}),\quad x_{N} = 0. } $$
(9.57)

This basic solution is feasible for all \( \alpha \in [\underline{\alpha },\overline{\alpha }] \) satisfying the nonnegativity condition

$$ \displaystyle{ B^{-1}(b +\alpha b_{ 0}) \geq 0. } $$
(9.58)

and is dual feasible for all α satisfying the dual feasibility condition (optimality criterion)

$$ \displaystyle{ (c^{0} +\alpha c^{1})_{ N} - (c^{0} +\alpha c^{1})_{ B}B^{-1}N \geq 0. } $$
(9.59)

Conditions (9.58) and (9.59) determine each an interval. If the intersection Δ B of the two intervals is nonempty, then the basic solution (9.57) is optimal for all α ∈ Δ B . The rest is obvious. □ 

Based on this Lemma, one can generate all the intervals Δ B corresponding to different bases B by a procedure combining primal simplex pivots with dual simplex pivots (see, e.g., Murty 1983). Specifically, for any given basis B k , denote by u k +α v k the vector (9.57) which is the basic solution of P(α) corresponding to this basis, and by Δ k  = [α k−1, α k ] the interval of optimality of this basic solution. When α is slightly greater than α k either the nonnegativity condition [see (9.58)] or the dual feasibility condition [see (9.59)] becomes violated. In the latter case we perform a primal simplex pivot step, in the former case a dual simplex pivot step, and repeat as long as necessary, until a new basis B k+1 with a new interval Δ k+1 is obtained. Thus, starting from an initial interval Δ 1 = [α, α 1] we can pass from one interval to the next, until we reach \( \overline{\alpha }. \) Let then Δ k , k = 1, , l be the collection of all intervals. For each interval Δ k let u k +α v k be the associated basic solution which is optimal to P(α) for all α ∈ Δ k and let x k = u k +ξ k v k, where ξ k is a minimizer of the quadratic function φ(α): = 〈c 0 +α c 1, u k +α v k〉 over Δ k . Then from (9.55) and by Lemma 9.5, an optimal solution of (9.54) is x k where

$$ \displaystyle{k^{{\ast}}\in \mathrm{argmin}\{c^{0}x^{k} + (c^{1}x^{k})(c^{2}x^{k})\vert \ \ k = 1,2,\ldots,l\}.} $$

Note that the problem (9.54) is also NP-hard (Pardalos and Vavasis 1991).

6 Convex Constraints

So far we have been mainly concerned with solving quasiconcave monotonic minimization problems under linear constraints. Consider now the problem

$$ \displaystyle{ (QCM)\qquad \qquad \min \{f(x)\vert \quad x \in D\}.\hspace{71.13188pt} } $$
(9.60)

formulated in Sect. 9.2 [see (9.8)], where the constraint set D is compact convex but not necessarily polyhedral. Recall that by Proposition 9.1 where Qx = (c 1 x, , c r x)T there exists a continuous quasiconcave function \( F: \mathbb{R}^{r} \rightarrow \mathbb{R} \) on a closed convex set Y ⊃ Q(D) satisfying condition (9.40) and such that

$$ \displaystyle{ f(x) = F(\langle c^{1},x\rangle,\ldots,\langle c^{r},x\rangle ). } $$
(9.61)

It is easy to extend the branch and bound algorithms in Sect. 8.2 and the polyhedral annexation algorithm in Sect. 8.3 to (QCM) with a compact convex constraint set D, and in fact, to the following more general problem:

$$ \displaystyle{ (GQCM)\qquad \qquad \min \{F(g(x))\vert \quad x \in D\}\hspace{85.35826pt} } $$
(9.62)

where D and F(y) are as previously (with g(D) replacing Q(D)), while \( g = (g_{1},\ldots,g_{r})^{T}: \mathbb{R}^{n} \rightarrow \mathbb{R}^{r} \) is a map such that g i (x), i = 1, , p, are convex on a closed convex set X ⊃ D and g i (x) = 〈c i, x〉 + d i ,  i = p + 1, , r. 

For the sake of simplicity we shall focus on the case p = r, i.e., we shall assume F(y) ≥ F(y′) whenever y, y′ ∈ Y, y ≥ y′. This occurs, for instance, for the convex multiplicative programming problem

$$ \displaystyle{ \min \left \{\prod _{i=1}^{r}g_{ i}(x)\vert \ \ x \in D\right \} } $$
(9.63)

where F(y) =  i = 1 r y i ,  D ⊂ { x |  g i (x) ≥ 0}. Clearly problem (GQCM) can be rewritten as

$$ \displaystyle{ \min \{F(y)\vert \ \ g(x) \leq y,x \in D\} } $$
(9.64)

which is to minimize the quasiconcave \( \mathbb{R}_{+}^{r} \)-monotonic function F(y) on the closed convex set

$$ \displaystyle{ E =\{ y\vert g(x) \leq y,x \in D\} = g(D) + \mathbb{R}_{+}^{r}. } $$
(9.65)

6.1 Branch and Bound

If the function F(y) is concave (and not just quasiconcave) then simplicial subdivision can be used. In this case, a lower bound for F(y) over the feasible points y in a simplex M = [u 1, , u r+1] can be obtained by solving the convex program

$$ \displaystyle{\min \{l_{M}(y)\vert \ \ y \in E \cap M\},} $$

where l M (y) is the affine function that agrees with F(y) at the vertices of M (this function satisfies \( l_{M}(y) \leq F(y)\ \forall y \in M \) in view of the concavity of F(y)). Using this lower bounding rule, a simplicial algorithm for (GQCM) can be formulated following the standard branch and bound scheme. It is also possible to combine branch and bound with outer approximation of the convex set E by a sequence of nested polyhedrons P k  ⊃ E. 

If the function F(y) is concave separable, i.e., F(y) =  j = 1 r F j (y j ), where each F j (. ) is a concave function of one variable, then rectangular subdivision can be more convenient. However, in the general case, when F(y) is quasiconcave but not concave, an affine minorant of a quasiconcave function over a simplex (or a rectangle) is in general not easy to compute. In that case, one should use conical rather than simplicial or rectangular subdivision.

6.2 Polyhedral Annexation

Since F(y) is monotonic with respect to the orthant \( \mathbb{R}_{+}^{r}, \) we can apply the PA Algorithm using Lemma 9.2 for constructing the initial polytope P 1. 

Let \( \overline{y} \in E \) be an initial feasible point of (9.64) and \( C =\{ y\vert \ F(y) \geq F(\overline{y})\}. \) By translating, we can assume that 0 ∈ E and 0 ∈ intC, i.e., \( F(0) > F(\overline{y}). \) Note that instead of a linear program like (9.25) we now have to solve a convex program of the form

$$ \displaystyle{ \max \{\langle t,y\rangle \vert \ \ y \in E\} =\max \{\langle t,g(x)\rangle \vert \ \ x \in D\} } $$
(9.66)

where \( t \in (\mathbb{R}_{+}^{r})^{\circ } = -\mathbb{R}_{+}^{r}. \) Therefore, to construct the initial polytope P 1 (see Lemma 9.2) we compute α > 0 such that \( F(\alpha w) = F(\overline{y}) \) (where \( w = (-1,\ldots,-1)^{T} \in \mathbb{R}^{r}) \) and take the initial polytope P 1 to be the r-simplex in \( \mathbb{R}^{r} \) with vertex set V 1 = { 0, −e 1α, , −e rα}. We can thus state:

PA Algorithm (for (GQCM))

Initialization. Let \( \overline{y} \) be a feasible solution (the best available), \( C =\{ y\vert \ F(y) \geq F(\overline{y})\}. \) Choose a point y 0 ∈ E such that \( F(y^{0}) > F(\overline{y}) \) and set E ← Ey 0,  C ← Cy 0. Let P 1 be the simplex in \( \mathbb{R}^{r} \) with vertex set V 1. Set k = 1. 

Step 1.:

For every new t = (t 1, , t r )T ∈ V k {0} solve the convex program (9.66), obtaining the optimal value μ(t) and an optimal solution y(t). 

Step 2.:

Let t k ∈ argmax{μ(t) |  t ∈ V k }. If μ(t k) ≤ 1, then terminate: \( \overline{y} \) is an optimal solution of (GQCM).

Step 3.:

If μ(t k) > 1 and y k: = y(t k) ∉ C (i.e., \( F(y^{k}) < F(\overline{y})), \) then update the current best feasible solution and the set C by resetting \( \overline{y} = y^{k}. \)

Step 4.:

Compute

$$ \displaystyle{ \theta _{k} =\sup \{\theta \vert \ \ F(\theta y^{k}) \geq F(\overline{y})\} } $$
(9.67)

and define

$$ \displaystyle{P_{k+1} = P_{k} \cap \{ t\vert \ \langle t,y^{k}\rangle \leq \frac{1} {\theta _{k}}\}.} $$
Step 5.:

From V k derive the vertex set V k+1 of P k+1. Set k ← k + 1 and go back to Step 1.

To establish the convergence of this algorithm we need two lemmas.

Lemma 9.6

μ(t k ) ↘ 1 as k → +∞.

Proof

Observe that μ(t) = max{〈t, y〉 |  y ∈ g(D)} is a convex function and that y k ∈ ∂ μ(t k) because \( \mu (t) -\mu (t^{k}) \geq \langle t,y^{k}\rangle -\langle t^{k},y^{k}\rangle =\langle t - t^{k},y^{k}\rangle \ \ \forall t. \) Denote l k (t) = 〈t, y k〉 − 1. We have l k (t k) = 〈t k, y k〉 − 1 = μ(t k) − 1 > 0, and for t ∈ [g(D)]: l k (t) = 〈t, y k〉 − 1 ≤ μ(t) − 1 ≤ 0 because \( y^{k} \in g(D) + \mathbb{R}_{+}^{r}. \) Since g(D) is compact, there exist t 0 ∈ int[g(D] (Proposition 1.21), i.e., t 0 such that l(t 0) = μ(t 0) − 1 < 0 and s k ∈ [t 0, t k)]int[g(D)] such that l(s k) = 0. By Theorem 6.1 applied to the set [g(D)], the sequence {t k} and the cuts l k (t), we conclude that t ks k → 0. Hence every accumulation point t of {t k} satisfies μ(t ) = 1, i.e., μ(t k) ↘ μ(t ) = 1.  □ 

Denote by C k the set C at iteration k. 

Lemma 9.7

C k ⊂ P k for every k.

Proof

The inclusion C 1  ⊂ P 1 follows from the construction of P 1. Arguing by induction on k, suppose that C k  ⊂ P k for some k ≥ 1. Since θ k y k ∈ C k+1, for all t ∈ C k+1 we must have 〈t, θ k y k〉 ≤ 1, and since C k  ⊂ C k+1, i.e., C k+1  ⊂ C k  ⊂ P k it follows that t ∈ P k ∩{ t |  〈t, θ k y k〉 ≤ 1} = P k+1.  □ 

Proposition 9.5

Let \( \overline{y}^{k} \) be the incumbent at iteration k. Either the above algorithm terminates by an optimal solution of (GQCM) or it generates an infinite sequence \( \{\overline{y}^{k}\} \) every accumulation point of which is an optimal solution.

Proof

Let y be any accumulation point of the sequence \( \{\overline{y}^{k}\}. \) Since by Lemma 9.6 max{μ(t) |  t ∈ P k } ↘ 1 it follows from Lemma 9.7 that max{μ(t) |  t ∈ ∩ k = 1 C k } ≤ 1. This implies that ∩ k = 1 C k  ⊂ E , and hence E ⊂ ∪ k = 1 C k . Thus, for any y ∈ E there is k such that y ∈ C k , i.e., \( F(y) \geq F(\overline{y}^{k}) \geq F(\overline{y}), \) proving the optimality of \( \overline{y}. \) □ 

Incidentally, the above PA Algorithm shows that

$$ \displaystyle{ \min \{F(y)\vert \ \ y \in E\} =\min \{ F(g(x(t)))\vert \ \ t \in \mathbb{R}_{-}^{r}\}, } $$
(9.68)

where x(t) is an arbitrary optimal solution of (9.66), i.e., of the convex program

$$ \displaystyle{\min \{\langle -t,g(x)\rangle \vert \ \ x \in D\}.} $$

This result could also be derived from Theorem 9.1.

Remark 9.6

All the convex programs (9.66) have the same constraints. This fact should be exploited for an efficient implementation of the above algorithm. Also, in practice, these programs are usually solved approximately, so μ(t) is only an ɛ-optimal value, i.e., μ(t) ≥ max{〈t, y〉 |  y ∈ E} −ɛ for some tolerance ɛ > 0. 

6.3 Reduction to Quasiconcave Minimization

An important class of problems (GQCM) is constituted by problems of the form:

$$ \displaystyle{ \mathrm{minimize}\sum _{i=1}^{p}\prod _{ j=1}^{q_{i} }g_{ij}(x)\quad \mathrm{s.t}\quad x \in D, } $$
(9.69)

where D is a compact convex set and \( g_{ij}: D \rightarrow \mathbb{R}_{++},i = 1,\ldots,p,j = 1,\ldots,q_{i}, \) are continuous convex positive-valued functions on D. This class includes generalized convex multiplicative programming problems (Konno et al. 1994; Konno and Kuno 1995) which can be formulated as:

$$ \displaystyle{ \min \left \{g(x) +\prod _{ j=1}^{q}g_{ j}(x)\vert \ \ x \in D\right \}. } $$
(9.70)

or, alternatively as

$$ \displaystyle{ \min \left \{g(x) +\sum _{ i=1}^{p}g_{ i}(x)h_{i}(x)\vert \ \ x \in D\right \}, } $$
(9.71)

where all functions g(x), g i (x), h i (x) are convex positive-valued on D. Another special case worth mentioning is the problem of minimizing the scalar product of two vectors:

$$ \displaystyle{ \min \left \{\sum _{i=1}^{n}x_{ i}y_{i}\vert \ \ (x,y) \in D \subset \mathbb{R}_{++}^{n}\right \}. } $$
(9.72)

Let us show that any problem (9.69) can be reduced to concave minimization under convex constraints. For this we rewrite (9.69) as

$$ \displaystyle{ \begin{array}{ll} \mathrm{minimize}&\sum _{i=1}^{p}(\prod _{j=1}^{q_{i}}y_{ij})^{1/q_{i}} \\ \mathrm{s.t.} &[g_{ij}(x)]^{q_{i}} \leq y_{ij},\ i = 1,\ldots,p;j = 1,\ldots,q_{i} \\ \mbox{ } &y_{ij} \geq 0,\quad x \in D.\end{array} } $$
(9.73)

Since φ j (y) = y ij , j = 1, , q i are linear, it follows from Proposition 2.7 that each term \( (\prod _{j=1}^{q_{i}}y_{ij})^{1/q_{i}}, \) i = 1, , p, is a concave function of \( y = (y_{ij}) \in \mathbb{R}^{q_{1}+\ldots,q_{p}}, \) hence their sum, i.e., the objective function F(y) in (9.73), is also a concave function of y. Furthermore, y ≥ y′ obviously implies F(y) ≥ F(y′), so F(y) is monotonic with respect to the cone {y | y ≥ 0}. Finally, since every function g ij (x) is convex, so is \( [g_{ij}(x)]^{q_{i}}. \) Consequently, (9.73) is a concave programming problem. If q = q 1 + + q p is relatively small this problem can be solved by methods discussed in Chaps. 5 and 6.

An alternative way to convert (9.69) into a concave program is to use the following relation already established in the proof of Proposition 2.7 [see (2.6)]:

$$ \displaystyle{\prod _{j=1}^{q_{i} }g_{ij}(x) = \frac{1} {q_{i}}\min _{\xi ^{i}\in T_{i}}\sum _{j=1}^{q_{i} }\xi _{ij}g_{ij}^{q_{i} }(x),} $$

with \( \xi ^{i} = (\xi _{i1},\ldots,\xi _{iq_{i}}),T_{i} =\{\xi ^{i}:\ \prod _{ j=1}^{q_{i}}\xi _{ij} \geq 1\}. \) Hence, setting \( \varphi _{i}(\xi ^{i},x) = \frac{1} {q_{i}}\sum _{j=1}^{q_{i} }\xi _{ij}g_{ij}^{q_{i} }(x), \) the problem (9.69) is equivalent to

$$ \displaystyle\begin{array}{rcl} & & \min _{x\in D}\sum _{i=1}^{p}\min _{ \xi ^{i}\in T_{i}}\varphi _{i}(\xi ^{i},x) \\ & & \quad =\min _{x\in D}\min \{\sum _{i=1}^{p}\varphi _{ i}(\xi ^{i},x):\ \xi ^{i} \in T_{ i},i = 1,\ldots,p\} \\ & & \quad =\min \{\min _{x\in D}\sum _{i=1}^{p}\varphi _{ i}(\xi ^{i},x):\ \xi ^{i} \in T_{ i},i = 1,\ldots,p\} \\ & & \quad =\min \{\varphi (\xi ^{1},\ldots,\xi ^{p})\vert \ \xi ^{i} \in T_{ i},i = 1,\ldots,p\} {}\end{array} $$
(9.74)

where

$$ \displaystyle{\varphi (\xi ^{1},\ldots,\xi ^{p}) =\min _{ x\in D}\sum _{i=1}^{p}\varphi _{ i}(\xi ^{i},x) =\min \left \{\sum _{ i=1}^{p}\sum _{ j=1}^{q_{i} } \frac{1} {q_{i}}\xi _{ij}g_{ij}^{q_{i} }(x)\vert \ x \in D\right \}.} $$

Since each function \( g_{ij}^{q_{i}}(x) \) is convex, φ(ξ 1, , ξ p) is the optimal value of a convex program. Furthermore, the objective function of this convex program is a linear function of (ξ 1, , ξ p) for fixed x. Therefore, φ(ξ 1, , ξ p) is a concave function (pointwise minimum of a family of linear functions). Finally, each T i , i = 1, , p, is a closed convex set in \( \mathbb{R}^{q_{i}} \) because

$$ \displaystyle{T_{i} = \left \{\xi ^{i}\vert \ \sum _{ j=1}^{q_{i} }\log \xi _{ij} \geq 0\right \}.} $$

Thus, problem (9.69) is equivalent to (9.74) which seeks to minimize the concave function φ(ξ 1, , ξ p) over the convex set i = 1 p T i . 

Example 9.9

As an illustration, consider the problem of determining a rectangle of minimal area enclosing the projection of a given compact convex set \( D \subset \mathbb{R}^{n}(n > 2) \) onto the plane \( \mathbb{R}^{2}. \) This problem arises from applications in packing and optimal layout (Gale 1981; Haims and Freeman 1970; Maling et al. 1982), especially when two-dimensional layouts are restricted by n − 2 factors. When the vertices of D are known, there exists an efficient algorithm based on computational geometry (Bentley and Shamos 1978; Graham 1972; Toussaint 1978). In the general case, however, the problem is more complicated.

Assume that D has full dimension in \( \mathbb{R}^{n} = \mathbb{R}^{2} \times \mathbb{R}^{n-2}. \) The projection of D onto \( \mathbb{R}^{2} \) is the compact convex set \( \mathrm{pr}D =\{ y \in \mathbb{R}^{2}\vert \ \exists z \in \mathbb{R}^{n-2},\ (y,z) \in D\}. \) For any vector \( x \in \mathbb{R}^{2} \) such that ∥ x ∥  = 1 define

$$ \displaystyle\begin{array}{rcl} g_{1}(x)& =& \max \{\langle x,y\rangle \vert \ (y,z) \in D\} {}\\ g_{2}(x)& =& \min \{\langle x,y\rangle \vert \ (y,z) \in D\}. {}\\ \end{array} $$

Then the width of prD in the direction x is

$$ \displaystyle{g(x) = g_{1}(x) - g_{2}(x).} $$

On the other hand, since Hx = (−x 2, x 1)T is the vector orthogonal to x such that ∥ H(x) ∥  = 1 if ∥ x ∥  = 1, the width of prD in the direction orthogonal to x is g(Hx) = g(−x 2, x 1). Hence the product g(x). g(Hx) measures the area of the smallest rectangle containing prD and having a side parallel to x. The problem can then be formulated as

$$ \displaystyle{ \min \{g(x).g(Hx)\vert \ x \in \mathbb{R}_{+}^{2},\ \|x\| = 1\}. } $$
(9.75)

Lemma 9.8

The function g(x) is convex and satisfies g(αx) = αg(x) for any α ≥ 0.

Proof

Clearly g 1(x) is convex as the pointwise maximum of a family of affine functions, while g 2(x) is concave as the pointwise minimum of a family of affine functions. It is also obvious that g i (α x) = α g i (x), i = 1, 2. Hence, the conclusion. □ 

Since \( H: \mathbb{R}^{2} \rightarrow \mathbb{R}^{2} \) is a linear map, it follows that g(Hx) is also convex and that g(H(α x)) = α g(Hx) for every α ≥ 0. Exploiting this property, a successive underestimation method for solving problem (9.75) has been proposed in Kuno (1993). In view of the above established property, problem (9.75) is actually equivalent to the convex multiplicative program

$$ \displaystyle{ \min \{g(x) \cdot g(Hx)\vert \quad x \in \mathbb{R}_{+}^{2},\|x\| \leq 1\}, } $$
(9.76)

and so can be transformed into the concave minimization problem

$$ \displaystyle{\begin{array}{ll} \min &\sqrt{t_{1 } t_{2}} \\ \mathrm{s.t.}&g(x) \leq t_{1},\ g(Hx) \leq t_{2} \\ \mbox{ } & x \in \mathbb{R}_{+}^{2},\ \|x\| \leq 1,\end{array} } $$

or, alternatively,

$$ \displaystyle{ \min \{\varphi (\xi _{1},\xi _{2})\vert \ \xi _{1}\xi _{2} \geq 1,\xi \in \mathbb{R}_{+}^{2}\}. } $$
(9.77)

where

$$ \displaystyle{ \varphi (\xi _{1},\xi _{2}) =\min \{\xi _{1}g(x) +\xi _{2}g(Hx)\vert \ x \geq 0,\|x\| \leq 1\}. } $$
(9.78)

In the special case when D is a polytope given by a system of linear inequalities Ay + Bz ≤ b the problem can be solved by an efficient parametric method (Kuno 1993). For any point \( x \in \mathbb{R}_{+}^{2} \) with ∥ x ∥  = 1, let σ(x) = (λ, 1 −λ) be the point where the halfline from 0 through x intersects the line segment joining (1, 0) and (0, 1). Since obviously, ∥ σ(x) ∥ 2 = λ 2 + (1 −λ)2, using Lemma 9.8 problem (9.76) becomes

$$ \displaystyle{ \min \{F(\lambda )\vert \ \lambda \in [0,1]\}, } $$
(9.79)

where

$$ \displaystyle{F(\lambda ) = \frac{g(\lambda,1-\lambda ).g(\lambda -1,\lambda )} {\lambda ^{2} + (1-\lambda )^{2}}.} $$

Furthermore, g(λ, 1 −λ) = g 1(λ, 1 −λ) − g 2(λ, 1 −λ) with

$$ \displaystyle\begin{array}{rcl} & g_{1}(\lambda,1-\lambda ) =\max \{\lambda y_{1} + (1-\lambda )y_{2}\vert \ Ay + Bz \leq b\}& {}\\ & g_{2}(\lambda,1-\lambda ) =\min \{\lambda y_{1} + (1-\lambda )y_{2}\vert \ Ay + Bz \leq b\}.& {}\\ \end{array} $$

So each g i (λ, 1 −λ), i = 1, 2 is the optimal value of a linear parametric program. Let \( 0,\lambda _{1}^{1i},\ldots,\lambda _{p_{1i}}^{1i},1 \) be the sequence of breakpoints of g i (λ, 1 −λ), i = 1, 2. Analogously, let \( 0,\lambda _{1}^{2i},\ldots,\lambda _{p_{2i}}^{2i},1 \) be the sequence of breakpoints of g i (λ − 1, λ). Finally let

$$ \displaystyle{ 0 =\lambda _{0},\lambda _{1},\lambda _{2},\ldots,\lambda _{N-1},1 =\lambda _{N} } $$
(9.80)

be the union of all these four sequences rearranged in the increasing order.

Proposition 9.6

A global minimum of the function F(λ) over [0,1] exists among the points λ s ,s = 0,…,N.

Proof

For any interval [λ s , λ s+1], there exist two vectors u = x 1x 2, v = x 3v 4, where x i, i = 1, , 4 are vertices of D, such that for every \( x \in \mathbb{R}_{+}^{2} \) of length ∥ x ∥  = 1, with σ(x) = (λ, 1 −λ), λ ∈ (λ s , λ s+1), we have g(x) = 〈x, u〉,  g(Hx) = 〈Hx, v〉. Then

$$ \displaystyle{F(\lambda ) = (\|u\|\cos \alpha ) \times (\|v\|\cos \beta ),} $$

where α = angle(x, u) (angle between the vectors x and u) and β = angle(Hx, v). Noting that angle(x, u) + angle(u, v) + angle(v, y) = angle(x, y), we deduce

$$ \displaystyle{ \beta =\alpha +\omega -\pi /2, } $$
(9.81)

where ω = angle(u, v). Thus, for all λ ∈ (λ s , λ s+1):

$$ \displaystyle{F(\lambda ) = (\|u\|\cos \alpha )(\|v\|\sin (\alpha +\omega )).} $$

Setting G(α) = cosαsin(α +ω), we have G′(α) = −sinαsin(α +ω) + cosαcos(α +ω) = cos(2α +ω). If F(. ) attains a minimum at λ, i.e., G(. ) attains a minimum at α, then one must have

$$ \displaystyle{ \cos (2\alpha +\omega ) = 0,\ \mathrm{hence}\ 2\alpha +\omega =\pi /2 \pm \pi. } $$
(9.82)

But from (9.81) it follows that α +ω = β +π∕2, hence 2α +ω = α +β +π∕2. In view of (9.82) this in turn implies that

$$ \displaystyle{\alpha +\beta = \pm \pi,} $$

contradicting the fact that 0 <  | α |  < π∕2,  0 <  | β |  < π∕2 (because 〈x, u〉 > 0,  〈Hx, v〉 > 0). Therefore, the minimum of F(λ) over an interval [λ s , λ s+1] cannot be attained at any λ such that λ s  < λ < λ s+1, but must be attained at either λ s or λ s+1. Hence the global minimum over the entire segment [0, 1] is attained at one point of the sequence λ 0, λ 1, , λ N .  □ 

Corollary 9.3

An optimal solution of (9.79) is

$$ \displaystyle{x^{{\ast}} = \frac{(\lambda _{{\ast}},1 -\lambda _{{\ast}})} {\sqrt{\lambda _{{\ast} }^{2 } + (1 -\lambda _{{\ast} } )^{2}}},\quad \lambda _{{\ast}}\in \mathrm{argmin}\{F(\lambda _{s}):\ s = 0,1,\ldots,N\}.} $$

Proof

Indeed, x  = σ(x )∕ ∥ σ(x ) ∥ and σ(x ) = (λ , 1 −λ ). 

7 Reverse Convex Constraints

In the previous sections we were concerned with low rank nonconvex problems with nonconvex variables in the objective function. We now turn to low rank nonconvex problems with nonconvex variables in the constraints. Consider the following monotonic reverse convex problem

$$ \displaystyle{(MRP)\qquad \qquad \quad \min \{\langle c,x\rangle \vert \quad x \in D,\ h(x) \leq 0\},\hspace{71.13188pt} } $$

where \( c \in \mathbb{R}^{n},D \) is a closed convex set in \( \mathbb{R}^{n}, \) and the function \( h: X \rightarrow \mathbb{R} \) is quasiconcave and K-monotonic on a closed convex set X ⊃ D. In this section we shall present decomposition methods for (MRP) by specializing the algorithms for reverse convex programs studied in Chaps. 6 and 7.

As in Sect. 9.2 assume that K ⊂ recX, with lineality L = { x |  Qx = 0}, where Q is an r × n matrix with r linearly independent rows c 1, , c r. By quasiconcavity of h(x) the set C = { x ∈ X |  h(x) ≥ 0} is convex and by Proposition 9.2 K ⊂ recC. Assume further that D = { x |  Ax ≤ b, x ≥ 0} is a polytope with a vertex at 0 and that:

  1. (a)

    min{〈c, x〉 |  x ∈ D, h(x) ≤ 0} > 0, so that h(0) > 0; 

  2. (b)

    The problem is regular, i.e.,

    $$ \displaystyle{D\setminus \mathrm{int}C = \mathrm{cl}(D\setminus C).} $$

7.1 Decomposition by Projection

By writing, as in Sect. 7.2 [see formulas (9.11) and (9.12)], Q = [Q B , Q N ], \( x = \left [\begin{array}{c} x_{B} \\ x_{N} \end{array} \right ], \) where Q B is an r × r nonsingular matrix, and setting

$$ \displaystyle{Z = \left [\begin{array}{c} Q_{B}^{-1} \\ 0 \end{array} \right ],\quad \varphi (y) = h(Zy)} $$

we define a function φ(y) such that φ(y) = h(x) for all x ∈ X satisfying y = Qx. The latter also implies that φ(y) is quasiconcave on Q(X). 

Denote by ψ(y) the optimal value of the linear program

$$ \displaystyle{\min \{\langle c,x\rangle \vert \ x \in D,\ Qx = y\}.} $$

Then (MR P) can be rewritten as a problem in y: 

$$ \displaystyle{ \min \{\psi (y)\vert \ y \in Q(D),\ \varphi (y) \leq 0\}. } $$
(9.83)

Since it is well known that ψ(y) is a convex piecewise affine function, (9.83) is a reverse convex program in the r-dimensional space \( Q(\mathbb{R}^{n}). \) If r is not too large this program can be solved in reasonable time by outer approximation or branch and bound.

7.2 Outer Approximation

Following the OA Algorithm for (CDC) (Sect. 6.3), we construct a sequence of polytopes P 1 ⊃ P 2 ⊃  outer approximating the convex set

$$ \displaystyle{\varOmega =\{ y \in Q(D)\vert \ \ \psi (y) \leq \tilde{\gamma }\},} $$

where \( \tilde{\gamma } \) is the optimal value of (MRP). Also we construct a sequence +  ≥ γ 1 ≥ γ 2 ≥  such that

$$ \displaystyle\begin{array}{rcl} & & \gamma _{k} < +\infty \quad \Rightarrow \quad \exists \overline{y}^{k} \in Q(D),\ \varphi (\overline{y}^{k}) \leq 0,\psi (\overline{y}^{k}) =\gamma _{ k}\; {}\\ & & \ \ \ \ \ \ \ \ \{y \in Q(D)\vert \ \ \psi (y) \leq \gamma _{k}\} \subset P_{k}. {}\\ \end{array} $$

At iteration k we select (by vertex enumeration)

$$ \displaystyle{y^{k} \in \mathrm{argmin}\{\varphi (y)\vert \ \ y \in P_{ k}\}} $$

(note that this requires φ(y) to be defined on P 1). If φ(y k) ≥ 0 then (MRP) is infeasible when γ k  = +, or \( \overline{y}^{k} \) is optimal when γ k  < +. If φ(y k) < 0, then y kΩ and we construct an affine inequality strictly separating y k from Ω. The polytope P k+1 is defined by appending this inequality to P k . 

The key issue in this OA scheme is, given a point y k such that φ(y k) < 0, to construct an affine function l(y) satisfying

$$ \displaystyle{ l(y^{k}) > 0,\quad \quad l(y) \leq 0\quad \forall y \in \varOmega. } $$
(9.84)

Note that we assume (a) and 0 ∈ D, so

$$ \displaystyle{ \varphi (0) > 0,\quad \psi (0) <\min \{\psi (y)\vert \ y \in Q(D),\ \varphi (y) \leq 0\}. } $$
(9.85)

Now, since φ(y k) < 0 < φ(0) and γ k ψ(y k) ≤ 0 < γ k ψ(0), one can compute z k ∈ [0, y k] satisfying min{φ(z k), γ k ψ(z k)} = 0. Consider the linear program

$$ \displaystyle{(SP(z^{k}))\quad \max \{\langle z^{k},v\rangle -\langle b,w\rangle \vert Q^{T}v - A^{T}w \leq c,v \geq 0\}.} $$

Proposition 9.7

  1. (i)

    If SP(z k ) has a (finite) optimal solution (v k ,w k ) then z k ∈ Q(D) and (9.84) is satisfied by the affine function

    $$ \displaystyle{ l(y):=\langle v^{k},y - z^{k}\rangle. } $$
    (9.86)
  2. (ii)

    If SP(z k ) has no finite optimal solution, so that the cone Q T v − A T w ≤ d, w ≥ 0 has an extreme direction (v k ,w k ) such thatz k ,v kb,w k> 0, then z k ∉ Q(D) and (9.84) is satisfied by the affine function

    $$ \displaystyle{ l(y):=\langle y,v^{k}\rangle -\langle b,w^{k}\rangle. } $$
    (9.87)

Proof

If SP(z k) has an optimal solution (v k, w k) then its dual

$$ \displaystyle{(SP^{{\ast}}(z^{k}))\qquad \quad \min \{\langle c,x\rangle \vert \ Ax \leq b,\ Qx = z^{k},\ x \geq 0\}} $$

is feasible, hence z k ∈ Q(D). It is easy to see that v k ∈ ∂ ψ(z k). Indeed, for any y, since ψ(y) is the optimal value in SP(y), it must also be the optimal value in SP(y); i.e., ψ(y) ≥ 〈y, v k〉 −〈b, w k〉, hence ψ(y) −ψ(z k) ≥ 〈y, v k〉 −〈b, w k〉 −〈z k, v k〉 + 〈b, w k〉 = 〈v k, yz k〉, proving the claim. For every y ∈ Ω we have \( \psi (y) \leq \tilde{\gamma }\leq \psi (z^{k}), \) so l(y): = 〈v k, yz k〉 ≤ ψ(y) −ψ(z k) ≤ 0. On the other hand, y k = θ z k = θ z k + (1 −θ)0 for some θ ≥ 1, hence l(y k) = θ l(z k) + (1 −θ)l(0) > 0 because l(0) ≤ ψ(0) −ψ(z k) < 0, while l(z k) = 0. 

To prove (ii), observe that if SP(z k) is unbounded, then SP(z k) is infeasible, i.e., z kQ(D). Since the recession cone of the feasible set of SP(z k) is the cone Q T vA T w ≤ 0, v ≥ 0,  SP(z k) may be unbounded only if an extreme direction (v k, w k) of this cone satisfies 〈z k, v k〉 −〈b, w k〉 > 0. Then the affine function (9.87) satisfies l(z k) > 0 while for any y ∈ Q(D),  SP(y) (hence SP(y)) must have a finite optimal value, and consequently, 〈y, v k〉 −〈b, w k〉 ≤ 0, i.e., l(y) ≤ 0.  □ 

The convergence of an OA method for (MRP) based on the above proposition follows from general theorems established in Sect. 6.3.

Remark 9.7

In the above approach we only used the constancy of h(x) on every manifold {x |  Qx = y}. In fact, since φ(y′) ≥ φ(y) for all y′ ≥ y, problem (MRP) can also be written as

$$ \displaystyle{ \min \{\psi (y)\vert \ x \in D,\ Qx \leq y,\ \varphi (y) \leq 0\}. } $$
(9.88)

Therefore, SP(z k) could be replaced by

$$ \displaystyle{\max \{\langle z^{k},v\rangle -\langle b,w\rangle \vert \ Q^{T}v - A^{T}w \geq 0,\ v \geq 0,\ w \geq 0\}.} $$

7.3 Branch and Bound

We describe a conical branch and bound procedure. A simplicial and a rectangular algorithm for the case when φ(y) is separable can be developed similarly.

From assumptions (a), (b) we have that

$$ \displaystyle{ \left \vert \begin{array}{l} 0 \in Q(D),\quad \varphi (0) > 0,\\ \psi (0) <\min \{\psi (y)\vert \ y \in Q(D),\ \varphi (y) \leq 0\}. \end{array} \right. } $$
(9.89)

A conical algorithm for solving (9.83) starts with an initial cone M 0 contained in Q(X) and containing the feasible set Q(D) ∩{ y |  φ(y) ≤ 0}.

Given any cone \( M \subset Q(\mathbb{R}^{n}) \) of base [u 1, , u k] let θ i u i be the point where the i-th edge of M intersects the surface φ(y) = 0. If U denotes the matrix of columns u 1, , u k then \( M =\{ t \in \mathbb{R}^{k}\vert \ Ut \geq 0,t \geq 0\} \) and i = 1 k t i θ i  = 1 is the equation of the hyperplane passing through the intersections of the edges of M with the surface φ(y) = 0. Denote by β(M) and \( \overline{t}(M) \) the optimal value and a basic optimal solution of the linear program

$$ \displaystyle{LP(M)\qquad \quad \min \left \{\langle c,x\rangle \vert \ x \in D,\ Qx =\sum _{ i=1}^{k}t_{ i}u^{i},\ \sum _{ i=1}^{k}\frac{t_{i}} {\theta _{i}} \geq 1,t \geq 0\right \}.\qquad \qquad } $$

Proposition 9.8

β(M) gives a lower bound for ψ(y) over the set of feasible points in M. If \( \omega (M) = U\overline{t}(M) \) lies on an edge of M then

$$ \displaystyle{ \beta (M) =\min \{\psi (y)\vert \ y \in Q(D) \cap M,\varphi (y) \leq 0\}. } $$
(9.90)

Proof

The first part of the proposition follows from the inclusion {y ∈ Q(D) ∩ M |  φ(y) ≤ 0} ⊂ { y ∈ Q(D) ∩ M |   i = 1 k t i θ i  ≥ 1} and the fact that min{ψ(y) |  y ∈ Q(D)∩M, φ(y) ≤ 0} = min{〈c, x〉 |  x ∈ D, Qx = y, y ∈ M, φ(y) ≤ 0}. On the other hand, if ω(M) lies on an edge of M it must lie on the portion of this edge outside the convex set φ(y) ≥ 0, hence φ(ω(M)) ≤ 0. Consequently, ω(M) is feasible to the minimization problem in (9.90) and hence is an optimal solution of it. □ 

It can be proved that a conical branch and bound procedure using ω-subdivision or ω-bisection (see Sect. 7.1) with the above bounding is guaranteed to converge.

Example 9.10

$$ \displaystyle{\begin{array}{cl} \mathrm{min} &2x_{1} + x_{2} + 0.5x_{3} \\ \mbox{ s.t.}& - 2x_{1} + x_{3} - x_{4} \leq 2.5 \\ \mbox{ } &x_{1} - 3x_{2} + x_{4} \leq \ 2 \\ \mbox{ } &x_{1} + x_{2} \leq 2 \\ \mbox{ } &x_{1},\ x_{2},\ x_{3},\ x_{4} \geq 0 \\ \mbox{ } &h(x):= -(3x_{1} + 6x_{2} + 8x_{3})^{2} - (4x_{1} + 5x_{2} + x_{4})^{2} + 154 \leq \ 0. \end{array} } $$

Here the function h(x) is concave and monotonic with respect to the cone K = { y |  c 1 y = 0,  c 2 y = 0}, where c 1 = (3, 6, 8, 0), c 2 = (4, 5, 0, 1) (cf Example 7.2). So Q is the matrix of two rows c 1, c 2 and Q(D) = { y = Qx, x ∈ D} is contained in the cone M 0 = { y | y ≥ 0} (D denotes the feasible set). Since the optimal solution \( 0 \in \mathbb{R}^{4} \) of the underlying linear program satisfies h(0) > 0, conditions (9.89) hold, and M 0 = cone{e 1, e 2} can be chosen as an initial cone. The conical algorithm terminates after 4 iterations, yielding an optimal solution of (0, 0, 1. 54, 2. 0) and the optimal value 0. 76. Note that branching is performed in \( \mathbb{R}^{2} \) (t-space) rather than in the original space \( \mathbb{R}^{4}. \)

7.4 Decomposition by Polyhedral Annexation

From L = { x |  Qx = 0} ⊂ K ⊂ C we have

$$ \displaystyle{ C^{\circ }\subset K^{\circ }\subset L^{\perp }, } $$
(9.91)

where L  ⊥  (the orthogonal complement of L) is the space generated by the rows c 1, , c r of Q (assuming rankQ = r). For every \( \gamma \in \mathbb{R} \cup +\{\infty \} \) let D γ  = { x ∈ D |  〈c, x〉 ≤ γ}. By Theorem 7.6 the value \( \overline{\gamma } \) is optimal if and only if

$$ \displaystyle{ D_{\overline{\gamma }}\setminus \mathrm{int}C\neq \emptyset,\quad D_{\overline{\gamma }} \subset C. } $$
(9.92)

Following the PA Method for (LRC) (Sect. 8.1), to solve (MRP) we construct a nested sequence of polyhedrons P 1 ⊃ P 2 ⊃  together with a nonincreasing sequence of real numbers γ 1 ≥ γ 2 ≥  such that \( D_{\gamma _{k}}\setminus \mathrm{int}C\neq \emptyset,\ C^{\circ }\subset P_{k} \subset L^{\perp }\ \forall k, \) and eventually \( P_{l} \subset [D_{\gamma _{l}}]^{\circ } \) for some l: then \( D_{\gamma _{l}} \subset P_{l}^{\circ }\subset C, \) hence γ l is optimal by the criterion (9.92).

To exploit monotonicity [which implies (9.91)], the initial polytope P 1 is taken to be the r-simplex constructed as in Lemma 9.1 (note that 0 ∈ intC by assumption (a), so this construction is possible). Let \( \pi: L^{\perp }\rightarrow \mathbb{R}^{r} \) be the linear map y =  i = 1 r t i c iπ(y) = t. For any set \( E \subset \mathbb{R}^{r} \) denote \( \tilde{E} =\pi (E) \subset \mathbb{R}^{r}. \) Then by Lemma 9.1 \( \tilde{P}_{1} \) is the simplex in \( \mathbb{R}^{r} \) defined by the inequalities

$$ \displaystyle{\sum _{i=1}^{r}t_{ i}\langle c^{i},c^{j}\rangle \leq \frac{1} {\alpha _{j}}\quad j = 0,1,\ldots,r} $$

where c 0 = − i = 1 r c i, α j  = sup{α |  α c j ∈ C}. In the algorithm below, when a better feasible solution x k than the incumbent has been found, we can compute a feasible point \( \hat{x}^{k} \) at least as good as x k and lying on an edge of D (see Sect. 5.7). This can be done by carrying out a few steps of the simplex algorithm. We shall refer to \( \hat{x}^{k} \) as a point obtained by local improvement from x k. 

PA Algorithm for (MRP)

Initialization. Choose a vertex x 0 of D such that h(x 0) > 0 and set D ← Dx 0,  C ← Cx 0. Let \( \overline{x}^{1} = \) best feasible solution available, \( \gamma _{1} =\langle c,\overline{x}^{1}\rangle \) \( (\overline{x}^{1} =\emptyset,\gamma _{k} = +\infty \) if no feasible solution is available). Let P 1 be the simplex constructed as in Lemma 9.1, V 1 the vertex set of \( \tilde{P}_{1}. \) Set k = 1. 

Step 1. :

For every t = (t 1, , t r )T ∈ V k solve the linear program

$$ \displaystyle{\max \left \{\sum _{i=1}^{r}t_{ i}\langle c^{i},x\rangle \vert \ x \in D,\langle c,x\rangle \leq \gamma _{ k}\right \}} $$

to obtain its optimal value μ(t). (Only the new t ∈ V k should be considered if this step is entered from Step 3). If \( \mu (t) \leq 1\ \forall t \in V _{k} \), then terminate: if γ k  < , \( \overline{x}^{k} \) is an optimal solution; otherwise (MRP) is infeasible.

Step 2. :

If μ(t k) > 1 for some t k ∈ V k , then let x k be a basic optimal solution of the corresponding linear program. If x kC, then let \( \hat{x} ^{k} \) be the solution obtained by local improvement from x k, reset \( \overline{x}^{k} = \hat{x} ^{k},\ \gamma _{k} =\langle c, \hat{x} ^{k}\rangle, \) and return to Step 1.

Step 3. :

If x k ∈ C then set \( \overline{x}^{k+1} = \overline{x}^{k},\ \gamma _{k+1} =\gamma _{k}, \) compute θ k  = sup{θ |  θ x k ∈ C} and define

$$ \displaystyle{\tilde{P}_{k+1} =\tilde{ P}_{k} \cap \left \{t\vert \ \sum _{i=1}^{r}t_{ i}\langle x^{k},c^{i}\rangle \leq \frac{1} {\theta _{k}}\right \}.} $$

From V k derive the vertex set V k+1 of \( \tilde{P}_{k+1}. \) Set k ← k + 1 and return to Step 1.

Proposition 9.9

The above PA Algorithm for (MRP) is finite.

Proof

Immediate, since this is a mere specialization of the PA Algorithm for (LRCP) in Sect. 6.4 to (MRP). □ 

Remark 9.8

If K = { x | Qx ≥ 0} then P 1 can be constructed as in Lemma 9.2. The above algorithm assumes regularity of the problem (assumption (b)). Without this assumption, we can replace C by C ɛ  = { x ∈ X |  h(x) ≤ ɛ}. Applying the above algorithm with C ← C ɛ will yield an ɛ-approximate optimal solution (cf Sect. 7.3), i.e., a point x ɛ satisfying x ɛ ∈ D, h(x ɛ) ≤ ɛ, and 〈c, x ɛ〉 ≤ min{〈c, x〉 | x ∈ D, h(x) ≤ 0}.

8 Network Constraints

Consider a special concave programming problem of the form:

$$ \displaystyle\begin{array}{rcl} (SCP)\qquad \qquad & \min & g(y_{1},\ldots,y_{r}) +\langle c,x\rangle \\ & \mbox{ s.t.}& y \in Y {}\end{array} $$
(9.93)
$$ \displaystyle\begin{array}{rcl} & \mbox{ }& Qx = y,\ Bx = d,\ x \geq 0.{}\end{array} $$
(9.94)

where \( g: \mathbb{R}_{+}^{r} \rightarrow \mathbb{R}^{+} \) is a concave function, Y a polytope in \( \mathbb{R}_{+}^{r} \), \( c,x \in \mathbb{R}^{n},d \in \mathbb{R}^{m},Q \) an r × n matrix of rank r, and B an m × n matrix. In an economic interpretation, y = (y 1, , y r )T may denote a production program to be chosen from a set Y of technologically feasible production programs, x a distribution–transportation program to be determined so as to meet the requirements (9.94). The problem is then to find a production–distribution–transportation program, satisfying conditions (9.94), with a minimum cost. A variant of this problem is the classical concave production–transportation problem

$$ \displaystyle{(PTP)\quad \left \vert \begin{array}{ll} \min &g(y) +\sum _{ i=1}^{r}\sum _{ j=1}^{m}c_{ ij}x_{ij} \\ \mbox{ s.t.}&\sum _{j=1}^{m}x_{ ij} = y_{i}\quad i = 1,\ldots,r \\ \mbox{ } &\sum _{i=1}^{r}x_{ ij} = d_{j}\quad j = 1,\ldots,m \\ \mbox{ } &x_{ij} \geq 0,\quad i = 1,\ldots,r,\ j = 1,\ldots,m \end{array} \right.} $$

where y i is the production level to be determined for factory i and x ij the amount to be sent from factory i to warehouse j in order to meet the demands d 1, , d m of the warehouses. Another variant of SCP is the minimum concave cost network flow problem which can be formulated as

$$ \displaystyle{(MCCNFP)\qquad \left \vert \begin{array}{ll} \min &\sum _{i=1}^{r}g_{i}(x_{i}) +\sum _{ i=r+1}^{n}c_{i}x_{i} \\ \mathrm{s.t.}&Ax = d,\ x \geq 0. \end{array} \right.} $$

where A is the node-arc incidence matrix of a given directed graph with n arcs, d the vector of node demands (with supplies interpreted as negative demands), and x i the flow value on arc i. Node j with d j  > 0 are the sinks, nodes j with d j  < 0 are the sources, and it is assumed that j d j  = 0 (balance condition). The cost of shipping t units along arc i is a nonnegative concave nondecreasing function g i (t) for i = 1, , r, and a linear function c i t (with c i  ≥ 0) for i = r + 1, , n. To cast (MCCNFP) into the form (SCP) it suffices to rewrite it as

$$ \displaystyle{ \left \vert \begin{array}{ll} \min &\sum _{i=1}^{r}g_{i}(y_{i}) +\sum _{ i=r+1}^{n}c_{i}x_{i} \\ \mathrm{s.t.}&x_{i} = y_{i}\ (i = 1,\ldots,r),\ \ Ax = d,\ x \geq 0. \end{array} \right. } $$
(9.95)

Obviously, Problem (SCP) is equivalent to

$$ \displaystyle\begin{array}{rcl} & \min & g(y) + t \\ & \mbox{ s.t.}& \langle c,x\rangle \leq t,\ Qx = y,\ Bx = d,\ x \geq 0,\ y \in Y.{}\end{array} $$
(9.96)

Since this is a concave minimization problem with few nonconvex variables, it can be handled by the decomposition methods discussed in the previous Sects. 7.2 and 7.3

Note that in both (PTP) and (MCCNFP) the constraints (9.94) have a nice structure such that when y is fixed the problem reduces to a special linear transportation problem (for (PTP)) or a linear cost flow problem (for (MCCNFP)), which can be solved by very efficient specialized algorithms. Therefore, to take full advantage of the structure of (9.94) it is important to preserve this structure under the decomposition.

8.1 Outer Approximation

This method is initialized from a polytope P 1 = Y × [α, β],where α, β are, respectively, a lower bound and an upper bound of 〈c, x〉 over the polytope Bx = d, x ≥ 0. At iteration k, we have a polytope P k  ⊂ P 1 containing all (y, t) for which there exists x ≥ 0 satisfying (9.96). Let

$$ \displaystyle{(y^{k},t^{k}) \in \mathrm{argmin}\{t + g(y)\vert \ (y,t) \in P_{ k}\}.} $$

Following Remark 7.1 to check whether (y k, t k) is feasible to (9.96) we solve the pair of dual programs

$$ \displaystyle\begin{array}{rcl} & \min \{\langle c,x\rangle \vert \ \ Qx = y^{k},\ Bx = d,\ x \geq 0\}&{}\end{array} $$
(9.97)
$$ \displaystyle\begin{array}{rcl} & \max \{\langle y^{k},u\rangle +\langle d,v\rangle \vert \ \ Q^{T}u + B^{T}v \leq c\}&{}\end{array} $$
(9.98)

obtaining an optimal solution x k of (9.97) and an optimal solution (u k, v k) of the dual (9.98). If 〈c, x k〉 ≤ t k then (x k, y k) solves (SCP). Otherwise, we define

$$ \displaystyle{ P_{k+1} = P_{k} \cap \{ (y,t)\vert \ \langle y,u^{k}\rangle +\langle d,v^{k}\rangle \leq t\} } $$
(9.99)

and repeat the procedure with k ← k + 1. 

In the particular case of (PTP) the subproblem (9.97) is a linear transportation problem obtained from (PTP) by fixing y = y k, while in the case of (MCCNFP) the subproblem (9.97) is a linear min-cost network flow problem obtained from (MCCNFP) by fixing x i  = y i k for i = 1, , r (or equivalently, by deleting the arcs i = 1, , r and accordingly modifying the demands of the nodes which are their endpoints).

8.2 Branch and Bound Method

Assume that in problem (SCP) the function g(y) is concave separable, i.e., g(y) =  i = 1 r g i (y i ). For each rectangle M = [a, b] let

$$ \displaystyle{l_{M}(y) =\sum _{ i=1}^{R}l_{ i,M}(y_{i})} $$

where l i, M (t) is the affine function that agrees with g i (t) at the endpoints a i , b i of the segment [a i , b i ], i.e.,

$$ \displaystyle{ l_{i,M}(t) = g_{i}(a_{i}) + \frac{g_{i}(b_{i}) - g_{i}(a_{i})} {b_{i} - a_{i}} (t - a_{i}). } $$
(9.100)

(see Proposition 7.4). Then a lower bound β(M) for the objective function in (SCP) over M is provided by the optimal value in the linear program

$$ \displaystyle{(LP(M))\qquad \min \left \{\sum _{i=1}^{r}l_{ i,M}(y_{i}) +\langle c,x\rangle \vert \ Qx = y,Ax = d,x \geq 0\right \}.\qquad } $$

Using this lower bound and a rectangular ω-subdivision rule, a convergent rectangular algorithm has been developed in Horst and Tuy (1996) which is a refined version of an earlier algorithm by Soland (1974) and should be able to handle even large scale (PTP) provided r is relatively small. Also note that if the data are all integral, then it is well known that a basic optimal solution of the linear transportation problem LP(M) always exists with integral values. Consequently, in this case, starting from an initial rectangle with integral vertices the algorithm will generate only subrectangles with integral vertices, and so it will necessarily be finite.

In practice, aside from production and transportation costs there may also be shortage/holding costs due to a failure to meet the demands exactly. Then we have the following problem (cf Example 5.1) which includes a formulation of the stochastic transportation–location problem as a special case:

$$ \displaystyle{(GPTP)\qquad \qquad \left \vert \begin{array}{ll} \min &\langle c,x\rangle +\sum _{ i=1}^{r}g_{i}(y_{i}) +\sum _{ j=1}^{m}h_{j}(z_{j}) \\ \mathrm{s.t.}&\sum _{j=1}^{m}x_{ ij} = y_{i}\quad (i = 1,\ldots,r) \\ \mbox{ } &\sum _{i=1}^{r}x_{ ij} = z_{j}\quad (j = 1,\ldots,m) \\ \mbox{ } &0 \leq y_{i} \leq s_{i}\quad \forall i, \\ \mbox{ } &x_{ij} \geq 0\quad \forall i,j. \end{array} \right.\qquad \qquad } $$

where g i (y i ) is the cost of producing y i units at factory i and h j (z j ) is the shortage/holding cost to be incurred if the warehouse j receives z j d j (demand of warehouse j). It is assumed that g i is a concave nondecreasing function satisfying g i (0) = 0 ≤ g i (0+) (so possibly g i (0+) > 0 as, e.g., for a fixed charge), while \( h_{j}: [0,s] \rightarrow \mathbb{R}_{+} \) is a convex function (s =  i = 1 r s i ), such that h j (d j ) = 0, and the right derivative of h j at 0 has a finite absolute value ξ j . The objective function is then a d.c. function, which is a new source of difficulty. However, since the problem becomes convex when y is fixed, it can be handled by a branch and bound algorithm in which branching is performed upon y (Holmberg and Tuy 1995).

For any rectangle \( [a,b] \subset Y:=\{ y \in \mathbb{R}_{+}^{r}\vert \ 0 \leq y_{i} \leq s_{i}\ i = 1,\ldots,r\} \) let as previously l i, M (t) be defined by (9.100) and l M (y) =  i = 1 r l i, M (y i ). Then a lower bound of the objective function over the feasible points in M is given by the optimal value β(M) in the convex program

$$ \displaystyle{CP(M)\qquad \qquad \left \vert \begin{array}{ll} \min &\langle c,x\rangle + l_{M}(y) +\sum _{ j=1}^{m}h_{j}(z_{j}) \\ \mathrm{s.t.}&\sum _{j=1}^{m}x_{ ij} = y_{i}\quad (i = 1,\ldots,r) \\ \mbox{ } &\sum _{i=1}^{r}x_{ ij} = z_{j}\quad (j = 1,\ldots,m) \\ \mbox{ } &a_{i} \leq y_{i} \leq b_{i}\quad \forall i, \\ \mbox{ } &x_{ij} \geq 0\quad \forall i,j. \end{array} \right.\qquad \qquad } $$

Note that if an optimal solution \( (\overline{x}(M),\overline{y}(M),\overline{z}(M)) \) of this convex program satisfies \( l_{M}(\overline{y}(M)) = g(\overline{y}(M)) \) then β(M) equals the minimum of the objective function over the feasible solutions in M. When y is fixed, (GPTP) is a convex program in x, z. If φ(y) denotes the optimal value of this program then the problem amounts to minimizing φ(y) over all feasible (x, y, z).

BB Algorithm for (GPTP)

Initialization. Start with a rectangle M 1 ⊂ Y. Let y 1 be the best feasible solution available, INV = φ(y 1). Set \( \mathcal{P}_{1} = \mathcal{S}_{1} =\{ M_{1}\},k = 1. \)

Step 1.:

For each \( M \in \mathcal{R}_{1} \) solve the associated convex program CP(M) obtaining its optimal value β(M) and optimal solution \( (\overline{x}(M),\overline{y}(M),\overline{z}(M)). \)

Step 2.:

Update INV and y k. 

Step 3.:

Delete all \( M \in \mathcal{S}_{k} \) such that β(M) ≥ INVɛ. Let \( \mathcal{R}_{k} \) be the collection of remaining rectangles.

Step 4.:

If \( \mathcal{R}_{k} =\emptyset, \) then terminate: y k is a global ɛ-optimal solution of (GPTP). 

Step 5.:

Select \( M_{k} \in \mathrm{argmin}\{\beta (M)\vert \ M \in \mathcal{R}_{k}\}. \) Let \( \overline{y}^{k} = \overline{y}(M_{k}), \)

$$ \displaystyle{ i_{k} \in \mathrm{arg}\max _{i}\{g_{i}(\overline{y}^{k}) - l_{ i,M_{K}}(\overline{y}^{k})\}. } $$
(9.101)

If \( g_{i_{k}}(\overline{y}_{i_{k}}^{k}) - l_{i_{k},M_{k}}(\overline{y}^{k}) = 0, \) then terminate: \( \overline{y}^{k} \) is a global optimal solution.

Step 6.:

Divide M k via \( (\overline{y}^{k},i_{k}) \) (see Sect. 5.6). Let \( \mathcal{P}_{k+1} \) be the partition of M k and \( \mathcal{S}_{k+1} = \mathcal{R}_{k}\setminus (\{M_{k}\}) \cup \mathcal{P}_{k+1}. \) Set k ← k + 1 and return to Step 1.

To establish the convergence of this algorithm, recall that \( \overline{y}^{k} = \overline{y}(M_{k}). \) Similarly denote \( \overline{x}^{k} = \overline{x}(M_{k}),\overline{z}^{k} = \overline{z}(M_{k}), \) so \( (\overline{x}^{k},\overline{y}^{k},\overline{z}^{k}) \) is an optimal solution of CP(M k ). Suppose the algorithm is infinite and let \( \overline{y} \) be any accumulation point of \( \{\overline{y}^{k}\} \), say \( \overline{y} =\lim _{q\rightarrow \infty }\overline{y}^{k_{q}}. \) Without loss of generality we may assume \( \overline{x}^{k_{q}} \rightarrow \overline{x},\overline{z}^{k_{q}} \rightarrow \overline{z} \) and also \( i_{k_{q}} = 1\ \forall q, \) so that

$$ \displaystyle{ 1 \in \mathrm{arg}\max _{i}\{g_{i}(\overline{y}^{k_{q} }) - l_{i,M_{k_{q}}}(\overline{y}^{k_{q} })\}. } $$
(9.102)

Clearly if for some q we have \( \overline{y}_{1}^{k_{q}} = 0 \), then \( g_{i}(\overline{y}^{k_{q}}) = l_{1,M_{ k_{q}}}(\overline{y}^{k_{q}}), \) and the algorithm terminates in Step 5. Therefore, since the algorithm is infinite, we must have \( \overline{y}_{1}^{k_{q}} > 0\ \forall q. \)

Lemma 9.9

If g 1 (0 + ) > 0 then

$$ \displaystyle{ \overline{y}_{1} =\lim _{q\rightarrow \infty }\overline{y}_{1}^{k_{q} } > 0. } $$
(9.103)

In other words, \( \overline{y}_{1} \) is always a continuity point of g 1 (t).

Proof

Let \( M_{k_{q},1} = [a_{1}^{q},b_{1}^{q}], \) so that \( \overline{y}_{1}^{k_{q}} \in [a_{1}^{q},b_{1}^{q}] \) and [a 1 q, b 1 q], q = 1, 2,  is a sequence of nested intervals. If \( 0\notin [a_{1}^{q_{0}},b_{1}^{q_{0}}] \) for some q 0, then \( 0\notin [a_{1}^{q},b_{1}^{q}]\ \forall q \geq q_{0}, \) and \( \overline{y}_{1}^{k_{q}} \geq a_{1}^{q_{0}} > 0\ \forall q \geq q_{0}, \) hence \( \overline{y}_{1} \geq a_{1}^{q_{0}} > 0, \) i.e., (9.103) holds. Thus, it suffices to consider the case

$$ \displaystyle{ a_{1}^{q} = 0,\ \overline{y}_{ 1}^{k_{q} } > 0\quad \forall q. } $$
(9.104)

Recall that ξ j denotes the absolute value of the right derivative of h j (t) at 0. Let

$$ \displaystyle{ \pi =\max _{j}\xi _{j} -\min _{j}c_{1j}. } $$
(9.105)

Since \( \sum _{j=1}^{n}\overline{x}_{1j}^{k_{q}} = \overline{y}_{1}^{k_{q}} > 0, \) there exists j 0 such that \( \overline{x}_{1j_{0}}^{k_{q}} > 0. \) Let \( (\tilde{x}^{k_{q}},\overline{y}^{k_{q}},\overline{z}^{k_{q}}) \) be a feasible solution to CP(M) obtained from \( (\overline{x}^{k_{q}},\overline{y}^{k_{q}},\overline{z}^{k_{q}}) \) by subtracting a very small positive amount \( \eta < \overline{x}_{1j_{0}}^{k_{q}} \) from each of the components \( \overline{x}_{1j_{0}}^{k_{q}},\overline{y}_{1}^{k_{q}},\overline{z}_{j_{ 0}}^{k_{q}}, \) and letting unchanged all the other components. Then the production cost at factory 1 decreases by \( l_{1,M_{k_{q}}}(\overline{y}_{1}^{k_{q}}) - l_{1,M_{ k_{q}}}(\tilde{y}_{1}^{k_{q}}) = g_{ 1}(b_{1}^{k_{q}})\eta /b_{ 1}^{k_{q}} > 0, \) the transportation cost in the arc (1, j 0) decreases by \( c_{1j_{0}}\eta, \) while the penalty incurred at warehouse j 0 either decreases (if \( \overline{z}_{j_{0}}^{k_{q}} > d_{j}), \) or increases by \( h_{j_{0}}(\overline{z}_{j_{0}}^{k_{q}}-\eta ) - h_{j_{ 0}}(\overline{z}_{j_{0}}^{k_{q}}) \leq \xi _{ j_{0}}\eta. \) Thus the total cost decreases by at least

$$ \displaystyle{ \delta = \left [\frac{g_{1}(b_{1}^{k_{q}})} {b_{1}^{k_{q}}} + c_{1j_{0}} -\xi _{j_{0}}\right ]\eta. } $$
(9.106)

If π ≤ 0 then \( c_{1j_{0}} -\xi _{j_{0}} \geq 0, \) hence δ > 0 and \( (\tilde{x}^{k_{q}},\tilde{y}^{k_{q}},\tilde{z}^{k_{q}}) \) would be a better feasible solution than \( (\overline{x}^{k_{q}},\overline{y}^{k_{q}},\overline{z}^{k_{q}}), \) contradicting the optimality of the latter for \( CP(M_{k_{q}}). \) Therefore, π > 0. Now suppose g 1(0+) > 0. Since g 1(t)∕t → + as t → 0+ (in view of the fact g 1(0+) > 0), there exists τ 1 > 0 satisfying

$$ \displaystyle{\frac{g_{1}(\tau _{1})} {\tau _{1}} >\pi.} $$

Observe that since \( M_{k_{q},1} = [0,b_{1}^{k_{q}}] \) is divided via \( \overline{y}_{1}^{k_{q}}, \) we must have \( [0,b_{1}^{k_{q}'}] \subset [0,\overline{y}_{1}^{k_{q}}] \) for all q′ > q, while \( [0,\overline{y}_{1}^{k_{q}}] \subset [0,b_{1}^{k_{q}}] \) for all q. With this in mind we will show that

$$ \displaystyle{ b_{1}^{k_{q} } \geq \tau _{1}\quad \forall q. } $$
(9.107)

By the above observation this will imply that \( \overline{y}_{1}^{k_{q}} \geq \tau _{1}\forall q, \) thereby completing the proof. Suppose (9.107) does not hold, i.e., \( b_{1}^{k_{q}} <\tau _{1} \) for some q. Then \( \overline{y}_{1}^{k_{q}} \leq b_{1}^{k_{q}} <\tau _{1}, \) and since g 1(t) is concave it is easily seen by Fig. 9.1 that \( g_{1}(b_{1}^{k_{q}}) \geq b_{1}^{k_{q}}g_{1}(\tau _{1})/\tau _{1}, \) i.e.,

$$ \displaystyle{\frac{g_{1}(b_{1}^{k_{q}})} {b_{1}^{k_{q}}} \geq \frac{g_{1}(\tau _{1})} {\tau _{1}}.} $$

This, together with (9.106) implies that

$$ \displaystyle{\delta \geq \left (\frac{g_{1}(\tau _{1})} {\tau _{1}} -\pi \right )\overline{x}_{1j_{0}}^{k_{q} } > 0} $$

hence \( (\tilde{x}^{k_{q}},\tilde{y}^{k_{q}},\tilde{z}^{k_{q}}) \) is a better feasible solution than \( (\overline{x}^{k_{q}},\overline{y}^{k_{q}},\overline{z}^{k_{q}}). \) This contradiction proves (9.107). □ 

Fig. 9.1
figure 1

Consequence of concavity of g 1(t)

Theorem 9.3

The above algorithm for (GPTP) can be infinite only if ɛ = 0 and in this case it generates an infinite sequence \( \overline{y}^{k} = \overline{y}(M_{k}),k = 1,2,\ldots \) every accumulation point of which is an optimal solution.

Proof

For ɛ = 0 let \( \overline{y} =\lim _{q\rightarrow \infty }\overline{y}^{k_{q}} \) be an accumulation point of \( \{\overline{y}^{k_{q}}\} \) with, as previously, \( i_{k_{q}} = 1\ \forall q, \) and \( M_{k_{q},1} = [a_{1}^{q},b_{1}^{q}]. \) Since {a 1 q} is nondecreasing, {b 1 q} is nonincreasing, we have a 1 q → a 1 , b 1 q → b 1 . For each q either of the following situations occurs:

$$ \displaystyle\begin{array}{rcl} & & a_{1}^{q}\ <\ a^{q+1}\ <\ b_{ 1}^{q+1}\ <\ \overline{y}_{ 1}^{k_{q} }\ <\ b_{1}^{q}{}\end{array} $$
(9.108)
$$ \displaystyle\begin{array}{rcl} & & a_{1}^{q}\ <\ \overline{y}_{ 1}^{k_{q} }\ <\ a_{1}^{q+1}\ <\ b_{ 1}^{q+1}\ <\ b_{ 1}^{q}{}\end{array} $$
(9.109)

If (9.108) occurs for infinitely many q then \( \lim _{q\rightarrow \infty }\overline{y}_{1}^{k_{q}} =\lim _{q\rightarrow \infty }b_{1}^{q} = b_{1}^{{\ast}}. \) If (9.109) occurs for infinitely many q then \( \lim _{q\rightarrow \infty }\overline{y}_{1}^{k_{q}} =\lim _{q\rightarrow \infty }a_{1}^{q} = a_{1}^{{\ast}}. \) Thus, \( \overline{y}_{1} \in \{ a_{1}^{{\ast}},b_{1}^{{\ast}}\}. \) By Lemma 9.9, the concave function g 1(t) is continuous at \( \overline{y}_{1}. \) Suppose, for instance, that \( \overline{y}_{1} = a_{1}^{{\ast}} \) (the other case is similar). Since \( \overline{y}_{1}^{k_{q}} \rightarrow a_{1}^{{\ast}} \) and a 1 q → a 1 , it follows from the continuity of g 1(t) at a 1 that \( g_{1}(\overline{y}_{1}^{k_{q}}) - l_{1,M_{ k_{q}}}(\overline{y}_{1}^{k_{q}}) \rightarrow 0 \) and hence, by (9.102),

$$ \displaystyle{g_{i}(\overline{y}_{i}^{k_{q} }) - l_{i,M_{k_{q}}}(\overline{y}_{i}^{k_{q} }) \rightarrow 0\quad i = 1,\ldots,r} $$

as q → . This implies that \( g(\overline{y}^{k_{q}}) - l_{M_{ k_{q}}}(\overline{y}^{k_{q}}) \rightarrow 0 \) and hence, that

$$ \displaystyle{\beta (M_{k_{q}}) \rightarrow \langle c,\overline{x}\rangle + g(\overline{y}) + h(\overline{z})} $$

as q → . Since \( \beta (M_{k_{q}}) \leq \gamma:= \) optimal value of (GPTP) it follows that \( \langle c,\overline{x}\rangle + g(\overline{y}) + h(\overline{z}) \) is the sought minimum of (GPTP).  □ 

Computational experiments have shown that the above algorithm for (GPTP) can solve problems with up to r = 100 supply points and n = 500 demand points in reasonable time on a Sun SPARCstation SLC (Holmberg and Tuy 1995). To apply this algorithm to the case when h j (z j ) = + for z j d j (e.g., the fixed charge problem) it suffices to redefine h j (z j ) = ξ j ( | d j x j  | ) where ξ j is finite but sufficiently large.

9 Some Polynomial-Time Solvable Problems

As mentioned previously, even quadratic programming with one negative eigenvalue is NP-hard (Pardalos and Vavasis 1991). There are, nevertheless, genuine concave programs which are polynomial-time solvable. In this section we discuss this topic, to illustrate the efficiency of the parametric approach for certain classes of low rank nonconvex problems.

9.1 The Special Concave Program CPL(1)

Consider the special concave program under linear constraints:

$$ \displaystyle{(CPL(1))\quad \left \vert \begin{array}{ll} \min &g(y) +\langle c,x\rangle \\ \mbox{ s.t.} &\sum _{ j=1}^{n}x_{ j} = y,\quad 0 \leq x_{j} \leq d_{j}\ \forall j. \end{array} \right.\qquad } $$

where g(y) is a concave function, \( c \in \mathbb{R}^{n} \) and \( d \in \mathbb{R}_{+}^{n}. \) Let s =  j = 1 n d j . To solve this concave program by the parametric right-hand side method presented in Sect. 8.5, we first compute the breakpoints y 0 = 0 ≤ y 1 ≤  ≤ y N  = s of the optimal value function of the associated parametric linear program

$$ \displaystyle{(LP(y))\qquad \min \{\langle c,x\rangle \vert \ \sum _{j=1}^{n}x_{ j} = y,\ 0 \leq x_{j} \leq d_{j}\ \forall j\},\quad 0 \leq y \leq s.\qquad \qquad.} $$

If \( \overline{x}^{i} \) is a basic optimal solution of LP(y i ) then, by Lemma 9.4, an optimal solution of CPL(1) is given by \( (\overline{x}^{i_{{\ast}}},y_{i_{ {\ast}}}) \) where

$$ \displaystyle{ i_{{\ast}}\in \mathrm{arg}\min \{g(y_{i}) + c\overline{x}^{i}\vert \ i = 0,1,\ldots,N\}. } $$
(9.110)

Thus, all is reduced to solving LP(y) parametrically in y ∈ [0, s]. But this is very easy because LP(y) is a continuous knapsack problem. In fact, by reindexing we can assume that the c j are ordered by increasing values:

$$ \displaystyle{c_{1}\ \leq \ c_{2}\ \leq \ldots \ \leq \ c_{n}.} $$

Let k be an index such that j = 1 k−1 d j  ≤ y ≤  j = 1 k d j . 

Lemma 9.10

An optimal solution of LP(y) is \( \overline{x} \) with

$$ \displaystyle{\left \vert \begin{array}{l} \overline{x}_{j} = d_{j}\quad \forall j = 1,\ldots,k - 1 \\ \overline{x}_{k} = y -\sum _{j=1}^{k-1}d_{j},\quad \overline{x}_{j} = 0\quad \forall j = k + 1,\ldots,n. \end{array} \right.} $$

Proof

This well-known result (see, e.g., Dantzig 1963) is rather intuitive because for any feasible solution x of LP(y), by noting that y =  j = 1 n x j we can write

$$ \displaystyle\begin{array}{rcl} \langle c,\overline{x}\rangle & =& \sum _{j=1}^{k-1}c_{ j}d_{j} + c_{k}\sum _{j=1}^{n}x_{ j} - c_{k}\sum _{j=1}^{k-1}d_{ j} {}\\ & =& \sum _{j=1}^{k-1}c_{ j}d_{j} - c_{k}\sum _{j=1}^{k-1}(d_{ j} - x_{j}) + c_{k}\sum _{j=k}^{n}x_{ j} {}\\ & \leq & \sum _{j=1}^{k-1}c_{ j}d_{j} -\sum _{j=1}^{k-1}c_{ j}(d_{j} - x_{j}) + c_{k}\sum _{j=k}^{n}x_{ j} \leq \langle c,x\rangle. {}\\ \end{array} $$

 □ 

Proposition 9.10

  1. (i)

    The breakpoints of the optimal value function of LP(y), 0 ≤ y ≤ s, are among the points:

    $$ \displaystyle{y_{0} = 0,\ y_{k} = y_{k-1} + d_{k}\quad k = 1,\ldots,n} $$
  2. (ii)

    For every k = 0,1,…,n a basic optimal solution of TP(y k ) is the vector \( \overline{x}^{k} \) such that

    $$ \displaystyle{ \overline{x}_{j}^{k} = d_{ j},\ j = 1,\ldots,k;\quad \overline{x}_{j}^{k} = 0,\ j = k + 1,\ldots n. } $$
    (9.111)

Proof

From Lemma 9.10, for any y ∈ [y k , y k+1] a basic optimal solution of LP(y) is

$$ \displaystyle{x^{y} = \overline{x}^{k} +\theta (\overline{x}^{k+1} -\overline{x}^{k})\ \mbox{ with}\ \theta = \frac{y - y_{k}} {d_{k+1}}.} $$

Hence each segment [y k , y k+1] is a linear piece of the optimal value function φ(y) of LP(y), which proves (i). Part (ii) is immediate. □ 

As a result, the concave program CPL(1) can be solved by the following:

Parametric Algorithm for CPL(1)

Initialization. Order the coefficients c j by increasing values:

$$ \displaystyle{c_{1}\ \leq \ c_{2}\ \leq \ldots \ \leq \ c_{n}.} $$
Step 2. :

For k = 0, 1, , n compute

$$ \displaystyle{f_{k} = g\left (\sum _{j=1}^{k}d_{ j}\right ) +\sum _{ j=1}^{k}c_{ j}d_{j}.} $$
Step 3. :

Find \( f_{k^{{\ast}}} =\min \{ f_{0},f_{1},\ldots,f_{n}\}. \) Then \( \overline{x}^{k^{{\ast}} } \) is an optimal solution.

Proposition 9.11

The above algorithm requires O(nlog2 n) elementary operations and n evaluations of g(⋅)

Proof

The complexity dominant part of the algorithm is Step 1 which requires O(nlog2 n) operations, using, e.g., the procedure Heapsort (Ahuja et al. 1993). Also an evaluation of g(⋅ ) is needed for each number f k .  □ 

Thus, assuming the values of the nonlinear function g(⋅ ) provided by an oracle the above algorithm for CPL(1) runs in strongly polynomial time.

10 Applications

10.1 Problem PTP(2)

For convenience we shall refer to problem (PTP) with r factories as PTP(r). So PTP(2) is the problem

$$ \displaystyle{PTP(2)\quad \left \vert \begin{array}{ll} \min &g(y) +\sum _{ i=1}^{2}\sum _{ j=1}^{m}c_{ ij}x_{ij} \\ \mbox{ s.t.}&\sum _{j=1}^{m}x_{ 1j} = y \\ \mbox{ } &x_{1j} + x_{2j} = d_{j}\quad j = 1,\ldots,m, \\ \mbox{ } &0 \leq x_{ij} \leq d_{j}\quad \quad i = 1,2,\ j = 1,\ldots,m \end{array} \right.\qquad } $$

Substituting d j x 1j for x 2j and setting x j  = x 1j , c j  = c 1j c 2j we can convert this problem into a CPL(1):

$$ \displaystyle{\begin{array}{ll} \min &g(y) +\langle c,x\rangle \\ \mbox{s.t.}&\sum _{ j=1}^{m}x_{ j} = y,\quad 0 \leq x_{j} \leq d_{j}\ \forall j.\end{array} } $$

Therefore,

Proposition 9.12

PTP(2) can be solved by an algorithm requiring O(mlog2 m) elementary operations and m evaluations of g(⋅).

10.2 Problem FP(1)

We shall refer to Problem (MCCNFP) with a single source and r nonlinear arc costs as FP(r). So FP(1) is the problem of finding a feasible flow with minimum cost in a given network G with m nodes (indexed by j = 1, , m), and n arcs (indexed by i = 1, , n), where node j has a demand d j (with d m  < 0, d j  ≥ 0 for all other j,  j = 1 m d j  = 0), arc 1 has a concave nonlinear cost function g 1(t) while each arc i = 2, , n has a linear cost g i (t) = c i t (with c i  ≥ 0). If A j + and A j denote the sets of arcs entering and leaving node j, respectively, then the problem is

$$ \displaystyle{FP(1)\quad \left \vert \begin{array}{ll} \min &g_{1}(x_{1}) +\sum _{ i=2}^{n}c_{ i}x_{i} \\ \mbox{ s.t.}&\sum _{i\in A_{j}^{+}}x_{i} -\sum _{i\in A_{j}^{-}}x_{i} = d_{j},\quad j = 1,2,\ldots,m, \\ \mbox{ } &x \geq 0. \end{array} \right.\qquad } $$

Proposition 9.13

FP(1) can be reduced to a PTP(2) by means of O(mlog2 m+n) elementary operations.

Proof

To simplify the language we call the single arc with nonlinear cost the black arc. All the other arcs are called white, and the unit cost c i attached to a white arc is called its length. A directed path through white arcs only is called a white path; its length is then the sum of the lengths of its arcs. A white path from a node j to a node j′ is called shortest if its length is smallest among all white paths from j to j′. Denote by Q 0j the shortest white path from the source to sink j, by Q 1j the shortest white path from the head of the black arc to sink j, and by P the shortest white path from the source to the tail of the black arc. Let c 0j , c 1j , p be the lengths of these paths, respectively (the length being + , or an arbitrarily large positive number, if the path does not exist). Assuming {j | d j  > 0} = { 1, , m 1}, let \( s =\sum _{ j=1}^{m_{1}}d_{j}. \) Consider now a problem PTP(2) defined for two factories F 0, F 1, and m 1 warehouses \( W_{1},\ldots,W_{m_{1}} \) with demands \( d_{1},\ldots,d_{m_{1}}, \) where the cost of producing y units at factory F 0 and sy units at factory F 1 is g 1(y) + py while the unit transportation cost from F i to W j is c ij  (i = 0, 1), i.e.,

$$ \displaystyle{ \left \vert \begin{array}{ll} \min &g_{1}(y) + py +\sum _{ i=0}^{1}\sum _{j=1}^{m_{1}}c_{ij}x_{ij} \\ \mbox{ s.t.}&\sum _{j=1}^{m_{1}}x_{0j} = y \\ \mbox{ } &x_{0j} + x_{1j} = d_{j}\quad j = 1,\ldots,m_{1} \\ \mbox{ } & 0 \leq x_{0j} \leq d_{j}\quad j = 1,\ldots,m_{1} \end{array} \right. } $$
(9.112)

(the corresponding reduced network is depicted in Fig. 9.2). Clearly from an optimal solution \( [\overline{x}_{ij}] \) of (9.112) we can derive an optimal solution of FP(1) by setting, in FP(1), 

$$ \displaystyle{x_{1} =\sum _{ j=1}^{m_{1} }\overline{x}_{0j}} $$

and solving the corresponding linear min-cost network flow problem. Thus, solving FP(1) is reduced to solving the just defined PT P(2). To complete the proof, it remains to observe that the computation of c ij , i = 0, 1,  j = 1, , m 1, and p needed for the above transformation, requires solving two single-source multiple-sink shortest problems with nonnegative arc costs, hence a time of O(mlog2 m + n) (using Fredman and Tarjan’s (1984) implementation of Dijkstra’ s algorithm).

Fig. 9.2
figure 2

Reduced network

It follows from Propositions 9.11 and 9.13 that FP(1) can be solved by a strongly polynomial algorithm requiring O(mlog2 m + mlog2 m + n) = O(mlog2 m + n) elementary operations and m evaluations of the function g 1(. ). 

Remark 9.9

A polynomial (but not strongly polynomial) algorithm for FP(1) was first given by Guisewite and Pardalos (1992). Later a strongly polynomial algorithm based on solving linear min-cost network flow problems with a parametric objective function was obtained by Klinz and Tuy (1993). The algorithms presented in this section for PTP(1) and FP(1) were developed by Tuy et al. (1993a) and subsequently extended in a series of papers by Tuy et al. (1993b1994a1995b1996a) to prove the strong polynomial solvability of a class of network flow problems including PTP(r) and FP(r) with fixed r. 

10.3 Strong Polynomial Solvability of PTP(r)

The problem PTP(r) as was formulated at the beginning of Sect. 8.8 is

$$ \displaystyle{PTP(r)\quad \left \vert \begin{array}{ll} \min &g(\sum _{j=1}^{m}x_{ 1j}),\ldots,\sum _{j=1}^{m}x_{ rj}) +\sum _{ i=1}^{r}\sum _{ j=1}^{m}c_{ ij}x_{ij} \\ \mbox{ s.t.}&\sum _{i=1}^{r}x_{ ij} = d_{j}\quad j = 1,\ldots,m \\ \mbox{ } &x_{ij} \geq 0,\quad i = 1,\ldots,r,\ j = 1,\ldots,m \end{array} \right.\qquad } $$

where g(y′) ≥ g(y) whenever y′ ≥ y. 

Following the approach in Sect. 8.5 we associate with PTP(r) the parametric problem

$$ \displaystyle{P(t)\quad \left \vert \begin{array}{ll} \min &\sum _{i=1}^{r}t_{ i}\sum _{j=1}^{m}x_{ ij} +\sum _{ i=1}^{r}\sum _{ j=1}^{m}c_{ ij}x_{ij} \\ \mbox{ s.t.}&\sum _{i=1}^{r}x_{ ij} = d_{j}\quad j = 1,\ldots,m \\ \mbox{ } &x_{ij} \geq 0,\quad i = 1,\ldots,r,\ j = 1,\ldots,m \end{array} \right.\qquad } $$

where \( t \in \mathbb{R}_{+}^{r}. \) The parametric domain \( \mathbb{R}_{+}^{r} \) is partitioned into a finite collection \( \mathcal{P} \) of polyhedrons (“cells”) such that for each cell \( \varPi \in \mathcal{P} \) there is a basic solution x Π which is optimal to P(t) for all t ∈ Π. Then an optimal solution of PTP(r) is \( x^{\varPi ^{{\ast}} } \) where

$$ \displaystyle{ \varPi ^{{\ast}}\in \mathrm{argmin}\left \{g\left (\sum _{ j=1}^{m}x_{ 1j}^{\varPi },\ldots,\sum _{ j=1}^{m}x_{ rj}^{\varPi }\right ) +\sum _{ i,j}c_{ij}x_{ij}^{\varPi }\vert \varPi \in \mathcal{P}\right \}. } $$
(9.113)

Following Tuy (2000b) we now show that the collection \( \mathcal{P} \) can be constructed and its cardinality is bounded by a polynomial in m.

Observe that the dual of P(t) is

$$ \displaystyle{P^{{\ast}}(t)\quad \left \vert \begin{array}{ll} \max &\sum _{j=1}^{m}d_{ j}u_{j} \\ \mbox{ s.t.}&u_{j} \leq t_{i} + c_{ij},\quad i = 1,\ldots,r,j = 1,\ldots,m \end{array} \right.\qquad } $$

Also for any fixed t ∈ r + r a basic solution of P(t) is a vector x t such that for every j = 1, , m there is an i j satisfying

$$ \displaystyle{ x_{ij}^{t} = \left \{\begin{array}{ll} d_{j}&i = i_{j} \\ 0 &i\neq i_{j} \end{array} \right. } $$
(9.114)

By the duality of linear programming x t is a basic optimal solution of P(t) if and only if there exists a feasible solution u = (u 1, , u m ) of P (t) satisfying

$$ \displaystyle{u_{j}\left \{\begin{array}{ll} = t_{i} + c_{ij}&\quad i = i_{j} \\ \leq t_{i} + c_{ij}&\quad i\neq i_{j} \end{array} \right.} $$

or, alternatively, if and only if for every j = 1, , m:

$$ \displaystyle{ i_{j} \in \mathrm{argmin}_{i=1,\ldots,r}\{t_{i} + c_{ij}\}. } $$
(9.115)

Now let I 2 be the set of all pairs (i 1, i 2) such that i 1 < i 2 ∈ { 1, , r}. Define a cell to be a polyhedron \( \varPi \subset \mathbb{R}_{+}^{r} \) which is the solution set of a linear system formed by taking, for every pair (i 1, i 2) ∈ I 2 and every j = 1, , m, one of the following inequalities:

$$ \displaystyle{ t_{i_{1}} + c_{i_{1}j} \leq t_{i_{2}} + c_{i_{2}j},\quad t_{i_{1}} + c_{i_{1}j} \geq t_{i_{2}} + c_{i_{2}j}. } $$
(9.116)

Then for every j = 1, , m the order of magnitude of the sequence

$$ \displaystyle{t_{i} + c_{ij},\quad i = 1,\ldots,r} $$

remains unchanged as t varies over a cell Π. Hence the index i j satisfying PTP(3) remains the same for all t ∈ Π; in other words, x t (basic optimal solution of P(t)) equals a constant vector x Π for all t ∈ Π. Let \( \mathcal{P} \) be the collection of all cells defined that way. Since every \( t \in \mathbb{R}_{+}^{r} \) satisfies one of the inequalities (9.116) for every (i 1, i 2) ∈ I 2 and every j = 1, , m, the collection \( \mathcal{P} \) covers all of \( \mathbb{R}_{+}^{r}. \) Let us estimate an upper bound of the number of cells in \( \mathcal{P} \).

Observe that for any fixed pair (i 1, i 2) ∈ I 2 we have \( t_{i_{1}} + c_{i_{1}j} \leq t_{i_{2}} + c_{i_{2}j} \) if and only if \( t_{i_{1}} - t_{i_{2}} \leq c_{i_{2}j} - c_{i_{1}j}. \) Let us sort the numbers \( c_{i_{2}j} - c_{i_{1}j},j = 1,\ldots,m, \) in increasing order

$$ \displaystyle{ c_{i_{2}j_{1}} - c_{i_{1}j_{1}} \leq c_{i_{2}j_{2}} - c_{i_{1}j_{2}} \leq \ldots \leq c_{i_{2}j_{m}} - c_{i_{1}j_{m}}. } $$
(9.117)

and let \( \nu _{i_{1},i_{2}}(j) \) be the position of \( c_{i_{2}j} - c_{i_{1}j} \) in this ordered sequence.

Proposition 9.14

A cell Π is characterized by a map k Π : I 2 →{ 1,…,m, m + 1} such that Π is the solution set of the linear system

$$ \displaystyle\begin{array}{rcl} & & t_{i_{1}} + c_{i_{1}j} \leq t_{_{2}} + c_{i_{2}j}\quad \forall (i_{1},i_{2}) \in I_{\ast}^{2}\ \mathrm{s.t.}\ \nu _{ i_{1},i_{2}}(j) \geq k_{\varPi }(i_{1},i_{2}){}\end{array} $$
(9.118)
$$ \displaystyle\begin{array}{rcl} & & t_{i_{1}} + c_{i_{1}j} \geq t_{_{2}} + c_{i_{2}j}\quad \forall (i_{1},i_{2}) \in I_{\ast}^{2},\ \mathrm{s.t.}\ \nu _{ i_{1},i_{2}}(j) < k_{\varPi }(i_{1},i_{2}){}\end{array} $$
(9.119)

Proof

Let \( \varPi \subset \mathbb{R}_{+}^{r} \) be a cell. For every pair (i 1, i 2) with i 1 < i 2 denote by \( J_{\varPi }^{i_{1},i_{2}} \) the set of all j = 1, , m such that the left inequality (9.116) holds for all t ∈ Π and define

$$ \displaystyle{k_{\varPi }(i_{1},i_{2}) = \left \{\begin{array}{ll} \min \{\nu _{i_{1},i_{2}}(j)\vert \ j \in J_{\varPi }^{i_{1},i_{2}}\} & \mathrm{if}\ J_{\varPi }^{i_{1},i_{2}}\neq \emptyset \\ m + 1 &\mathrm{otherwise} \end{array} \right.} $$

It is easy to see that Π is then the solution set of the system (9.118) and (9.119). Indeed, let t ∈ Π. If \( \nu _{i_{1},i_{2}}(j) \geq k_{\varPi }(i_{1},i_{2}) \), then k Π (i 1, i 2) ≠ m + 1, so \( k_{\varPi }(i_{1},_{2}) =\nu _{i_{1},i_{2}}(l) \) for some \( l \in J_{\varPi }^{i_{1},i_{2}}. \) Then \( t_{i_{1}} + c_{i_{1}l} \leq t_{i_{1}} + c_{i_{2}}, \) hence \( t_{i_{1}} - t_{i_{2}} \leq c_{i_{2}l} - c_{i_{1}l} \) and since the relation \( \nu _{i_{1},i_{2}}(j) \geq \nu _{i_{1},i_{2}}(l) \) means that \( c_{i_{2}j} - c_{i_{1}j} \geq c_{i_{2}l} - c_{i_{1}l} \) it follows that \( t_{i_{1}} - t_{i_{2}} \leq c_{i_{2}j} - c_{i_{1}j}, \) i.e., \( t_{i_{1}} + c_{i_{1}j} \leq t_{i_{2}} + c_{i_{2}j}. \) Therefore (9.118) holds. On the other hand, if \( \nu _{i_{1},i_{2}}(j) < k_{\varPi }(i_{1},i_{2}) \) then by definition of a cell \( j\notin J_{\varPi }^{i_{1},i_{2}}, \) hence (9.119) holds, too [since from the definition of a cell, any t ∈ Π must satisfy one of the inequalities (9.116)]. Thus every t ∈ Π is a solution of the system (9.118) and (9.119). Conversely, if t satisfies (9.118) and (9.119) then for every (i 1, i 2) ∈ I 2, t satisfies the left inequality (9.116) for \( j \in J_{\varPi }^{i_{1},i_{2}} \) and the right inequality for \( j\notin J_{\varPi }^{i_{1},i_{2}}, \) hence t ∈ Π. Therefore, each cell Π is determined by a map k Π : I 2 → { 1, , m + 1}. Furthermore, it is easy seen that k Π k Π for two different cells Π, Π′. Indeed, if ΠΠ′ then at least for some (i 1, i 2) ∈ I 2 and some j = 1, , m, one has \( j \in J_{\varPi }^{i_{1},i_{2}}\setminus J_{\varPi '}^{i_{1},i_{2}}. \) Then \( k_{\varPi }(i_{1},i_{2}) \leq \nu _{i_{1},i_{2}}(j) \) but \( k_{\varPi '}(i_{1},i_{2}) >\nu _{i_{1},i_{2}}(j). \) □ 

Corollary 9.4

The total number of cells is bounded above by (m + 1) r(r−1)∕2 .

Proof

The number of cells does not exceed the number of different maps k: I 2 → { 1, , m + 1} and there are (m + 1)r(r−1)∕2 such maps. □ 

To sum up, the proposed algorithm for solving PTP(r) involves the following steps:

  1. (1)

    Ordering the sequence \( c_{i_{2}j} - c_{i_{1}j},j = 1,\ldots,m \) for every pair (i 1, i 2) ∈ I 2 so as to determine \( \nu_{i_{1},i_{2}}(j),j = 1,\ldots,m,(i_{1},i_{2}) \in I_{{\ast}}^{2}. \)

  2. (2)

    Computing the vector x Π for every cell \( \varPi \in \mathcal{P} \) (\( \mathcal{P} \) is the collection of cells determinedly the maps k Π : I 2 → { 1, , m + 1}. 

  3. (3)

    Computing the values f(x Π) and select Π according to (9.113).

    Steps (1) and (2) require obviously a number of elementary operations bounded by a polynomial in m, while step (3) requires m r(r−1)∕2 evaluations of f(x). 

11 Exercises

9.1 Show that a differentiable function f(x) is K-monotonic on an open convex set \( X \subset \mathbb{R}^{n} \) if and only if \( -\nabla f(x) \in K^{\circ }\quad \forall x \in X. \)

9.2 Consider the problem min{f(x) |  x ∈ D} where D is a polyhedron in \( \mathbb{R}^{n} \) with a vertex at 0, and \( f: \mathbb{R}^{n} \rightarrow \mathbb{R} \) is a concave function such that for any real number γ ≤ f(0) the level set C γ  = { x |  f(x) ≥ γ} has a polar contained in the cone generated by two linearly independent vectors \( c^{1},c^{2} \in \mathbb{R}^{n}. \) Show that this problem reduces to minimizing a concave function of two variables on the projection of D on \( \mathbb{R}^{2}. \) Develop an OA method for solving the problem.

9.3 Consider the problem

$$ \displaystyle{\min \{g(t) +\langle d,y\rangle \vert \ ta + By \leq c,t \geq 0,y \geq 0\}} $$

where \( t \in \mathbb{R},y \in \mathbb{R}^{n},a \in \mathbb{R}^{m},B \in \mathbb{R}^{m\times n},c \in \mathbb{R}_{+}^{m}. \) Develop a parametric algorithm for solving the problem and compare with the polyhedral annexation method.

9.4 Show that a linearly constrained problem of the form

$$ \displaystyle{\min \{F(l(x))\vert \ \ x \in D\}} $$

where D is a polyhedron in \( \mathbb{R}^{n}, \) l(x) is an affine function, and F(t) is an arbitrary quasiconcave real-valued function on an interval Δ ⊃ l(D) reduces to at most two linear programs. If F(. ) is monotonic on l(D) then the problem is in fact equivalent to even a mere linear program.

9.5 Solve the plant location problem

$$ \displaystyle\begin{array}{rcl} & \min & \sum _{i=1}^{3}f_{ i}(x_{i}) +\sum _{ i=1}^{3}\sum _{ j=1}^{5}d_{ ij}y_{ij} {}\\ & \mathrm{s.t.}& x_{1} + x_{2} + x_{3} = 203 {}\\ & \mbox{ } & y_{1j} + y_{2j} + y_{3j} = b_{j}\quad j = 1,\ldots,5 {}\\ & \mbox{ } & y_{i1} +\ldots +y_{i5} = x_{i}\quad i = 1,2,3 {}\\ & \mbox{ } & x_{i} \geq 0;\quad y_{ij} \geq 0\quad i = 1,2,3;j = 1,\ldots,5. {}\\ \end{array} $$

where f i (t) = 0 if t = 0,  f i (t) = r i + s i t if t > 0

$$ \displaystyle\begin{array}{rcl} & & r_{1} = 1,s_{1} = 1.7;\quad r_{2} = 88,s_{2} = 8.4;\quad r_{3} = 39;s_{3} = 4.7 {}\\ & & b = (62,65,51,10,15);\quad d = [d_{ij}] = \left [\begin{array}{ccccc} 6 &66&68&81& 4\\ 40 &20 &34 &83 &27 \\ 90&22&82&17& 8 \end{array} \right ]{}\\ \end{array} $$

9.6 Suppose that details of n different kinds must be manufactured. The production cost of t items of i-th kind is a concave function f i (t). The number of items of i-th kind required is a i  > 0, but one item of i-th kind can be replaced by one item of any j-th kind with j > i. Determine the number x i of items of i-th kind, i = 1, , n, to be manufactured so as to satisfy the demands with minimum cost. By setting α i  =  j = 1 i a j , this problem can be formulated as:

$$ \displaystyle\begin{array}{rcl} & & \min \ \sum _{i=1}^{n}f_{ i}(x_{i})\quad \mbox{ s.t.} {}\\ & & x_{1} \leq \alpha _{1},\ x_{1} + x_{2} \leq \alpha _{2},\ldots,x_{1} +\ldots +x_{n} \leq \alpha _{n} {}\\ & & x_{1},\ldots,x_{n} \geq 0 {}\\ \end{array} $$

(Zukhovitzki et al. 1968). Solve this problem.

(Hint: Any extreme point x of the feasible polyhedron can be characterized by the condition: for each i = 1, , n, either x i  = 0 or x 1 + + x i  = α i ).

9.7 Develop an algorithm for solving the problem

$$ \displaystyle{\min \{c^{0}x + (c^{1}x)^{1/2} \times (c^{2}x)^{1/3}\vert \quad x \in D\}} $$

where \( D \subset \mathbb{R}_{+}^{n} \) is a polytope, \( c^{1},c^{2} \in \mathbb{R}_{+}^{n}, \) \( c^{0} \in \mathbb{R}_{-}^{n} \) and cx denotes the inner product of c and x. Apply the algorithm to a small numerical example with n = 2.

9.8 Solve the numerical Example 9.10 (page 9.10).

9.9 Solve by polyhedral annexation:

$$ \displaystyle\begin{array}{rcl} & & \min \ (0.5x_{1} + 0.75x_{2} + 0.75x_{3} + 1.25x_{4})\quad \mbox{ s.t.} {}\\ & & 0.25x_{2} + 0.25x_{3} + 0.75x_{4} \leq 2 {}\\ & & -0.25x_{2} - 0.25x_{3} + 0.25x_{4} \leq 0 {}\\ & & x_{1} - 0.5x_{2} + 0.5x_{3} + 1.5x_{4} \leq 1.5 {}\\ & & h(x) + 4x_{1} - x_{2} = 0 {}\\ & & x = (x_{1},\ldots,x_{4}) \geq 0 {}\\ \end{array} $$

where h(x) is the optimal value of the linear program

$$ \displaystyle\begin{array}{rcl} & & \min \ (-4y_{1} + y_{2})\quad \mbox{ s.t.}\hspace{142.26378pt} {}\\ & & y_{1} + y_{2} \leq 1.5 - 0.5x_{2} - 0.5x_{3} - 1.5x_{4} {}\\ & & y_{2} \leq x_{1} + x_{2};\quad y_{1},y_{2} \geq 0 {}\\ \end{array} $$

(Hint: Exploit the monotonicity property of h(x))

9.10 Solve the problem

$$ \displaystyle\begin{array}{rcl} & & \min \ \sum _{i=1}^{n} \frac{x_{i}} {y_{i}+\eta }\quad \mbox{ s.t.} {}\\ & & \sum _{i=1}^{n}x_{ i} = a,\quad \sum _{i=1}^{n}y_{ i} = b {}\\ & & 0 \leq x_{i} \leq \gamma,\quad 0 \leq y_{i} \leq 1\quad i = 1,\ldots,n\hspace{71.13188pt} {}\\ \end{array} $$

where 0 < η, 0 < a ≤ n γ, 0 < b ≤ n. 

Hint: Reduce the problem to the concave minimization problem

$$ \displaystyle{\min \left \{h(x)\vert \sum _{i=1}^{n}x_{ i} = a,0 \leq x_{i} \leq \gamma \right \}} $$

where

$$ \displaystyle{h(x) =\min \left \{\sum _{i=1}^{n} \frac{x_{i}} {y_{i}+\eta }\vert \ \sum _{i=1}^{n}y_{ i} = b,0 \leq y_{i} \leq 1\quad i = 1,\ldots,n\right \}} $$

and use the results on the special concave minimization CPL(1).

9.11 Solve the problem in Example 5.5 (page 144) where the constraints (5.24) are replaced by

$$ \displaystyle{\sum _{i=1}^{n}x_{ i} = a,\quad \sum _{i=1}^{n}y_{ i} = b.} $$