1 Introduction

Mixed-integer nonlinear programming (MINLP) has proven to be a powerful modeling paradigm and has received increased attention in recent years. However, despite the tremendous advances in the theoretical and algorithmic treatment of MINLPs, they are still significantly less scalable than their linear counterparts such that solving MINLPs of large sizes remains a challenge. In this work, we consider the exact solution of a class of generally nonconvex MINLPs whose structure is amenable to branch-and-price, hence allowing us to combine the strengths of two key enabling technologies: column generation for large-scale integer programming and global optimization of MINLPs. Specifically, we consider problems of the following form:

$$\begin{aligned} \mathop {\mathrm{minimize}}\limits _{x,y,z}&\quad c^{\top } x + f(y,z) \end{aligned}$$
(1a)
$$\begin{aligned} \mathrm{subject}\;\mathrm{to}&\quad A x + D y \ge b \end{aligned}$$
(1b)
$$\begin{aligned}&\quad g(y,z) \le 0 \end{aligned}$$
(1c)
$$\begin{aligned}&\quad y^{\min } \le y \le y^{\max } \end{aligned}$$
(1d)
$$\begin{aligned}&\quad x \in {\mathbb {R}}_+^m \times {\mathbb {Z}}_+^{\bar{m}}, \, y \in {\mathbb {Z}}^p, \, z \in {\mathbb {R}}^q \times {\mathbb {Z}}^{\bar{q}}, \end{aligned}$$
(1e)

where the vectors x and z can contain both continuous and integer variables, while the y-variables are all integer. Note that without loss of generality, x are constrained to be nonnegative. The objective function (1a) consists of a linear term, \(c^{\top }x\), and a nonlinear term, f(yz). While the linear constraints (1b) involve x and y, g(yz) in (1c) are nonlinear functions of y and z. The functions f and g can be nonconvex. Constraints (1d) ensure that y are bounded. We assume that problem (1) can be efficiently solved if constraints (1b), which we call the complicating constraints, are removed.

We are particularly interested in problems that, without the complicating constraints, result in smaller MINLPs that can be solved using state-of-the-art global solvers. This most commonly occurs when the problem has a decomposable structure, such that multiple independent subproblems are generated when the complicating constraints are removed, but can also occur when the number of linear constraints is much larger than the number of nonlinear constraints. We find that many relevant problems directly fall or can be reformulated into the aforementioned class of MINLPs. Examples can be found in process and product design, production capacity planning, dynamic facility location, stochastic programming, statistical learning, and many more application domains.

In this work, we investigate the computational feasibility of a branch-and-price approach to solving MINLPs of the given form. While the suitability of the proposed algorithm has been indicated in the literature [1,2,3], it has barely found any application. We hence aim to systematically analyze the theoretical basis of the branch-and-price approach, highlight critical algorithmic considerations, and examine the algorithm’s performance in extensive computational experiments.

The remainder of this paper is organized as follows: In Sect. 2, we review existing works for finding the exact solution of MINLPs and solving large-scale MILPs using branch-and-price. Next, we present an approach for reformulating problems of the form presented in. (1) using discretization in Sect. 3. This reformulation makes the problem amenable for solution using branch-and-price, and the algorithm for doing so is presented in Sect. 4. We show how this approach can be extended to problems where the pricing subproblem is decomposable in Sect. 5. In Sect. 6, the performance of the branch-and-price approach is assessed in four representative case studies. Finally, in Sect. 7 we provide some concluding remarks.

2 Literature review

The development of exact methods for the solution of mixed-integer linear programs (MILPs) dates back to the 1950s [4, 5] (for more details on the history of integer programming, see [6]). Over the last decades, MILP has reached a level of maturity that has made it the primary approach to solving many industrial and scientific problems of high complexity and dimensionality. Mixed-integer nonlinear programming is a more recent development and was initially motivated by applications in chemical and process systems engineering [7].

The development of MINLP algorithms has initially focused on the convex case, i.e. problems in which, loosely speaking, the continuous relaxation of the MINLP is convex. Methods for solving convex MINLPs include branch-and-bound [8, 9], generalized Benders decomposition [10], outer approximation [11, 12], LP/NLP-based branch-and-bound [13], extended cutting plane [14], and extended supporting hyperplane [15]. For recent reviews on convex MINLP, we refer the reader to Grossmann [16], Bonami et al. [17], and Kronqvist et al. [18].

Compared to convex MINLPs, solving nonconvex MINLPs is significantly more challenging due to the nonconvexity that remains even after relaxing the integer restrictions. Exact algorithms for nonconvex MINLP incorporate concepts from global continuous optimization, such as convex relaxations and spatial branch-and-bound [19]. Major improvements have been achieved with the development of the branch-and-reduce [20, 21] and \(\alpha \)-branch-and-bound methods [22, 23]. Furthermore, incorporating MILP relaxations and integer programming techniques for generating cutting planes has proven to be very effective [24]. These and other algorithmic advances are implemented in state-of-the-art global MINLP solvers, such as BARON [24], Couenne [25], LINDOGlobal [26], ANTIGONE [27], and SCIP [28]. For recent reviews focusing on nonconvex MINLP, see [29, 30].

Although the performance of global MINLP solvers has improved significantly over the last two decades, they are still by far not as scalable as state-of-the-art MILP solvers. Hence, to solve large-scale nonconvex MINLPs, one often resorts to decomposition methods that exploit special model structures. Popular decomposition approaches include different variants of Lagrangean decomposition [31] and progressive hedging [32]. However, these methods have to be considered heuristics since, although they can provide lower and upper bounds, it is not guaranteed that the duality gap can be closed. Exact decomposition algorithms for nonconvex MINLP are a rarity. One of those is bilevel decomposition, which iterates between a master MILP that is a relaxation of the original MINLP and an NLP or MINLP subproblem obtained by fixing integer variables. Integer and outer-approximation cuts (or tailored cuts that can be interpreted as such) are added to the master MILP at each iteration. The convergence behavior of bilevel decomposition strongly depends on the quality of the MILP relaxation and is therefore very application-specific [33,34,35]. Generalized Benders decomposition has been extended to solve two-stage stochastic nonconvex separable MINLPs in which only the continuous variables are involved in the nonconvex terms [36]. Cao and Zavala[37] propose a branch-and-bound scheme to address two-stage stochastic nonconvex MINLP with mixed-integer first-stage and continuous second-stage variables. Combining generalized Benders decomposition and branch-and-cut, Li and Grossmann [38] are able to consider the case with nonconvex constraints and mixed-binary variables in both stages. Finally, Rebennack et al. [39] propose a decomposition method based on column enumeration for nonconvex MINLPs with an assignment constraint; however, since the number of columns grows exponentially with the number of assignment decisions, this method is only suited for problems with a few assignment variables.

In the same spirit of decomposition, we apply in this work branch-and-price, which has its origin in the pioneering works of Dantzig and Wolfe [40], who introduced the fundamental idea of column generation for linear programming, and Desrosiers et al. [41], who were the first to embed column generation in a branch-and-bound framework to solve a large-scale MILP. Branch-and-price has been a success story in large-scale mixed-integer optimization, with applications in vehicle routing [42, 43], crew scheduling [44, 45], and fleet assignment [46, 47], to just name a few. For reviews on branch-and-price, see [1, 48].

The vast majority of existing works on branch-and-price consider integer and mixed-integer linear problems. However, there is significant flexibility in incorporating nonlinearities in the pricing problem, which has been mentioned (in form of brief side notes) in the literature [1, 2]. It is all the more surprising that there seems to be almost no published work on applying branch-and-price with mixed-integer nonlinear pricing problems. The only notable exception that we have been able to find is the work by Nowak et al. [3] in which a column-generation-based method for generating inner and outer approximations for nonconvex MINLPs is developed. We believe that there is considerable room for theory and algorithm development in the application of branch-and-price concepts to solving MINLPs.

3 Reformulation via discretization

To make the problem amenable to branch-and-price, we first apply a discretization approach [49] to derive an extensive formulation of problem (1). Consider the feasible set for y when disregarding constraints (1b):

$$\begin{aligned} {\mathcal {Y}} := \left\{ y: \exists z \in {\mathbb {R}}^q \times {\mathbb {Z}}^{\bar{q}} \;\; \text {such that} \;\; g(y,z) \le 0, \; y^{\min } \le y \le y^{\max }, \; y \in {\mathbb {Z}}^p\right\} , \end{aligned}$$
(2)

which is a finite set of integer points with cardinality \(K:=|{\mathcal {Y}}|\). Hence, \({\mathcal {Y}}\) can be equivalently expressed as:

$$\begin{aligned} {\mathcal {Y}} = \{\bar{y}_1, \dots , \bar{y}_K\} =\left\{ y: y = \sum _{k \in {\mathcal {K}}} \lambda _k \, \bar{y}_k, \; \sum _{k \in {\mathcal {K}}} \lambda _k = 1, \; \lambda _k \in \{0,1\} \; \forall \, k \in {\mathcal {K}} \right\} , \end{aligned}$$
(3)

where \(\bar{y}_k\) denotes a specific point in \({\mathcal {Y}}\) and \({\mathcal {K}} := \{1,\dots ,K\}\). Here, the binary variable \(\lambda _k\) is equal to 1 if \(y = \bar{y}_k\) and otherwise 0. Figure 1 shows a two-dimensional example of \({\mathcal {Y}}\), which is given by the dark-colored points. In this representative example, the shaded area, which represents one possible set \({\mathcal {Y}}\) without integrality restrictions on y, is clearly nonconvex.

Remark 1

Note that in the example shown in Fig. 1, \({\mathcal {Y}} \ne \mathrm {conv}({\mathcal {Y}}) \cap {\mathbb {Z}}^p\), where \(\mathrm {conv}({\mathcal {Y}})\) denotes the convex hull of \({\mathcal {Y}}\). In general, \({\mathcal {Y}} \subseteq \mathrm {conv}({\mathcal {Y}}) \cap {\mathbb {Z}}^p\), which is due to nonconvex constraint functions g or integer components in z. Hence, the traditional convexification technique [1] that is often used in Dantzig–Wolfe reformulations does not apply here. However, in the special case where all integer variables are binary, that is, where \(y^{\min }=0\) and \(y^{\max }=1\), the equality \({\mathcal {Y}} =\mathrm {conv} ({\mathcal {Y}})\cap {\mathbb {Z}}^p\) is guaranteed to hold.

Fig. 1
figure 1

The two-dimensional feasible set \({\mathcal {Y}}\) is given by the set of dark-colored points. The shaded area represents set \({\mathcal {Y}}\) without the integrality restrictions on y

For each \(k \in {\mathcal {K}}\), we can define an optimal cost:

$$\begin{aligned} \bar{f}_k := \min _{z \in {\mathbb {R}}^q \times {\mathbb {Z}}^{\bar{q}}} \left\{ f(\bar{y}_k,z): g(\bar{y}_k,z) \le 0 \right\} , \end{aligned}$$
(4)

which is the last ingredient that we need to reformulate (1) into the following master problem:

$$\begin{aligned} \mathrm {(MP):} \quad v^{\mathrm{MP}}&:= \min _{x,\lambda } \;\; c^{\top } x + \sum _{k \in {\mathcal {K}}} \lambda _k \bar{f}_k \end{aligned}$$
(5a)
$$\begin{aligned}&\quad \mathrm {s.t.} \;\; A x + D \sum _{k \in {\mathcal {K}}} \lambda _k \, \bar{y}_k \ge b \end{aligned}$$
(5b)
$$\begin{aligned}&\qquad \sum _{k \in {\mathcal {K}}} \lambda _k = 1 \end{aligned}$$
(5c)
$$\begin{aligned}&\qquad x \in {\mathbb {R}}_+^m \times {\mathbb {Z}}_+^{\bar{m}} \end{aligned}$$
(5d)
$$\begin{aligned}&\qquad \lambda _k \in \{0,1\} \quad \forall \, k \in {\mathcal {K}}, \end{aligned}$$
(5e)

which is a large-scale MILP as K grows exponentially with the dimension of y. Problems (MP) and (1) are equivalent in a sense that they have the same optimal value, \(v^{\mathrm{MP}}\), and every solution of (MP) can be mapped to a unique solution of (1) in the (xy)-space and a corresponding set of z-values that provide the same optimal value (the converse is trivially true).

4 The branch-and-price algorithm

In the branch-and-price algorithm, we solve (MP) using branch-and-bound, but solve the LP relaxation at each node via column generation. In the following, we describe the major elements of the algorithm.

4.1 Column generation

Consider the LP relaxation of (MP), which we denote by \((\overline{\mathrm {MP}})\). The large number of variables in \((\overline{\mathrm {MP}})\) typically prohibits solving it in full-space. Instead, we consider a restricted master problem, denoted by (RMP), which involves only a subset of columns \(\widehat{{\mathcal {K}}} \subseteq {\mathcal {K}}\). The LP relaxation of (RMP), denoted by \((\overline{\mathrm {RMP}})\), can be solved efficiently as long as the size of \(\widehat{{\mathcal {K}}}\) is sufficiently small. New columns are generated as needed by solving the following pricing problem:

$$\begin{aligned} \mathrm {(PP):} \quad \zeta&:= \min _{y,z} \;\; f(y,z) - \pi ^{\top } D y - \mu \end{aligned}$$
(6a)
$$\begin{aligned}&\quad \mathrm {s.t.} \;\; g(y,z) \le 0 \end{aligned}$$
(6b)
$$\begin{aligned}&\qquad y^{\min } \le y \le y^{\max } \end{aligned}$$
(6c)
$$\begin{aligned}&\qquad y \in {\mathbb {Z}}^p, \, z \in {\mathbb {R}}^q \times {\mathbb {Z}}^{\bar{q}}, \end{aligned}$$
(6d)

where \(\pi \) and \(\mu \) denote the values of the dual variables associated with constraints (5b) and (5c), respectively, at the optimal solution of \((\overline{\mathrm {RMP}})\). Problem (PP) minimizes the reduced cost; hence, if \(\zeta < 0\), the corresponding \(y^*\) may improve the solution to \((\overline{\mathrm {MP}})\) and is therefore added as a new column to \(\widehat{{\mathcal {K}}}\). Note that (PP) is a generally nonconvex MINLP.

Since \((\overline{\mathrm {RMP}})\) considers a restricted set of columns, its optimal value, \(v^{\overline{\mathrm {RMP}}}\), is an upper bound on the optimal value of \((\overline{\mathrm {MP}})\), \(v^{\overline{\mathrm {MP}}}\). Following standard duality arguments [50, p. 189], one can show that a lower bound is given by \(v^{\overline{\mathrm {RMP}}} + \zeta \). Hence, we have the following relationship:

$$\begin{aligned} v^{\overline{\mathrm {RMP}}} + \zeta \le v^{\overline{\mathrm {MP}}} \le v^{\overline{\mathrm {RMP}}}. \end{aligned}$$
(7)

Algorithm 1 shows the pseudocode of the algorithm for solving \((\overline{\mathrm {MP}})\). Here, the current lower and upper bounds on \(v^{\overline{\mathrm {MP}}}\) are denoted by \(\mathrm {LB}^{\overline{\mathrm {MP}}}\) and \(\mathrm {UB}^{\overline{\mathrm {MP}}}\), respectively. The column generation algorithm is finite and exact. If we set the tolerance \(\bar{\epsilon }\) to zero, \(\zeta \) will be zero at the last iteration and the algorithm terminates at the optimal solution of \((\overline{\mathrm {MP}})\).

figure a

Remark 2

If we solve (PP) using a global MINLP solver such as BARON, we obtain, if available, lower (l) and upper (u) bounds on \(\zeta \) at every iteration of the branch-and-bound algorithm. Using these bounds, we can generate new columns and compute valid bounds on \(v^{\overline{\mathrm {MP}}}\) without solving (PP) to optimality as follows: A column is added if \(u < 0\). In line 10 of Algorithm 1, \(\bar{f}_k\) is computed using u, i.e. \(\bar{f}_k \leftarrow u + \pi ^{\top } D y^* + \mu \). Then the following bounds can be computed:

$$\begin{aligned} v^{\overline{\mathrm {RMP}}} + l \le v^{\overline{\mathrm {RMP}}} + \zeta \le v^{\overline{\mathrm {MP}}} \le v^{\overline{\mathrm {RMP}}}. \end{aligned}$$
(8)

Only in the last iteration, (PP) has to be solved to \(\bar{\epsilon }\)-optimality in order to achieve a desired optimality gap. Not solving (PP) to optimality at every iteration can mitigate the impact of the tailing-off effect in solving the MINLP and significantly speed up the overall algorithm.

4.2 Branch-and-bound

If the solution of \((\overline{\mathrm {MP}})\) is integer feasible, it is also the optimal solution to (MP); remarkably, our computational experiments (see Sect. 6) show that solving \((\overline{\mathrm {MP}})\) often returns integer feasible solutions. In general, however, the solution may not satisfy the integrality restrictions, in which case we have to apply branch-and-bound. It is well known that branching on the \(\lambda \)-variables leads to unbalanced branch-and-bound trees; hence, it is recommended to design branching rules based on the original variables [49]. We comment, whenever appropriate, on branching rules in the case studies in Sect. 6 as they often have to be tailored to the specific application.

An outline of the branch-and-price algorithm is shown in Algorithm 2. Here, \({\mathcal {N}}\) denotes the set of nodes generated in the branch-and-bound tree, which is initialized with the root node 1. Furthermore, we start with a nonempty set of feasible columns \(\widehat{{\mathcal {K}}}\). At each node n, we solve the linear relaxation of the master problem associated with that node, denoted by \((\overline{\mathrm {MP}})_n\), using the corresponding pricing problem (PP)\(_n\). The solution of \((\overline{\mathrm {MP}})_n\) can be used to update the overall lower bound \(\mathrm {LB}^{\mathrm{MP}}\), and if it is integer feasible, it also provides an upper bound on \(v^{\mathrm{MP}}\); otherwise, we can solve (RMP)\(_n\), which considers the integrality constraints, to obtain a feasible solution to (MP) and hence an upper bound. The incumbent solution \((\hat{x}^*, \hat{\lambda }^*)\) is updated every time a new feasible solution is found. The algorithm terminates when an optimality gap smaller than \(\epsilon \) is reached; otherwise, branching rules are applied to update the node set \({\mathcal {N}}\), and the next node n to be evaluated is selected.

figure b

Remark 3

As mentioned in Remark 2, a lower bound on \(v^{(\overline{\mathrm {MP}})_n}\) is obtained at every iteration of solveRelaxedMP(\((\overline{\mathrm {MP}})_n\), (PP)\(_n\), \(\widehat{{\mathcal {K}}}\)). This information can be used to potentially prune the node before the column generation algorithm terminates.

Remark 4

Solving (RMP)\(_n\) after solving \((\overline{\mathrm {MP}})_n\) at every node (line 7 of Algorithm 2) is not required for convergence; however, it provides upper bounds that can help reduce the total number of nodes that need to be evaluated. In fact, (RMP)\(_n\) can be solved at any column generation iteration, but this is typically not worth the computational effort.

Remark 5

For ease of exposition, a global set of columns, \(\widehat{{\mathcal {K}}}\), is used in Algorithm 2. However, it is usually beneficial to consider node-specific column sets. For example, branching can render some of the columns infeasible, in which case they should be removed from the set.

5 Decomposable pricing problems

Most problems of interest that can be effectively solved using branch-and-price have a decomposable structure. Specifically, this means that the pricing problem decomposes into multiple smaller subproblems that can be solved independently and in parallel. Such an MINLP is of the following form:

$$\begin{aligned} \mathop {\mathrm{minimize}}\limits _{x,y,z}&\quad c^{\top } x + \sum _{i \in {\mathcal {I}}} f_i(y_i,z_i) \end{aligned}$$
(9a)
$$\begin{aligned} \mathrm{subject}\;\mathrm{to}&\quad A x + \sum _{i \in {\mathcal {I}}} D_i \, y_i \ge b \end{aligned}$$
(9b)
$$\begin{aligned}&\quad g_i(y_i,z_i) \le 0 \quad \forall \, i \in {\mathcal {I}} \end{aligned}$$
(9c)
$$\begin{aligned}&\quad y^{\min }_i \le y_i \le y^{\max }_i \quad \forall \, i \in {\mathcal {I}} \end{aligned}$$
(9d)
$$\begin{aligned}&\quad x \in {\mathbb {R}}_+^m \times {\mathbb {Z}}_+^{\bar{m}} \end{aligned}$$
(9e)
$$\begin{aligned}&\quad y_i \in {\mathbb {Z}}^{p_i}, \, z_i \in {\mathbb {R}}^{q_i} \times {\mathbb {Z}}^{\bar{q}_i} \quad \forall \, i \in {\mathcal {I}}, \end{aligned}$$
(9f)

where \({\mathcal {I}}\) denotes the set of subproblems. One can see that the problem decomposes into \(|{\mathcal {I}}|\) independent subproblems if constraints (9b) are removed.

Following an analogous derivation as in Sect. 3, we reformulate (9) into the following master problem:

$$\begin{aligned} v^{\mathrm{MP}}&= \min _{x,\lambda } \;\; c^{\top } x +\sum _{i \in {\mathcal {I}}} \sum _{k \in {\mathcal {K}}_i} \lambda _{ik} \, \bar{f}_{ik} \end{aligned}$$
(10a)
$$\begin{aligned}&\quad \mathrm {s.t.} \;\; A x + \sum _{i \in {\mathcal {I}}} D_i \sum _{k \in {\mathcal {K}}_i} \lambda _{ik} \, \bar{y}_{ik} \ge b \end{aligned}$$
(10b)
$$\begin{aligned}&\qquad \sum _{k \in {\mathcal {K}}_i} \lambda _{ik} = 1 \quad \forall \, i \in {\mathcal {I}} \end{aligned}$$
(10c)
$$\begin{aligned}&\qquad x \in {\mathbb {R}}_+^m \times {\mathbb {Z}}_+^{\bar{m}} \end{aligned}$$
(10d)
$$\begin{aligned}&\qquad \lambda _{ik} \in \{0,1\} \quad \forall \, i \in {\mathcal {I}}, \, k \in {\mathcal {K}}_i, \end{aligned}$$
(10e)

where \({\mathcal {K}}_i\) denotes the set of feasible columns in subproblem i, and each \(k \in {\mathcal {K}}_i\) is associated with a solution \(\bar{y}_{ik}\) and an optimal cost \(\bar{f}_{ik}\). The corresponding pricing subproblem i is as follows:

$$\begin{aligned} \zeta _i&:= \min _{y_i,z_i} \;\; f_i(y_i,z_i) - \pi ^{\top } D_i \, y_i - \mu _i \end{aligned}$$
(11a)
$$\begin{aligned}&\quad \mathrm {s.t.} \;\; g_i(y_i,z_i) \le 0 \end{aligned}$$
(11b)
$$\begin{aligned}&\qquad y^{\min }_i \le y_i \le y^{\max }_i \end{aligned}$$
(11c)
$$\begin{aligned}&\qquad y_i \in {\mathbb {Z}}^{p_i}, \, z_i \in {\mathbb {R}}^{q_i} \times {\mathbb {Z}}^{\bar{q}_i}. \end{aligned}$$
(11d)

In the decomposable case, the pricing subproblems are solved independently and columns are generated for each subproblem. The optimal value \(v^{\overline{\mathrm {MP}}}\) can then be bounded as follows:

$$\begin{aligned} v^{\overline{\mathrm {RMP}}} + \sum _{i \in {\mathcal {I}}} \zeta _i \le v^{\overline{\mathrm {MP}}} \le v^{\overline{\mathrm {RMP}}}. \end{aligned}$$
(12)

Note that not all \(\zeta _i\) have to be nonpositive, but their sum will be.

6 Computational experiments

In this section, the computational performance of solving many problem instances from four different case studies using the proposed branch-and-price algorithm is compared to the performance when solving the full-space problem using a state-of-the-art, off-the shelf global MINLP solver, BARON 19.7.9 [24]. All problems are solved using 25 cores on the Mesabi cluster of the Minnesota Supercomputing Institute, a Linux cluster equipped with a set of 2.5 GHz Intel Hasewell E5-2680v3 processors. For solving full-space problems without decomposition, BARON is used with the Threads option set to 25, allowing for the MILP subsolver (CPLEX) to make use of parallelization to speed up solution times. When problems are solved using branch-and-price, CPLEX 12.8 is used to solve restricted master problems, while BARON with a single thread is used to solve pricing subproblems in parallel. All computations are implemented in Julia using the JuMP modeling language, version 0.17.1 [51]. The stopping criteria used for all instances are a 0.1When the desired optimality gap is reached, we report wall time to reach that gap, and when the maximum wall time is passed, we report the optimality gap achieved at that time. For each case study presented, model formulations for the full-space problem, master problem, and pricing subproblems, as well as distributions used for the generation of random parameters, are given in the supplementary material. The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

6.1 Example 1: cutting circles from rectangles

This problem considers cutting out a set of circles \({\mathcal {J}}\) of different sizes from a set of given rectangles \({\mathcal {R}}\), which are also of different sizes. It decides which rectangles should be used and which circles should be cut from these rectangles in order to minimize the total trim loss, which is the area of the used rectangles that was not cut into circles. More generally, it is a simple nonlinear example of a generalized assignment problem, where tasks need to be assigned to different available resources, and can also be thought of as a nonlinear example of the cutting stock problem, one of the motivating examples for the development of branch-and-price [52]. This problem was originally considered by Rebennack et al. [39], where it is solved using column enumeration, which differs from the column generation approach used in this paper in that the master problem (5) is solved considering the full set of columns \({\mathcal {K}}\), instead of a dynamically updated subset of columns.

This problem decomposes into one pricing subproblem for each rectangle \(r\in {\mathcal {R}}\) which decides which circles should be cut from each respective rectangle. These decisions are returned as a column to the master problem, which decides which rectangles should actually be used given complicating assignment constraints that state that each circle must be assigned to exactly one rectangle. For this case study, the master problem convexity constraint (5c) can be relaxed to an inequality, where it is possible to not choose any column for a given rectangle if no circles are cut from that rectangle in the optimal solution. The overall size of the problem can be increased in two ways: increasing the number of rectangles, which increases the number of subproblems in the branch-and-price algorithm and should have minimal effect on solution times when subproblems are parallelized, and increasing the number of circles, which increases the size of subproblems in the branch-and-price algorithm and should result in an increase in solution times. Computational results for 25 instances of varying problem size are presented in Table 1. Note that for this case study, random problem instances are generated in the same way as in the previous work analyzing this problem.

For this case study, a feasible initial set of columns is required to ensure feasibility of the master problem in early iterations. The initial columns chosen are feasible (but likely suboptimal) cases where a single circle is cut from a rectangle, as optimal column costs can be trivially calculated for these cases as the rectangle area minus the circle area. The “number of columns generated” information presented in Table 1 includes these initial columns. To ensure a fair comparison between the two methods, the full-space problem is also initialized with a feasible (but likely suboptimal) solution where different rectangles are each assigned one circle. Nonetheless, for many of the problem instances with a large number of rectangles, BARON is unable to determine an upper bound on the objective for the problem within the time allotted. Overall, the performance of the branch-and-price algorithm is clearly superior to solving the full-space problem in BARON for the problems tested in this case study, as the full-space model cannot be solved to optimality in under 10,000 s for any instance, and only returns solutions with very large optimality gaps after that time. Conversely, 16 of the 25 instances are solved to global optimality using branch-and-price, and the worst gap returned from branch-and-price (78.8%) is still better than the best gap returned from the full-space solution (95.1%). We also note that, as expected, our solution times for the branch-and-price algorithm seem to scale greatly with the number of circles, or subproblem size, but only minimally with the number of rectangles, or number of subproblems. Note that 12 of the 25 instances tested in this case study require branching to solve to global optimality; here, the branching strategy used is to find the circle of the largest size with a noninteger assignment to a rectangle, and to branch on this variable.

Table 1 Computational performance of 25 instances of cutting circles from rectangles

6.2 Example 2: multiperiod capacity planning with congestion effects

This problem considers capacity planning at a facility that can use a set of candidate production units \({\mathcal {K}}\) to produce a set of demanded products \({\mathcal {J}}\). Demands for these products change over time throughout a set of time periods \({\mathcal {T}}\). Demands are placed into a production queue, and congestion within this queue results in uncertainties in lead times for meeting demands. It decides which units to build during which time points in order to meet demands at minimum cost. Nonlinear chance constraints are used to ensure lead times remain below a given threshold with some probability. This problem is a multiperiod extension of the capacity planning with congestion problem considered by Rajagopalan and Yu [53].

In extending the original formulation to the multiperiod case, we define two different sets of variables which are related to a unit being available for use: \(z_{kt}\), which corresponds to the number of units k newly built during time period t, and \(y_{kt}\), which corresponds to the number of units k that are actively used during time period t. These variables are related according to the following conservation equation:

$$\begin{aligned} y_{k,t}\le y_k^0+\sum _{t'=1}^tz_{kt'} \quad \forall \,k\in {\mathcal {K}},\,t\in {\mathcal {T}}, \end{aligned}$$
(13)

where \(y_{k}^0\) is the number of units k initially present. This problem decomposes into one pricing subproblem for each time period \(t\in {\mathcal {T}}\) which decides the assignment of orders to production units such that demands are met within the respective time periods. Subproblems are linked to the master problem via number of units actively used at each time period, and the master problem then determines which units should be built when subject to constraints (13). We again increase the overall size of the problem in two ways: by increasing the number of subproblems (increasing \(|{\mathcal {T}}|\)) and by increasing the size of subproblems (increasing \(|{\mathcal {K}}|\) and \(|{\mathcal {J}}|\)). Computational results for 20 instances of varying problem size are presented in Table 2. Note that for this case study, random parameters for problem instances are generated in the same way as in the previous work analyzing this problem, with the exception that demand is modified to ensure that it increases, on average, over time, and installation costs decrease over time in a manner consistent with typical notions of time value of money.

Table 2 Computational performance of 20 instances of capacity planning with congestion

To obtain an initial set of columns for the branch-and-price algorithm, the pricing subproblems were solved in the case where all dual variables are set to zero. We do not explicitly define starting values for the variables in either the initial subproblems or in the full-space problem. While subproblems are solved relatively quickly, we note that in all instances tested, we are unable to obtain even a feasible solution when solving the full-space problem using BARON. For this case study it is again seen that the branch-and-price approach is superior, with feasible solutions found in all instances and globally optimal solutions found in 17 of the 20 instances tested. Interestingly, we find that solution times tend to increase more strongly with number of subproblems than size of subproblems, which is the opposite of what is expected. We note that this is likely due to the fact that instances with smaller subproblem sizes seem to require more column generation iterations to converge, and that the time required per iteration of column generation seems to better conform to the expected trends. We also note that in all instances tested only column generation is needed to solve the problems, as the root node solutions of the relaxed master problem are integer. This observation will also be true for the next two case studies tested.

6.3 Example 3: multiscenario synthesis of integrated water networks

This problem considers the optimal design of a waste water network with process units which add impurities to the water, as well as remediation units which remove these impurities. It considers uncertainties in the amount of impurities added by the process units and removed by remediation units, which are manifested through a set of different scenarios \({\mathcal {S}}\). It decides which pipes should be used to connect different units and which remediation units should be installed in order to satisfy constraints on the allowable amount of impurities in the process inputs and system output at minimum cost. The problem considered here is an adaptation of the original problem presented by Karuppiah and Grossmann [54], using the same network structure but assuming that allowable unit and pipe sizes come from a discrete set \({\mathcal {V}}\), rather than being allowed to vary continuously.

To solve this problem using branch-and-price, we introduce copies of the design variables for each scenario \(s\in {\mathcal {S}}\). These variables are constrained by non-anticipativity constraints, which take into account the fact that we do not know a priori which of the scenarios will actually be realized when making design decisions, and as such designs must be the same for all scenarios:

$$\begin{aligned} y_{ivs}=y_{ivs'}\quad \forall \,i\in {\mathcal {I}}, \,v\in {\mathcal {V}},\,(s,s')\in {\mathcal {S}}, \end{aligned}$$
(14)

where \(y_{ivs}\) refers to a design decision to build unit or pipe i of size v in scenario s. Note that the set \({\mathcal {I}}\) refers to all different pipes and units. The problem then decomposes to one pricing subproblem for each scenario \(s\in {\mathcal {S}}\) which decides the optimal network design for a specific scenario. The master problem then considers the weights of all scenarios and finds a design which minimizes the expected cost subject to non-anticipativity constraints (14). The number of subproblems is increased by increasing \(|{\mathcal {S}}|\), and the subproblem size is increased by increasing \(|{\mathcal {V}}|\). Computational results for 20 instances of varying size are presented in Table 3. Note that for this case study, random parameters are used for different scenarios in different instances based on the ranges used in the previous work analyzing this problem.

Table 3 Compuational performance of 20 instances of multiscenario water network synthesis

For this problem, the knowledge of the non-anticipativity constraints is used to modify the algorithm for generating columns. Whenever a column with negative reduced cost is generated for one scenario, a cost for that column from each scenario is found by solving each pricing subproblem with design variables fixed at the value given by the new column. As such, this problem only consists of one set of columns, instead of multiple sets corresponding to each respective subproblem. Initial columns are generated using this strategy with dual variables equal to zero. This approach ensures that the master problem is always feasible. However, it is possible that columns generated from one scenario may be infeasible for another. These infeasible columns are kept in a separate set and integer cuts are introduced to the subproblems to ensure that they are not generated again. Note that the number of infeasible columns increases as the number of discrete unit sizes increases, which makes sense as this introduces a larger number columns that may be infeasible for a scenario. Again, the branch-and-price algorithm outperforms solving the full-space model using BARON for all instances. However, the time equired to solve the problem clearly scales very poorly with subproblem size in this case, to the point where when \(|{\mathcal {V}}|\ge 3\), no instance is solved to optimality within the time limit. Branch-and-price still finds a feasible solution in all cases, which is not the case when solving the full-space problem, w hich only finds a feasible solution in 10 out of 20 instances.

6.4 Example 4: multiscenario design of multiproduct batch plants

This problem considers the optimal design of a batch plant with a set of stages \({\mathcal {J}}\) that can produce multiple products. It considers uncertainties in the demands for each product, as well as in the operating parameters of each of the different batch stages, which are manifested through a set of different scenarios \({\mathcal {S}}\). It decides the number of batch units to include at each stage, as well as the batch sizes and times for each product, such that demands are met at minimum cost. The problem presented here is an adaptation of a problem presented in [7], assuming that the volume of the batch unit is fixed instead of being a decision variable, and adding a nonlinear operating cost term to the objective function.

To solve this problem using branch-and-price, we introduce copies of the variable corresponding to number of batch units built for each scenario \(s\in {\mathcal {S}}\). Like the previous example, these variables are constrained by non-anticipativity constraints. As such, we use the same modified algorithm for generating columns. To combat the possibility of generating infeasible columns, scenarios are grouped together into a set of bunches \({\mathcal {B}}\) such that each bunch roughly contains an equal amount of “easy-to-satisfy” scenarios where demands are low and “hard-to-satisfy” scenarios where demands are high. Additionally, we add logical cuts to all subproblems when an infeasible column is generated which state that if a certain design is infeasible, any new design must build more batch units than in the infeasible design in at least one stage:

$$\begin{aligned}&N_j+N^{inf}_{jc}z_{jc}\ge N^{inf}_{jc}+1\quad \forall \,j \in {\mathcal {J}},c\in {\mathcal {C}}_i, \end{aligned}$$
(15)
$$\begin{aligned}&\sum _{j\in {\mathcal {J}}}z_{jc}\le |{\mathcal {J}}| -1\quad \forall \,c\in {\mathcal {C}}_i, \end{aligned}$$
(16)

where \(N_j\) is the number of units built for stage j, \(N_{jc}^{inf}\) is the number of units built for stage j in infeasible column c, \(z_{jc}\) is a binary variable which is 0 only if \(N_j > N_{jc}^{inf}\), and \({\mathcal {C}}_i\) is a set of infeasible columns. The problem then decomposes into one pricing subproblem for each scenario bunch \(b\in {\mathcal {B}}\) which decides the optimal system design for the corresponding group of scenarios. The master problem then considers the weights of all scenario bunches and finds a design which minimizes the expected cost subject to non-anticipativity constraints. The number of subproblems is increased by increasing \(|{\mathcal {B}}|\), where each bunch contains 5 scenarios, and the size of subproblems is increased by increasing \(|{\mathcal {J}}|\). Computational results for 20 instances of varying size are presented in Table 4. Note that for this case study, the randomly generated stochastic parameters for different instances are generated using the deterministic parameter values from the previous work as a starting point for random perturbations.

Table 4 Compuational performance of 20 instances of multiscenario batch system design

For this problem, unlike the other three examples presented, the branch-and-price approach performs very similarly to solving the full-space problem using BARON, particularly as the subproblem size increases. For all problems, both approaches are able to find the same objective upper bound. We note that for some instances, BARON is able to find a slightly better objective value than the branch-and-price approach, although in all instances these differences are within either the 0.1% gap used as a global optimality stopping criteria, or within the reported gap after 10,000 s. It is also apparent that as the number of stages increases, the number of infeasible columns generated also increases, contributing to the added difficulty for solving this problem using branch and price. However, we do note the number of infeasible columns generated is reduced by bunching columns and adding logical infeasibility cuts, as this number can be as much as an order of magnitude greater when these options are not used.

7 Conclusions

Applied branch-and-price to a class of nonconvex MINLPs with linear complicating constraints and integer linking variables. We exploit the structure of the problem to construct a Dantzig-Wolfe reformulation via a discretization approach, which then allows the problem to be solved using a branch-and-price scheme. This approach is especially effective in cases where the pricing problem decomposes into multiple small subproblems such that solving each subproblem using a global MINLP solver is considerably more tractable than solving the original full-space MINLP. Through several case studies, we have shown that many relevant problems directly fall or can be reformulated into the given class of MINLPs, and have demonstrated the computational feasibility of the proposed algorithm. In most tested model instances, the branch-and- price algorithm clearly outperforms solving the full-space problem directly using a global MINLP solver, often achieving orders-of-magnitude speedups.