1 Introduction

We aim to solve a special class of nonconvex mixed-integer nonlinear programming (MINLP) problems to \(\varepsilon \)-global optimality, where \(\varepsilon \) is a non-zero tolerance. Problem P is a mixed-integer quadratically constrained problem (MIQCP) where nonlinearities are due to bilinear terms \(x_i x_j \) of continuous variables x with finite lower \(x^{L}\) and upper \(x^{U}\) bounds, and binary variables y appear linearly in the constraints:

$$\begin{aligned} \begin{array}{ll} f_{\mathbf{P}}^*=\min f_0 \left( {x,y} \right) &{}\\ \hbox {s.t.}\quad f_q \left( {x,y} \right) \le 0 &{}\quad \forall q\in \mathbf Q /\left\{ 0 \right\} \\ f_q \left( {x,y} \right) = \sum \nolimits _{\left( {i,j} \right) \in \mathbf BL } a_{ijq} w_{ij} +B_q x+C_q y+d_q &{}\quad \forall q\in \mathbf Q \\ w_{ij} =x_i x_j &{}\quad \forall \left( {i,j} \right) \in \mathbf BL \\ 0\le x^{L}\le x\le x^{U} &{}\\ x\in {\mathbb {R}}^{lx},\quad y\in \left\{ {0, 1} \right\} ^{ly}, \quad w\in {\mathbb {R}}^{\left| \mathbf{BL } \right| }&{}\\ \end{array} \end{aligned}$$
(P)

BL is an ( i, j )-index set defining all bilinear terms, \(\mathbf Q \) represents the set of all functions appearing in the constraints and objective function (\(q=0\)), which excludes auxiliary equations defining new sets of bilinear variables w. The total number of original continuous variables is lx, while the number of original binary variables is ly. We assume that P is feasible with global optimal solution \(f_{\mathbf{P}}^*\).

Many relevant engineering problems can be formulated as P. Others, closely resemble P, the difference being the presence of constraints with exponential terms for estimating the capital cost. Examples can be found in pooling problems [1,2,3,4], synthesis of general multicomponent process networks [5, 6], design of water networks [7,8,9,10,11], short-term planning of oil refineries [12, 13], scheduling of crude-oil blending operations [14,15,16] and hydro energy systems [17].

Nonconvex optimization problem P can present multiple local and global optima. Gradient-based methods cannot guarantee finding a global solution and they do not tell, at termination, how far the best feasible solution is from the best possible solution (i.e., the best estimate of the global optimum) [18]. Deterministic global optimization algorithms are required for such purposes, with their development being a very active research area.

Deterministic global optimization algorithms rely on a relaxation of P to compute estimates of the global solution, on various techniques to iteratively improve such estimates, and on methods to compute feasible solutions. They aim to reduce the relative difference between the best feasible and best possible solutions below \(\varepsilon \). The quality of the best possible solution depends on the tightness of the relaxation, which in turn depends on the size of the domain of the variables (i.e., \(x^{U}-x^{L}\)). The smaller the domain, the closer the relaxation is to the original nonconvex function.

Spatial branch-and-bound (SBB) is the most common method to systematically reduce the domain of the variables. In SBB, branching is applied on discrete variables, as well as on continuous variables involved in nonlinear terms [19, 20]. Branching occurs one variable at a time and it generates two child nodes, each with a smaller domain than the parent node, leading to potentially tighter relaxations. Whenever the best possible solution at a node becomes worse than the best feasible solution, the node is fathomed (pruned). Cutting planes and bound-tightening techniques have been incorporated in SBB algorithms to improve the relaxation and reduce the number of nodes to explore. Although SBB guarantees convergence to an \(\varepsilon \)-global solution, computational time can grow exponentially with problem size.

The tightest linear relaxation for a bilinear term is given by McCormick envelopes [21]. They are generated by four inequalities that have a low computational cost. Many deterministic global optimization algorithms employ McCormick envelopes as their only relaxation technique for bilinear terms [5, 12, 22, 23]. However, McCormick envelopes usually provide a weak relaxation when at least one of the variables involved in a bilinear term has a significantly large domain. This led to the development of piecewise linear relaxation techniques that improve the quality of the relaxation by partitioning the variables domain (the larger the number of partitions, the tighter the relaxation). However, since piecewise linear relaxations introduce additional binary and continuous variables, there exists a trade-off between tightness and the computational effort required to solve the MILP to optimality.

The piecewise McCormick relaxation [24,25,26,27] partitions the domain of one of the variables involved in a bilinear term and constructs McCormick envelopes for each partition. The major drawback of piecewise McCormick is that the number of additional binary variables increases linearly with the number of partitions. This important issue fostered the development of relaxation techniques where the number of binary variables increases logarithmically with respect to the number of partitions [2, 28, 29], with one example being normalized multiparametric disaggregation. Piecewise linear relaxations have been used in SBB algorithms [2, 30, 31], but they can also be deployed as an alternative to the SBB framework [13, 28, 32,33,34], as well as in decomposition methods [26].

The semidefinite programming (SDP) relaxation of MIQCPs has also been extensively studied. Problem (P) is a lifted reformulation, with extra variables \(w_{ij} \) and non-convex constraints \(w_{ij} =x_i x_j \). It can be relaxed as a pair of inequalities, \(W-xx^{T} \succcurlyeq 0\) (convex) and \(xx^{T}-W \succcurlyeq 0\) (non-convex). In order to produce strong convex relaxations, Saxena et al. [35] use the convex SDP inequality to derive convex quadratic cuts and exploit the non-convex inequality to derive disjunctive cuts. Their cutting plane algorithm also relies on McCormick envelopes to strengthen the initial relaxation of the MIQCP, later removing all non-binding (at the solution of the convex relaxation) RLT inequalities when generating disjunctive cuts. In the companion paper [36], Saxena et al. study methods that capture the strength of such extended SDP relaxations but are defined only in the space of the x variables. By replacing the RLT convexification of (P) with an alternative that splits matrix A, defining bilinear terms \(x^{T}Ax\), as a difference of positive semidefinite and symmetric matrices, they show how to project the extended RLT formulation in the original space by solving linear programs (LPs). A similar procedure is performed when adding the convex inequality \(W-xx^{T} \succcurlyeq 0\) to the extended RLT formulation, leading to the solution of SDPs rather than LPs. For the GLOBALLib instances, relaxations from projected formulations are almost as strong as those from [35], with the advantage of being solved two orders of magnitude faster.

In this work, we present a deterministic global optimization algorithm to solve MINLP problems of type P. The main novelty is the use of a dynamic partitioning scheme for piecewise relaxations, not only to compute the lower bound, but also for performing optimality-based bound tightening (OBBT) for all variables appearing in bilinear terms [33]. The extensive use of OBBT contrasts with commercial global optimization solvers [37,38,39], which apply some restricted version of it, always featuring the simplest bilinear envelopes [40]. Dynamic partitioning refers to changing the number of partitions between iterations. Although the same term has been applied in [34], there are major differences between the two algorithms, as can be seen in Fig. 1.

Fig. 1
figure 1

Comparison of dynamic partitioning schemes

Nagarajan et al. [34] assume that a local solution \(( {x^{*},y^{*},w^{*}} \)) to (P) is given and divide their global optimization algorithm in two parts. In part one, a sequence of OBBT iterations is performed to reduce the domain of all x variables. The procedure stops when bound improvement in consecutive iterations falls below a specified tolerance. Part two involves the solution of relaxation problems (PR) derived from piecewise McCormick envelopes. The domain of bilinearly appearing variables \(x_i \) and \(x_j \) is partitioned in a non-uniform way. Partitions are dynamically added around the current solution (local solution \(x^{*}\) in the first iteration and optimal solution from the relaxation problem \(x^{R}\) in subsequent iterations) until the normalized improvement on the lower bound LB from (PR) is less than a given tolerance, \(x^{R}\) variables remain in the same partitions and the size of such partitions is already very small, or the computation hits a time limit.

The algorithm proposed in this work does not assume a feasible solution is given. In the first iteration, it solves a simple relaxation problem (McCormick envelopes) to try to find one very quickly. This process is repeated in subsequent iterations to improve the upper bound UB, and consequently the bounds from OBBT (step omitted from Fig. 1 since the focus is on comparing the lower bounding procedure). The integration of OBBT and (PR) steps is the first major difference compared to [34]. The second difference is that our algorithm relies on univariate and uniform partitioning. Uniform partitioning, by giving the same importance to all regions of the domain, may protect us against frequent \(x^{R}\) movements from narrower to wider partitions, meaning potentially fewer iterations at the expense of more partitions (larger problems) per iteration. The next partitioning level is decided based on (PR)’s computational requirements. Figure 1 assumes (PR) solves fast before reaching \(N=8\) partitions in iteration 4, coinciding with OBBT becoming ineffective. To further improve domain reduction, it is thus worth to try piecewise relaxation strategies for OBBT, not considered in [34], by selecting \(N=2\). Iteration 5 also backtracks to \(N=4\) for (PR), illustrating that dynamic partitioning can go in both directions. Finally, and to benefit from the better scaling of problem size with the number of partitions when reaching \(N=10\), the algorithm will change the relaxation technique from piecewise McCormick to normalized multiparametric disaggregation.

2 Computing lower bounds

If y variables remain binary and all original constraints \(q\in \mathbf Q /\left\{ 0 \right\} \) are kept, the simplest relaxation of P is obtained by removing equations \(w_{ij} =x_i x_j \). However, it is also the weakest. Narrowing the domain of variables \(w_{ij} \) to regions \(\mathbf WR _{{\varvec{ij}}} \) will potentially lead to a tighter relaxation. Since P is feasible, so is its relaxation PR. If \(f_{\mathbf{PR}}^R \) is the global optimal solution of PR, then \(f_{\mathbf{PR}}^R \le f_{\mathbf{P}}^*\), representing a lower bound on the global optimal solution of P.

$$\begin{aligned} \begin{array}{ll} f_{\mathbf{PR}}^R =\min f_0 \left( {x,y} \right) &{}\\ \hbox {s.t.}\quad f_q \left( {x,y} \right) \le 0 &{}\quad \forall q\in \mathbf Q /\left\{ 0 \right\} \\ f_q \left( {x,y} \right) = \sum \nolimits _{\left( {i,j} \right) \in \mathbf BL } a_{ijq} w_{ij} +B_q x+C_q y+d_q &{}\quad \forall q\in \mathbf Q \\ w_{ij} \in \mathbf WR _{{\varvec{ij}}} &{}\quad \forall \left( {i,j} \right) \in \mathbf BL \\ 0\le x^{L}\le x\le x^{U}&{}\\ x\in {\mathbb {R}}^{lx},\quad y\in \left\{ {0, 1} \right\} ^{ly}, \quad w\in {\mathbb {R}}^{\left| \mathbf{BL } \right| }&{}\\ \end{array} \end{aligned}$$
(PR)

Three alternative ways of defining regions \(\mathbf WR _{{\varvec{ij}}} \) for relaxation problem PR will be discussed next.

2.1 McCormick relaxation (SMCR)

The standard McCormick relaxation for bilinear function \(w_{ij} =x_i x_j \) represented in Fig. 2, is given by Eqs. (1–4). These equations define regions \(\mathbf WR _{{\varvec{ij}}} \) that form the convex hull for \(x_i x_j \), see Fig. 3.

$$\begin{aligned} w_{ij}\ge & {} x_i x_j^L +x_j x_i^L -x_i^L x_j^L \quad \forall \left( {i,j} \right) \in \mathbf BL \end{aligned}$$
(1)
$$\begin{aligned} w_{ij}\ge & {} x_i x_j^U +x_j x_i^U -x_i^U x_j^U\quad \forall \left( {i,j} \right) \in \mathbf BL \end{aligned}$$
(2)
$$\begin{aligned} w_{ij}\le & {} x_i x_j^L +x_j x_i^U -x_i^U x_j^L\quad \forall \left( {i,j} \right) \in \mathbf BL \end{aligned}$$
(3)
$$\begin{aligned} w_{ij}\le & {} x_i x_j^U +x_j x_i^L -x_i^L x_j^U \quad \forall \left( {i,j} \right) \in \mathbf BL \end{aligned}$$
(4)
Fig. 2
figure 2

Bilinear function \({x}_{i} {x}_{j}\) in [0, 1]\(^{2}\)

Fig. 3
figure 3

Feasible region from McCormick envelopes for bilinear function \({x}_{i}{x}_{j}\) in [0, 1]\(^{2}\)

2.2 Piecewise linear relaxations

Piecewise McCormick and normalized multiparametric disaggregation are well-known examples of piecewise relaxation techniques that typically use the same number of partitions N for every partitioned variable \(x_j \). Both introduce additional binary variables into the problem, creating non-convex regions \(\mathbf WR _{{\varvec{ij}}} \).

2.2.1 Piecewise McCormick relaxation (PMCR)

Piecewise McCormick uses binary variable \(z_{jn} \) to identify the active (n) partition for variable \(x_j \). The McCormick envelopes in Eqs. (1–4) can then benefit from tighter bounds \(x_j^L \le x_{jn}^L \) and \(x_{jn}^U \le x_j^U \), computed by Eqs. (5–6). The mixed-integer linear relaxation can be formulated as a disjunction and convex-hull reformulated [41] into Eqs. (7–15). Notice the new continuous disaggregated variables \(\hat{x}_{jn} \) and \(\hat{x}_{ijn} \). The feasible region associated to PMCR using 4 partitions is illustrated in Fig. 4. Notice that it is closer to the original bilinear function (Fig. 2) than SMCR (Fig. 3).

$$\begin{aligned} x_{jn}^L= & {} x_j^L +\frac{\left( {x_j^U -x_j^L } \right) \left( {n-1} \right) }{N}\quad \forall j:\left( {i,j} \right) \in {\mathbf{BL}},n\in \left\{ {1,\ldots ,N} \right\} \end{aligned}$$
(5)
$$\begin{aligned} x_{jn}^U= & {} x_j^L +\frac{\left( {x_j^U -x_j^L } \right) \left( n \right) }{N}\quad \forall j:\left( {i,j} \right) \in {\mathbf{BL}},n\in \left\{ {1,\ldots ,N} \right\} \end{aligned}$$
(6)
$$\begin{aligned} w_{ij}\ge & {} \sum \limits _{n=1}^N \left( {\hat{x}_{ijn} x_{jn}^L +\hat{x}_{jn} x_i^L -z_{jn} x_i^L x_{jn}^L } \right) \quad \forall \left( {i,j} \right) \in {\mathbf{BL}} \end{aligned}$$
(7)
$$\begin{aligned} w_{ij}\ge & {} \sum \limits _{n=1}^N \left( { \hat{x}_{ijn} x_{jn}^U +\hat{x}_{jn} x_i^U -z_{jn} x_i^U x_{jn}^U } \right) \quad \forall \left( {i,j} \right) \in {\mathbf{BL}} \end{aligned}$$
(8)
$$\begin{aligned} w_{ij}\le & {} \sum \limits _{n=1}^N \left( {\hat{x}_{ijn} x_{jn}^L +\hat{x}_{jn} x_i^U -z_{jn} x_i^U x_{jn}^L } \right) \quad \forall \left( {i,j} \right) \in {\mathbf{BL}} \end{aligned}$$
(9)
$$\begin{aligned} w_{ij}\le & {} \sum \limits _{n=1}^N \left( {\hat{x}_{ijn} x_{jn}^U +\hat{x}_{jn} x_i^L -z_{jn} x_i^L x_{jn}^U } \right) \quad \forall \left( {i,j} \right) \in {\mathbf{BL}} \end{aligned}$$
(10)
$$\begin{aligned} x_i= & {} \sum \limits _{n=1}^N \hat{x}_{ijn}\quad \forall \left( {i,j} \right) \in {\mathbf{BL}} \end{aligned}$$
(11)
$$\begin{aligned} x_j = \sum \limits _{n=1}^N \hat{x}_{jn} \quad \forall j:\left( {i,j} \right) \in {\mathbf{BL}} \end{aligned}$$
(12)
$$\begin{aligned} \sum \limits _{n=1}^N z_{jn}= & {} 1 \forall j:\left( {i,j} \right) \in {\mathbf{BL}} \end{aligned}$$
(13)
$$\begin{aligned} x_i^L z_{jn}\le & {} \hat{x}_{ijn} \le x_i^U z_{jn} \quad \forall \left( {i,j} \right) \in {\mathbf{BL}},n\in \left\{ {1,\ldots ,N} \right\} \end{aligned}$$
(14)
$$\begin{aligned} x_{jn}^L z_{jn}\le & {} \hat{x}_{jn} \le x_{jn}^U z_{jn} \quad \forall j:\left( {i,j} \right) \in {\mathbf{BL}},n\in \left\{ {1,\ldots ,N} \right\} \end{aligned}$$
(15)
Fig. 4
figure 4

Feasible region from piecewise McCormick relaxation with 4 partitions for bilinear term \({x}_{i} {x}_{j}\) in [0, 1]\(^{2}\)

2.2.2 Normalized multiparametric disaggregation (NMDT)

Normalized multiparametric disaggregation provides an equivalent relaxation to PMCR but can be orders of magnitude more efficient computationally. However, the number of partitions is restricted to powers of ten, i.e. \(N=10^{-p}\), with \(p\in {\mathbb {Z}}^{-}\) being the accuracy parameter chosen by the user. The normalized \(\left[ {0,1} \right] \) domain of variable \(x_j \) is discretized considering all digits \(k\in \left\{ {0,\ldots ,9} \right\} \) of the decimal representation system and positions \(l\in \left\{ {p,\ldots ,-1} \right\} \). It is then linked to the real domain of \(x_j \) through continuous variable \(\lambda _j \) and global bounds \(x_j^L \) and \(x_j^U \). Note that continuous variable \(\Delta \lambda _j \) allows \(\lambda _j \) to take continuous values between discrete points. The active partition for \(x_j \) is identified by the non-zero values of (\(-p\)) binary variables \(z_{jkl} \), one per position l. The number of binary variables per variable is thus \(10\log _{10} N\) versus N when using PMCR. Equations (16–26) provide the NMDT relaxation that also requires continuous variables \(v_{ij} \), \(\Delta v_{ij} \), and \(\hat{x}_{ijkl} \). It is illustrated in Fig. 5 for \(p=-1\) (\(N=10\)), which is already very similar to Fig. 2.

$$\begin{aligned} w_{ij}= & {} x_i x_j^L +v_{ij} \left( {x_j^U -x_j^L } \right) \quad \forall \left( {i,j} \right) \in {\mathbf{BL}} \end{aligned}$$
(16)
$$\begin{aligned} x_j= & {} x_j^L +\lambda _j \left( {x_j^U -x_j^L } \right) \quad \forall j:\left( {i,j} \right) \in {\mathbf{BL}} \end{aligned}$$
(17)
$$\begin{aligned} \lambda _j= & {} \Delta \lambda _j + \sum \limits _{l=p}^{-1} \mathop \sum \limits _{k=0}^9 10^{l}\cdot k\cdot z_{jkl} \quad \forall j:\left( {i,j} \right) \in {\mathbf{BL}} \end{aligned}$$
(18)
$$\begin{aligned} 0\le & {} \Delta \lambda _j \le 10^{p} \quad \forall j:\left( {i,j} \right) \in {\mathbf{BL}} \end{aligned}$$
(19)
$$\begin{aligned} v_{ij}= & {} \sum \limits _{l=p}^{-1} \mathop \sum \limits _{k=0}^9 10^{l}\cdot k\cdot \hat{x}_{ijkl} +\Delta v_{ij} \quad \forall \left( {i,j} \right) \in {\mathbf{BL}} \end{aligned}$$
(20)
$$\begin{aligned} x_i^L \Delta \lambda _j\le & {} \Delta v_{ij} \le x_i^U \Delta \lambda _j \quad \forall \left( {i,j} \right) \in {\mathbf{BL}} \end{aligned}$$
(21)
$$\begin{aligned} \Delta v_{ij}\le & {} 10^{p}\left( {x_i -x_i^L } \right) +x_i^L \Delta \lambda _j\quad \forall \left( {i,j} \right) \in {\mathbf{BL}} \end{aligned}$$
(22)
$$\begin{aligned} \Delta v_{ij}\ge & {} 10^{p}\left( {x_i -x_i^U } \right) +x_i^U \Delta \lambda _j \quad \forall \left( {i,j} \right) \in {\mathbf{BL}} \end{aligned}$$
(23)
$$\begin{aligned} x_i= & {} \sum \limits _{k=0}^9 \hat{x}_{ijkl} \quad \forall \left( {i,j} \right) \in {\mathbf{BL}}, l\in \left\{ {p,\ldots ,-1} \right\} \end{aligned}$$
(24)
$$\begin{aligned} \sum \limits _{k=0}^9 z_{jkl}= & {} 1\quad \forall j:\left( {i,j} \right) \in {\mathbf{BL}}, l\in \left\{ {p,\ldots ,-1} \right\} \end{aligned}$$
(25)
$$\begin{aligned} x_i^L z_{jkl}\le & {} \hat{x}_{ijkl} \le x_i^U z_{jkl} \quad \forall \left( {i,j} \right) \in {\mathbf{BL}}, l\in \left\{ {p,\ldots ,-1} \right\} , k\in \left\{ {0,\ldots ,9} \right\} \end{aligned}$$
(26)
Fig. 5
figure 5

Feasible region from normalized multiparametric disaggregation with \({p}=-{1}\) for bilinear term \({x}_{i}{x}_{j}\) in [0, 1]\(^{2}\)

3 Optimality-based bound tightening (OBBT)

For all three relaxation techniques described in Sect. 2, the volume of region \({\mathbf{WR}}_{{{\varvec{ij}}}} \) depends on bounds \(x_i^L \), \( x_i^U \), \(x_j^L \) and \(x_j^U \). It is thus desirable to strengthen such bounds (raise \(x_i^L \) and \(x_j^L \), and decrease \(x_i^U \) and \(x_j^U \)) to obtain a tighter relaxation (higher \(f_{\mathbf{PR}}^R \)). One way to do it, is through optimality-based bound tightening (OBBT). For each variable \(h\in \mathbf{BLV}=\{h|\left( {i,j} \right) \in \mathbf{BL}\wedge \left( {h=i\vee h=j} \right) \}\) involved in a bilinear term, lower and upper bounds are computed by solving one minimization and one maximization problem, respectively. These problems, denoted as PRB, are like relaxation problem PR but with a different objective function (now the variable to minimize/maximize) and an additional constraint, which imposes the value of the objective function in P, \(f_0 \left( {x,y} \right) \), to be less or equal than the current upper bound UB.

$$\begin{aligned} \begin{array}{ll} x_h^L =\min x_h \left( {x_h^U =\max x_h } \right) &{}\\ \hbox {s.t.}\quad f_0 \left( {x,y} \right) \le {U}{B} &{}\\ f_q \left( {x,y} \right) \le 0 &{}\quad \forall q\in \mathbf Q /\left\{ 0 \right\} \\ f_q \left( {x,y} \right) =\sum \limits _{\left( {i,j} \right) \in \mathbf BL } a_{ijq} w_{ij} +B_q x+C_q y+d_q &{}\quad \forall q\in \mathbf Q \\ w_{ij} \in \mathbf WR _{{{\varvec{ij}}}} &{}\quad \forall \left( {i,j} \right) \in \mathbf BL \\ 0\le x^{L}\le x\le x^{U}&{}\\ x\in {\mathbb {R}}^{lx},y\in \mathcal{Y},w\in {\mathbb {R}}^{\left| \mathbf{BL } \right| }&{}\\ \end{array} \end{aligned}$$
(PRB)

Remark 1

Given that many problems may need to be solved, the complexity of problems PRB should be manageable. Region \(\mathbf WR _{{{\varvec{ij}}}} \) will be generated from either the standard or piecewise McCormick envelopes with a low number of partitions (\(N<10\)). With the former, binary variables are further relaxed, \(\mathcal{Y}\in \left[ {0,1} \right] ^{ly}\), to work with linear problems (LPs) instead of MILPs (\(\mathcal{Y}\in \left\{ {0,1} \right\} ^{ly}\)).

Other types of probing methods can be found in the literature that also solve bounded relaxations of the problem to extract further information on the variables, e.g. to identify conflicts between binary variables y [42]. They are not part of this work.

4 Generating upper bounds

Previous work has shown that an effective way to compute a good feasible solution to non-convex MINLP problem P, is to rely on a two-stage MILP/NLP strategy. Any feasible solution to MILP problem PR can be used to extract the values \(x^{R}\), \(y^{R}\) and \(w^{R}\) of variables x, y and w. Parameters \(y^{R}\) will then replace binary variables y in P, reducing it to NLP problem PF. PF will be solved by a local NLP solver, after initializing variables x and w with parameters \(x^{R}\) and \(w^{R}\), to facilitate convergence. Note that PF is a restricted version of P, and so it is not necessarily feasible. If feasible, the optimal solution \(\left( {x^{*},y^{*},w^{*}} \right) \) of PF is an upper bound UB on the global solution of P, i.e. \(f_{\mathbf{PF}}^*\ge f_{\mathbf{P}}^*\).

$$\begin{aligned} \begin{array}{ll} f_{{\mathbf{PF}}}^*=\min f_0 \left( x \right) &{}\\ \hbox {s.t.}\quad f_q \left( x \right) \le 0 &{}\quad \forall q\in \mathbf Q /\left\{ 0 \right\} \\ f_q \left( x \right) = \sum \limits _{\left( {i,j} \right) \in \mathbf BL } a_{ijq} w_{ij} +B_q x+C_q y^{R}+d_q &{}\quad \forall q\in \mathbf Q \\ w_{ij} =x_i x_j &{}\quad \forall \left( {i,j} \right) \in \mathbf BL \\ 0\le x^{L}\le x\le x^{U}&{}\\ x\in {\mathbb {R}}^{lx}, \quad w\in {\mathbb {R}}^{\left| \mathbf{BL } \right| }&{}\\ \end{array} \end{aligned}$$
(PF)

5 Global optimization algorithm

We now propose a global optimization algorithm for the solution of any mixed-integer nonlinear program that can be written as problem P. It is summarized in Fig. 6 and detailed in Tables 1 and 2.

Fig. 6
figure 6

Flowchart of the proposed global optimization algorithm

Assumed given are the selection of partitioned variables \(x_j \) in every bilinear term, variable bounds \(x^{L}\) and \(x^{U}\), and a variety of tuning parameters. Problem-specific settings include the pre-specified values that the number of partitions can take when solving PR (\(N_{ PR } \)) and PRB (\(N_{ PRB } \)), maximum computational time and relative optimality tolerance (e.g. \( TIME _{ PR }^{max} \), \(\varepsilon _{ PR } \)). The other parameters will be named while describing the algorithm.

Following the initialization step, the algorithm computes the lower bound \( LB \) using the simplest McCormick relaxation. In subsequent iterations, step 5 will typically involve piecewise linear relaxations. Note that once OBBT loses efficiency (flag \( LAST _{ PR } =1\)), the maximum computational time \( TIME _{ PR }^{max} \) will be reset to the remaining time to run the algorithm. Step 5 solves one MILP problem of type PR, gathering a maximum of npool solutions in a pool. If the optimal solution \(f_{{\mathbf{PR}}}^R \) is higher than the lower bound, the latter is updated.

Remark 2

Region \(\mathbf WR _{{{\varvec{ij}}}} \) in problem PR is computed using piecewise McCormick envelopes whenever \(N_{ PR } \in \left\{ {1,2,\ldots ,9} \right\} \). Normalized multiparametric disaggregation is used instead for \(N_{ PR } \in \left\{ {10,100,1000,\ldots } \right\} (p\in \left\{ {\ldots ,-3,-2,-1} \right\} )\).

Remark 3

The lower bound is updated using the best possible solution at termination for problem PR and not the best-found feasible solution. The same is true for problem PRB, when it is an MILP.

In step 6, we use the values \(\left( {x^{R},y^{R},w^{R}} \right) \) of the variables in the previous solutions to help computing upper bounds. A total of npool problems of type PF are solved in parallel using npar threads. Amongst those that are feasible, the one with the lowest objective function \(f_{{\mathbf{PF}}}^*\) can set the upper bound \( UB \). Note that PF is solved by a local NLP solver and so this step is much faster than steps 4–5. It is the reason why no execution-time constraints are enforced.

With the lower and upper bound, step 7 computes the relative optimality gap OptGap. Step 8 stops the algorithm if the termination criteria is met, either a relative tolerance below \(\varepsilon \) or a computational time (\( TIME \)) above maximum value \( TIME ^{max}\). Decisions related to the dynamic partitioning scheme are taken in step 9.

The details of step 9 can be found in Table 2. Two flags are used: \( NN _{ PR }^{nec} =1\) indicates that we have the necessary conditions for increasing the number of partitions in problem PR; \( NN _{ PR } =1\) gives the sufficient condition for selecting the next setting in \(\left\{ {N_{PR,first} ,\ldots ,N_{PR,last} } \right\} \), see 9c. These are the initial values for the first entry in 9a, which checks the time spent solving PR (\( TIME _{ PR } \)).

If greater or equal to \( TIME _{ PR }^{max} \), it means that we should try to backtrack and reduce the number of partitions in the next iteration to reduce the complexity of PR, unless we are already in the coarsest setting \(N_{ PR ,first} \) or have previously backtracked to \(N_{ PR } \); either way, we will definitely not increase \(N_{ PR } \), i.e. \( NN _{ PR } =0\). The same is true if \( TIME _{ PR } \) is within \( TIME _{ PR }^{max} \) and the maximum time ratio \(tr_{ PR }^{max} \), and \( LAST _{ PR } =0\). We also set \( NN _{ PR }^{nec} =0\) to later decide how to improve the lower bound.

It the number of partitions did not increase in the previous iteration (\( NN _{ PR } =0\)) and PR was solved rather fast (below minimum time ratio \(tr_{ PR }^{min} \)), we will try to generate a better lower bound by rising \(N_{ PR } \) in the next iteration. This concludes step 9a.

Step 9b takes measures when the average domain reduction in OBBT is below the minimum target of \( ADR ^{min}\). This is not an issue if PR problems can be solved rather fast (\( NN _{ PR } =1\)), we simply avoid spending time in the next iteration with an inefficient OBBT by making \(DO_{ OBBT } =0\). On the other hand, if we do not meet the necessary condition to increase \(N_{ PR } \) (\( NN _{ PR }^{nec} =0\)), we may need to move towards termination of the algorithm.

If the next possible value of \(N_{ PRB } \) is lower than \(N_{ PR } \), then we might still be able to get a good domain reduction by increasing \(N_{ PRB } \). Counters of LP (\(C_{ PRB }^{LP} \)) and MILP (\(C_{ PRB }^{ MILP } \)) problems solved are then reset. Else, we increase the appropriate counter by one. We then proceed to the last if-then-else. If we have already solved at least one MILP in OBBT and found that \( ADR < ADR ^{min}\), then the most reasonable thing to do is to give all remaining time to PR by making \( LAST _{ PR } =1\). If the domain reduction was low but we have been solving LPs in OBBT, then we also move towards the end while allowing one more OBBT run, now solving MILPs, after selecting the next value of \(N_{ PRB } \).

We then proceed to the next iteration in step 10. Steps 3 and 4 are the first procedures of iteration IT but do not occur in the first iteration to quickly compute an optimality gap.

Step 3 involves a depth search and is detailed in Sect. 5.1.

Step 4 executes optimality-based bound tightening to reduce the variables domain. It is triggered by \( DO _{ OBBT } =1\) and involves solving two PRB problems per variable, after setting the number of partitions N to \(N_{ PRB } \). Since the number of variables involved in bilinear terms can be significantly large, it is much more efficient to solve the multiple instances of problem PRB in parallel rather than sequentially (see results in Sect. 7.6). This procedure is repeated until OBBT has been applied on all \(x_h \) variables involved in bilinear terms. We then compute the average domain reduction \( ADR \) (%) using Eq. (27).

$$\begin{aligned} ADR =\frac{1}{|\mathbf{BLV}|} \sum \limits _{h\in \mathbf{BLV}} \left[ \frac{\left( {x_h^{U,previous} -x_h^{L,previous} } \right) {-}\left( {x_h^{U,updated} {-}x_h^{L,updated} } \right) }{\left( {x_h^{U,previous} {-}x_h^{L,previous} } \right) }\times 100 \right] \end{aligned}$$
(27)

Remark 4

\(N_{ PRB } =0\) triggers the computation of relaxed region \(\mathbf WR _{{{\varvec{ij}}}} \) of bilinear function \(w_{ij} =x_i x_j \) from the McCormick envelopes with binary variables relaxed (recall Remark 1).

Table 1 Global optimization algorithm
Table 2 Global optimization algorithm—dynamic partitioning scheme

5.1 Depth search method

MIQCPs have two sources of complexity: (1) a combinatorial source from binary variables; (2) a non-convexity source from bilinear terms. A stronger combinatorial component is associated to a higher difficulty finding the global optimal solution and can be addressed by generating a larger number of feasible solutions for P. This should be done preferably in the earlier stages of the global optimization algorithm, since a better upper bound (\( UB \)) helps to improve the bounds computed by problem PRB. It is activated in the second iteration (\( IT =2\)) or when \( ADR < ADR ^{min}\), if general setting \( DO _{ DS } =1\).

The depth search method works by dynamically increasing the number of partitions in PR from the current \(N_{ PR } \) value. Note that it is not needed to solve PR to optimality since the focus here is not on the lower bound. Because the MILP solver normally finds multiple feasible solutions in the early nodes of the search tree, we stop at time \( TIME _{ PR }^{max} \). Solutions obtained after solving PR with more partitions are potentially better (higher \(f_{{\mathbf{PR}}}^R \)), leading to values of the model variables that are closer to the feasible region of P. As explained in Sect. 4, these values are then used to initialize PF, potentially leading to a better UB. The depth search method stops after \(IT_{DS}^{max} \) increments in the number of partitions, resetting \(N_{ PR } \) to its original value.

Overall, depth search is very similar to the search performed by the main algorithm. However, it does not use OBBT and it always increases the number of partitions from one iteration to the next.

6 Benchmark problems

Two different sets of MIQCP benchmark problems from the literature are used to evaluate the performance of the proposed global optimization algorithm.

The first set deals with the short-term scheduling of a hydroelectric system [17], where the aim is to maximize the daily profit considering hourly changing electricity prices and start-up costs for the power plants. Power generation is modelled as a bilinear function of discharge flowrate and head change, with binary variables identifying if a plant is producing energy on a given hour (needed to enforce lower and upper bounds on power production and discharge flowrate) and startups. Like in our previous global optimization studies [31, 33], we consider the original problem with 7 reservoirs (HYD7) and simpler versions with 2 (HYD2) and 4 reservoirs (HYD4).

The second set consists of planning problems from a petroleum refinery [13]. The objective is to minimize the total operating cost of the system that includes processing units with alternative operating modes and storage tanks. Binary variables identify active modes and products being blended. Bilinear terms appear as the product of volumetric flows and quality properties in the material balances. We solve seven problems with different crude-oil supply and product demand data. Three involve a single period of operation (SC1TP1-SC3TP1), while in the others, the weekly time horizon is divided in three periods of fixed length (SC1TP3-SC4TP3).

The model statistics in Table 3 show that the ratio between the number of binary variables and bilinear terms varies significantly between the two sets of problems (1:2 vs. 1:50). For the hydro problems, a stronger combinatorial component is associated to a higher difficulty finding the global optimal solution and can be addressed by generating a larger number of feasible solutions of P. We thus activate the depth search method (\(DO_{DS} =1\)), with a maximum of five increments in the number of partitions (\(IT_{DS}^{max} =5\)). The refinery problems do not benefit from the time-consuming depth search method and so \(DO_{DS} =0\).

Table 3 MIQCP model statistics

6.1 Tuning parameters

The optimization algorithm presented in Sect. 5 has a few parameters affecting its performance. Most of the values selected were independent of problem type, while one was tuned to adjust to instance size. It is beyond the scope of this paper to present a thorough computational study involving such parameters.

MILP problems PR were solved for a number of partitions \(N_{ PR } \in \{ 1,2,4,8,10,100,1000\}\). The termination criteria were either a relative optimality tolerance \(\varepsilon _{ PR } =0.0001\)% or a maximum time \( TIME _{ PR }^{max} \) equal to: 400 s while OBBT is effective; or the remaining time available, otherwise. The number of partitions \(N_{ PR } \) will increase in the next iteration if the time solving PR divided by \( TIME _{ PR }^{max} \) is less or equal than \(tr_{ PR }^{min} =0.05\). On the other hand, if the time ratio is greater of equal than \(tr_{ PR }^{max} =0.75\), \(N_{ PR } \) will not change. The solution pool option of the MILP solver was active, with a pool capacity of \(npool=\) 60, thus leading to a maximum of 60 instances of PF solved in parallel per iteration.

The OBBT step involves solving LPs, \(N_{ PRB } =0\), and MILP problems, \(N_{ PRB } \in \left\{ {2,3,4,5,6,7} \right\} \) (recall Remark 4). In the latter case, the relative tolerance for problems PRB is \(\varepsilon _{ PRB } =0.0001\)%, while the maximum time \( TIME _{ PRB }^{max} \) is instance dependent: 130, 135, 145, 45 and 70 s for problems HYD2, HYD4, HYD7, SC#TP1 and SC#TP3, respectively. A maximum of \(npar=\) 80 instances were solved in parallel and the minimum average domain reduction to consider OBBT effective was \(ADR^{min}=5\)%.

For the hydro problems, the algorithm terminates when the optimality gap \(\varepsilon \le 0.0001\)% or upon reaching a wall time \( TIME ^{max}=18{,}000\) s. For the refinery problems, the values are 0.01%, and 3600/10,800 s when dealing with one/three periods. The partitioned variables in the hydro problems are the discharge flowrates. In the refinery problems, the partitioned variables are the stream flowrates, the inventory levels in the storage tanks, and the quality variables associated with the specific gravity.

7 Numerical results

All mathematical models and the global optimization algorithm were implemented in GAMS 24.7.3, taking advantage of its parallel computing grid facility. The MILP problems were solved by CPLEX 12.6.3, running in parallel deterministic mode and using up to 8 threads. CONOPT 3.17A solved the NLP problems. The MINLP benchmark problems were also solved by commercial global optimization solvers BARON 16.5 [37] and ANTIGONE 1.1 [39] using the same termination criteria. The former is centered around spatial branch-and-bound, while the latter focuses more on solving piecewise linear relaxations, applying bound tightening techniques, and generating different types of cutting planes. The hardware consisted of a server with an AMD \(\hbox {Opteron}^{\mathrm{TM}}\) Processor 6386 SE (2.79 GHz), 32 available cores, 64 GB RAM, and running Windows Server 2008 R2 Enterprise.

7.1 Comparison to our previous algorithms

The global optimization algorithms in our previous work have used piecewise relaxations in a different manner, see details in Table 4. They are responsible for the literature results in Table 5.

Results in [31] for the hydro problems came from a spatial branch-and-bound algorithm using the NMDT relaxation with \(N_{ PR } =10\) partitions. OBBT was called in every node of the tree, prior to solving the relaxation problem (as in the current work), and involved solving a sequence of LPs (\(N_{ PRB } =0\)).

The algorithm solving the refinery problems in [13] used dynamic partitioning in the relaxation step as a replacement to spatial B&B, similarly to the one proposed in this work. The difference is that the number of partitions only increased, until reaching the computational time limit. Now, we enforce timing constraints per iteration to use the available time more efficiently, backtracking on the number of partitions whenever the relaxation problem cannot be solved to optimality. The two algorithms also share the parallel solution strategy for the bound contracting problems. However, the current algorithm calls OBBT more often, once per iteration and while domain reduction remains effective, instead of following the finding of a better solution. More importantly, our new algorithm adjusts to problem complexity by dynamically switching between McCormick and piecewise McCormick relaxations. In the former case, binary variables y are relaxed, leading to LPs instead of the MILPs (\(N_{ PRB } =1\)) in [13].

Table 4 Features of current and previous algorithms
Table 5 Comparison between proposed algorithm, commercial global optimization solvers and literature results
Table 6 Detailed information about the performance of the new algorithm

7.2 Performance overview

Table 5 shows the optimality gap and computational time required by the different algorithms, and results from the literature. The highlight is that the new algorithm always returns the lowest optimality gap. It can solve four problems to the given tolerance, compared to three problems by ANTIGONE and one by BARON. Our previous attempts with algorithms featuring piecewise relaxation methods and optimality-based bound tightening solved none of these benchmark problems to optimality. An ability to find the global optimal solution is also an important performance metric. The commercial global optimization solvers are doing better in this respect, returning suboptimal solutions in three problems compared to our new algorithms’ four. It is an indication that there is still room for improving the upper bounding procedure.

BARON solves HYD2 three times slower, returning considerable larger gaps for the other problems. The poorer performance in the refinery problems might be due to the large number of variables involved in bilinear terms (see Table 3), and thus the potentially large number of nodes to explore in spatial branch-and-bound. ANTIGONE is an overall better performer than BARON and is considerable faster in the single period refinery problems. One possible explanation for the latter behavior is that cutting planes, or other techniques, are more efficient at reducing the domain of model variables than OBBT, when the problem size is small.

7.3 More detailed performance information

To understand how the algorithm is solving the benchmark problems, we show in Table 6 information related to: the total number of iterations; total time spent solving problems PR (step 5) and PF (step 6); executing OBBT (step 4) and depth search procedures (step 3); number of PF and PRB instances solved; average domain reduction in first and last OBBT call, with respect to the initial bounds; and number of partitions used in PR and PRB (final setting).

Piecewise relaxations are explored further in HYD2, with the algorithm reaching the maximum defined number of partitions for PR (\(N_{ PR } =1000\)) and PRB (\(N_{ PRB } =7\)). This is not surprising, considering that HYD2 has the fewest bilinear terms and variables appearing in bilinear terms (recall Table 3). As a consequence, we obtain the largest domain reduction (99.5%). Notice that HYD2 is the only problem taking advantage of the relaxation from multiparametric disaggregation.

The final OBBT domain reduction is strongly dependent on problem size, decreasing to 67 and 32.9% when the number of reservoirs in the hydro problems increases from 2 to 4 and 7, and from above 95% to below 86% when switching from the single to the three-period refinery problems.

The time spent performing optimality-based bound tightening typically far exceeds the time spent solving relaxation problems. The two exceptions are HYD4 and HYD7, for which domain reduction became ineffective for \(N_{ PRB } = 3\) and 2 (while reaching the \( TIME _{ PRB }^{max} \) limit), and all remaining time was allocated to the final PR problems with \(N_{ PR } =4\) partitions. Refinery problems SC#TP3 exhibited a similar behavior, with the larger number of PRB instances solved explaining the longer OBBT time. Options to improve the algorithm performance for such problems involve extending the time limit and reducing the number of PRB instances to be solved in parallel.

Fig. 7
figure 7

Optimality gap versus wall time for hydro problems

7.4 Closing the gap

The extensive use of time-consuming yet very efficient piecewise relaxation techniques by our algorithm, is clearly visible when plotting the optimality gap as a function of wall time, see Figs. 7 and 8. Recall from Fig. 6 that the optimality gap is only updated after a sequence of procedures: OBBT (tightens the variable bounds); solving problem PR (computes the lower bound, which may only improve with respect to the \( LB \) incumbent in the last moments of solving the MILP to optimality); solving NLP problems PF (compute the upper bound). The consequence is a stepwise profile with major drops in optimality gap compared to a smoother profile from the commercial solvers. Notice that there is still some progress towards the end of the search (SC#TP3 problems in Fig. 8), when the solvers have already plateaued. One disadvantage is that it may take a few hundred seconds to go below the gaps of ANTIGONE and BARON.

Fig. 8
figure 8

Optimality gap versus wall time for refinery problems

7.5 Removing the effect of OBBT

The four problems that were solved by the proposed algorithm to global optimality have in common the reduction of the domain of the variables involved in bilinear terms to less than 95% of the initial ranges, on average. We now test the performance of the commercial solvers after setting the variables domain to the final range obtained by our algorithm. The influence of the upper bound is also removed by initializing with the optimum.

The results in Table 7 show that the warm start helps ANTIGONE and BARON to solve such four problems in less than a minute. Improvements for HYD4, HYD7 and the refinery problems (with ANTIGONE) are minor. For the latter, BARON reduces the gap to less than half the values in Table 5. Neither solver can reach optimality gaps as low as the proposed algorithm, highlighting the importance of piecewise relaxations.

Table 7 Performance of commercial solvers after warm start
Fig. 9
figure 9

Optimality gap profiles for sequential versus parallel OBBT

7.6 Sequential versus parallel OBBT

It remains to explain our choice for a parallel rather than a sequential implementation of optimality-based bound tightening. Figure 9 shows the optimality gap versus time profiles for refinery problems SC1TP1 and SC1TP3 and a fixed number of partitions in problems PR (\(N_{ PR } =2\)) and PRB, leading to the solution of LP (\(N_{ PRB } =0\)) or MILP problems (\(N_{ PRB } =2\)).

Results for the easiest SC1TP1 problem show that there are no major differences between the sequential and parallel implementations when solving LP problems. The optimality gaps are better when optimizing the bounds for one variable after the other (as expected) and not much time is lost compared to the parallel approach, for which the overhead of exchanging information between the threads is high. After the second iteration, the gaps become very similar and the lower computational time starts to be noticeable. Switching to MILP problems improves the relaxation quality and makes the parallel implementation far more competitive, with three iterations of OBBT taking less time and returning significantly smaller gaps than one iteration with the sequential approach.

Sequential OBBT with a piecewise relaxation (\(N_{ PRB } =2\)) is no longer an option for the larger SC1TP3, i.e. three hours are not enough to complete one iteration. We can still tackle one iteration with the parallel approach, but it is far more efficient to rely on the standard McCormick relaxation. Overall, the benefits from a parallel implementation of OBBT become increasingly more important with the increase in the number of variables in bilinear terms and the number of partitions in PRB.

8 Conclusions

This paper has presented a new global optimization algorithm for mixed-integer quadratically constrained problems that does not employ spatial branch-and-bound. The novel aspect is the use of dynamic partitioning in piecewise relaxations, not only to compute lower bounds for the problem being minimized, but also to reduce the domain of the variables involved in bilinear terms. Relaxations range from the simplest bilinear envelopes at the start, to univariate piecewise McCormick, up to normalized multiparametric disaggregation, which is computationally more efficient for 10 partitions and beyond. The first type provides a quick lower bound, with the algorithm then switching to piecewise relaxations to refine such estimate. The number of partitions keeps increasing while the relaxation problem remains solvable to the given tolerance within the specified time. In case of severe increase in complexity, the algorithm backtracks to the previous setting, focusing more on optimality-based bound tightening (OBBT). Heuristic rules are used to decide when to increase/decrease the number of partitions in OBBT.

The algorithm has been designed to take advantage of parallel computing when doing OBBT and computing upper bounds. Rather than reducing one at a time the domain of the many variables that appear in bilinear terms, which leads to the tightest bounds, multiple variables are handled simultaneously to reduce the computational wall time. A solution pool is activated when solving the MILP relaxation problems, to generate alternative initialization points for solving restricted NLPs of the original non-convex problem that, if feasible, provide upper bounds. These are also solved in parallel.

The algorithm has been tested on ten industrially relevant benchmark problems, three hydroelectric scheduling problems with more discrete decisions and petroleum refinery planning problems with a larger number of bilinear terms. The computational results have shown that more problems can be solved to \(\varepsilon \)-global optimality. For the other six problems, the final optimality gaps were better than the values reported in the literature and lower than the ones from state-of-the-art commercial global optimization solvers ANTIGONE and BARON. The latter remained above our algorithm even when starting from a reduced variable range (from our last OBBT iteration). It shows that as problem size increases, piecewise relaxations with just 2 and 4 partitions can already provide tighter lower bounds than spatial branch-and-bound. Commercial solvers should thus use them to a greater extent.