Matheuristics

Fischetti, Martina; Fischetti, Matteo

doi:10.1007/978-3-319-07124-4_14

Martina Fischetti^4,5 &
Matteo Fischetti⁶

4922 Accesses
22 Citations

Abstract

As its name suggests, a matheuristic is the hybridization of mathematical programming with metaheuristics. The hallmark of matheuristics is the central role played by the mathematical programming model, around which the overall heuristic is built. As such, matheuristic is not a rigid paradigm but rather a concept framework for the design of mathematically sound heuristics. The aim of this chapter is to introduce the main matheuristic ideas. Three specific applications in the field of wind farm, packing, and vehicle routing optimization, respectively, are addressed and used to illustrate the main features of the method.

Access provided by CONRICYT-eBooks. Download reference work entry PDF

Matheuristics

Matheuristics: using mathematics for heuristic design

Article Open access 09 May 2022

Metaheuristics for Vehicle Routing Problems

Keywords

Introduction

The design of heuristics for difficult optimization problems is itself a heuristic process that often involves the following main steps.

After a clever analysis of the problem at hand and of the acceptable simplifications in its definition, one tries to set up an effective mathematical programming (MP) model and to solve it by a general-purpose piece of software—often a mixed-integer linear programming (MIP) solver. Due to the impressive improvement of general-purpose solvers in recent years, this approach can actually solve the instances of interest to proven optimality (or with an acceptable approximation) within a reasonable computing time, in which case of course no further effort is needed.

If this is not the case, one can insist on the MP approach and try to obtain better and better results by improving the model and/or by enhancing the solver by specialized features (cutting planes, branching, etc.). Or one can forget about MP and resort to ad hoc heuristics not based on the MP model. In this latter case, the MP model is completely disregarded or just used for illustrating the problem characteristics and/or for getting an off-line indication of the typical approximation error on a set of sample instances.

A third approach is however possible that consists in using the MP solver as a basic tool within the heuristic framework. This hybridization of MP with metaheuristics leads to the matheuristic approach, where the heuristic is built around the MP model. Matheuristics became popular in recent years, as witnessed by the publication of dedicated volumes and journal special issues [8, 19, 24] and by the dedicated sessions on MP and metaheuristic conferences.

Designing an effective heuristic is an art that cannot be framed into strict rules. This is particularly true when addressing a matheuristic, which is not a rigid paradigm but a concept framework for the design of mathematically sound heuristics. In this chapter, we will therefore try to illustrate some main matheuristic features with the help of different examples of application.

Section “General-Purpose MIP-Based Heuristics” describes powerful general-purpose MIP heuristics that can be used within the matheuristic framework. Interestingly, these heuristics can themselves be viewed as the first successful applications of the matheuristic idea of hybridizing MP and metaheuristics. Indeed, as noticed in [8], one of the very first illustrations of the power of the matheuristic idea is the general-purpose local branching [7] paradigm, where a black-box MIP solver is used to explore a solution neighborhood defined by invalid constraints added to the MIP model for the sake of easing its solution.

Section “Application 1: Wind Farm Layout Optimization” addresses the design of a matheuristic for wind farm optimization. This application is used to illustrate the importance of the choice of the MIP model: models that are weak in polyhedral terms can be preferred to tighter—but computationally much harder—models when heuristic (as opposed to exact) solutions are required.

Section “Application 2: Prepack Optimization” addresses a packing problem where the model is nonlinear, and the matheuristic is based on various ways to linearize it after a heuristic fixing of some variables.

Finally, section “Application 3: Vehicle Routing” is used to illustrate an advanced feature of matheuristics, namely, the solution of auxiliary MP models that describe a subproblem in the solution process. In particular, we address a vehicle routing problem and derive a matheuristic based on a set-partitioning MIP model asking for the reallocation of a subset of customer sequences subject to capacity and distance constraints.

The present chapter is based on previous published work; in particular, sections “General-Purpose MIP-Based Heuristics”, “Application 1: Wind Farm Layout Optimization”, “Application 2: Prepack Optimization”, and “Application 3: Vehicle Routing” are based on [8, 11, 13, 15], respectively.

General-Purpose MIP-Based Heuristics

Heuristics for general-purpose MIP solvers form the basis of the matheuristic’s toolkit. Their relevance for our chapter is twofold. On the one hand, they are invaluable tools for the solution of the subproblems tailored by the matheuristic when applied to a specific problem. On the other hand, they illustrate the benefits for a general-purpose MIP solver deriving from the use of metaheuristics concepts such as local search and evolutionary methods.

Modern MIP solvers exploit a rich arsenal of tools to attack hard problems. It is widely accepted that the solution of hard MIPs can take advantage from the solution of a series of auxiliary linear programs (LPs) intended to enhance the performance of the overall MIP solver. For example, auxiliary LPs may be solved to generate powerful disjunctive cuts or to implement a strong branching policy. On the other hand, it is a common experience that finding good-quality heuristic MIP solutions often requires a computing time that is just comparable to that needed to solve the LP relaxation. So, it makes sense to think of exact/heuristic MIP solvers where auxiliary MIPs (as opposed to LPs) are heuristically solved on the fly, with the aim of bringing the MIP technology under the chest of the MIP solver itself. This leads to the idea of “translating into a MIP model” (MIPping in the jargon of [9]) some crucial decisions to be taken when designing a MIP-based algorithm.

We next describe the new generation of MIP heuristics that emerged in the late 1990s, which are based on the idea of systematically using a “black-box” external MIP solver to explore a solution neighborhood defined by invalid linear constraints. We address a generic MIP of the form

$$\displaystyle \begin{aligned} \begin{array}{rcl} (MIP)~~~~~ &\displaystyle &\displaystyle \min c^T x \end{array} \end{aligned} $$

(1)

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle &\displaystyle A x \geq b, {} \end{array} \end{aligned} $$

(2)

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle &\displaystyle x_j \in \{0,1\}, ~\forall j \in \mathcal{B}, {} \end{array} \end{aligned} $$

(3)

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle &\displaystyle x_j \mbox{ integer}, ~\forall j \in \mathcal{G}, {} \end{array} \end{aligned} $$

(4)

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle &\displaystyle x_j \mbox{ continuous}, ~\forall j \in \mathcal{C},{} \end{array} \end{aligned} $$

(5)

where A is an m × n input matrix and b and c are input vectors of dimension m and n, respectively. Here, the variable index set $\mathcal {N}:=\{ 1,\dots , n \}$ is partitioned into $({\mathcal {B}}, {\mathcal {G}}, {\mathcal {C}})$, where $\mathcal {B}$ is the index set of the 0-1 variables (if any), while sets $\mathcal {G}$ and $\mathcal {C}$ index the general integer and the continuous variables, respectively. Removing the integrality requirement on variables indexed by $\mathcal {I} := \mathcal {B} \cup \mathcal {G}$ leads to the so-called LP relaxation.

Local Branching

The local branching (LB) scheme of Fischetti and Lodi [7] appears to be one of the first general-purpose heuristics using a black-box MIP solver applied to subMIPs, and it can be viewed as a precursor of matheuristics. Given a reference solution $\bar {x}$ of a MIP with $\mathcal {B}\neq \emptyset $, one aims at finding an improved solution that is “not too far” from $\bar {x}$, in the sense that not too many binary variables need be flipped. To this end, one can define the k-opt neighborhood $\mathcal {N}(\bar {x}, k)$ of $\bar {x}$ as the set of the MIP solutions satisfying the invalid local branching constraint

$$\displaystyle \begin{aligned} \Delta(x,\bar{x}) := \sum_{j\in \mathcal{B}: \bar{x}_j=0} x_j +\sum_{j\in \mathcal{B}: \bar{x}_j=1} (1 - x_j) \leq k, \end{aligned} $$

(6)

for a small neighborhood radius k—an integer parameter typically set to 10 or 20. The neighborhood is then explored (possibly heuristically, i.e., with some small node or time limit) by means of a black-box MIP solver. Experimental results [10] show that the introduction of the local branching constraint typically has the positive effect of driving to integrality many component of the optimal solution of the LP relaxation, improving the so-called relaxation grip and hence the capability of the MIP solver to find (almost) optimal integer solutions within short computing times. Of course, this effect is lost if parameter k is set to a large value—a mistake that would make local branching completely ineffective.

LB is in the spirit of local search metaheuristics and, in particular, of large-neighborhood search (LNS) [29], with the novelty that neighborhoods are obtained through “soft fixing,” i.e., through invalid cuts to be added to the original MIP model. Diversification cuts can be defined in a similar way, thus leading to a flexible toolkit for the definition of metaheuristics for general MIPs.

Relaxation-Induced Neighborhood Search

The relaxation-induced neighborhood search (RINS) heuristic of Danna, Rothberg, and Le Pape [4] also uses a black-box MIP solver to explore a neighborhood of a given solution $\bar {x}$ and was originally designed to be integrated in a branch-and-bound solution scheme. At specified nodes of the branch-and-bound tree, the current LP relaxation solution x ^∗ and the incumbent $\bar {x}$ are compared, and all integer-constrained variables that agree in value are fixed. The resulting MIP is typically easy to solve, as fixing reduces its size considerably, and often provides improved solutions with respect to $\bar {x}$.

Polishing a Feasible Solution

The polishing algorithm of Rothberg [27] implements an evolutionary MIP heuristic which is invoked at selected nodes of a branch-and-bound tree and includes all classical ingredients of genetic computation, namely:

Population: A fixed-size population of feasible solutions is maintained. Those solutions are either obtained within the branch-and-bound tree (by other heuristics) or computed by the polishing algorithm itself.
Combination: Two or more solutions (the parents) are combined with the aim of creating a new member of the population (the child) with improved characteristics. The RINS scheme is adopted, i.e., all variables whose value coincides in the parents are fixed, and the reduced MIP is heuristically solved by a black-box MIP solver within a limited number of branch-and-bound nodes. This scheme is clearly much more time-consuming than a classical combination step in evolutionary algorithms, but it guarantees feasibility of the child solution.
Mutation: Diversification is obtained by performing a classical mutation step that (i) randomly selects a “seed” solution in the population, (ii) randomly fixes some of its variables, and (iii) heuristically solves the resulting reduced MIP.
Selection: Selection of the two parents to be combined is performed by randomly picking a solution in the population and then choosing, again at random, the second parent among those solutions with a better objective value.

Proximity Search

Proximity search [10] is a “dual version” of local branching that tries to overcome the issues related to the choice of the neighborhood radius k. Instead of hard-fixing the radius, proximity search fixes the minimum improvement of the solution value and changes the objective function to favor the search of solutions at small Hamming distance with respect to the reference one.

The approach works in stages, each aimed at producing an improved feasible solution. As in LB or RINS, at each stage a reference solution $\bar {x}$ is given, and one aims at improving it. To this end, an explicit cutoff constraint

$$\displaystyle \begin{aligned} c^T x \le c^T \bar{x} - \theta {} \end{aligned} $$

(7)

is added to the original MIP, where θ > 0 is a given tolerance that specifies the minimum improvement required. The objective function of the problem can then be replaced by the proximity function $\Delta (x, \bar {x})$ defined in (6), to be minimized. One then applies the MIP solver, as a black box, to the modified problem in the hope of finding a solution better than $\bar {x}$. Computational experience confirms that this approach is quite successful (at least, on some classes of problems), due to the action of the proximity objective function that improves the “relaxation grip” of the model.

A simple variant of the above scheme, called “proximity search with incumbent,” is based on the idea of providing $\bar {x}$ to the MIP solver as a staring solution. To avoid $\bar {x}$ be rejected because of the cutoff constraint (7), the latter is weakened to its “soft” version

$$\displaystyle \begin{aligned} \begin{array}{rcl} c^T x \le c^T \bar{x} - \theta (1-\xi) {} \end{array} \end{aligned} $$

(8)

while minimizing $\Delta (x,\bar {x}) + M \xi $ instead of just $\Delta (x,\bar {x})$, where ξ ≥ 0 is a continuous slack variable and M ≫ 0 is a large penalty.

Application 1: Wind Farm Layout Optimization

Green energy became a topic of great interest in recent years, as environmental sustainability asks for a considerable reduction in the use of fossil fuels. The wind farm layout optimization problem aims at finding an allocation of turbines in a given site so as to maximize power output. This strategic problem is extremely hard in practice, both for the size of the instances in real applications and for the presence of several nonlinearities to be taken into account. A typical nonlinear feature of this problem is the interaction among turbines, also known as wake effect. The wake effect is the interference phenomenon for which, if two turbines are located one close to another, the upwind one creates a shadow on the one behind. Interference is therefore of great importance in the design of the layout as it results into a loss of power production for the turbine downstream.

We next outline the main steps in the design of a sound matheuristic scheme for wind farm layout optimization that is able to address the large-size instances arising in practical applications.

Choice of the MIP Model

Different models have been proposed in the literature to describe interference. We will consider first a simplified model from the literature [5], where the overall interference is the sum of pairwise interferences between turbine pairs. The model addresses the following constraints:

(a)
a minimum and maximum number of turbines that can be built is given;
(b)
there should be a minimal separation distance between two turbines to ensure that the blades do not physically clash (turbine distance constraints);
(c)
if two turbines are installed, their interference will cause a loss in the power production that depends on their relative position and on wind conditions.

Let V denote the set of possible positions for a turbine, called “sites” in what follows, and let

N _MIN and N _MAX be the minimum and maximum number of turbines that can be built, respectively;
D _MIN be the minimum distance between two turbines;
dist(i, j) be the Euclidean distance between sites i and j;
I _ij be the interference (loss of power) experienced by site j when a turbine is installed at site i, with I _jj = 0 for all j ∈ V ;
P _i be the power that a turbine would produce if built (alone) at site i.

In addition, let G _I = (V, E _I) denote the incompatibility graph with

$$\displaystyle \begin{aligned} E_I = \{ [i,j] \in V \times V: \, dist(i,j) < D_{\mathrm{MIN}}, \, i<j\}\end{aligned} $$

and let n := |V | denote the total number of sites. Two sets of binary variables are defined:

$$\displaystyle \begin{gathered} x_i = \left\{\begin{array}{ll} 1 & \, \mbox{if a turbine is built at site {$i$}}; \\ 0 & \, \mbox{otherwise} \end{array} \right. \quad(i \in V) \\ z_{ij} = \left\{\begin{array}{ll} 1 & \, \mbox{if two turbines are built at both sites {$i$} and {$j$}}; \\ 0 & \, \mbox{otherwise} \end{array} \right. \quad(i,j \in V, i < j) \end{gathered} $$

The model then reads

$$\displaystyle \begin{aligned} \begin{array}{rcl} \max \sum_{i \in V} P_i x_i - \sum_{i \in V}\sum_{j \in V, i < j} (I_{ij}+I_{ji}) z_{ij} {} \end{array} \end{aligned} $$

(9)

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mbox{s.t.} N_{\mathrm{MIN}} \leq \sum_{i \in V} x_i \leq N_{\mathrm{MAX}} {} \end{array} \end{aligned} $$

(10)

$$\displaystyle \begin{aligned} \begin{array}{rcl} x_i + x_j \leq 1 \forall [i,j] \in E_I {} \end{array} \end{aligned} $$

(11)

$$\displaystyle \begin{aligned} \begin{array}{rcl} x_i + x_j - 1 \leq z_{ij} \forall i, j \in V, i<j {} \end{array} \end{aligned} $$

(12)

$$\displaystyle \begin{aligned} \begin{array}{rcl} x_i \in \{0, 1 \} \forall i \in V {} \end{array} \end{aligned} $$

(13)

$$\displaystyle \begin{aligned} \begin{array}{rcl} z_{ij} \in \{0, 1 \} \forall i,j \in V, i < j {} \end{array} \end{aligned} $$

(14)

Objective function (9) maximizes the total power production by taking interference losses I _ij into account. Constraints (11) model pairwise site incompatibility. Constraints (12) force z _ij = 1 whenever x _i = x _j = 1; because of the objective function, this is in fact equivalent to imposing z _ij = x _i x _j.

The definition of the turbine power vector (P _i) and of interference matrix (I _ij) depends on the wind scenario considered, which greatly varies in time. Using statistical data, one can in fact collect a large number K of wind scenarios k, each associated with a pair (P ^k, I ^k) with a probability π _k, and define the average power and interference to be used in the model as:

$$\displaystyle \begin{aligned} \begin{array}{rcl} P_i := &\displaystyle \displaystyle{\sum_{k=1}^K \pi_k P_{i}^k} &\displaystyle \forall i \in V \end{array} \end{aligned} $$

(15)

$$\displaystyle \begin{aligned} \begin{array}{rcl} I_{ij} := &\displaystyle \displaystyle{\sum_{k=1}^K \pi_k I_{ij}^k} &\displaystyle \forall i, j \in V \end{array} \end{aligned} $$

(16)

While (9), (10), (11), (12), (13), and (14) turns out to be a reasonable model when just a few sites have to be considered (say n ≈ 100), it becomes hopeless when n ≥ 1000 because of the huge number of variables and constraints involved, which grows quadratically with n. Therefore, when facing instances with several thousand sites, an alternative (possibly weaker) model is required, where interference can be handled by a number of variables and constraints that grows just linearly with n. The model below is a compact reformulation of model (9), (10), (11), (12), (13), and (14) that follows a recipe of Glover [17] that is widely used, e.g., in the quadratic assignment problem [12, 32]. The original objective function (to be maximized), rewritten as

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle \displaystyle{\sum_{i \in V} P_i x_i - \sum_{i \in V} \left(\sum_{j \in V} I_{ij} x_j\right)\, x_i } {} \end{array} \end{aligned} $$

(17)

is restated as

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle \displaystyle{\sum_{i \in V} (P_{i} x_i - w_i)} &\displaystyle {} \end{array} \end{aligned} $$

(18)

where

$$\displaystyle \begin{aligned} w_i := \left(\sum_{j \in V} I_{ij} x_j \right)\, x_i = \left\{\begin{array}{ll} \sum_{j \in V} I_{ij} x_j & \mbox{if {$x_i=1$}} \\ 0 & \mbox{if {$x_i=0$}} \end{array} \right. \end{aligned}$$

denotes the total interference caused by site i. Our compact model then reads

$$\displaystyle \begin{aligned} \begin{array}{rcl} \max z = \sum_{i \in V} (P_i x_i - w_i) &\displaystyle &\displaystyle {} \end{array} \end{aligned} $$

(19)

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mbox{s.t.} N_{\mathrm{MIN}} \leq \sum_{i \in V} x_i &\displaystyle \leq &\displaystyle N_{\mathrm{MAX}} {} \end{array} \end{aligned} $$

(20)

$$\displaystyle \begin{aligned} \begin{array}{rcl} x_i + x_j &\displaystyle \leq &\displaystyle 1 \forall [i,j] \in E_I {} \end{array} \end{aligned} $$

(21)

$$\displaystyle \begin{aligned} \begin{array}{rcl} \sum_{j \in V} I_{ij} x_j &\displaystyle \leq &\displaystyle w_i + M_i (1-x_i) \forall i \in V {} \end{array} \end{aligned} $$

(22)

$$\displaystyle \begin{aligned} \begin{array}{rcl} x_i &\displaystyle \in &\displaystyle \{0, 1\} \forall i \in V {} \end{array} \end{aligned} $$

(23)

$$\displaystyle \begin{aligned} \begin{array}{rcl} w_i &\displaystyle \geq &\displaystyle 0 \forall i \in V {} \end{array} \end{aligned} $$

(24)

where the big-M term $M_i = \sum _{j \in V: [i,j] \not \in E_I} I_{ij}$ is used to deactivate constraint (22) in case x _i = 0, in which case w _i = 0 because of the objective function.

Choice of Ad Hoc Heuristics

A simple 1-opt heuristic can be designed along the following lines. At each step, we have an incumbent solution, say $\tilde {x}$, that describes the best-known turbine allocation ($\tilde {x}_i=1$ if a turbine is built at site i, 0 otherwise), and a current solution x. Let

$$\displaystyle \begin{aligned} z = \sum_{i \in V} P_i x_i - \sum_{i \in V}\sum_{j \in V} I_{ij}\, x_i \, x_j \end{aligned}$$

be the profit of the current solution, γ =∑_{i ∈ V} x _i be its cardinality, and define for each j ∈ V the extra-profit δ _j incurred when flipping x _j, namely:

$$\displaystyle \begin{aligned} \delta_j = \left\{\begin{array}{ll} P_j - \displaystyle{\sum_{i\in V: x_i=1} (I_{ij}+I_{ji})} & \mbox{if {$x_j=0$};} \\*[1ex] -P_j + \displaystyle{\sum_{i\in V: x_i=1} (I_{ij}+I_{ji})} & \mbox{if {$x_j=1$}} \end{array} \right.\end{aligned} $$

where we assume I _ij = BIG for all incompatible pairs [i, j] ∈ E _I, and BIG >∑_{i ∈ V} P _i is a large penalty value, while I _ii = 0 as usual.

We start with x = 0, z = 0, and γ = 0 and initialize δ _j = P _j for all j ∈ V . Then, we iteratively improve x by a sequence of 1-opt moves, according to the following scheme. At each iteration, we look in O(n) time for the site j with maximum δ _j + FLIP(j), where function FLIP(j) takes cardinality constraints into account, namely

$$\displaystyle \begin{aligned} FLIP(j) = \left\{\begin{array}{ll} - HUGE & \mbox{if {$x_j=0$} and {$\gamma \geq N_{MAX}$}}\\ - HUGE & \mbox{if {$x_j=1$} and {$\gamma \leq N_{MIN}$}}\\ + HUGE & \mbox{if {$x_j=0$} and {$\gamma < N_{MIN}$}}\\ + HUGE & \mbox{if {$x_j=1$} and {$\gamma > N_{MAX}$}}\\ ~~0 & \mbox{otherwise} \end{array} \right.\end{aligned} $$

with HUGE ≫ BIG (recall that function δ _j + FLIP _j has to be maximized).

Once the best j has been found, say j ^∗, if $\delta _{j^*}+FLIP(j^*)>0$, we just flip $x_{j^*}$; update x, z, and γ in O(1) time; update all δ _j’s in O(n) time (through the parametric technique described in [11]); and repeat. In this way, a sequence of improving solutions is obtained, until a local optimal solution that cannot be improved by just one flip is found. To escape local minima, a simple perturbation scheme can be implemented; see again [11] for details.

A 2-opt heuristic can similarly be implemented to allow a single turbine to move to a better site—a move that requires flipping two variables. Each 2-opt exchange requires O(n ²) time as it amounts to trying n 1-opt exchanges and to apply the best one.

The Overall Matheuristic

Our final approach is a mixture of ad hoc (1- and 2-opt) and general MIP (proximity search with incumbent) heuristics and works as shown in Algorithm 1.

Algorithm 1: The overall matheuristic framework

At Step 2, the heuristics of section “Choice of Ad Hoc Heuristics” are applied in their “initial-solution” mode where one starts with $\tilde {x}=x=0$ and aborts execution when 1-opt is invoked 10,000 consecutive times without improving $\tilde {x}$. At Step 4, instead, a faster “cleanup” mode is applied. As we already have a hopefully good incumbent $\tilde {x}$ to refine, we initialize $x=\tilde {x}$ and repeat the procedure until we count 100 consecutive 1-opt calls with no improvement of $\tilde {x}$. As to time-consuming 2-opt exchanges, they are applied with a certain frequency and in any case just before the final $\tilde {x}$ is returned.

Two different MIP models are used to feed the proximity-search heuristic at Step 6. During the first part of the computation, we use a simplified MIP model obtained from (19), (20), (21), (22), (23), and (24) by removing all interference constraints (22), thus obtaining a much easier problem. A short time limit is imposed for each call of proximity search when this simplified model is solved. In this way we aggressively drive the solution $\tilde {x}$ to increase the number of built turbines, without being bothered by interference considerations and only taking pairwise incompatibility (21) into account. This approach quickly finds better and better solutions (even in terms of the true profit), until either (i) no additional turbine can be built or (ii) the addition of new turbines does in fact reduce the true profit associated to the new solution because of the neglected interference. In this situation we switch to the complete model (19), (20), (21), (22), (23), and (24) with all interference constraints, which is used in all next executions of Step 6. Note that the simplified model is only used at Step 6, while all other steps of the procedure always use the true objective function that takes interference into full account.

Computational Results

The following alternative solution approaches were implemented in C language, some of which using the commercial MIP-solver IBM ILOG Cplex 12.5.1 [21]; because of the big-Ms involved in the models, all Cplex’s codes use zero as integrality tolerance (CPX_PARAM_EPINT = 0.0).

(a)
proxy: The matheuristic outlined in the previous section, built on top of Cplex with the following aggressive parameter tuning: all cuts deactivated, CPX_PARAM_RINSHEUR = 1, CPX_PARAM_POLISHAFTERTIME = 0.0, CPX_PARAM_INTSOLLIM = 2;
(b)
cpx_def: The application of IBM ILOG Cplex 12.5.1 in its default setting, starting from the same heuristic solution $\tilde {x}$ available right after the first execution of Step 2 of Algorithm 1;
(c)
cpx_heu: Same as cpx_def, with the following internal tuning intended to improve Cplex’s heuristic performance: all cuts deactivated, CPX_PARAM_RINSHEUR = 100, CPX_PARAM_POLISHAFTERTIME = 20% of the total time limit;
(d)
loc_sea: A simple heuristic not based on any MIP solver, that just loops on Steps 4 of Algorithm 1 and randomly removes installed turbines from the current best solution after 10,000 iterations without improvement of the incumbent.

For each algorithm, we recorded the best solution found within a given time limit.

In our view, loc_sea is representative of a clever but not oversophisticated metaheuristic, as typically implemented in practice, while cpx_def and cpx_heu represent a standard way of exploiting a MIP model once a good feasible solution is known.

Our test bed refers to an offshore 3,000 × 3,000 (m) square with D _MIN = 400 (m) minimum turbine separation, with no limit on the number of turbines to be built (i.e., N _MIN = 0 and N _MAX = +∞). Turbines are all of Siemens SWT-2.3-93 type (rotor diameter 93 m), which produces a power of 0.0 MW for wind speed up to 3 m/s, of 2.3 MW for wind speed greater than or equal to 16 m/s, and intermediate values for winds in range 3–16 m/s according to a nonlinear power curve [30]. Pairwise interference (in MW) was computed using Jensen’s model [22], by averaging 250,000+ real-world wind samples. Those samples were grouped into about 500 macro-scenarios to reduce the computational time spent for the definition of the interference matrix. A pairwise average interference of 0.01 MW or less was treated as zero. The reader is referred to [6] for details.

We generated five classes of medium-to-large problems with n ranging from 1,000 to 20,000. For each class, ten instances have been considered by generating n uniformly random points in the 3,000 × 3,000 square. (Although in the offshore case turbine positions are typically sampled on a regular grid, we decided to randomly generate them to be able to compute meaningful statistics for each value of n.)

In what follows, reported computing times are in CPU sec.s of an Intel Xeon E3-1220 V2 quad-core PC with 16GB of RAM and do not take Step 1 of Algorithm 1 into account as the interference matrix is assumed to be precomputed and reused at each run.

Computational results on our instances are given in Table 1, where each entry refers to the performance of a given algorithm at a given time limit. In particular, the left part of the table reports, for each algorithm and time limit, the number of wins, i.e., the number of instances for which a certain algorithm produced the best-known solution at the given time limit (ties allowed).

Table 1 Number of times each algorithm finds the best-known solution within the time limit (wins) and optimality ratio with respect to the best-known solution—the larger, the better

Full size table

According to the table, proxy outperforms all competitors by a large amount for medium-to-large instances. As expected, cpx_heu performs better for instances with n = 1,000 as it is allowed to explore a large number of enumeration nodes for the original model and objective function. Note that loc_sea has a good performance for short time limits and/or for large instances, thus confirming its effectiveness, whereas cpx_heu is significantly better than loc_sea only for small instances and large time limits.

A different performance measure is given in the right-hand side part of Table 1, where each entry gives the average optimality ratio, i.e., the average value of the ratio between the solution produced by an algorithm (on a given instance at a given time limit) and the best solution known for that instance—the closer to one, the better. It should be observed that an improvement of just 1% has a very significant economical impact due to the very large profits involved in the wind farm context. The results show that proxy is always able to produce solutions that are quite close to the best one. As before, loc_sea is competitive for large instances when a very small computing time is allowed, whereas cpx_def and cpx_heu exhibit a good performance only for small instances and are dominated even by loc_sea for larger ones.

Application 2: Prepack Optimization

Packing problems play an important role in industrial applications. In these problems, a given set of items has to be packed into one or more containers (bins) so as to satisfy a number of constraints and to optimize some objective function.

Most of the contributions from the literature are devoted to the case where all the items have to be packed into a minimum number of bins so as to minimize, e.g., transportation costs; within these settings, only loading costs are taken into account. The resulting problem is known as the bin packing problem and has been widely studied in the literature both in its one-dimensional version [25] and in its higher-dimensional variants [23].

We will next consider a different packing problem arising in inventory allocation applications, where the operational cost for packing the bins is comparable, or even higher, than the cost of the bins themselves. This is the case, for example, for warehouses that have to manage a large number of different customers (e.g., stores), each requiring a given set of items. Assuming that automatic systems are available for packing, the required workforce is related to the number of different ways that are used to pack the bins to be sent to the customers. To limit this cost, a hard constraint can be imposed on the total number of different box configurations that are used.

Prepacking items into box configurations has obvious benefits in terms of easier and cheaper handling, as it reduces the amount of material handled by both the warehouse and the customers. However, the approach can considerably reduce the flexibility of the supply chain, leading to situations in which the set of items that are actual shipped to each customer may slightly differ from the required one—at the expense of some cost in the objective function. In addition, an upper bound on overstocking is usually imposed for each store.

The resulting problem, known as prepack optimization problem (POP), was recently addressed in [20], where a real-world application in the fashion industry is presented, and heuristic approaches are derived using both constraint programming (CP) and MIP techniques.

Mathematical Model

In this section we briefly formalize POP and review the mathematical model introduced in [20]. We are given a set I of types of products and a set S of stores. Each store s ∈ S requires an integer number r _is of products of type i ∈ I. Bins with different capacities are available for packing items: we denote by K ⊂ Z ₊ the set of available bin capacities.

Bins must be completely filled and are available in an unlimited number for each type. A box configuration describes the packing of a bin, in terms of number of products of each type that are packed into it. We denote by NB the maximum number of box configurations that can be used for packing all products and by B = {1, …, NB} the associated set.

Products’ packing into boxes is described by integer variables y _bi: for each product type i ∈ I and box configuration b ∈ B, the associated variable y _bi indicates the number of products of type i that are packed into the b-th box configuration. In addition, integer variables x _bs are used to denote the number of bins loaded according to box configuration b that have to be shipped to store s ∈ S.

Understocking and overstocking of product i at store s are expressed by decisional variables u _is and o _is, respectively. Positive costs α and β penalize each unit of under- and overstocking, respectively, whereas an upper bound δ _is on the maximum overstocking of each product at each store is also imposed.

Finally, for each box configuration b ∈ B and capacity value k ∈ K, a binary variable t _bk is introduced that takes value 1 if box configuration b corresponds to a bin of capacity k.

Additional integer variables used in the model are q _bis = x _bs y _bi (number of items of type i sent to store s through boxes loaded with configuration b); hence, ∑_{b ∈ B} q _bis gives the total number of products of type i that are shipped to store s.

A mixed-integer nonlinear programming (MINLP) model then reads:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \min \sum_{s \in S} \sum_{i \in I} (\alpha u_{is} + \beta o_{is}) &\displaystyle {} \end{array} \end{aligned} $$

(25)

$$\displaystyle \begin{aligned} \begin{array}{rcl} q_{bis} = x_{bs} y_{bi} &\displaystyle &\displaystyle (b\in B; i \in I; s \in S) {} \end{array} \end{aligned} $$

(26)

$$\displaystyle \begin{aligned} \begin{array}{rcl} \sum_{b\in B} q_{bis} - o_{is} + u_{is} = r_{is} &\displaystyle &\displaystyle (i \in I; s \in S) {} \end{array} \end{aligned} $$

(27)

$$\displaystyle \begin{aligned} \begin{array}{rcl} \sum_{i \in I} y_{bi} = \sum_{k \in K} k \, t_{bk} &\displaystyle &\displaystyle (b\in B) {} \end{array} \end{aligned} $$

(28)

$$\displaystyle \begin{aligned} \begin{array}{rcl} \sum_{k \in K} t_{bk} = 1 &\displaystyle &\displaystyle (b\in B) {} \end{array} \end{aligned} $$

(29)

$$\displaystyle \begin{aligned} \begin{array}{rcl} o_{is} \leq \delta_{is} &\displaystyle &\displaystyle (i \in I; s \in S) {} \end{array} \end{aligned} $$

(30)

$$\displaystyle \begin{aligned} \begin{array}{rcl} t_{bk} \in \{0, 1\} &\displaystyle &\displaystyle (b\in B; k \in K) {} \end{array} \end{aligned} $$

(31)

$$\displaystyle \begin{aligned} \begin{array}{rcl} x_{bs} \geq 0 \mbox{ integer} &\displaystyle &\displaystyle (b\in B; s \in S) {} \end{array} \end{aligned} $$

(32)

$$\displaystyle \begin{aligned} \begin{array}{rcl} y_{bi} \geq 0 \mbox{ integer} &\displaystyle &\displaystyle (b\in B; i \in I) {} \end{array} \end{aligned} $$

(33)

The model is of course nonlinear, as the bilinear constraints (26) involve the product of decision variables. To derive a linear MIP model, the following standard technique can be used. Each x _bs variable is decomposed into its binary expansion using binary variables v _bsl (l = 0, …, L), where L is easily computed from an upper bound on x _bs. When these variables are multiplied by y _bi, the corresponding product w _bisl = v _bsl y _bi are linearized with the addition of suitable constraints.

Our final MIP is therefore obtained from (25), (26), (27), (28), (29), (30), (31), (32), and (33) by adding

$$\displaystyle \begin{aligned} \begin{array}{rcl} x_{bs} = \sum_{l=0}^L 2^l v_{bsl} &\displaystyle &\displaystyle (b\in B; s \in S) {} \end{array} \end{aligned} $$

(34)

$$\displaystyle \begin{aligned} \begin{array}{rcl} v_{bsl} \in \{0, 1\} &\displaystyle &\displaystyle (b\in B; s \in S; l=0,\ldots, L) {} \end{array} \end{aligned} $$

(35)

and by replacing each nonlinear equation (26) with the following set of new variables and constraints:

$$\displaystyle \begin{aligned} \begin{array}{rcl} q_{bis} = \sum_{l=0}^L 2^l w_{bisl} &\displaystyle &\displaystyle (b \in B; i \in I; s \in S) {} \end{array} \end{aligned} $$

(36)

$$\displaystyle \begin{aligned} \begin{array}{rcl} w_{bisl} \leq \overline Y v_{bsl} &\displaystyle &\displaystyle (b\in B; i \in I; s \in S; l=0,\ldots, L) {} \end{array} \end{aligned} $$

(37)

$$\displaystyle \begin{aligned} \begin{array}{rcl} w_{bisl} \leq y_{bi} &\displaystyle &\displaystyle (b\in B; i \in I; s \in S; l=0,\ldots, L) {} \end{array} \end{aligned} $$

(38)

$$\displaystyle \begin{aligned} \begin{array}{rcl} w_{bisl} \geq y_{bi} - \overline Y (1-v_{bsl})&\displaystyle &\displaystyle (b\in B; i \in I; s \in S; l=0,\ldots, L) {} \end{array} \end{aligned} $$

(39)

$$\displaystyle \begin{aligned} \begin{array}{rcl} w_{bisl} \geq 0 &\displaystyle &\displaystyle (b\in B; i \in I; s \in S; l=0,\ldots, L) {} \end{array} \end{aligned} $$

(40)

where $\overline Y$ denotes an upper bound on the y variables.

In case all capacities are even, the following constraint—though redundant—plays a very important role in improving the LP bound of our MIP model:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \sum_{i \in I} (u_{is}+o_{is}) \ge 1 &\displaystyle &\displaystyle \left(s \in S: \sum_{i\in I} r_{is} \text{ is odd}\right) {} \end{array} \end{aligned} $$

(41)