1 Introduction

In multiobjective optimization, the goal is to generate a set of solutions that induces the nondominated frontier (NDF), also known as the Pareto front (Pareto 1996). The NDF is the set of nondominated points (NDPs), where an NDP is a vector of objective values evaluated at a feasible solution with the property that there exists no other feasible solution that is at least as good in all objective values and is better in at least one of them. This paper focuses on multiobjective mixed integer linear programs (MOMIPs) defined by: continuous and integer variables, linear constraints, and two linear objective functions; it also focuses on exact algorithms which produce the complete NDF.

Multiobjective integer linear programming problems have been studied for many decades; see, for example, the surveys of Ehrgott and Gandibleux (2000), Gandibleux (2006), Stidsen et al. (2014) and Halffmann et al. (2022). In this paper, we restrict our discussion to criterion space search methods, in which the search for the NDF operates in the space of the vectors of objective values, known as the criterion space. These methods benefit from advances in single-objective mathematical programming solvers, since single-objective linear programs (LPs) and integer linear programs (IPs) are repeatedly solved.

To date, there is a rich history of criterion space search algorithms for generating the NDF of biobjective (pure) IPs (Chankong et al. 2008; Ralphs et al. 2004; Boland et al. 2015a; Dai and Charkhgard 2018) and/or more objectives (Sylva and Crema 2004; Przybylski et al. 2010b; Kirlik and Sayın 2014; Klamroth et al. 2015; Dächert and Klamroth 2015; Boland et al. 2016, 2017; Tamby and Vanderpooten 2020). The mixed integer case poses quite different challenges. One challenge is that MOMIP frontiers can have a complex structure. The frontier of a biobjective mixed integer program (BOMIP), for example, can contain open, half-open, and closed line segments, in addition to isolated points (see Fig. 1 for an NDF of a BOMIP). The presence of these open, half-open, and closed line segments introduces many numerical challenges in the implementation of algorithms for generating the NDF of a BOMIP. In the last decade, several computationally effective, criterion space algorithms have appeared for BOMIPs (Boland et al. 2015b; Soylu and Yıldız 2016; Fattahi and Turkay 2018; Soylu 2018; Perini et al. 2019; Emre 2020).Footnote 1

Recent work in MOMIPs includes generalizing to three or more objectives (Rasmi et al. 2017; Rasmi and Türkay 2019; Pettersson and Ozlen 2019; Ceyhan et al. 2023) and/or nonlinear objective functions (Cabrera-Guerrero et al. 2021; Diessel 2022; Eichfelder et al. 2023b).

Two-phase algorithms have played an important role in biobjective (pure) IPs; see, for example, Przybylski et al. (2010b) and Dai and Charkhgard (2018). The computational improvement of our two-stage approach (over the component methods) align with the recent two-stage application of balanced box method and the \(\epsilon \)-constraint method (Dai and Charkhgard 2018).

Our work also contributes to expanding the library of BOMIP test instances; deeper investigation of generating instances for MOMIPs has only gained recent attention (Eichfelder 2023a).

Fig. 1
figure 1

The nondominated frontier of a BOMIP with minimization objectives. The shaded regions represent the images of feasible solutions with a common integer part. The nondominated frontier is darkened

In this paper, we offer the following contributions:

  1. 1.

    We extend and merge ideas from two BOMIP algorithms, namely the \(\epsilon \)-Tabu Method (Soylu and Yıldız 2016) and the Boxed Line Method (Perini et al. 2019). The result is a two-stage hybridization whose performance is fast and, importantly, robust for generating the NDF of a BOMIP; it is robust in the sense that its two hyperparameters may be tuned so that it works well across a wide range of instances with different characteristics.

  2. 2.

    We demonstrate that the algorithm can produce a high-quality subset of the NDF in a fraction of the time it takes to generate the complete NDF. We rigorously explore this approximation of the NDF of a BOMIP and propose metrics to assess the quality of such an approximation.

  3. 3.

    We augment existing classes of test instances (from two pre-existing classes to four classes, total), which diversifies the set of test instances for BOMIP algorithms and justifies our claim that the algorithm is robust to a wide range of structural properties of the NDF.

  4. 4.

    We publish the code for this algorithm on GitHub (Pecin et al. 2022), which provides a widely applicable and highly efficient BOMIP algorithm with just two tunable hyperparameters.

The remainder of the paper is organized as follows. In Sect. 2, we introduce notation, key definitions, and review existing methods, including the \(\epsilon \)-Tabu Method and the Boxed Line Method. In Sect. 3, we propose enhancements to the \(\epsilon \)-Tabu Method and the Boxed Line Method, which improve their performance and robustness. In Sect. 4, we present the two-stage algorithm that merges ideas from these two methods. In Sect. 5, we discuss how the algorithm can be used to quickly produce an approximation of the NDF and metrics that assess the quality of that approximation. In Sect. 6, we present the results of a thorough computational study (full details included in the Appendices). We conclude in Sect. 7 with some final remarks.

2 Definitions and overview of component methods

We consider the biobjective mixed integer linear program (BOMIP)

$$\begin{aligned} \min _{x\in \mathcal {X}} \lbrace z(x):= (z_1(x), z_2(x)) \rbrace \end{aligned}$$
(1)

where \(z_1(x), z_2(x)\) are linear in x and the feasible region \(\mathcal {X}\subseteq \mathbb {Z}^{n_I} \times \mathbb {R}^{n_C}\) is assumed to be nonempty and bounded. To differentiate between the integer and continuous components of \(x\in \mathcal {X}\), we use the convention \(x=(x_I,x_C)\) where \(x_I\in \mathbb {Z}^{n_I}\) and \(x_C\in \mathbb {R}^{n_C}\). Let the projections of \(\mathcal {X}\) onto the set of integer and real vectors be defined as \(\mathcal {X}_I:= \{x_I\in \mathbb {Z}^{n_I}: (x_I,x_C)\in \mathcal {X},\ x_C\in \mathbb {R}^{n_C}\}\) and \(\mathcal {X}_C:= \{x_C\in \mathbb {R}^{n_C}: (x_I,x_C)\in \mathcal {X},\ x_I\in \mathbb {Z}^{n_I}\}\), respectively. The feasible region \(\mathcal {X}\) lies in the decision space, \(\mathbb {R}^{n_I+n_C}\). The image of \(\mathcal {X}\) under \(z(\cdot )\), denoted by \(\mathcal {Y}:=\lbrace y\in \mathbb {R}^2: y=z(x),\ x\in \mathcal {X} \rbrace \), lies in the criteria space, \(\mathbb {R}^2\).

For \(x^1,x^2 \in \mathcal {X}\), if \(z_i(x^1) \le z_i(x^2)\) for \(i=1,2\) and \(z(x^1) \ne z(x^2)\), then \(z(x^1)\) dominates \(z(x^2)\). If \(x^N\in \mathcal {X}\) and there does not exist \(x\in \mathcal {X}\) such that z(x) dominates \(z(x^N)\), then \(z(x^N)\) is a nondominated point (NDP) and \(x^N\) is efficient. The union of all NDPs is the nondominated frontier (NDF), which we denote by \(\mathcal {N}\). Hereon, we focus on biobjective optimization, since our methods depend on a natural ordering of points in the NDF.

A single NDP of (1) can be found by solving single-objective IPsFootnote 2\(\mathcal {X}\) for example with lexicographic optimization or a weighted scalarization. The lexicographic IP hierarchically minimizes two objectives in turn. We denote the case of minimizing \(z_1(x)\) and then \(z_2(x)\) by

$$\begin{aligned} \eta = \text {lexmin}\lbrace (z_1(x), z_2(x)):\ x\in \mathcal {X} \rbrace . \end{aligned}$$
(2)

Solving (2) requires solving two IPs in sequence: \(\eta _1 = \text {min}\lbrace z_1(x):\ x\in \mathcal {X} \rbrace \) and then \(\eta _2 = \text {min}\lbrace z_2(x):\ z_1(x)\le \eta _1, \ x\in \mathcal {X} \rbrace \) are solved, resulting in an NDP \(\eta =(\eta _1,\eta _2)\) of (1). In practice, the second IP tends to solve very quickly. For given vector \(\lambda \in \mathbb {R}^2_+\), we refer to

$$\begin{aligned} \min \lbrace \lambda ^T z(x):\ x\in \mathcal {X} \rbrace \end{aligned}$$
(3)

as the scalarized IP with respect to \(\lambda \) (Zadeh 1963). If \(x^\lambda \) is an optimal solution to (3) with positive \(\lambda \), then \(z(x^\lambda )\) is an NDP of (1) (Geoffrion 1968). Not every NDP in the NDF can be found by such an IP: if, for a given NDP \(z(x_N)\), there exists positive vector \(\lambda \) such that \(x^N\) is an optimal solution to (3), then the NDP is supported; otherwise, the NDP is unsupported.

The NDF can be described by nondominated line segments, vertical gaps, and horizontal gaps. Define \(L(z^1,z^2)\) to be the line segment connecting endpoints \(z^1,z^2\in \mathbb {R}^2\), where the endpoints are ordered from left to right so that \(z^1_1 \le z^2_1\). The line segment may be open at both ends, half-open, or closed. Thus

$$\begin{aligned} \{\xi z^1 + (1-\xi )z^2: 0< \xi < 1 \} \subseteq L(z^1,z^2) \subseteq \{\xi z^1 + (1-\xi )z^2: 0\le \xi \le 1 \}. \end{aligned}$$

For each \(i=1, 2\), we refer to the endpoint \(z^i\) of \(L(z^1,z^2)\) as closed if \(z^i\) belongs to the line segment, i.e., if \(z^i\in L(z^1, z^2)\), and as open otherwise. In the case that \(z^1=z^2\), the line segment \(L(z^1,z^2)\) consists of a single NDP. If all the points in \(L(z^1,z^2)\) are nondominated and \(L(z^1,z^2)\) is maximal with respect to set inclusion, then we call \(L(z^1,z^2)\) a nondominated line segment (NLS).Footnote 3

The NDF may be described as a finite sequence of NLSs, \(L(y^1,z^1), \dots ,\) \( L(y^K,z^K)\), say, for some \(K\ge 1\), with \(z^k_1 \le y^{k+1}_1\) and \(z^k_2 \ge y^{k+1}_2\) for all \(k=1, \dots , K-1\), and for which

$$\begin{aligned} y^1_1< y^2_1< \dots < y^K_1 \qquad \text {and}\qquad y^1_2> y^2_2> \dots > y^K_2. \end{aligned}$$

A gap may appear between second and first endpoints, respectively, of two consecutive NLSs in the NDF: for some k, it may be that \(z^k_1 < y^{k+1}_1\), which is a gap in the horizontal direction, and/or \(z^k_2 > y^{k+1}_2\), which implies a gap in the vertical direction. In the case that \(z^k_1 < y^{k+1}_1\), we define the interval \((z^k_1, y^{k+1}_1)\subset \mathbb {R}\) to be a horizontal gap. In this case, there exists no NDP p with \(p_1\in (z^k_1, y^{k+1}_1)\) and \(z^k\) must be an NDP and hence must be a closed endpoint. In the case that \(z^k_2 > y^{k+1}_2\), we define the interval \((y^{k+1}_2,z^k_2)\subset \mathbb {R}\) to be a vertical gap. In this case, there exists no NDP p with \(p_2\in (y^{k+1}_2,z^k_2)\) and \(y^{k+1}\) must be an NDP and hence a closed endpoint. If there is a horizontal gap but no vertical gap between \(z^k\) and \(y^{k+1}\), then \(y^{k+1}\) must be an open endpoint, and if there is a vertical gap but no horizontal gap between \(z^k\) and \(y^{k+1}\), then \(z^k\) must be an open endpoint.

Given \(x_I\in \mathcal {X}_I\), the BOLP obtained from fixing the integer variables to \(x_I\) is called the slice problem for \(x_I\) (Belotti et al. 2013). We call the NDF of a slice problem, which consists of a connected set of closed line segments, a slice.Footnote 4 The NDF of a slice problem for \(x_I\) is called the slice for \(x_I\). Common methods for solving the BOLP slice problem include dichotomic search (Aneja and Nair 1979) (see “Appendix A.1” for reference) or parametric simplex (Yu and Zeleny 1975) which return the slice for one \(x_I\). The NDF of the BOMIP is contained in the union, over all \(x_I\in \mathcal {X}_I\), of the slice for \(x_I\). The NDF consists of all points in this union that are not dominated by any other point in the union.

2.1 \(\epsilon \)-Tabu method

The \(\epsilon \)-Tabu Method (\(\epsilon \)TM) is an extension of the \(\epsilon \)-constraint method for biobjective IPs (Haimes 1971; Chankong et al. 2008), which requires additional no-good or “Tabu” constraints for continuous portions of the frontier, and generates a BOMIP frontier from left to right (or vice-versa). An abbreviated pseudocode (Algorithm 1) is included in “Appendix A”; for more details see Soylu and Yıldız (2016).

To initialize, \(\epsilon \)TM solves a lexicographic IP to find the upper left NDP of the frontier, \(z^L = \text {lexmin}\lbrace (z_1(x),z_2(x)): x\in \mathcal {X}\rbrace \). Next, \(\epsilon \)TM repeats two steps: (1) solve the slice problem for the current integer solution, and (2) check (sequentially from left to right) whether each line segment in the resulting slice is dominated or not using a (modified) lexicographic IP. Once a line segment is found to be dominated, then return to (1) with the solution of the dominating image.

The latter step involves searching for the leftmost NDP “below” the line segment. If such an NDP is found, then \(\epsilon \)TM updates the rightmost endpoint of the line segment using the vertical projection of the new NDP onto the line segment; the line segment from the leftmost endpoint to the projected point is a NLS. Then, \(\epsilon \)TM processes the slice to which the new NDP belongs. If, on the other hand, no such NDP is found, then the full line segment is a NLS. Hence, \(\epsilon \)TM adds it to the frontier and proceeds to check the next line segment in the slice. If all line segments in the slice are confirmed to be nondominated, then \(\epsilon \)TM searches for the leftmost NDP “to the right” of the slice. If such an NDP is found, it defines the next slice. If no such NDP is found, the complete NDF has been found and \(\epsilon \)TM finishes.

If an NDF contains many consecutive NLSs from the same slice, as is the case in some benchmark instances, then it may be advantageous (more efficient) to check whether a set of consecutive line segments from a slice is dominated or not. This new enhancement is presented in more detail in Sect. 3.1.

2.2 Boxed line method

The Boxed Line Method (BLM) (Perini et al. 2019) extended the Balanced Box Method (Boland et al. 2015a), which solves biobjective pure IPs, to handle continuous variables. Multiple variants of BLM were presented. The basic variant, described next is also summarized by pseudocode (Algorithm 2) in “Appendix A”.Footnote 5 To initialize, BLM solves two lexicographic IPs to find the upper left and lower right NDPs of the frontier. These define a rectangular region, or “box”, that is added to a queue. The outer loop of BLM iteratively processes boxes in this queue until it is empty; once an iteration completes and the queue remains empty, the algorithm terminates.

The first step in processing a box is to search for the leftmost NDP in the lower half of the box. This is achieved by solving a lexicographic IP constrained to the region of the criterion space below a horizontal line that splits the box. If the NDP found lies strictly below the split line, the outer loop solves a mirrored lexicographic IP to find the rightmost NDP that is in the box and to the left of the NDP already found. Up to two new boxes are added to the queue with a newfound NDP as a corner pointFootnote 6; boxes with a width/height smaller than some numerical tolerance are discarded.

Otherwise, the NDP lies on the split line and it must be that the split line intersects a NLS whose endpoints are yet unknown. To find these endpoints, BLM will first generate an overestimate (i.e., a superset) of the NLS by computing a line segment from a slice that contains the NDP. The inner loop is invoked to refine this line segment, eliminating dominated sections of it until it is proved to be a NLS. The original inner loop procedure in BLM accomplishes this by solving weighted-sum scalarization IPs. Computational experiments have revealed that solve times for these IPs are unpredictable and can sometimes be (too) long. To eliminate this unpredictability, we present replacement IPs in Sect. 3.2.

BLM quickly partitions the criterion space into regions that are empty or dominated and boxes that are unexplored. If an iteration of the outer loop begins processing a box with area X, the sum of the areas of the new boxes added to the queue is at most X/2. Furthermore, since each box can be processed independently, BLM is easily parallelizable. These strengths facilitate a rapid approximation of the NDF.

An additional variant, the recursive variant, of BLM is included during our computational study; it is described in detail in Perini et al. (2019).

2.3 Diversity of methods and singularity of instances

The literature on BOMIP algorithms has diverged far more than the set of test instances with which to compare. The original algorithm to solve BOMIP was the Triangle Splitting Algorithm (TSA) (Boland et al. 2015b); however, direct comparison to TSA has been ill-advised for reasons including the use of a relative error tolerance, as discussed in Fattahi and Turkay (2018) and Perini et al. (2019). The One-Direction Search (ODS) algorithm (Fattahi and Turkay 2018) is very close in design and spirit to \(\epsilon \)TM, and so we do not discuss it further. The computational study for the Search-and-Remove (SR) algorithm (Soylu 2018) compared SR against TSA and \(\epsilon \)TM on instances from Boland et al. (2015b). Emre (2020) presents a BOMIP method using the Pascoletti-Serafini scalarization, and the computational comparisons (although not directly on the same machine) are with SR and ODS.

The algorithms tested in this paper are coded efficiently for large-scale instances. The library of large-scale, linear instances from Boland et al. (2015b) remain the standard for computational experimentation of BOMIP algorithms. While randomly generated, their NDFs have a consistent, monolithic structure, as first discussed in Perini et al. (2019). We argue that this simplicity of NDF structure leads to a lack in understanding of the robustness of algorithms to different structural characteristics.

3 Enhancements

This section presents enhancements for both component methods.

3.1 An enhanced implementation of the \(\epsilon \)-Tabu method

An enhanced implementation of \(\epsilon \)TM, denoted by \(\epsilon \)TM-PWL, solves a single lexicographic IP based on a mixed integer programming model of piecewise linear (PWL) functions. This single IP simultaneously considers multiple line segments of a slice. That is, rather than investigating the line segments one by one, from “left to right”, we investigate them simultaneously, by solving a single lexicographic IP.

Suppose that from NDP \(z^0 = z(x^0)\), the slice problem is solved for \(x^0_I\) to generate a representative list of \(K \ge 1\) line segments \(\{L(z^0,z^1), L(z^1,z^2), \ldots ,\) \(L(z^{K-1},z^K)\}\) ordered from left to right.Footnote 7 To check a single line segment, say \(L(z^i,z^{i+1})\), where \(z^i\) is known to be nondominated, \(\epsilon \)TM solves the following lexicographic IP:

$$\begin{aligned} {\hat{z}}=\text {lexmin}\quad&(z_1(x),z_2(x)) \nonumber \\ \text {s.t.} \quad&z_1(x)\le \lambda z^i_1 + (1-\lambda )z^{i+1}_1,\nonumber \\&z_2(x)\le \lambda z^i_2 + (1-\lambda )z^{i+1}_2,\nonumber \\&x_I\ne x^0_I,\nonumber \\&x\in \mathcal {X}, 0\le \lambda \le 1, \end{aligned}$$
(4)

where \(x_I \ne x^0_I\) represents a no-good constraint. If (4) is feasible, then \({\hat{z}}\) is the leftmost NDP from a different slice that dominates some part of the line segment \(L(z^i,z^{i+1})\).

For \(\epsilon \)TM-PWL, we propose a new formulation that considers a set of (at most) K line segments in the slice simultaneously and models the piecewise linear slice function using binary variables (see, for example, Wolsey and Nemhauser 2014). Let \(K>0\) be the first hyperparameter we introduce.Footnote 8 We introduce K binary variables, \(y_i\) for \(i = 1,\ldots ,K\) where \(y_i\) is associated with \(L(z^{i-1}, z^{i})\), and \(K+1\) continuous variables, \(\lambda _i\) for \(i = 0,\ldots ,K\), one for each of the end points of the line segments. The following minimization problem computes the “left-most” point from a different slice that dominates any of the line segments:

figure b

In a feasible solution of this model, \(y_i=1\) indicates that the NDP found is “below” line segment \(L(z^{i-1}, z^{i})\). Constraints (5b), together with the binary constraints on \(y_i\) variables, ensure that at most two convex multipliers (\(\lambda _i\) variables) may be positive, the rest must be zero, and that any positive values must be consecutive: \(y_i=1\) for exactly one i, which forces \(y_j=0\) for all \(j\ne i\) and hence \(\lambda _j=0\) for all \(j\not \in \{i-1, i\}\). Since \(\lambda _{i-1}+\lambda _i=1\), \((\sum _{j=0}^K\lambda _jz^j_1,\sum _{j=0}^K\lambda _jz^j_2)\) is a convex combination of the points \(z^{i-1}\) and \(z^i\), and hence is a point on the ith line segment, \(L(z^{i-1}, z^i)\).

Constraints (5c) and (5d) thus ensure that the image z(x) either dominates or coincides with a point on this line segment, while the objective function ensures that x gives the left-most such image. If the IP is infeasible, then the slice is nondominated (as there exists no feasible solution “below” its frontier). Otherwise the IP is feasible, and solving

$$\begin{aligned} {{\hat{z}}}_2 = \min \{z_2(x)\;:\; z_1(x) \le {\hat{z}}_1,\ x_I \ne x_I^0,\ x \in \mathcal {X} \}, \end{aligned}$$

ensures that the result, \({{\hat{z}}}\), is a NDP. By construction, there is a unique \(i\in \{1,\dots ,K\}\) such that \(z^{i-1}_1 < {{\hat{z}}}_1 \le z^i_1\). Then \(L(z^{j-1},z^j)\) is a NLS for all \(j = 1,\dots ,i-1\) and \(L(z^{i-1},{{\tilde{z}}})\) is a NLS, where \({{\tilde{z}}}\) is the vertical projection of \({{\hat{z}}}\) onto \(L(z^{i-1},z^i)\).

Table 1 Average and maximum relative increase in total running time of the variations of \(\epsilon \)TM for all sets of instances

The proposed formulation adds \(K + 3\) new constraints and \(2K+1\) new variables (of which K are binary). Note that one may decide to work with a partial slice, i.e., with the left-most \({{\hat{K}}} < K\) line segments obtained when solving the slice problem. That is, stop solving the slice problem after \({{\hat{K}}}\) line segments have been found and check if that portion of the frontier is dominated. If no dominated point is found, then resume solving the slice problem starting from the \(({{\hat{K}}}+1)\)-th point in the frontier, and repeat the process until we find either a dominating point or we reach the end of the frontier. Observe that when \({\hat{K}}=1\), this enhancement collapses to the original \(\epsilon \)TM. Although we treat the hyperparameter K as a fixed value, it could instead be dynamically obtained and adjusted throughout the execution of the algorithm, as in Herszterg (2020).

Fig. 2
figure 2

Comparing solve times of lexicographic IP (6) (left) and scalarized IPs (right) across instances with respect to the area of boxed regions. Times are measured during BLM (basic variant) and plotted as points. Correlation coefficients are given in the upper left of each graph. Note that only one lexicographic IP is solved on the Bent instance, so the correlation is not well defined even though it is marked as 0. The correlation is at least 0.7 for scalarized IPs solved on the generated instances, i.e., Rand and Bent; elsewhere, the correlation is negligible

The computational results of this enhancement are reported in Sect. 6.2 and summarized in Table 1.

3.2 A purely lexicographic boxed line method

Recall that in the basic variant of BLM, the line generation procedure returns a line segment from a slice, which is then refined to the nondominated portion of the line segment by the inner loop that solves one or more scalarized IPs (see Perini et al. 2019 for details). Experiments with BLM have revealed that solving scalarized IPs can be computationally expensive; they are typically more expensive than solving a lexicographic IP (and sometimes much more expensive). Figure 2 shows for three instances (described in more detail in Sect. 6.1) all lexicographic and scalarized IPs solved during the execution of the basic variant of BLM. The x-axis represents the area of the active box when the IP is solved, and the y-axis represents the solve time of the IP. We observe that for all lexicographic IPs, the solve time is in the order of a few seconds, whereas for some of the scalarized IPs, the solve time is close to 500 s. Furthermore, for some instances (Rand and Bent), a correlation of at least 0.7 indicates that the solve time for scalarized IPs increases with box size, and hence BLM performance suffers early when boxes are still large. This suggests that a variant of BLM that only solves lexicographic IPs (i.e., avoids solving scalarized IPs) may be faster on average or, at least, have more “stable” solve times.

Therefore, we explore the benefits of a Purely Lexicographic Boxed Line Method (PureLex). PureLex replaces the weighted sum scalarization IPs for refining the initial overestimate of the line segment with lexicographic IPs.Footnote 9 It does so in a way similar to \(\epsilon \)TM, using one lexicographic IP solve per endpoint. Given a line segment \(L(z^1,z^2)\) containing NDP \(z^*=z(x^*)\) for some \(x^*\in {\mathcal {X}}\) in its (relative) interior, let \(\textbf{w}\) represent the gradient of \(L(z^1,z^2)\). PureLex solves the following lexicographic IP to update \(z^1\):

$$\begin{aligned} z^\alpha =\text {lexmin}\quad&(z_2(x),z_1(x)) \nonumber \\ \text {s.t.} \quad&\textbf{w}^T z(x) < \textbf{w}^T z^*,\nonumber \\&z_1(x)\le z^*_1,\nonumber \\&z_2(x)\le z^1_2,\nonumber \\&x_I \ne x^*_I, \quad x\in \mathcal {X}. \end{aligned}$$
(6)

Note that (4) and (6) both model a lexicographic optimization of two objectives over a criterion space set with the same structure. In both cases, the criterion space set is defined by the intersection of three half spaces: two defined by the upper bounds on each objective and the third defined by the requirement to lie below a line (segment). They model this common structure in two different ways. The formulation (6) represents the three half-space constraints directly. The formulation (4) expresses them using a convex combination of the line segment endpoints, with a new continuous variable to model the weights in the convex combination. Both formulations impose a no-good constraint on the integer components of the solution. Unlike formulation (4), the formulation (6) uses a strict inequality (implemented as \( \textbf{w}^T z(x) \le \textbf{w}^T z^* - \epsilon \), for small \(\epsilon > 0\)) to make the current line segment infeasible. In theory, therefore, the no-good constraint is redundant. However, computational experiments have shown that including the no-good constraint in (6) provides better numerical stability. When running \(\epsilon \)TM where (4) is replaced with (6), experiments indicate that instances are solved slightly faster with this new formulation (an average improvement of 1% in computational runtime). Note also that we solve either two single-objective IPs, and find an NDP that dominates a portion of \(L(z^1,z^*)\), or one (infeasible) single-objective IP, and prove that \(L(z^1,z^*)\) is nondominated.

If (6) is infeasible, then endpoint \(z^1\) does not need to be updated. Otherwise, the NDP \(z^\alpha \) is used to update the endpoint \(z^1\), using the horizontal projection of \(z^\alpha \) onto \(L(z^1,z^*)\), after which \(L(z^1,z^*)\) is guaranteed to be nondominated. Solving the second IP in the lexicographic minimization is needed only because \(z^\alpha \) is required to be nondominated in order to define the next box; to update the line segment, it suffices to solve the first IP.

A similar lexicographic IP is used to update \(z^2\):

$$\begin{aligned} z^\beta =\text {lexmin}\quad&(z_1(x),z_2(x)) \nonumber \\ \text {s.t.} \quad&\textbf{w}^T z(x) < \textbf{w}^T z^*,\nonumber \\&z_1(x)\le z^2_1,\nonumber \\&z_2(x)\le z^*_2,\nonumber \\&x_I \ne x^*_I, \quad x\in \mathcal {X}. \end{aligned}$$
(7)

If (7) is infeasible, then endpoint \(z^2\) does not need to be updated. Otherwise, the NDP \(z^\beta \) is used to update the endpoint \(z^2\), using the vertical projection of \(z^\beta \) onto \(L(z^*,z^2)\). After the update, \(L(z^*,z^2)\) is guaranteed to be nondominated.

Thus, PureLex identified the nondominated portion of a line segment by solving a minimum of two and a maximum of four single objective IPs, compared to one or more (possibly expensive) scalarized IPs solved in BLM. Algorithm 3 presents the pseudo-code of PureLex.

4 A hybrid, two-phase algorithm

Reviewing the logic of \(\epsilon \)TM, we see that it is likely to solve fewer single-objective IPs than PureLex, because when it shortens a line segment to its nondominated portion, it does so by solving either one (infeasible) IP or two single-objective IPs (one if the entire line segment belongs to the frontier and two when the line segment has to be shortened). However, solving fewer single-objective IPs will not always results in shorter solve times, because run time also depends on the time it takes to solve the single-objective IPs, which is impacted by the size of the box. The plots in Fig. 3 show a clear correlation between the solve time of a single-objective IP and the area of the active box. This explains, in part, why (as we shall see from the results in Sect. 6) \(\epsilon \)TM is more effective than PureLex when the frontier lies in a “small” region of the criterion space, i.e, when the initial box \(B(z^L,z^R)\) is “small”. Another contributing factor is the fact that the frontier in a small region of the criterion space is likely to have fewer horizontal gaps, which is beneficial for \(\epsilon \)TM. Whenever \(\epsilon \)TM encounters a horizontal gap in the frontier, it needs to solve a lexicographic IP to find a new “starting” point, i.e., a new slice, which requires solving a lexicographic IP (and this lexicographic IP is not restricted to a small part of the box, which is usually the case when \(\epsilon \)TM determines whether a line segment belongs to the frontier or needs to be shortened).

Fig. 3
figure 3

The area of the box and the solve time for the lexicographic IPs solved by \(\epsilon \)TM, separated by instance, are plotted as points. The resulting correlation coefficient is in the upper left of each graph

This suggests that an algorithm that switches from PureLex to \(\epsilon \)TM at some point during the generation of the frontier may be able to exploit the respective strengths of these methods and allay their respective weaknesses. We propose such an algorithm, which we call SPureLex, as follows. SPureLex starts by executing PureLex to quickly decompose the criterion space into many small as-yet unexplored boxes, and, then, when the total as-yet unexplored area of the criterion space becomes small, i.e., less than a fraction \(\rho \) (\(0< \rho < 1\)) of the area of the initial box \(B(z^l,z^U)\), it switches to \(\epsilon \)TM to rapidly generate the frontier in the remaining boxes (defining the as-yet unexplored area of the criterion space). Early computational experiments indicated that small values for \(\rho \) were ideal; in Sect. 6, we use \(\rho =0.005\). Algorithm 4 presents the pseudo-code of SPureLex. We call SPureLex-PWL the variant of SPureLex using the \(\epsilon \)TM-PWL method.

5 Approximating a mixed integer nondominated frontier

In the literature on multiobjective optimization, an approximation of the NDF can take any of several different forms. Here, we use it to describe a subset of the NDF. In particular, we compare the parts of the NDF discovered by alternative exact algorithms prior to their completion. To do so, we require metrics that measure the quality of such an approximation. We introduce the following metrics illustrate the progress of the algorithms as well as the quality of the improvement of approximation:

  1. 1.

    The as-yet unexplored area of the criterion space (i.e., the area of the criterion space that may still contain parts of the frontier) as a fraction of the area of the initial box \(B(z^L,z^R)\);

  2. 2.

    The fraction of isolated NDPs found;

  3. 3.

    The fraction of NLSs found;

  4. 4.

    The fraction of the total length of the NLSs found; and

  5. 5.

    The fraction of slices that contribute to the frontier found.

Note that metrics 2, 3, 4, and 5 are relative to the complete nondominated frontier, which is assumed to be known.

When computing the complete NDF, the order in which boxes are processed is immaterial. However, processing the boxes in the queue in nonincreasing order of their area has two advantages when computing an approximation of the NDF: (1) it often introduces diversification, in the sense that different parts of the criterion will be explored, and (2) as a feature of the BLM design (see Sect. 2.2), the unexplored area of the criterion space reduces as fast as possible. Therefore, terminating the algorithm after a fixed amount of time, or terminating the algorithm when the unexplored area of the criterion space drops below a certain threshold (e.g., below some fraction of the area of the initial box \(B(z^L,z^U)\)), are both effective strategies to quickly produce a high-quality approximation of the NDF.

6 Computational study

The goal of our computational study is twofold. First, it demonstrates the efficacy of \(\epsilon \)TM-PWL and SPureLex. Second, it investigates SPureLex’s ability to efficiently produce high-quality approximations to the NDF. All algorithms are coded in C++ and solve the linear and integer programs using IBM CPLEX Optimizer 12.6. All experiments were conducted in a single thread of a dedicated Intel Xeon ES-2630 2.3 GHz with 50 B RAM, running Red Hat Enterprise Linux Server 7.4. Runtime limit was chosen to be 86,400 s.

6.1 Test instances

While the algorithms presented work for general integer variables, our test instances are restricted to binary variables. Four sets of instances were used: (1) the five largest instances (C320) from the benchmark instances proposed by Mavrotas and Diakoulaki (1998), which we refer to as “Historical”, (2) five modified versions of these instances, which we refer to as “Relaxed Historical”, (3) three large randomly generated instances using the generation scheme proposed by Perini et al. (2019), which we refer to as “Rand”, and (4) three new randomly generated instances, which we refer to as “Bent”.

The framework for the Historical instances, originally published by Mavrotas and Diakoulaki (1998), includes: half of the variables are binary and half are continuous; a randomly generated objective vector; and a set of knapsack constraints (the number of which equals the number of variables) with randomly generated coefficients and right hand sides. Each binary variable appears in exactly one knapsack constraint; half of the constraints include only continuous variables. In addition, a constraint ensures that no more than a third of the binary variables can be set to 1. This framework was adapted to the biobjective case in Boland et al. (2015b) by generating an additional objective vector. This set of instances has since been used in many published studies of BOMIP algorithms, including Soylu and Yıldız (2016), Fattahi and Turkay (2018), Soylu (2018), and Perini et al. (2019). We use the largest instances in this set.Footnote 10 All instances in the set share similar features: relatively few integer solutions contribute to the NDF, each of which produces a slice with many short line segments, and little dominance between slices. This structure poses complications from an accuracy perspective, due to numerical rounding errors (see Fattahi and Turkay 2018; Perini et al. 2019 for further discussion).

To increase diversity in the benchmark data set, we modify the Historical instances by relaxing the constraints, as follows. First, the constraints involving only continuous variables are removed. Second, the proportion of binary variables that can be set to 1 is increased from a third to a half. These modifications maintain the same number of integer and continuous variables (160 each) while reducing the number of constraints by approximately half. In four of the five instances, this relaxation had the effect of increasing the number of distinct integer solutions whose slice (at least in part) appears in the NDF by about 21% on average, while decreasing the number of NLSs per integer solution in the NDF by between 20–34%, with an average of 26%. In the case of the historical instance numbered 24, its NDF structure barely changed: its ratio of NLSs per integer solution was reduced by less than 2% in the relaxed version. Thus, we omit it. The remaining four instances constitute the set we call Relaxed Historical.

Perini et al. (2019) introduced the instance generation scheme that produces the Rand instances to complement the Historical instances. For example, their ratio of NLSs per integer solution in the NDF is an order of magnitude less than for the Historical instances. The Rand instances were designed with the slices and final NDF in mind, then the BOMIP was “reverse-engineered.” The slices in criterion space include: (1) a long line segment traversing from the top left to the bottom right of the NDF and (2) boundaries of cones with vertices that dominate (a point in) the line segment (see Fig. 4a). The NDF alternates between portions of the long line segment and boundaries of each cone, which includes many intersections. Classes of these instances are defined by the number of vertices or cones chosen to dominate the long line segment, which we simply call n. We include three relatively large instances of size \(n=7,500\) in this study, labeled “A”, “B”, and “C.” Each of the n vertices is chosen randomly, so NDFs from instances of the same size still vary.

Here we introduce a slight modification to the structure of the Rand instances that results in significant differences in computational comparison of BOMIP algorithms. The slice consisting of the long line segment is replaced with a “bent” line segment, i.e., two line segments, whose gradients have small difference, joined at a central vertex (see Fig. 4b). Note that this slice is now a cone, but one that is much wider than the cones associated with other integer solutions. Details of the method for generating the Bent instances are given in “Appendix B”. Computationally, the slight difference in gradient vectors in the two line segments that form the “bent” line segment makes it much more difficult for algorithms that solve scalarized IPs. Our computational study uses three instances, with \(n=\) 5000, 7500, and 10,000, respectively.

Fig. 4
figure 4

Random cone width instances (Perini et al. 2019) and the new Bent instances for \(n=2\)

6.2 Computing exact frontiers

We start by evaluating the benefits of the PWL enhancement to \(\epsilon \)TM. The results are summarized in Table 1. Here, for each set of instances, we report the average and maximum relative increase in running time of the algorithm over that of whichever algorithm is best for the instance: \(\epsilon \)TM or \(\epsilon \)TM-PWL. Specifically, if \(T_i\) denotes the running time for \(\epsilon \)TM on the ith instance in a set, and \(T^{PWL}_i\) denotes the running time for \(\epsilon \)TM-PWL on the same instance, then in the row of Table 1 for the \(\epsilon \)TM method, we will report

$$\begin{aligned} \tfrac{1}{N} \sum _i \frac{T_i - T_i^*}{T_i^*} \qquad \text {and} \qquad \max _i\ \frac{T_i - T_i^*}{T_i^*} \end{aligned}$$

in the “Average” and “Max” columns corresponding to that set of instances, where N is the number of instances in the set and \(T_i^*:=\min \{T_i, T^{PWL}_i\}\) is the best of the two runtimes. A zero relative increase indicates that the algorithm was best for every instance in the set; a positive value indicates that the algorithm was not best for at least one instance in the set. A high value indicates that the alternative algorithm gives a speed-up of a factor of one plus this value, on average (e.g., for a value of 1.87, the speed-up factor is 2.87). For the exact running times of the algorithms and other performance metrics, please refer to “Appendix C” (Tables 4, 5).

Recall that the NDF of a Historical instance is composed of many small line segments but is defined by a small number of integer solutions. On average, the number of integer solutions defining the frontier of the Historical instances is about 370, and each contributes about 45 line segments. Hence, we set the hyperparameter K to 50 in our computational experiments. The enhancements were specifically designed to exploit that situation, and explains why the enhanced versions of \(\epsilon \)TM perform so well on this set of instances. The same is true for the Relaxed set of instances, to a lesser extent. In all Historical and Relaxed instances, the enhanced version was faster, with average speed-up factor of around 2–3.

Unfortunately, these impressive gains are not observed for the Rand and Bent instances. Rather than the NDF of these instances being determined by a few integer solutions contributing many line segments, they are determined by many integer solutions contributing few line segments. Hence, K was fixed to 1 in our computational experiments. The overhead that comes with setting up and solving an integer program to determine the “left-most” segment of a slice that is nondominated is too large when slices consist of only a few line segments. For the largest Bent instance, the variant of \(\epsilon \)TM with the enhancement is unable to solve the instance within 24 h.

Next, we evaluate the benefits of the variant of BLM in which only lexicographic IPs are solved, PureLex. The results are summarized in Table 2, which again gives statistics for each set of instances for the increase in running time of each variant relative to that of the best BLM variant for the instance.

Table 2 Average and maximum relative increase of total running time of the variations of BLM for all instance sets

The PureLex variant is clearly the most robust, being not much slower than the best in the cases where it is not best. For the Rand and Bent instances, the PureLex variant substantially outperforms the basic version. It also outperforms the recursive variant on the Bent instances: BLM-recursive struggles to solve the Bent instances within the 24 h time limit for each, whereas PureLex can solve all of them in less than 3 h. For the Historical and Relaxed instances, the recursive variant of BLM is faster than PureLex, but not by much. Note that BLM-recursive is faster than BLM on all but the Historical instances, and on those is not much slower; BLM-recursive is the most robust of the BLM variants prior to this work.

Finally, we compare the performance of the two best performing variants of \(\epsilon \)TM and BLM with two variants of the hybrid, two-phase method: SPureLex with \(\rho = 0.005\) and SPureLex-PWL with \(\rho = 0.005\). The results can be found in Table 3.

Table 3 Average and maximum increase in total running time of various solution methods relative to the fastest of the methods for the instance, for all instance sets

The first thing to observe is that the PWL enhancement to \(\epsilon \)TM still has a significant impact on the performance of the hybrid, two-phase method, SPureLex, even though \(\epsilon \)TM is only used to find (part of) the NDF in a box that is relatively small (i.e., when the area is less than half a percent of the area of the original box). The second thing to observe is that PureLex performs well on the Historical instance and outperforms SPureLex \(-\)0.005, although not by much. Finally, as expected, we see that the version of SPureLex that does not use the enhanced version of \(\epsilon \)TM outperforms the one that does, as the benefits of using the enhancement are insufficient to overcome the overhead incurred when using the enhancement. However, the difference is very small. These results indicate that SPureLex-PWL \(-\)0.005, the hybrid, two-phase method that uses PureLex in the first phase and used the enhanced version of \(\epsilon \)TM in the second phase, is the most robust and efficient method for solving BOMIPs.

It is interesting to observe that, compared to the variants of BLM, the number of times the Same-Integer-Solution enhancement is invoked by the variants of the hybrid, two-phase method is small. This indicates that a more careful tuning of the point at which the hybrid, two-phase methods switches methods may have some pay-off. By switching too early, the algorithm may miss out on opportunities to invoke the Same-Integer-Solution enhancement.

Overall, \(\epsilon \)TM solves the smallest number of IPs for these instances. However, because solving lexicographic IPs for large-area boxes can be costly (as shown in Fig. 3), even though \(\epsilon \)TM solves significantly fewer IPs than PureLex, \(\epsilon \)TM takes more time as it solves several IPs in large-area boxes. This happens because \(\epsilon \)TM generates the nondominated frontier from left-to-right, and solves a lexicographic IP with respect to as-yet unprocessed portion of the line segment.

6.3 Computing approximate frontiers

Next, we compare the performance, in terms of their ability to produce high-quality approximations of the NDF, of \(\epsilon \)TM, the recursive variant of BLM, PureLex, and SPureLex with \(\rho = 0.005\). Rather than enforcing early termination of an algorithm to obtain an approximation of the NDF, we report statistics of the approximation of the NDF at different points in time during the execution of the algorithms. We do this for three instances, one from each of the set of Historical, Bent, and Rand instances: 21, Bent7500.A, and Rand7500.A, respectively. The results are representative of what happens for the other instances in the corresponding set. Figure 5 summarizes the following statistics with respect to time (seconds) on a log scale: Area, the resolved area as a percentage of the initial box \(B(z^L,z^R)\); INDP, the percentage of isolated NDPs found; NLS, the percentage of the NLSs found; fTNLS, the fraction of the total length of NLSs found; and, Slice, the percentage of slices that contribute to the frontier found. Tables 6, 7, 8 and 9 in “Appendix D” report more statistics.

Fig. 5
figure 5

Approximation results for \(\epsilon \)TM, the recursive variant of BLM, PureLex, and SPureLex with \(\rho = 0.005\). Percentages are plotted on the y-axis; time (in s) is plotted on the x-axis on a log scale. There are no isolated NDPs in the NDF of the Historical instance, so this space is blank. No approximation metrics are reported for the recursive variant (blue) of BLM on the Bent instance

We observe that, as expected, PureLex and SPureLex produce a high-quality approximation of the NDF much more quickly than \(\epsilon \)TM. In less than 10 min, PureLex and SPureLex have explored more than 99.5% of the area of \(B(z^L,z^U)\), whereas \(\epsilon \)TM has explored a little more than 5% for the Historical instance, and 1% or less for Rand and Bent instances.

Especially for Rand and Bent instances, multiple metrics illustrate how slowly \(\epsilon \)TM advances at the beginning (from an approximation perspective) but speeds up towards the end. This is due, in part, to the fact that in the beginning, the boxes cover a large area in criterion space, and the lexicographic IPs are time-consuming. As the area of the boxes decreases, the lexicographic IPs are solved faster, and, consequently, the statistics improve more rapidly. For the Historical instance, this is less noticeable because the solve times for the lexicographic IPs tend to be smaller (as the generated line segments are small).

Even though the recursive variant of BLM can be competitive when it comes to producing the complete nondominated frontier, it is not the ideal candidate for producing approximations. For the Historical instance, there is never any recursion, so the algorithm produces high-quality approximations throughout the execution. However, on the Rand instance, the first reported data point occurs late during the execution because the algorithm runs deep into recursion and only reports approximation metrics once it has returned to depth level zero. This is also the reason why no approximation metrics are reported for the Bent instance (since the entire execution time is spent on the first recursion without returning to depth level zero). After the algorithm returns to depth level zero for the first time, it quickly approximates the rest of the frontier, but the depth of recursion and resulting lag time of reporting makes the recursive variant of BLM less effective for approximating nondominated frontiers.

Another interesting observation is that PureLex and SPureLex quickly find a large fraction of the slices that contribute to the frontier for the Historical instance. This is a consequence of the structure of the nondominated frontier, in which each slice contributes many nondominated line segments to the frontier, and, because PureLex and SPureLex quickly decompose the criterion space into small boxes, they tend to find NDPs from different slices. Also observe from the Historical instance, however, that once SPureLex switches to \(\epsilon \)TM, the algorithm discovers new slices more slowly, at a rate comparable to \(\epsilon \)TM on the same instance. This is the trade-off: sometimes the switch from PureLex to \(\epsilon \)TM improves performance (e.g., Rand and Bent instances), and sometimes it worsens performance (e.g., Historical instances).

By unifying the two disparate algorithms into a two-phase hybrid approach, with just two hyperparameters to tune (\(\rho \) and K), SPureLex is robust to a wider variety of instances than either of the component algorithms. Tuning the first hyperparameter can be informed by some information about the NDF, i.e., how many NLS are expected per slice in the frontier, and/or it can be dynamically updated as the procedure evolves. We found that the second hyperparameter performs well for small values, e.g. \(\rho =0.005\), but this could be more rigorously tested via parameter tuning. Altogether, these results show that also when it comes to finding approximate nondominated frontiers, SPureLex (or one of its variants) is fast and robust and the algorithm of choice.

7 Final remarks

The algorithm presented and analyzed in this paper shows how structural observations of a NDF and computational bottlenecks can lead to tactical enhancements to algorithm design. Evidence included in this work calls for a more diverse range of test instances to test and improve robustness of multiobjective algorithms.