1 Introduction

In this paper we consider a box-constrained global optimization problem of the form

$$\begin{aligned} \begin{aligned}&\min _{\mathbf {x}\in D}&f(\mathbf {x}) \end{aligned} \end{aligned}$$
(1)

where \(f:\mathbb {R}^n \rightarrow \mathbb {R}\) denotes the objective function and the feasible region is an n-dimensional hyper-rectangle \( D = [ \mathbf {a}, \mathbf {b}] = \{ \mathbf {x} \in \mathbb {R}^n: a_j \le x_j \le b_j, j = 1, \dots , n\} \). We also assume, that the objective function \(f(\mathbf {x})\) is Lipschitz-continuous, but can be non-linear, non-differentiable, non-convex, and multi-modal. DIRECT is a popular partitioning-based Lipschitz optimization [8, 20, 21, 23, 25, 26, 33] algorithm extending ideas of Piyavskii [27] (independently rediscovered also by Shubert [32]) algorithm to multidimensional derivative-free optimization. The DIRECT algorithm [9] seeks a global optimum by partitioning potentially optimal (the most promising) hyper-rectangles and evaluating the objective function at the centers of these hyper-rectangles. Simplicity and efficiency of the DIRECT algorithm attracted considerable research interest. Although most of DIRECT-type algorithms use hyper-rectangular partitions [6, 11, 13,14,15], simplicial partitions (DISIMPL algorithm) [19, 22, 23] have several advantages [24]. Central sampling of the objective function can be changed to diagonal approach sampling at the endpoints of diagonal [10, 29,30,31]. A trisection of hyper-rectangles is usually used to reuse the objective function values at the center or endpoints of diagonals in descendant subregions. However, a bisection can ensure better shapes of hyper-rectangles with a smaller variety of sizes in different dimensions than trisection which produces sizes differing by three times, but a specific sampling strategy is necessary to enable the reuse of sample points [18].

The original DIRECT algorithm has two main weaknesses [12, 16, 19, 29]. First, on problems with many local minima, DIRECT sometimes spends an excessive number of function evaluations exploring suboptimal local minima, thereby delaying the discovery of the global minimum. To address this issue, a two-phase globally-biased technique was proposed [19, 29]. Second, DIRECT usually gets close to the global optimum quickly, but it can be slow to converge with a high accuracy. To overcome the latter issue, a two-phase locally-biased technique [13] or hybrid versions of DIRECT-type algorithms enriched with the use of local searches [14, 16] can be employed. In this paper, we propose an alternative strategy to overcome both drawbacks without the need to use local solvers or use two-phase scheme which requires the introduction of new parameters.

The rest of the paper is organized as follows. In Sect. 2, we review existing ways for the selection of potentially optimal hyper-rectangles used in various DIRECT-type approaches. In Sect. 3, we describe a new strategy for the selection of the most promising hyper-rectangles, which addresses both mentioned weaknesses of DIRECT. Description of the new DIRECT-GL algorithm is given in Sect. 4. The results of numerical investigation with 54 Hedar test problems [7] using four different convergence tolerances (in total 216 variants) is discussed in Sect. 5. Finally, we conclude the paper in Sect. 6.

2 The selection of the most promising hyper-rectangles

The essential step in DIRECT-type algorithms is identification of potentially optimal (the most promising) hyper-rectangles of the current partition, which at the iteration k is defined as

$$\begin{aligned} \mathcal {P}_k = \{ D^i_k : i \in \mathbb {I}_k \}, \end{aligned}$$

where \( D^i_k = [\mathbf {a}^i, \mathbf {b}^i ] = \{ \mathbf {x} \in \mathbb {R}^n: 0 \le a^i_j \le x_j \le b^i_j \le 1, j = 1,\dots , n, \forall i \in \mathbb {I}_k \} \) and \( \mathbb {I}_k \) is the index set identifying the current partition \( \mathcal {P}_k \). The next partition \( \mathcal {P}_{k+1} \) is obtained after the subdivision of the selected potentially optimal hyper-rectangles from the current partition \( \mathcal {P}_k \).

2.1 Potentially optimal hyper-rectangles in the original DIRECT algorithm

To make the selection of potentially optimal hyper-rectangles in the future iterations, DIRECT assesses the goodness based on the lower bound estimates for the objective function \( f(\mathbf {x}) \) over each hyper-rectangle \( D^i_k \). The requirement of potential optimality is stated formally in Definition 1.

Definition 1

(Potentially optimal hyper-rectangle) Let \( \mathbf {c}^i \) denote the center sampling point and \( \delta _i \) be a measure (distance, size) of the hyper-rectangle \( D^i_k \). Let \( \varepsilon > 0 \) be a positive constant and \(f_\mathrm{min}\) be the best currently known value of the objective function. A hyper-rectangle \( D^j_k, j \in \mathbb {I}_k \) is said to be potentially optimal if there exists some rate-of-change (Lipschitz) constant \( \tilde{L} > 0\) such that

$$\begin{aligned} f(\mathbf {c}^j) - \tilde{L}\delta _j\le & {} f(\mathbf {c}^i) - \tilde{L}\delta _i, \quad \forall i \in \mathbb {I}_k, \end{aligned}$$
(2)
$$\begin{aligned} f(\mathbf {c}^j) - \tilde{L}\delta _j\le & {} f_\mathrm{min} - \varepsilon |f_\mathrm{min}|, \end{aligned}$$
(3)

where the measure of the hyper-rectangle is

$$\begin{aligned} \delta _i = \frac{1}{2} \Vert \mathbf {b}^i - \mathbf {a}^i \Vert _2. \end{aligned}$$
(4)

The hyper-rectangle \( D^j_k \) is potentially optimal if the lower Lipschitz bound for the objective function computed by the left-hand side of (2) is the smallest one with some positive constant \(\tilde{L}\) among the hyper-rectangles of the current partition \( \mathcal {P}_k \). In (3) the parameter \(\varepsilon \) is used to protect from an excessive refinement of the local minima [9, 19].

2.2 Selection of the most promising hyper-rectangles in other DIRECT-type algorithms

In the original DIRECT algorithm, the size of a hyper-rectangle is measured by the Euclidean distance from its center to a corner or equivalently by a half length of a diagonal (see (4)). In DIRECT-l [6], the measure of a hyper-rectangle is instead evaluated by the length of its longest side. Such a measure corresponds to the \(L^{\infty }\)-norm and allows the DIRECT-l algorithm to group more hyper-rectangles with the same measure. Thus, there are fewer distinct measures and therefore, less potentially optimal hyper-rectangles are selected. Moreover, in DIRECT-l at most one hyper-rectangle from each group is selected, even if there are more than one potentially optimal hyper-rectangle in the same group. This allows reduction of the number of divisions within a group. The results presented in [6] and extended in [19] suggest that DIRECT-l performs well for lower dimensional problems, which do not have too many local and global minima.

The main principle of an aggressive version of DIRECT [1] is to select and divide a hyper-rectangle of every measure \((\delta _i)\) in each iteration. The aggressive version requires many more function evaluations than the other versions of DIRECT since the criteria for choosing hyper-rectangles to be divided have been relaxed. Although this approach does not appear to be favorable for simple test problems, more difficult problems may be easier solved by this strategy on a large parallel supercomputer [1].

In the PLOR algorithm [17], the set of all Lipschitz constants (herewith the set of potentially optimal hyper-rectangles) is reduced to just two: the maximal and the minimal ones. In such a way the PLOR approach is independent of any user-defined parameters and balances equally local and global search during the optimization process.

A two-phase globally [19, 29] and locally-biased [13] algorithms at one of the phases work in the same as the original DIRECT algorithm, i.e., during the selection procedure considers all hyper-rectangles from the current partition. However, in the second phase, they limit the selection of potentially optimal hyper-rectangles based on their measures. The globally-biased versions constrain themselves to the larger subregions (primary addressing the first weakness), while the locally-biased version constrains itself to the smaller ones and in such a way addresses the second weakness of DIRECT-type algorithms.

3 Extended set of potentially optimal hyper-rectangles

In this section, we present a new way to identify the extended set of potentially optimal hyper-rectangles. Using a new two-step based strategy, we enlarge the set of the best hyper-rectangles by adding more medium-measured hyper-rectangles with the smallest function value at their centers and additionally, closest to the current minimum point. The first extension forces the algorithm to work more globally (compared to the selection procedure used in DIRECT), while the second part assures faster and broader examination around the current minimum point. In such way, we address both weaknesses of DIRECT staying in the same algorithmic framework. Let’s state it formally.

Let \(\mathbb {L}_k\) be the set of all different indices at the current partition \( \mathcal {P}_k \), corresponding to the groups of hyper-rectangles having the same measure \((\delta _k)\). The minimum value \(l^\mathrm{min}_k \in \mathbb {L}_k\) corresponds to the group of hyper-rectangles having the smallest measure \(\delta _k^\mathrm{min}\). The maximum value \(l^\mathrm{max}_k\) of \(\mathbb {L}_k\) corresponds to the group of hyper-rectangles having the largest measures \(\delta _k^\mathrm{max}\), i. e., \(l^\mathrm{max}_k = \max \{\mathbb {L}_k\} < \infty \). Finally, let \(l^{i}_k \in \mathbb {L}_k\) be the index of the group the hyper-rectangle \(D^i_k\) belongs to. Having this, in Definitions 2 and 3 we formalize new strategies for identification of an extended set of potentially optimal hyper-rectangles from the current partition \( \mathcal {P}_k \).

Definition 2

(Enhancing the global search)

  • Step 1 Find an index \( j \in \mathbb {I}_k\) and a corresponding hyper-rectangle \(D_k^j\), such that

    $$\begin{aligned} D^j_k = \mathop {{{\mathrm{arg\,max}}}}\limits _{j} \left\{ l_{k}^j : j = \mathop {{{\mathrm{arg\,min}}}}\limits _{i \in \mathbb {I}_k : \; l^\mathrm{min}_k \le l_{k}^i \le l^\mathrm{max}_k } \{ f(\mathbf {c}^i) \}\right\} . \end{aligned}$$
    (5)
  • Step 2 Set \( l^\mathrm{min}_k = l^{j}_k + 1\). If \(l^{j}_k \le l^\mathrm{max}_k\) repeat from Step 1; otherwise terminate.

At Step 1, the hyper-rectangle containing the minimum point \((\mathbf {x}^\mathrm{min})\) is selected. If there are several hyper-rectangles with the same lowest objective value \(f(\mathbf {c}^i)\), the preference is given to hyper-rectangles with the largest \(l_{k}^j\) value, i.e., a bigger size measure. After this, in Step 2, the minimum value \(l^\mathrm{min}_k = l^{j}_k + 1 \) is increased; thus all hyper-rectangles from the groups with indices lower than the updated \(l^\mathrm{min}_k\) (measures of these hyper-rectangles belonging to these groups are smaller than the measure of the \(l^\mathrm{min}_k\) group) are not considered in the recurrent Step 1. A geometrical interpretation and comparison of the original DIRECT and the globally enhanced (let us call DIRECT-G) versions are shown in the left-hand side and middle graphs in Fig. 1. By this strategy, we extend the number of medium-measured potentially optimal hyper-rectangles and force DIRECT-G to work more globally. Let us stress, that opposed to the aggressive DIRECT version, by Definition 2 DIRECT-G will not consider hyper-rectangles from the groups where the minimum function value is larger compared to the minimum value from the larger groups.

Definition 3

(Enhancing the local search)

  • Step 1 At each iteration k, evaluate the Euclidean distance from the current minimum point \((\mathbf {x}^\mathrm{min})\) to other sampled points:

    $$\begin{aligned} d(\mathbf {x}^\mathrm{min},\mathbf {c}^i) = \sqrt{\sum _{j=1}^n (x^\mathrm{min}_j - c^i_j)^2} \end{aligned}$$
    (6)
  • Step 2 Apply the procedure described in Definition 2 in (5) using distances \(d(\mathbf {x}^\mathrm{min},\mathbf {c}^i)\) instead of objective function values.

A geometrical interpretation of the selection of potentially optimal hyper-rectangles using the locally enhanced strategy is shown on the right-hand side of Fig. 1. By this strategy, we extend the number of potentially optimal hyper-rectangles locating close to the current minimum point \((\mathbf {x}^\mathrm{min})\). Moreover, by this strategy, we select the closest hyper-rectangles from various measures.

Fig. 1
figure 1

Geometric interpretation of the selection of potentially optimal hyper-rectangles by using DIRECT (on the left-hand side), DIRECT-G (middle), and the locally enhanced strategy (on the right-hand side) on the Shekel 5 test problem in the fifth iteration of corresponding algorithms/strategies

4 DIRECT-GL algorithm

In this section, we introduce a new DIRECT-type algorithm (let us call DIRECT-GL). The key feature of DIRECT-GL is that DIRECT-GL performs the identification of potentially-optimal hyper-rectangles twice in every iteration. First, by using Definition 2 the globally enhanced set of potentially optimal candidates is determined and fully processed (sampled and partitioned). Second, by using Definition 3 the locally enhanced set is identified and fully processed (sampled and partitioned) again. Thus, our new approach is based on “Divide the best” strategy [28] and it has the everywhere-dense type of convergence (like other DIRECT-type algorithms [4, 9, 18, 19, 29]). This follows from the fact that, that using Definitions 2 and 3, DIRECT-GL always selects for partitioning hyper-rectangles from the group \((l^\mathrm{max}_k)\) with the largest measure \(\delta _k^\mathrm{max}\). Since each group contains only a finite number of hyper-rectangles, after a sufficient number of iterations, all hyper-rectangles will be partitioned. Such a procedure will be repeated with a new group of the largest hyper-rectangles and so on until the largest hyper-rectangles will have the measure smaller than the required tolerance \(\varepsilon \).

The complete description of the DIRECT-GL algorithm is shown in Algorithm 1. The input for the algorithm is one (or few) stopping criteria: required tolerance (\(\varepsilon _\mathrm{pe}\)), the maximal number of function evaluations (\(\texttt {M}_\mathrm{max}\)) and the maximal number of DIRECT-GL iterations (\(\texttt {K}_\mathrm{max}\)). After termination, DIRECT-GL returns the found objective value \(f_{\min }\) and the solution point \(\mathbf {x}^{\min }\) together with algorithmic performance measures: final tolerance—percent error (pe), the number of function evaluations (m), and the number of iterations (k).

figure a

5 Numerical investigation

The introduced DIRECT-G and DIRECT-GL as well as the original DIRECT algorithm (Finkel’s implementation [3]) were implemented in the MATLAB programming language. Note, that for the DIRECT algorithm potentially optimal hyper-rectangles can be identified in at least two different ways: using modified Graham’s scan algorithm [2] or the rule described by Lemma 2.3 in [5]. Usually this does not impose significant differences, but occasionally it can have, e.g., when a higher precision is required. The selection procedure of potentially optimal hyper-rectangles in DIRECT-GL differs significantly, however, this does not have a notable difference to the overall performance, compared with the procedure used in DIRECT. This means, that for the identification of the same quantity of potentially optimal hyper-rectangles DIRECT and DIRECT-GL spent a similar amount of time.

Table 1 Key characteristics of the Hedar test problems

We compare the efficiency of the algorithms on the Hedar test set [7], which consist of 27 global optimization test functions. Some of test problems have several variants, e.g., Bohachevsky, Hartman, Shekel, and some of them can be tested for different dimensionality. In Table 1 we report main features of these problems: problem number (No.), name, dimensionality (n), feasible region (D), the number of local minima (if known), and the known minimum (\( f^* \)). Whenever the global minimum point lies at the initial sampling point for any tested algorithm the feasible region was modified (increased). These modified problems are marked with the star sign *.

Table 2 Number of function evaluations using DIRECT, DIRECT-G and DIRECT-GL algorithms solving Hedar test problems

Note, that the most of test problems from the Hedar test set are multimodal, therefore suitable to investigate how introduced modifications help to overcome the first weakness. Since all the global minima \( f^* \) are known for all Hedar test problems in advance, investigated algorithms were stopped either when the point \( \bar{\mathbf {x}} \) was generated such that the percent error

$$\begin{aligned} \ pe = 100 \% \times {\left\{ \begin{array}{ll} \frac{f(\bar{\mathbf {x}}) - f^*}{|f^*|}, &{} f^* \ne 0, \\ f(\bar{\mathbf {x}}), &{} f^* = 0, \end{array}\right. } \end{aligned}$$
(7)

is smaller than the tolerance value \(\varepsilon _\mathrm{pe}\), or when the number of function evaluations exceeds the prescribed limit of \(10^6\). In our investigation, four different values for \(\varepsilon _\mathrm{pe}\) were considered: \(10^{-2}\), \(10^{-4}\), \(10^{-6}\), \(10^{-8}\). By doing this, we investigate algorithm’s ability to avoid the second weakness. The comparison is based on the number of function evaluations and the best (smallest) number for each problem is shown in bold font.

The results of experiments are given in Table 2. First, observe that DIRECT-G and DIRECT-GL perform on average much better (see Overall row in Table 2) compared to DIRECT. Especially this is evident when a lower percentage error (pe) (higher accuracy) is sought. Observe, that original DIRECT on average performs better only for simpler (unimodal) test problems (see Unimodal row in Table 2). That is mainly because the set of potentially optimal hyper-rectangles in DIRECT-G and DIRECT-GL is larger per iteration. Consequently, a greater number of function evaluations is needed.

For small dimensional problems (see \(\mathbf {n \le 3}\) row in Table 2), DIRECT requires on average from 4.5 times (when \(\varepsilon _\mathrm{pe} = 10^{-2}\)) to 175 times more function evaluations (when \(\varepsilon _\mathrm{pe} = 10^{-8}\)) compared to DIRECT-GL. Observe, that DIRECT-G performed worst with \(\varepsilon _\mathrm{pe} = 10^{-2}\) and \(\varepsilon _\mathrm{pe} = 10^{-4}\). Again, for most of these problems DIRECT was able to converge after a small number of iterations. Therefore, by extending the set of potentially optimal hyper-rectangles only globally enhanced (DIRECT-G) is not very efficient for low-dimensional problems. However, when \(\varepsilon _\mathrm{pe} = 10^{-6}\) and \(\varepsilon _\mathrm{pe} = 10^{-8}\) was used, DIRECT-G performed significantly better compared to DIRECT.

For higher dimensional (see \(\mathbf {n \ge 4}\) row in Table 2) and multimodal problems (see Multimodal row in Table 2) both introduced versions performed significantly better compared to DIRECT, and the best results were obtained using DIRECT-GL. Finally, in total DIRECT failed (see Failed row in Table 2) for \(30.1\%\) (65/216) cases, most of which when a lower percent error tolerance was required (\(10^{-6}\) and \(10^{-8}\)) and optimization problems were more challenging. Meanwhile, DIRECT-G and DIRECT-GL in total failed on \(18.1\%\) (39/216) and \(9.2\%\) (20/216) cases, accordingly.

6 Concluding remarks

In this paper, we introduced a new strategy for the selection of the extended set of potentially optimal hyper-rectangles in the DIRECT-type algorithmic framework. Using the proposed approach two well-known weaknesses of DIRECT-type algorithms were addressed. The experimental results confirmed the well-known fact that while for simpler problems DIRECT performs well, for more challenging (higher dimensional) and multimodal problems the proposed modified DIRECT-GL performs significantly faster. Moreover, since the set of potentially optimal hyper-rectangles is larger (compared to DIRECT), DIRECT-GL scheme looks promising for more efficient parallelization too.