1 Introduction

Context and motivation. In landscape analysis, the search space is regarded as an object having spatial and structural properties which are believed to characterize the intrinsic difficulties underlying the solving of an optimization problem and, hence, the dynamics and performance of recognized optimization techniques [16]. This aim may be achieved in an incremental manner: (i) gaining fundamental understanding of the information provided by the landscape structure, (ii) using this information to catalyze the good practice and design of search techniques, and (iii) developing out-of-the-box effective heuristics, for example, by contributing to the foundation of landscape-aware selection methods based on statistical and machine learning prediction models [11]. In this paper, we make a step towards leveraging the so-called local optima network model, with the primary goal of providing empirical evidence on its relevance and usefulness for multi-objective combinatorial optimization.

Overview of (single-objective) local optima network. Given a combinatorial optimization problem, it is widely accepted that understanding the underlying (landscape) properties of local optimal solutions is of high importance. This is especially because heuristics are essentially navigating through the landscape in search of high-quality local optima that are hopefully as close as possible to the global one. A major issue, however, comes from the curse of dimensionality, which makes it difficult to define metrics and extract statistics that are meaningful and easy to interpret. In single-objective optimization, the local optima network (LON) is a relatively new model [13, 14] that adapts the notion of the inherent network of energy surfaces in chemical physics [6] to compress the information given by the whole search space into a smaller and synthetic mathematical object described as a graph. The local optima are thereby the nodes of the graph and there is an edge or a transition (arc) between two nodes if the search process is potentially able to jump from one local optima (or basin of attraction) to another one. Variants of the definition of edges and nodes exist that capture different facets of search algorithms, however, these variants do not alter the primary purpose behind the analysis of LONs. The LON model not only aims at capturing in detail the number and distribution of local optima in the search space, but it also aims at enabling the definition of novel features having interesting correlation with the behavior of local search heuristics [4, 14].

Overview of related work in multi-objective optimization. To the best of our knowledge, there are no investigations on defining and studying LONs in multi-objective combinatorial optimization, although local optimality also constitutes a central issue in the multi-objective case. For instance, the distribution and the connection of local optima induced by a scalarizing (single-objective) function is studied in [1, 9], which is informative when multiple scalarizations of the objectives are considered. In [15], local optimality under Pareto dominance is defined, which is then related to the convergence of Pareto local search. Such a definition of Pareto local optimal solutions (PLOS) was later used in a number of studies. For example, it is shown in [18] how the characteristics of a problem instance can be related to the number of PLOS. Alternative definitions of local optima in terms of sets of nondominated solutions are also being investigated in relation to different multi-objective search paradigms [12]. A definition of local optimal sets for continuous multi-objective landscapes can also be found in [8]. In fact, although a multi-objective search heuristic aims at providing a whole set of solutions approximating the Pareto set, information about PLOS is found to influence the global search performance, especially when considering algorithms using local search and Pareto dominance as core components [3].

Paper contributions. The work described in this paper aims at pushing further the understanding of the structure of PLOS, eventually leading to new insights into what makes a multi-objective problem instance difficult and what makes a particular multi-objective search algorithm effective. More specifically, our contribution can be summarized following two lines:

  • We introduce a model, inspired by the single-objective LON, that describes the network of Pareto local optimal solutions (PLOS-net) for multi-objective optimization. Using a comprehensive set of \({\rho }mnk\)-landscapes, we conduct a preliminary visual inspection and comparison of the corresponding PLOS-nets, showing how such networks enlighten the nature of the multi-objective landscape where the search is expected to operate.

  • We further analyze the proposed PLOS-nets by introducing features from network analysis. Based on a comprehensive statistical analysis, the predictive importance of the newly-defined PLOS-net features is then studied with respect to the performance of pure Pareto local search [15] and of a global evolutionary multi-objective optimizer [10]. Our analysis indicates that PLOS, and more critically their connections in the PLOS-net, have the largest influence on the runtime of PLS, even larger than the number of objectives and their conflicting nature.

Outline. The rest of this paper is organized as follows. In Sect. 2, we recall some definitions for multi-objective optimization, and we describe the \({\rho }mnk\)-landscapes that will be later used in our empirical investigations. In Sect. 3, we define the notion of PLOS-net for multi-objective optimization, and we conduct a visual inspection of selected PLOS-nets. We also define features extracted from the PLOS-net, and we discuss how they relate with search performance. In Sect. 4, we study the effect and importance of PLOS-net features on algorithm performance. In Sect. 5, we conclude the paper and discuss open issues.

2 Multi-objective Optimization and \(\rho \) \(mnk\)-Landscapes

We assume that we are given a (black-box) optimization problem characterized by a set of feasible solutions X (the decision space) and an objective function , to be maximized. We denote by the image of X in the objective space. Given two solutions \(x, x^\prime \in X\), x is said to dominate \(x^\prime \) (\(x \prec x^\prime \)) iff \(f_i(x^\prime ) \leqslant f_i(x)\) for all \(i \in \{1,\cdots ,m\}\) and \(f_i(x^\prime ) < f_i(x)\) for at least one \(i \in \{1,\cdots ,m\}\). The set \(X^\star \subseteq X\) for which there exists no solution \(x \in X\) such that \(f(x) \prec f(x^\star )\) for all \(x^\star \in X^\star \), is the Pareto set and its image \(Z^\star = f(X^\star )\) in the objective space is the Pareto front.

We consider \({\rho }mnk\)-landscapes as a problem-independent model used for multi-objective multi-modal landscapes with objective correlation [18]. Solutions are binary strings of size n, i.e., \(X= \{0,1\}^n\). The objective vector \(f=(f_1, f_2, \cdots , f_m)\) is defined as \(f:\lbrace 0, 1 \rbrace ^{n} \mapsto [0,1]^m\) such that each objective function \(f_i\) is a pseudo-boolean function to be maximized. The value \(f_i(x)\) of a solution \(x=(x_1, x_2, \ldots , x_n)\) is the average value of the contributions associated with each variable \(x_j\). Given objective \(f_i\) and variable \(x_j\), a component function \(f_{ij}:\lbrace 0, 1 \rbrace ^{k+1} \mapsto [0,1]\) assigns a real-valued contribution to every combination of \(x_j\) and its k epistatic interactions \(\{x_{j_1}, \ldots , x_{j_k}\}\). For every \(i \in \{1, \ldots , m\}\), the objective function is then defined as \(f_i(x) = \frac{1}{n} \sum _{j=1}^{n} f_{ij}(x_j, x_{j_1}, \ldots , x_{j_k})\). The epistatic interactions, i.e., the k variables that influence the contribution of \(x_j\), are set uniformly at random among the \((n-1)\) variables other than \(x_j\) [7]. By increasing the number of epistatic interactions k from 0 to \((n-1)\), problem instances can be gradually tuned from smooth to rugged. In \({\rho }mnk\)-landscapes, \(f_{ij}\)-values follow a multivariate uniform distribution of dimension m, defined by an \(m \times m\) positive-definite symmetric covariance matrix \((c_{pq})\) such that \(c_{pp} = 1\) and \(c_{pq}=\rho \) for all \(p,q \in \{1, \ldots , m\}\) with \(p \not = q\), where \(\rho > \frac{-1}{m-1}\) defines the correlation among the objectives [18]. The positive (respectively, negative) objectives correlation \(\rho \) decreases (respectively, increases) the degree of conflict between the different objective function values. We use the same correlation coefficient, and the same epistatic degree and interactions for all objectives.

We generate 520 \({\rho }mnk\)-landscapes as follows. The problem size is set to \(n=16\); the problem non-linearity to \(k \in \{0, 1, 2, 4\}\), from linear to rugged landscapes; the number of objectives to \(m \in \{2,3\}\); and the objective correlation to \(\rho \in \{-0.7, -0.4, -0.2, 0, 0.2, 0.4, 0.7\}\) subject to \(\rho > \frac{-1}{m-1}\), from conflicting to correlated objectives. We generate 10 instances independently at random for each parameter combination.

3 Pareto Local Optimal Solutions Network

3.1 Definition and Visual Inspection of PLOS-net

The Pareto local optimal solutions network (PLOS-net) proposed in this paper can be constructed for a given optimization problem (Xf) and neighborhood relation \(\mathcal N:X\mapsto 2^X\). For \({\rho }mnk\)-landscapes, the neighborhood relation is the 1-bit-flip operator: two solutions are neighbors if the Hamming distance between them is one. Let us first define Pareto local optimal solution (PLOS) [15].

Definition 1

A solution \(x \in X\) is a Pareto local optimal solution if it is not dominated by any of its neighbors: \(\forall x^\prime \in \mathcal N(x)\), \(\lnot (x^\prime \prec x)\).

For \(m=1\), this is equivalent to the conventional definition of a single-objective local optimal solution. Based on PLOS, we then define a PLOS-net as follows.

Definition 2

A Pareto local optimal solutions network (PLOS-net) is a (undirected unweighted simple) graph \(G = (N, E)\), such that the set of vertices N are the Pareto local optimal solutions, and there is an edge \(e_{ij} \in E\) between two nodes \(x^i\) and \(x^j\) iff \(x^i \in \mathcal N(x^j)\) or \(x^j \in \mathcal N(x^i)\).

Two solutions connected by an edge in the PLOS-net are necessarily mutually nondominated. Moreover, the Pareto (global) optimal solutions (POS) are particular nodes of the PLOS-net.

A visual inspection of the so-defined PLOS-nets is given in Fig. 1 for some selected landscapes. The different PLOS-nets are extracted by full enumeration. The two first rows are for a (fixed) number of objectives (\(m=2\)), while the last two are for \(m=3\). In the first and third rows, the degree of objective correlation \(\rho \) varies for a fixed value of \(k=1\), whereas in the second and fourth rows, the value of k varies for a fixed value of \(\rho =0.0\). The x– and y–axes correspond to the first two objectives \((f_1, f_2)\) and the scale is intentionally different for the different instances for a better visualization. For \(m=3\), we see a 2D-projection while the color intensity depicts \(f_3\)–values.

By mapping the PLOS-nets into the objective space, we intend to illustrate the shape of the network while providing a first hint on the impact of instance parameters on the distribution of PLOS. As somewhat expected, the objective correlation \(\rho \) impacts the shape and region spanned by the Pareto front; e.g., nodes of the PLOS-net at the upper objective space limit for \(m=2\). More interestingly, the number of nodes in the PLOS-nets increases when the number of objectives m, when the degree of conflict between the objectives \(-\rho \), or when the problem non-linearity k increase. A similar trend for the number of connections between nodes can be observed. However, the increase in the PLOS-net density seems to be proportionally lower than the number of PLOS.

Fig. 1.
figure 1

Exemplary PLOS-nets. For \(m=3\), a two-dimensional projection is displayed, the darker the node color, the higher the \(f_3-\)value. Notice the different scales of axes.

This visual projection of PLOS-nets in the objective space is certainly not sufficient to elicit the structure and complexity of the underlying graphs in a comprehensive manner. For instance, at first sight, PLOS-nets might look fully connected. As it will be highlighted later in our analysis, this is definitely not true in general for the considered instances. Interestingly, the connectedness between PLOS is one critically important aspect for multi-objective local search. A closer investigation of the PLOS-net model will enable us to fully capture such an aspect, among others, in a very natural manner. Because of its roots in graph theory and complex networks [17], the PLOS-net model allows us to define informative metrics and statistics with respect to the structure of PLOS, as detailed in the following.

3.2 Definition of PLOS-net Features

Looking at the PLOS-net as a mathematical object, we propose to define and analyze a number of graph-based features inspired by previous studies from single-objective LON analysis [4, 13, 14] but taking into account the Pareto dominance relation when necessary, and hence accommodating the multi-objective nature of the considered problems.

The first two considered features give a general idea on the node degrees, that is the number of edges leaving a PLOS and connecting it to other PLOS.

  • node_prop: The number of nodes in the network, i.e., the number of PLOS, proportional to the search space size.

  • degree_avg: The average degree of a node, proportional to the number of nodes. This metric is equivalent to the density of edges, that is, the number of edges proportional to the maximum number of edges in a complete graph.

The next two features are related to the connectedness between PLOS. We argue that connectedness affects whether a local search with exhaustive neighborhood exploration will be able (or not) to find all PLOS.

  • comp_prop: The number of connected components in the graph, proportional to the number of nodes. A connected component is a maximal subgraph in which there exist a path between any pairs of nodes.

  • comp_max_size: The size of the largest connected components, proportional to the number of nodes.

The next three features deal with the paths that connect PLOS, either among them or to PLOS that are also Pareto (global) optimal solutions (POS). This is mostly intended to provide an information about how fast navigating between PLOS can lead to the Pareto front.

  • path_length: The average path length between any pair of nodes, proportional to the maximum distance between pairs of solutions. When the graph is not connected, only existing paths are considered.

  • path_pos_exist: The average number of nodes that are connected with (at least) one POS in the graph, proportional to the number of nodes.

  • path_length_pos: The average path length between nodes and their closest POS in the graph, proportional to the maximum distance between pairs of solutions. Nodes not connected to any POS are not taken into account.

The last three features describe the similarity of PLOS and are intended to capture how the PLOS may be clustered together, eventually inferring some preferred connections as observed in network communities.

  • assort_degree: The tendency of nodes to be connected to nodes with a similar degree.

  • assort_pos: The tendency of local (resp. global) optimal nodes to be connected to local (resp. global) optimal nodes.

  • assort_rank: The tendency of nodes to be connected to nodes with a similar rank. The rank of nodes is here defined in terms of nondominated sorting within the whole search space (dominance depth) [5].

Fig. 2.
figure 2

Distribution of feature-values with respect to instance parameters.

3.3 Exploratory Analysis

Based on the 10 aforementioned features, we are able to provide a more detailed analysis on the effect of instance parameters on the structure of the PLOS-net, as reported in Fig. 2. Confirming our visual inspection, one can clearly notice the positive correlation of the number of PLOS with \(-\rho \), m and k. This correlation is actually reverted when looking at the density of the graph degree_avg. This indicates that not only the number of PLOS increases with the intrinsic difficulty of an instance, but PLOS become more sparsely connected. A similar situation occurs when looking at the size of the induced connected components. For instances with a low degree of non-linearity (\(k=1\)), the proportional size of the largest connected component comp_max_size is 1 in most cases, meaning that all PLOS are connected. This is confirmed by the path_pos_exist feature, measuring how many PLOS are connected to (global) POS, which is also found to be close to 1 for \(k=1\). However, with the exception of 3-objective instances with highly conflicting objectives (\(\rho < 0\)), the PLOS-nets are clearly disconnected (path_pos_exist) and PLOS are relatively far from the Pareto set (path_length_pos). Surprisingly, the degree of connectedness seems to increase substantially for the highest value of \(k=4\). Finally, we can remark that the tendency of a node to be connected with nodes having the same degree (assert_degree) is mainly impacted by the degree of non-linearity k, while the other parameters \(\rho \) and m have only a marginal effect. This trend is roughly the same when looking at how the PLOS are mutually connected as a function of their nondominated rank (assort_rank). To summarize, we found that the connectedness of the PLOS-net is highly related to instance parameters, especially to the degree of non-linearity k, which is obviously related to search difficulty. In the next section, we go deeper in the analysis by providing a comprehensive statistical study of the predictive power of the features with respect to two conventional Pareto-based multi-objective search algorithms.

4 PLOS-net Features vs. Search Performance

4.1 Algorithms and Search Performance

In the following, we consider the relative impact of PLOS-net features on both Pareto Local Search (PLS) [15] and the Global Simple Evolutionary Multi-objective Optimizer (G-SEMO) [10]. For PLS, we are interested in the total number of evaluations performed by the algorithm before falling into a maximal Pareto local optimum set [15]. For G-SEMO, the stopping condition is arbitrarily set to \(10^4\) solutions evaluated. For both algorithms, we are interested in the quality of the approximation set, measured in terms of the Pareto front resolution, i.e., the proportion of nondominated solutions identified. We perform 30 independent runs of each algorithm per instance.

The expected number of evaluations performed by PLS is reported in Fig. 3. The expected Pareto front resolution obtained by PLS and G-SEMO is reported in Fig. 4. These two figures illustrate how strong is the effect of the intrinsic instance parameters on search performance. Notice in particular the linear slope of the PLS runtime as a function of \(\rho \), and the strong similar effect of k on the accuracy of PLS and G-SEMO, independently of \(\rho \) and m.

Fig. 3.
figure 3

Expected number of evaluations performed by PLS.

Fig. 4.
figure 4

Expected Pareto front resolution obtained by PLS and G-SEMO.

4.2 Effect of PLOS-net Features on Search Performance

The correlation of instance parameters (\(\rho \), m, k) and PLOS-net features with search performance is reported in Fig. 5. First, both the number of objectives m and their correlation degree \(\rho \) are highly correlated with the runtime of PLS and very little with search quality, while the opposite holds for the problem non-linearity k. Second, interesting correlations can clearly be observed for PLOS-net features. On the one hand, the number of connected components as well as the density of the network are correlated both to runtime and quality. We argue that this is due to the fact that local search is basically exploring the connected components in a pseudo-exhaustive manner. Hence, the performance of the considered algorithms is naturally correlated to how local optimal solutions are connected together. On the other hand, the features related to path length and node connection similarity have the lowest correlation with the runtime of PLS. However, this is no more true when examining the search resolution where the correlation of such features to the Pareto front resolution obtained by PLS and G-SEMO is consistently similar and more pronounced. In fact, the distance between local and global optima, as captured by path_lenght_pos, is a natural outcome to understand how likely a Pareto local search can find its way to Pareto optimal solutions when starting from a local optima and navigating throughout the PLOS-net under Pareto dominance.

Fig. 5.
figure 5

Spearman’s rank correlation coefficient between parameters/feature-values and the expected number of evaluations performed by PLS (left), the expected Pareto front resolution obtained by PLS (middle) and by G-SEMO (right).

4.3 Importance of PLOS-net Features on Search Performance

In this last section, we build a machine learning regression model to predict search performance based on three sets of input variables: (i) instance parameters only (i.e. \(\rho \), m, k), (ii) features from the network only (cf. Sect. 3.2), and (iii) the combination of the two. Given the non-linearity observed on the data, we decide to use random forest as a regression model [2], which also allows us to calculate the relative importance of input variables on the quality of the model. The proportion of variance explained by the random forest regression model for different input variables and search performance measures are reported in Table 1. As we can see, the addition of PLOS-net features as input variables largely improve the regression accuracy, and consequently the predictive power of the regression model. Additionally, the importance of model’s input variables [2] are depicted in Fig. 6. For a given search performance measure, the instance parameters and PLOS-net features are sorted in the decreasing order of importance, from top to bottom. Again, the PLOS-net features, especially those related to connectedness of PLOS, appear to have a high importance. By contrast, for the search quality measure, the number of objectives m and the objective correlation \(\rho \) has a relatively low importance, whereas the degree of non-linearity k has a relatively high importance. Finally, the number of PLOS, although being relatively important for the regression model, is less important than other PLOS-net features such as the structure of connected components.

Table 1. Variance explained by the regression model for different input variables.
Fig. 6.
figure 6

Relative importance (mean decrease in accuracy) of input variables (instance parameters and PLOS-net features) on the random forest regression model.

5 Conclusions

In this paper, we introduced the Pareto local optimal solutions network model as an alternative to capture the structure of multi-objective landscapes. We also investigated its relation with the performance of multi-objective search algorithms. Our work is to be viewed as the first step towards more systematic investigations on the accuracy of PLOS-nets as a powerful fundamental tool for enhancing our understanding and our practice of multi-objective optimization. In fact, several questions are left open. For example, it would be interesting to test whether our conclusions still hold for other state-of-the-art algorithms and standard multi-objective combinatorial optimization problems. Studying the scalability of the PLOS-net model for large-size problem instances is a challenging issue that should allow us to consider more practical prediction scenarios. It is for instance still unclear how to estimate the proportion of nodes in the PLOS-net and their relative connectedness. Considering adaptation of a simple solution-based Pareto adaptive walk [3] for this purpose can be a promising sampling methodology, which is worth investigating in the future. It would also be interesting to investigate how leveraging PLOS-net features could help improving the performance of the multi-objective optimization algorithms.