1 Introduction

Optimization [1] is a process concerned with exploring the best solution regarding some performance criteria. These criteria are referred to objective functions that can measure the solution quality regarding a target problem. The number of objectives determine the nature of the problem. A large group of optimization research focuses on the problems with only one objective, i.e. single-objective optimization. However, the majority of the real-world applications actually come with more than one objective. Those problems are categorized as the multi-objective optimization problems (MOPs) [2]. Further categorization is possible when the number of objectives is exactly two, i.e. bi-objective optimization. If the count exceeds three, then the MOPs are denoted as many-objective optimization. [3, 4].

The main challenge of having multiple objectives is that they are likely to be conflicting. Improving one objective can degrade the quality of the remaining objectives. This leads to solution quality evaluation based on various performance indicators utilizing all the objectives. R2 [5, 6], Hyper-volume (HV) [7], Generational Distance (GD) [8], Inverted/Inverse GD (IGD) [9], IGD\(+\) [10], Spread [11], and Epsilon [12] are well-known examples of the performance indicators. These indicators are mostly linked to Pareto fronts (PFs) where multiple solutions are maintained. PFs consist of the solutions that do not strictly dominate any other solution, i.e. the solutions that are not worse than the remaining solutions considering all the objectives. In that respect, the algorithms developed for the MOPs mostly operate on the populations of solutions, i.e. the population-based algorithms. Multi-objective Evolutionary Algorithms (MOEAs) [13, 14] take the lead in that domain. Non-dominated Sorting Genetic Algorithm II (NSGA-II) [11, 15], Pareto Archived Evolution Strategy (PAES) [16], Strength Pareto Evolutionary Algorithm 2 (SPEA2) [17], Pareto Envelope-based Selection Algorithm II (PESA-II) [18] and MOEA based on Decomposition (MOEA/D) [19] are some examples from the literature. There are other population-based algorithms besides MOEAs, using meta-heuristics like Particle Swarm Optimization (MOPSO) [20] and Ant Colony Optimization [21]. It is also possible to see their hybridized variants [22,23,24].

Despite these immense algorithm development efforts, it is unlikely to see a truly best, i.e. always coming first, algorithm on the existing benchmark scenarios under fair experimental conditions. This practical fact is further supported theoretically by the No Free Lunch (NFL) theorem [25]. This study focuses on automatically determining the algorithm to be applied for each given MOP instance, through Algorithm Selection (AS) [26]. AS is a meta-algorithmic approach offering improved performance through selection. The idea is to automatically choose algorithms from given problem solving scenarios. The selection operations are carried on a given algorithm set [27] consisting of those candidate methods to be picked. The traditional way of approach AS is in the form of performance prediction models. In that respect, a suite of features is needed to characterize the target problem instances. These features are matched with the performance of the candidate algorithms on a group of training instances. While the use of human-engineered features is common for AS, Deep Learning (DL) has also been used for automatically extracting features [28].

AS has been applied to a variety of problem domains such as Boolean Satisfiability (SAT) [29] Constraint Satisfaction (CSP) [30], Blackbox Optimization [31], Nurse Rostering (NRP) [32], Graph Coloring (GCP) [33], Traveling Salesman (TSP) [34] Traveling Thief Problem (TTP) [35], and Game Playing [36]. AS library (ASlib) [37] provides a diverse and comprehensive problem sets for AS. There have been development efforts of new AS systems for addressing these problems. SATzilla [29] is a well known AS method, particularly popularized due to its success in the SAT competitions. Hydra [38] is an example aiming at constructing algorithm sets, a.k.a. Algorithm Portfolios [27], via configuring the given algorithms. The portfolio building task has been studied for different selection tasks [39,40,41,42]. 3S [43] delivers algorithm schedules, assigning runtime to the algorithms for each given problem instance. Unlike these AS level contributions, Autofolio [44] takes the search to a higher level by seeking the best AS setting of varying components and parameter configurations. As another high-level approach, AS is used for performing per-instance selection across Selection Hyper-heuristics (SHHs) [45].

The present study performs AS to identify suitable algorithms for the given MOP instance. To be specific, the problem targeted here is the Large-scale MOP (LSMOP) where the number of decision variables can reach up to the vicinity of thousands. The instance set is based on 9 LSMOP benchmarks. Those base benchmarks are varied w.r.t. the number of objectives, i.e. 2 or 3, and the number of decision variables, varies between 46 and 1006, leading to 63 LSMOP instances. The task is to perform per-instance AS using an existing AS system named ALORS [46], among 4 candidate population-based algorithms. Hypervolume (HV) is used as the performance indicator. Experimental analysis carried out illustrated that AS only with 4 basic features outperforms those constituent multi-objective algorithms.

In the remainder of the paper, Sect. 2 discusses the use of AS. An empirical analysis is reported in Sect. 3. Section 4 comes with the concluding remarks besides discussing the future research ideas.

2 Method

ALORS [46] is concerned with the selection task as a recommender system (RS). ALORS specifically uses Collaborative Filtering (CF) [47] in that respect. Unlike the existing AS systems, ALORS is able to perform with the sparse/incomplete performance data, M, while maintaining high, comparable performance to the complete data. The performance refers to running a set of algorithms, A, on a group of instances, I. Thus, the performance data is a matrix of \(M_{|I| \times |A|}\). For decreasing the data generation cost of such sparse data has been further targeted in [48, 49]. While the entries of the performance data vary from problem to problem, ALORS generalizes them by using the rank data, \(\mathcal {M}\). Thus, any given performance data is first converted into rank data. Unlike the traditional AS systems, ALORS builds a prediction model with an intermediate feature-to-feature mapping step, instead of providing a direct rank prediction. The initial, hand-picked/designed features are referenced to a set of latent (hidden) features. These features are extracted directly from the rank performance data by using Singular Value Decomposition (SVD) [50]. SVD is a well-known Matrix Factorization (MF) strategy, used in various CF based RS applications [51]. SVD returns two matrices, U and V besides a diagonal matrix accommodating the singular values as \(\mathcal{M} = U \varSigma V^t \). U represents the rows of \(\mathcal {M}\), i.e. instances, while V displays its columns, i.e. algorithms, similarly to [52, 53]. Beyond representing those data elements, the idea is the reduce the dimensions, \(r \le min(|I|,|A|)\), hopefully eliminating the possible noise in \(\mathcal {M}\).

$$\begin{aligned} \mathcal{M} \approx U_r \varSigma _r V^t_r \end{aligned}$$

ALORS maps a given initial set of instance features F to \(U_r\). The predicted performance ranks are calculated by multiplying \(U_r\) with the remaining matrices of \(\varSigma _r\) and \(V^t_r\). In that respect, for a new problem instances, ALORS essentially determines an array of values, i.e. a new row for \(U_r\). Its multiplication with \(\varSigma _r\) and \(V^t_r\) delivers the expected performance ranks of the candidate algorithms on this new problem instance.

3 Computational Results

Despite the capabilities of ALORS as the sole Algorithm Selection (AS) approach, on working with incomplete performance data, the instance \(\times \) algorithm rank data here has the complete performance entries. The AS data is directly derived from [54]. The data on the Large-Scale Multi-objective Optimisation Problem (LSMOP) consists of 4 algorithms. The candidate algorithms are Speed-constrained Multi-objective Particle Swarm Optimization (SMPSO) [55], Multi-objective Evolutionary Algorithm based on Decision Variable Analysis (MOEA/DVA) [56], Large-scale Many-objective Evolutionary Algorithm (LMEA) [57] and Weighted Optimization Framework SMPSO (WOF-SMPSO) [58]. The hypervolume (HV) indicator [59] is used as the performance metric.

Table 1. The base LSMOP instances

Table 1 shows the specifications of the LSMOP benchmark functions [60]. The functions differ in terms of modality and separability. The 2-objective and 3-objective variants of each function are considered. Besides that further variations on the functions are achieved using different number of decision variables. In total, 63 LSMOP instances are present. The instances are encoded as LSMOPX_m=a_n=b where X is the base LSMOP index, m refers to the number of objectives and n is for the number of decision variables. All these instances are represented using just 4 features. Besides the modality and separability characteristics, the number of objectives and the number of variables are as the instance features.

Table 2. The average ranks of each constituent algorithm besides ALORS where the best per-benchmark performances are in bold (AVG: the average rank considering the average performance on each benchmark function; O-AVG: the overall average rank across all the instances)

Table 2 reports the performance of all the candidate algorithms besides ALORS as the automated selection method. Average performance on all the instances show that ALORS offers the best performance with the average rank of 2.07. The closest approach that is the single best method, i.e. WOF-SMPSO, comes with the average rank of 2.44 while SMPSO shows the overall worst performance with the average rank of 3.98. Referring to the standard deviations, ALORS also comes with the most robust behaviour.

Fig. 1.
figure 1

The selection frequencies of each algorithm by Oracle and ALORS

Figure 1 reports the selection frequencies of each constituent algorithm. Oracle denotes the optimal selection, i.e. choosing the best algorithm for each instance. The graph shows that ALORS shows similar behaviour to Oracle with minor variations. MOEA/DVA and WOF-SMPSO are the most frequently selected algorithms. Besides the pure selection frequencies, ALORS does not utilize SMPSO at all while it is preferred for two instances by Oracle.

Figure 2 illustrates the importance of each single feature in terms of Gini Index, derived by Random Forest (RF). All four features happen to contribute to the selection model. Being said that separability comes as the most critical feature while modality is the least important one.

Fig. 2.
figure 2

Importance of the initial, hand-picked LSMOP benchmark function instance features, using Gini Index/Importance

Fig. 3.
figure 3

Hierarchical clusters of instances using the latent features extracted from the performance data by SVD (\(k=3\))

Figure 3 reports the dis/-similarities of the LSMOP benchmark function instances. Linking to the feature importance analysis in Fig. 2, there is no a single criterion/feature to emphasize instance dis/-similarity, yet it is still possible to see the effects of separability. As an example, consider the 10 most similar instances provided on the right bottom of the clustering figure. The instances are LSMOP1_m=2_n=46, LSMOP1_m=2_n=106, LSMOP5_m=3_n=212, LSMOP5_m=3_n=112, LSMOP5_m=3_n=52, LSMOP2_m=2_n=106, LSMOP5_m=2_n=1006, LSMOP1_m=2_n=206, LSMOP8_m=3_n=52 and LSMOP9_m=3_n=112. 8 of them are fully separable. The remaining 2 instances are partially separable and mixed, respectively. Referring to the second best features, i.e. number of variables, the values change from 52 to 1006. Being said that 1006 occurs only once, thus The half of the instances have 2 objectives while the other half is with 3 objectives. As 2 out of 3 fully separable benchmark functions are unimodal, 7 instances happen to be unimodal. The other 3 instances are mixed in terms of modality.

Figure 4 illustrates the candidate algorithms which are hierarchically clustered. Referring to the best performing standalone algorithm, i.e. WFO-SMPSO, there is resemblance to SMPSO which is the base approach of WFO-SMPSO. Although their performance levels differ, their performance variations across the tested instances are similar.

Fig. 4.
figure 4

Hierarchical clusters of algorithms using the latent features extracted from the performance data by SVD (\(k=3\))

4 Conclusion

This study utilizes Algorithm Selection (AS) for Large-Scale Multi-objective Optimization, using Hyper-volume (HV) as the performance criterion. Multi-objective optimization is concerned with the majority of the real-world optimization tasks. In that respect, there have been immense efforts both problem modelling and algorithm development for multi-objective optimization. However, there is no ultimate multi-objective optimization algorithm that can outperform the competing algorithms under fair experimental settings. This practical fact reveals a clear performance gap that can be filled by AS. AS suggests a way to automatically determine the best algorithms for given problem instances.

The present work performs on 4 multi-objective optimization algorithm for 63 benchmarks originated from 9 base problems. For the instance characterization required to use AS, 4 simple instance features are determined. The corresponding computational analysis showed that AS is able to suppress those candidate algorithms. Further analysis carried on the algorithm and instance spaces delivered insights on the instance hardness, instance similarity and algorithm resemblance.

As the first study of using AS for multi-objective optimization, there are a variety of research tasks to be tackled as future research. The initial follow-up work is concerned with extending both the algorithm and instance space. Additionally, the well-known multi-objective performance indicators will be incorporated. The analysis on the algorithm and instance spaces will be extended accordingly. While an AS model will be derived for each indicator, the selection will be also achieved by taking all the indicators into account like a Pareto frontier. The idea will then be reversed to devise AS a multi-objective selection problem where the performance measures are the common AS metrics such as the Par10 score and success rate.