Abstract
Many works on surrogate-assisted evolutionary multiobjective optimization have been devoted to problems where function evaluations are time-consuming (e.g., based on simulations). In many real-life optimization problems, mathematical or simulation models are not always available and, instead, we only have data from experiments, measurements or sensors. In such cases, optimization is to be performed on surrogate models built on the data available. The main challenge there is to fit an accurate surrogate model and to obtain meaningful solutions. We apply Kriging as a surrogate model and utilize corresponding uncertainty information in different ways during the optimization process. We discuss experimental results obtained on benchmark multiobjective optimization problems with different sampling techniques and numbers of objectives. The results show the effect of different ways of utilizing uncertainty information on the quality of solutions.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Sometimes in real applications, multiple conflicting objectives should be optimized, but there is no mathematical or simulation model of the objectives involved. Instead, there is data, e.g., obtained via physical experiments. In such cases, surrogate models can be built using the given data and optimization is then performed with the surrogate models. In the literature, surrogate models such as Kriging [8], neural networks [18] and support vector regression [16] have been typically used for solving computationally expensive optimization problems [6, 10]. If we may conduct new (expensive) function evaluations when needed, this process is called online data-driven optimization [20]. When we do not have access to additional data during the optimization, we call it offline data-driven optimization [11].
In using surrogate models, the main challenge is to manage the models for improving convergence and diversity without too much sacrifice in the accuracy of models. In online data-driven optimization problems, an infill criterion [6] is maximized or minimized for updating the models iteratively during the optimization process. However, this is not applicable for offline data-driven optimization when no further data is available during the optimization process. So far, little research has been conducted on solving optimization problems, where no new data is available for managing the surrogates [4, 11, 20]. In such case, the quality of the solutions obtained after using the surrogate models is entirely dependent on the accuracy of the models and optimizer used.
When solving an offline data-driven problem with multiple conflicting objectives, one can fit models using all the data available for each objective function. Then an evolutionary multiobjective optimization (EMO) algorithm can be used on these models to find a set of approximated nondominated solutions. Essentially, in that case, an offline data-driven multiobjective optimization problem (MOP) can be divided into two major parts: model building and using an EMO algorithm.
Some surrogate models, like Kriging, provide uncertainty information (or standard deviation) about the predicted values. A low standard deviation implies that the actual objective function value has a higher chance of being close to the predicted value (though the actual function may remain unknown and the only information is the data available). Therefore, one possible way to improve the accuracy of the model is to utilize uncertainty in the fitted model as an additional objective to be optimized.
In this article, we study different ways to deal with the uncertainty information provided by the Kriging models in offline data-driven multiobjective optimization. Moreover, we consider the effect of using different initial sampling techniques on some benchmark test problems. In this study, we simulate offline problems by generating data for problems with known optimal solutions to be able to analyze the results. The results show the effect of utilizing uncertainty information in the quality of solutions.
The rest of this article is organized as follows. We summarize the basic concepts of data-driven optimization and Kriging model in Sect. 2. In Sect. 3, we present different approaches of incorporating uncertainty information in the optimization problem and present and analyze the results in Sect. 4. Finally, we draw conclusions in Sect. 5.
2 Background
2.1 Generic Offline Data-Driven EMO
We consider MOPs of the following form:
with k \((\ge 2)\) objective functions and the feasible set S is a subset of the decision space \(\mathbb {R}^n\). For any feasible decision vector \(\mathbf {x}\) we have a corresponding objective vector \(f(\mathbf {x})=(f_1(\mathbf {x}),\ldots ,f_k(\mathbf {x}))\).
MOPs that are offline in nature can generally be solved by the approach given in Fig. 1. In what follows, we refer to it as a generic approach. As described in [11, 21], the solution process can be split into three major components: (1) data collection, (2) model building and management, and (3) EMO method utilized. The collection of data may also incorporate data pre-processing, if it is required. Once the data has been obtained, the objectives and constraints of the MOP are formulated. The next stage is to build surrogate models (also known as meta-models) e.g. for each objective function using the available data. Finally, an EMO method is used to find nondominated solutions utilizing the surrogates as objective functions. As objectives to be optimized in (1) we have for \(i=1,\ldots ,k\) the predicted means \(\hat{f}_i\) of the surrogate of objective \(f_i\) and our objective vector is denoted by:
Selecting proper surrogate models is a challenging task in model management. In online data-driven EMO, the quality of the surrogate models can be accessed and updated as new data becomes available during the optimization process. However, for offline data-driven EMO this is not possible. It becomes even more challenging with the data being noisy [22], skewed [23], time-varying [2] or heterogeneous [3]. Thus, it is crucial to build, before optimization, surrogates that are as good approximations as possible of the “true” objective functions. One way to improve the accuracy of the surrogates is to enhance the quality of the data. In this research, our consideration is on a general level and we do not go into the characteristics of the data.
In offline data-driven EMO, the possible ways to improve the accuracy of the surrogate models are to have an effective data pre-processing for noise removal [4], creating synthetic data [23], transferring knowledge [15] or applying advanced machine leaning techniques [19, 20]. However, it is quite possible that the surrogate models are not good representations of the true objectives. It may even happen that the solutions obtained are actually worse than the data used for fitting the models.
2.2 Kriging
Kriging or Gaussian process regression has been widely used as a surrogate model for solving expensive optimization problems [6]. The main advantage of using Kriging is its ability to provide uncertainty information of the predicted values. Given a Kriging model, the approximated mean value \(y^*\) and its variance \(s^2\) for a sample (or decision variable value) \(\mathbf {x}^*\) are as follows:
where \(\mathbf {X}\in \mathbb {R}^{N_I \times n}\) is the matrix of the given data with \(N_I\) items with n decision variables, \(\mathbf {y}\in \mathbb {R}^{N_I}\) is the vector of given objective values corresponding to some decision vector, \(K(\mathbf {X},\mathbf {X})\) is the covariance matrix of \(\mathbf {X}\) and \(\mathbf {k}(\mathbf {x}^*,\mathbf {X})\) is a vector of covariances between \(\mathbf {x}^*\) and \(\mathbf {X}\). For more details about Kriging, see [17].
3 Approaches to Incorporate Uncertainty
As new data cannot be obtained in offline data-driven optimization, it is difficult to update the surrogates and enhance their accuracy. One approach is to build a very accurate surrogate model before the optimization process. Another possible approach is to provide a suitable metric in addition to final solutions after the optimization process, which can be used to measure the accuracy of solutions obtained. This approach can be beneficial when the surrogate models cannot provide a very exact representation of the true objective functions. One such instance can be when the data consists of optimal solutions. In such a case, the surrogate might not be a good representation of the actual objectives, which might lead to degraded final solutions. Providing a set of solutions together with the uncertainty information of predicted final solutions can be helpful in the decision making process.
As previously discussed, the two major components in offline data-driven optimization are building a surrogate model and using an EMO algorithm. In this research we have limited ourselves by focusing on a few variations of the optimization problem which try to minimize the uncertainty in the final solutions. As shown in Fig. 2, the uncertainties in the predicted value of the Kriging models are utilized as additional objective functions. By considering uncertainties in this way, the EMO method tries to minimize the predicted mean values from the fitted Kriging models by subsequently minimizing the standard deviations in the prediction. Thus, the final set of nondominated solutions will consist of solutions with different levels of uncertainty.
We have tested three different approaches for utilizing uncertainties in the optimization. Approach 1 uses all the standard deviations given by each surrogate model as additional objectives. The resulting objective vector in Approach 1 is:
where \(\hat{f_i}(\mathbf {x})\) and \(s_i(\mathbf {x})\) and are the predicted mean and the standard deviation values for the \(i^{th}\) objective. Final solutions are obtained by performing a nondominated sort on the archive of predicted solutions (predicted mean values and standard deviations) stored while optimization. It might be possible that the solutions have different uncertainties for different objectives. We double the number of objectives which may increase the complexity of solving the resulting optimization problem.
Approach 2 utilizes the average of the standard deviations given by each of the surrogate models as an additional objective and the resulting objective vector is:
where \(\bar{s}(\mathbf {x})\) is the average of the standard deviations from Kriging models built for each objective function. This method has fewer objectives when compared to Approach 1, however, either of the approaches provide solutions with a range of uncertainty values. Both Approaches 1 and 2 can provide an option for filtering solutions based on the uncertainty information.
Approach 3 utilizes the expected improvement (EI) [12] for every surrogate model as objectives to be optimized by the EMO algorithm, see, e.g. [9]. Expected improvement can be expressed as \(\text {EI}(\mathbf {x})=(f_{min}-\hat{f}(\mathbf {x}))\varPhi \left( \frac{f_{min}-\hat{f}(\mathbf {x})}{s(\mathbf {x})} \right) + s(\mathbf {x})\phi \left( \frac{f_{min}-\hat{f}(\mathbf {x})}{s(\mathbf {x})} \right) \), where \(\phi (\cdot )\) and \(\varPhi (\cdot )\) are the standard normal density and distribution function respectively, and \(f_{min}\) is a k-dimensional vector, where the \(i^{th}\) component represents the best values of the \(i^{th}\) objective function in the given data. The objective vector in this case is:
where \(\text {EI}_i(\mathbf {x})\) is the expected improvement value for the \(i^{th}\) objective. The EI criterion takes the predicted mean value and the standard deviation into account.
Now we have introduced three approaches for incorporating uncertainty information. Algorithm 1 shows the process of applying any of them in the offline optimization process, where k is the number of objectives and we can use the maximum number of evaluations using surrogate models as a stopping criterion.
4 Experimental Results
We compare the three different approaches to each other and also to a generic approach (as (2) in Subsect. 2.1), using test problems DTLZ2, DTLZ4–DTLZ7 with 2, 3 and 5 objectives. As said, we generate data for these problems and fit Kriging models there. The dimension of the decision variable space n is fixed to 10.
The size of the data set used is 109 (corresponds to the \(11n-1\) [5, 13, 24]). The sampling techniques for creating the data sets were Latin hypercube sampling (LHS), uniform random sampling and a special case of sampling which we call optimal-random sampling. In the latter, 50% of the data are nondominated solutions and the remaining 50% are uniform random samples. This kind of hypothetical sampling might resemble a special case where most of the samples in the given data set are close to optimal, and thus the optimization process could no longer improve the solutions further. However, in such a scenario the offline optimization technique should not compute final solutions which are worse than the provided samples. A total of 31 independent runs from each sampling were performed for each case.
We used indicator based evolutionary algorithm (IBEA) [25] as the EMO method as it has been demonstrated to perform well in [1] even for problems with a higher number of objectives. The selection criterion was \(I_{\epsilon +}\) (Step 6 in Algorithm 1) with \(\kappa \) parameter values 0.51, 0.87 and 0.48 for \(k= 2, 3 \) and 5, respectively, and \(\kappa \) value of 0.5 for any other number of objectives. The population size was 100 and the maximum number of function evaluations was 40 000 according to [1]. We used Matlab implementation of Kriging models with first order polynomial functions and a Gaussian kernel function.
For measuring the performance of different approaches, we first performed a nondominated sort on the archive (also including the additional objective(s)). These nondominated solutions were then evaluated with the real objective function. After obtaining their true objective function values, dominated solutions were removed producing the final nondominated set. For comparing the quality of solutions for all the approaches, inverted generational distance (IGD) metric was utilized with 5000 points in the reference set for all problems.
Table 1 shows the comparison between the mean and standard deviation values of the IGD for all the three approaches and the generic approach. It was observed that Approaches 1 and 2 performed better than the generic approach for LHS and uniform random sampling for all the problems with various numbers of objectives with the exception of DTLZ6 and DTLZ7. However, while using optimal-random sampling, Approaches 1 and 2 performed better than the generic approach for DTLZ2, DTLZ4-5 and better for DTLZ6 and DTLZ7 for few of the objectives. Approach 3 did not produce good results for any of the problems, objectives or sampling technique.
Adding uncertainties as additional objectives pose a major problem in explaining the effect of optimization as the fitness landscape of the uncertainties is mostly unknown. A possible explanation that no noticeable performance improvement is observed in DTLZ6 when using Approaches 1 and 2 is because the problem consists has a non-uniform (or biased) [7] degenerated Pareto front. Adding additional uncertainty objectives makes the problem even harder to solve and fewer nondominated solutions are obtained. For DTLZ7, a possible explanation for the worse performance of Approaches 1 and 2 is that the objective functions are completely separable [14]. Thus, the additional objectives added by Approaches 1 and 2 only make the problem more difficult than the generic approach.
For optimal-random sampling the advantage of Approaches 1 and 2 was clearly visible. Despite the initial sampling including also nondominated solutions, the generic approach failed to provide good solutions. This is because the surrogate models do not provide a perfect representation of the true objectives. While utilizing EIs as objectives in Approach 3, the solutions were actually worse (comparing mean IGD values) for most of the cases. This is because EI tries to balance between convergence and diversity. Therefore, it can select a solution with a high uncertainty for achieving its goal.
Figure 3 shows the root mean square error (RMSE) of the final solutions obtained by different approaches with LHS sampling on problems with two objectives. It can be observed that the solutions obtained by Approaches 1 and 2 are more accurate in most of the cases. This means that using uncertainty as additional objective(s) helps to find solutions with a low approximation error. Therefore, using uncertainty in the optimization process can be considered as an advantage in solving an offline data-driven EMO problem where there is no possibility for updating the surrogate models. An illustration of solutions obtained after evaluating them with real objectives for the DTLZ2 problem with LHS and optimal-random sampling is shown in Fig. 4. Due to space limitations, further analysis is available at http://www.mit.jyu.fi/optgroup/extramaterial.html as additional material. The performance of the proposed approaches on other test problems (i.e., DTLZ1, DTLZ3, WFG1-WFG3, WFG5 and WFG9) can also be found at the above-mentioned website.
5 Conclusions
We have considered offline data-driven optimization with evolutionary multiobjective optimization. We used Kriging to fit surrogate models to data and proposed and tested three approaches to utilize uncertainty information from Kriging models in the optimization. A comparison was done with several benchmark problems, sampling techniques and varying the number of objectives in solving offline data-driven multiobjective optimization problems. Adding uncertainty as one or more objectives showed improvements in the final solutions for certain problems in our benchmark testing. However, utilizing expected improvements as objectives (in Approach 3) did not seem to be effective in solving this kind of problems. The analysis also revealed that the solutions obtained in Approaches 1 and 2 are more accurate compared to the ones obtained using a generic approach (without uncertainty information).
Future work will include comparing the performance of the proposed approaches with bigger initial sample sizes, higher number of decision variables and higher number of objectives. Aiding the decision making process by giving a decision maker an option to select a final solution using the uncertainty information is another direction to work on. Moreover, filtering techniques can be applied to remove solutions with higher uncertainties. Testing on real-world data sets and exploring different ways to deal with uncertainties using other surrogate models will also be future research topics.
References
Bezerra, L.C.T., López-Ibáñez, M., Stützle, T.: A large-scale experimental evaluation of high-performing multi- and many-objective evolutionary algorithms. Evol. Comput. 26, 621–656 (2018)
Blackwell, T., Branke, J.: Multiswarms, exclusion, and anti-convergence in dynamic environments. IEEE Trans. Evol. Comput. 10(4), 459–472 (2006)
Castano, S., Antonellis, V.D.: Global viewing of heterogeneous data sources. IEEE Trans. Knowl. Data Eng. 13(2), 277–297 (2001)
Chugh, T., Chakraborti, N., Sindhya, K., Jin, Y.: A data-driven surrogate-assisted evolutionary algorithm applied to a many-objective blast furnace optimization problem. Mater. Manuf. Process. 32(10), 1172–1178 (2017)
Chugh, T., Jin, Y., Miettinen, K., Hakanen, J., Sindhya, K.: A surrogate-assisted reference vector guided evolutionary algorithm for computationally expensive many-objective optimization. IEEE Trans. Evol. Comput. 22(1), 129–142 (2018)
Chugh, T., Sindhya, K., Hakanen, J., Miettinen, K.: A survey on handling computationally expensive multiobjective optimization problems with evolutionary algorithms. Soft Comput. (to appear). https://doi.org/10.1007/s00500-017-2965-0
Coello, C., Lamont, G., Veldhuizen, D.: Evolutionary Algorithms for Solving Multi-Objective Problems, 2nd edn. Springer, New York (2007). https://doi.org/10.1007/978-0-387-36797-2
Forrester, A., Sobester, A., Keane, A.: Engineering Design via Surrogate Modelling. Wiley, Hoboken (2008)
Jeong, S., Obayashi, S.: Efficient global optimization (EGO) for multi-objective problem and data mining. In: 2005 IEEE Congress on Evolutionary Computation, vol. 3, pp. 2138–2145 (2005)
Jin, Y.: Surrogate-assisted evolutionary computation: recent advances and future challenges. Swarm Evol. Comput. 1, 61–70 (2011)
Jin, Y., Wang, H., Chugh, T., Guo, D., Miettinen, K.: Data-driven evolutionary optimization: an overview and case studies. IEEE Trans. Evol. Comput. (to appear). https://doi.org/10.1109/TEVC.2018.2869001
Jones, D.R., Schonlau, M., Welch, W.J.: Efficient global optimization of expensive black-box functions. J. Global Optim. 13(4), 455–492 (1998)
Knowles, J.: ParEGO: a hybrid algorithm with on-line landscape approximation for expensive multiobjective optimization problems. IEEE Trans. Evol. Comput. 10(1), 50–66 (2006)
Li, K., Omidvar, M.N., Deb, K., Yao, X.: Variable interaction in multi-objective optimization problems. In: Handl, J., Hart, E., Lewis, P.R., López-Ibáñez, M., Ochoa, G., Paechter, B. (eds.) PPSN 2016. LNCS, vol. 9921, pp. 399–409. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-45823-6_37
Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)
Pilat, M., Neruda, R.: Aggregate meta-models for evolutionary multiobjective and many-objective optimization. Neurocomputing 116, 392–402 (2013)
Rasmussen, C., Williams, C.: Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning). The MIT Press, Cambridge (2005)
Regis, R.G.: Evolutionary programming for high-dimensional constrained expensive black-box optimization using radial basis functions. IEEE Trans. Evol. Comput. 18(3), 326–347 (2014)
Sun, X., Gong, D., Jin, Y., Chen, S.: A new surrogate-assisted interactive genetic algorithm with weighted semisupervised learning. IEEE Trans. Cybern. 43(2), 685–698 (2013)
Wang, H., Jin, Y., Jansen, J.O.: Data-driven surrogate-assisted multiobjective evolutionary optimization of a trauma system. IEEE Trans. Evol. Comput. 20(6), 939–952 (2016)
Wang, H., Jin, Y., Sun, C., Doherty, J.: Offline data-driven evolutionary optimization using selective surrogate ensembles. IEEE Trans. Evol. Comput. (to appear). https://doi.org/10.1109/TEVC.2018.2834881
Wang, H., Zhang, Q., Jiao, L., Yao, X.: Regularity model for noisy multiobjective optimization. IEEE Trans. Cybern. 46(9), 1997–2009 (2016)
Wang, S., Minku, L.L., Yao, X.: Resampling-based ensemble methods for online class imbalance learning. IEEE Trans. Knowl. Data Eng. 27(5), 1356–1368 (2015)
Zhang, Q., Liu, W., Tsang, E., Virginas, B.: Expensive multiobjective optimization by MOEA/D with Gaussian process model. IEEE Trans. Evol. Comput. 14(3), 456–474 (2010)
Zitzler, E., Künzli, S.: Indicator-based selection in multiobjective search. In: Yao, X., et al. (eds.) PPSN 2004. LNCS, vol. 3242, pp. 832–842. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30217-9_84
Acknowledgements
This research is related to the thematic research area Decision Analytics utilizing Causal Models and Multiobjective Optimization (DEMO) at the University of Jyvaskyla. This work was partially supported by the Natural Environment Research Council [NE/P017436/1].
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Mazumdar, A., Chugh, T., Miettinen, K., López-Ibáñez, M. (2019). On Dealing with Uncertainties from Kriging Models in Offline Data-Driven Evolutionary Multiobjective Optimization. In: Deb, K., et al. Evolutionary Multi-Criterion Optimization. EMO 2019. Lecture Notes in Computer Science(), vol 11411. Springer, Cham. https://doi.org/10.1007/978-3-030-12598-1_37
Download citation
DOI: https://doi.org/10.1007/978-3-030-12598-1_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-12597-4
Online ISBN: 978-3-030-12598-1
eBook Packages: Computer ScienceComputer Science (R0)