A bi-fidelity Bayesian optimization method for multi-objective optimization with a novel acquisition function

Xu, Kaiqin; Shu, Leshi; Zhong, Linjun; Jiang, Ping; Zhou, Qi

doi:10.1007/s00158-023-03509-9

A bi-fidelity Bayesian optimization method for multi-objective optimization with a novel acquisition function

Research Paper
Published: 22 February 2023

Volume 66, article number 53, (2023)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Structural and Multidisciplinary Optimization Aims and scope Submit manuscript

A bi-fidelity Bayesian optimization method for multi-objective optimization with a novel acquisition function

Download PDF

Kaiqin Xu¹,
Leshi Shu ORCID: orcid.org/0000-0001-9168-1347¹,
Linjun Zhong¹,
Ping Jiang¹ &
…
Qi Zhou²

813 Accesses
2 Citations
Explore all metrics

Abstract

In engineering design optimization, there are often multiple conflicting optimization objectives. Bayesian optimization (BO) is successfully applied in solving multi-objective optimization problems to reduce computational expense. However, the expensive expense associated with high-fidelity simulations has not been fully addressed. Combining the BO methods with the bi-fidelity surrogate model can further reduce expense by using the information of samples with different fidelities. In this paper, a bi-fidelity BO method for multi-objective optimization based on lower confidence bound function and the hierarchical Kriging model is proposed. In the proposed method, a novel bi-fidelity acquisition function is developed to guide the optimization process, in which a cost coefficient is adopted to balance the sampling cost and the information provided by the new sample. The proposed method quantifies the effect of samples with different fidelities for improving the quality of the Pareto set and fills the blank of the research domain in extending BO based on the lower confidence bound (LCB) function with bi-fidelity surrogate model for multi-objective optimization. Compared with the four state-of-the-art BO methods, the results show that the proposed method is able to obviously reduce the expense while obtaining high-quality Pareto solutions.

Bi-objective Bayesian optimization of engineering problems with cheap and expensive cost functions

Article 25 January 2022

A multi-fidelity Bayesian optimization approach based on the expected further improvement

Article 18 November 2020

A sequential multi-fidelity surrogate-based optimization methodology based on expected improvement reduction

Article 05 May 2022

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Simulation models have been widely used in engineering design optimization. However, obtaining the Pareto solutions of multi-objective optimization problems usually requires many evaluations of objective functions. The evaluation processes of objective functions rely on time-consuming simulation models, which increases the burden of the optimization tasks. To solve this problem, surrogate models, such as Kriging models (Jerome et al. 1989; Liu et al. 2018; Williams et al. 2021), radial basis function models (Sóbester et al. 2005; Zhou et al. 2017), support vector regression models (Shi et al. 2020), are commonly adopted to replace expensive simulations in engineering design optimization.

Bayesian optimization (Shahriari et al. 2016) is a typical surrogate-based optimization method, which is able to balance the exploitation and exploration by utilizing the uncertainty term of the surrogate model with an acquisition function (Jeong et al. 2005). Commonly used acquisition functions include the probability of improvement (PI) (Ruan et al. 2020), expected improvement (EI) (Jones 2001; Zhan et al. 2022), and lower confidence bound (LCB) (Srinivas et al. 2012; Zheng et al. 2016). In recent years, extending these single-objective BO methods to multi-objective optimization has also gained much attention. For example, Knowles (2006) proposed the ParEGO method, which extended the EI function to multi-objective optimization. Emmerich et al. (2011) combined the hypervolume improvement with the EI function, which clarifies the regions where the Pareto frontier can be improved by establishing the monotonicity properties. Zhan et al. (2017) expanded the EI function into a matrix form (EIM), where the elements of the matrix are the EI value corresponding to each objective. Shu et al. (2020) defined an acquisition function for multi-objective optimization problems, considering convergence and diversity of the Pareto frontier.

However, these methods only utilize the high-accuracy samples to build the surrogate model, which still expends much computational cost (Chen et al. 2016). A potential way to further reduce the expense is adopting the multi-fidelity (MF) surrogate model (Zhang et al. 2018a, b), constructed based on a few costly high-fidelity (HF) samples and plenty of cheap low-fidelity (LF) samples. There are three main MF surrogate modeling methods: the scaling function method (Choi et al. 2009; Li et al. 2017), the space mapping method (Leifsson et al. 2015), and the Co-Kriging method (Kennedy et al. 2000; Perdikaris et al. 2015; Singh et al. 2017). Moreover, several new MF modeling methods have been proposed in the past few years, such as the method proposed by Eweis-Labolle et al. (2022), which utilizes latent-map Gaussian processes and outperforms the space mapping method. With above MF surrogate models gaining much attention, several multi-fidelity BO methods have been developed. For example, Zhang et al. (2018a, b) proposed a multi-fidelity method with an extension of Co-Kriging model, hierarchical Kriging (HK) model (Han et al. 2012), which avoids the construction of cross covariance in Co-Kriging. Jiang et al. (2019) proposed a new method to obtain the new sample and fidelity level, extended BO based on the LCB function with MF surrogate model. However, very limited work is accomplished for multi-objective optimization with BO based on the MF surrogate model. He et al. (2022) introduced a novel method for multi-objective problems, combining the HK model and EIM method. The blank of the research domain in extending BO based on the LCB function with MF surrogate model for multi-objective optimization still needs to fill. The bi-fidelity (BF) surrogate model is the surrogate model with two different fidelities, which is the special case of the MF surrogate model and needs to be considered first.

In this paper, a bi-fidelity BO method for multi-objective optimization is proposed, which utilizes the advantages of the bi-fidelity surrogate model and BO framework to further reduce the expense. In the method, a novel acquisition function is developed, in which a cost coefficient considering a cost ratio between different fidelity levels is introduced to balance the sampling cost and information of the new sample. The proposed method is compared with four state-of-the-art BO methods using several numerical examples and two engineering design optimization problems. The results show that the proposed method presents prominent performance in computational cost, especially for solving high-dimension problems.

The remainder of the paper is organized as follows. In Sec. 2, the technical background of the multi-objective optimization, HK model, and Bayesian optimization are recalled. Sec. 3 introduces the proposed method in detail. In Sec. 4, the results of comparative study between the proposed method and other state-of-the-art methods are presented. Sec. 5 summarizes this paper with conclusions and future work provided.

2 Background

2.1 Multi-objective optimization

Multi-objective optimization exists multiple conflicting objectives to optimize, and the optimization problem can be formulated as follows:

$$\begin{array}{*{20}c} {\min F(x) = \left\{ {f_{1} (x),f_{2} (x), \ldots ,f_{i} (x), \ldots ,f_{M} (x)} \right\}} \\ {s.t.\;g_{j} \left( x \right) \le 0,j = 1,2, \ldots ,q} \\ {x_{lb} \le x \le x_{ub} } \\ \end{array}$$

(1)

where the dimension of objective functions $\, F({\text{x}}) \,$ is $\, M$, ${\text{x}} = (x_{1} ,x_{2} ,...,x_{N} )^{T}$ denotes N-dimension design variable vector with lower/upper bounds of ${\text{x}}_{lb}$ and ${\text{x}}_{ub}$, ${\text{g}} = (g_{1} ,g_{2} ,...,g_{q} )$ is the constrain vector that can be simple linear or nonlinear.

The multi-objective improvement functions are usually employed as acquisition functions in BO for multi-objective optimization. Euclidean distance improvement function (Keane 2006), maximin distance improvement function (Svenson et al. 2016), and hypervolume (HV) (Yang et al. 2019) improvement function are three state-of-the-art multi-objective improvement functions. The Euclidean distance improvement function can be expressed as follows:

$$I({\text{x}}) = \mathop {\min }\limits_{j = 1}^{k} \sqrt {\sum\limits_{i = 1}^{m} {(f_{i}^{j} - y_{i} ({\text{x}}))^{2} } }$$

(2)

where $f_{i}^{j}$ is the jth Pareto solution of the ith objective, $y_{i} ({\text{x}})$ is the prediction of the ith objective.

The maximin distance improvement function is expressed as follows:

$$I({\text{x}}) = - \mathop {\max }\limits_{j = 1}^{k} \left[ {\mathop {\min }\limits_{i = 1}^{m} (y_{i} ({\text{x}}) - f_{i}^{j} )} \right]$$

(3)

Before the HV improvement function is introduced in detail, firstly the concept of the HV should be clarified. With a reference point dominated by the Pareto set, the HV of the dominated region can be measured as follows:

$$H(p) = {\text{Volume}} (y \in {\mathbf{\mathbb{R}}}^{M} |p \prec y \prec R)$$

(4)

where $p \prec y \prec R$ means the dominated region bounded by current Pareto $p$ and the reference point, Volume (·) is the HV indicator.

The HV improvement is the difference of the HV indicator between the current Pareto set and Pareto set with the next sample. It can be formulated as follows:

$$I(y) = H(p \cup y) - H(p)$$

(5)

where $p \cup y$ represents the Pareto set obtained by the non-dominated-sort method after the new sample y added. $I(y)$ is the HV improvement beyond the current Pareto set after adding a new sample $\, y \,$. Figure 1 presents the 2D example of the HV improvement. The light gray area is the HV of current Pareto set and the dark gray area illustrates the $I(y)$. A higher HV improvement means a better-quality improvement beyond the current Pareto set.

More information about these improvement functions can be consulted in Svenson’s work (2016).

2.2 Hierarchical kriging (HK) model

Kriging model (Williams et al. 2006) is an interpolative surrogate model, which originated from the geostatistics and was applied to fit expensive simulation (Jerome et al. 1989).

HK model is a MF surrogate model based on the Kriging model, which takes LF prediction as an overall trend of the MF model with the HF function adopted to be a correction. To build an HF surrogate model for expensive function, first we build a Kriging model with LF samples, which is adopted thereafter to assist the construction of the MF model. The LF Kriging model is expressed as follows:

$$Y_{l} ({\text{x}}) = \beta_{0,l} + Z_{l} ({\text{x}})$$

(6)

where $\beta_{0,l}$ is the mean term and $Z_{l} ({\text{x}})$ is a random process with variance $\sigma^{2}$ and zero mean. One of the most commonly used spatial correlation functions between the error term of two points ${\text{ x }}$ and ${\text{ x}}^{\prime}$, named the Gaussian correlation function, is given as follows:

$${\text{cov}} (Z({\text{x}}),Z({\text{x}}^{\prime})) = \sigma^{2} \exp \left\{ { - \sum\limits_{i = 1}^{d} {\theta_{i} \left| {x_{i} - x_{i}^{^{\prime}} } \right|^{2} } } \right\}$$

(7)

where $\sigma^{2}$ is the variance of the $Z({\text{x}})$,$\, d \,$ is the dimension of design variables, and ${\uptheta } = \left\{ {\theta_{1} ,\theta_{2} ,...,\theta_{d} } \right\}$ is a “roughness” parameter associated with dimension i.

The prediction of mean and mean-squared error (MSE) from LF model at any unobserved points can be formulated as follows:

$$\hat{y}_{l} (x) = \beta_{0,l} + {\text{r}}_{l}^{T} (x){\text{R}}_{l}^{ - 1} (f_{l} ({\text{x}}) - {1}\beta_{0,l} )$$

(8)

$$s_{l}^{2} (x) = \sigma^{2} (1 - {\text{r}}_{l}^{T} (x){\text{R}}_{l}^{ - 1} {\text{r}}_{l} (x) + \frac{{(1 - {1}^{T} {\text{R}}_{l}^{ - 1} {\text{r}}_{l} (x))^{2} }}{{{1}^{T} {\text{R}}_{l}^{ - 1} {1}}})$$

(9)

respectively, where $\beta_{0,l} = ({1}^{T} {\text{R}}_{l}^{ - 1} {1})^{ - 1} {1}^{T} {\text{R}}_{l}^{ - 1} f_{l} ({\text{x}})$, $f_{l} ({\text{x}})$ is the LF response of samples ${\text{ x}}$,${1}$ is a vector filled with 1, ${\text{R}}_{l}$ is the covariance matrix with $R_{l,mn} = {\text{cov}} (Z({\text{x}}^{m} ),Z({\text{x}}^{n} ))$ as elements, and ${\text{r}}_{l}$ is a vector of correlation between the unobserved point and samples. Above hyperparameters $\beta_{0,l}$, $\sigma^{2}$, $\theta_{i}$ are calculated by Maximum Likelihood Estimation (MLE).

Taking the predicted value of the LF model to construct the prior mean of the HF Kriging model, it will be built as follows:

$$Y({\text{x}}) = \beta_{0} \hat{y}_{l} ({\text{x}}) + Z({\text{x}})$$

(10)

where the $\hat{y}_{l} ({\text{x}})$ is the prediction of the LF model scaled by a constant factor $\beta_{0}$, $Z({\text{x}})$ is a random process with zero mean and a covariance of ${\text{cov}} (Z({\text{x}}),Z({\text{x}}^{^{\prime}} ))$.

The prediction of mean value and mean-squared error (MSE) of HK model at the unobserved point can be formulated as follows:

$$\hat{y}(x) = \beta_{0} \hat{y}_{l} ({\text{x}}) + {\text{r}}^{T} (x){\text{R}}^{ - 1} (f_{h} ({\text{x}}) - \beta_{0} f_{l} ({\text{x}}))$$

(11)

$$\begin{gathered} MSE\left\{ {\hat{y}(x)} \right\} = \sigma^{2} \left\{ {1 - {\text{r}}^{T} (x){\text{R}}^{ - 1} {\text{r}}(x) + } \right. \\ \left. {\left[ {{\text{r}}^{T} (x){\text{R}}^{ - 1} f_{l} ({\text{x}}) - \hat{y}_{l} ({\text{x}})} \right]\left[ {f_{l} ({\text{x}})^{T} {\text{R}}^{ - 1} f_{l} ({\text{x}})} \right]^{ - 1} \left[ {{\text{r}}^{T} (x){\text{R}}^{ - 1} f_{l} ({\text{x}}) - \hat{y}_{l} ({\text{x}})} \right]^{T} } \right\} \\ \end{gathered}$$

(12)

respectively, where $\beta_{0} = \left[ {f_{l} ({\text{x}})^{T} {\text{R}}^{ - 1} f_{l} ({\text{x}})} \right]^{ - 1} f_{l} ({\text{x}})^{T} {\text{R}}^{ - 1} f_{h} ({\text{x}})$ is a scaling factor, and $f_{h} ({\text{x}})$ is the high-fidelity response of sample points ${\text{ x}}$.

More details about hierarchical Kriging model are introduced in Han’s work (2012).

2.3 Bayesian optimization based on the LCB function

Bayesian optimization is a type of the surrogate model-based method for expensive optimization problems. The BO method utilizes the uncertainty and prediction information from surrogate model to quantify the improvement or probability of improvement between new sample and current optimum, which can balance the exploitation and exploration. EI and LCB functions are commonly used acquisition functions (Tran et al. 2019). With the number of objectives increased, due to the complexity of the non-dominated region, a piecewise integral approach is often used to calculate the EI value, where the integral region is commonly resolved into some regular cells (Zhan et al. 2017). Hence, such improvement-based acquisition function as above multi-objective EI criteria becomes so complicated that the EI value is hard to be calculated. On the contrary, an outstanding advantage using LCB function is that it is no need to calculate such tedious piecewise integral like EI and PI function. This property economizes many computational resources from the calculation of high-dimension integrals.

Given the samples $\left\{ {x_{1} ,x_{2} ,...,x_{N} } \right\}$ and their associated response $\left\{ {f_{1} ,f_{2} ,...,f_{N} } \right\}$, LCB function is formulated as follows:

$$f_{LCB} ({\text{x}}) = \hat{f}({\text{x}}) - ks({\text{x}})$$

(13)

where $\hat{f}({\text{x}})$ represents the prediction value of surrogate model, $s({\text{x}})$ is the square root of uncertainty, $k$ is a parameter balanced the exploitation and exploration. In multi-objective optimization, the form of LCB function will be a vector (Shu et al. 2020):

$$\begin{gathered} f_{LCB} ({\text{x}}) = \left[ {f_{LCB,1} ({\text{x}}),f_{LCB,2} ({\text{x}}),...,f_{LCB,M} ({\text{x}})} \right] \\ where \, f_{LCB,i} ({\text{x}}) = \hat{f}_{i} ({\text{x}}) - k_{i} s_{i} ({\text{x}}) \\ \end{gathered}$$

(14)

where M is the number of objectives, each objective is fitted by the surrogate model respectively, and $s_{i} (x)$ is the square root of the posterior predictive variance. A larger value of the coefficient $k_{i}$ represents that the algorithm is encouraged to search the solution in the uncertain region, conversely a small value encourages exploitation in the current optimum region.

More information about EI, PI, and other acquisition functions can be consulted in following reference papers (Jones et al. 1998) (Ruan et al. 2020; Shahriari et al. 2016).

3 Proposed method

In this section, a bi-fidelity BO method for multi-objective optimization is proposed. Firstly, the cost and HV improvement after adding a new sample with different fidelity are analyzed, then the new acquisition function is introduced in detail. Finally, the terminal condition and procedure of the method are presented.

3.1 Predicted HV improvement after adding a new sample with different fidelities

The LCB function utilizes the predicted mean and uncertainty to measure the worth of each point in the unobserved area. The HV improvement function is employed to quantify the predicted quality improvement after adding a new sample in the proposed method, as presented in Eqs. (4) and (5). The HV improvement function can be formulated as follows:

$$I_{LCB} (x) = H(p \cup f_{LCB} (x)) - H(p)$$

(15)

where the $f_{LCB} (x)$ is the LCB function defined in Eq. (14). Figure 2 presents the Pareto set and predicted HV improvement after adding a new sample ($I_{LCB} (x)$) in 2D objective space. The red point in Fig. 2 represents the $f_{LCB} ({\text{x}})$ at sample x.

Figure 3 presents the HV improvement while the surrogate model has two fidelity levels and the HV improvement functions are formulated, respectively, as follows:

$$I_{LCB}^{l} (x) = H(p \cup f_{LCB}^{l} (x)) - H(p)$$

(16)

$$I_{LCB}^{h} (x) = H(p \cup f_{LCB}^{h} (x)) - H(p)$$

(17)

where the $I_{LCB}^{l} (x)$ and $I_{LCB}^{h} (x)$ denote the HV improvement after adding HF and LF samples, respectively,$f_{LCB}^{l} (x)$ and $f_{LCB}^{h} (x)$ are expressed as follows:

$$f_{LCB}^{l} (x) = \left[ {f_{LCB,1}^{l} (x),f_{LCB,2}^{l} (x),...,f_{LCB,M}^{l} (x)} \right]$$

(18)

$$f_{LCB}^{h} (x) = \left[ {f_{LCB,1}^{h} (x),f_{LCB,2}^{h} (x),...,f_{LCB,M}^{h} (x)} \right]$$

(19)

respectively, where

$$\, f_{LCB,i}^{l} (x) = \hat{f}_{i} (x) - k_{i} \beta_{0} s_{i}^{l} (x)$$

(20)

$$f_{LCB,i}^{h} (x) = \hat{f}_{i} (x) - k_{i} s_{i}^{h} (x)$$

(21)

$s_{i}^{h} (x)$ is the error term arising from the lack of an HF sample in HK model, $\beta_{0} s_{i}^{l} (x)$ is the error of the LF model prediction because of the lack of a LF sample (Zhang et al. 2018a, b).

The light gray area in Fig. 3 is the HV of the current Pareto frontier. The blue and dark gray areas represent the HV improvement after adding the LF sample ($I_{LCB}^{l} (x)$) and the HF sample ($I_{LCB}^{h} (x)$) beyond current Pareto set, respectively. It should be noted that the $I_{LCB}^{h} (x)$ includes the blue area, which denotes the HV indicator of the region between $f_{LCB}^{h} (x)$ and the Pareto set. In most circumstances, the $I_{LCB}^{h} (x)$ is larger than the corresponding $I_{LCB}^{l} (x)$ due to the discrepant uncertainty term so that there is a conflict between computational cost and HV improvement of the sample with different fidelity.

3.2 The proposed acquisition function

As illustrated in the following Fig. 4, the predicted HV improvement can be partitioned into two parts, one from the predicted mean and the other from the uncertainty region.

The green region in Fig. 4 means the HV improvement of the predicted mean after adding a new sample ($I_{LCB}^{\mu } (x)$) and it is expressed as follows:

$$I_{LCB}^{\mu } (x) = H(p \cup \hat{f}(x)) - H(p)$$

(22)

$$\hat{f}({\text{x}}) = [\hat{f}_{1} ({\text{x}}),\hat{f}_{2} ({\text{x}}),...,\hat{f}_{i} ({\text{x}}),...,\hat{f}_{M} ({\text{x}})]$$

(23)

where $\hat{f}_{i} (x)$ means the prediction mean of each objective. The blue and dark gray areas represent the HV improvement as the same as shown in Fig. 3, and they can be expressed as $I_{LCB}^{l} (x)$ and $I_{LCB}^{h} (x)$ calculated by Eqs. (16) and (17).

To balance the cost and improvement of a new sample with different fidelity, here a novel improvement function with different fidelity is introduced, which can be expressed as follows:

$$I(x,t) = \left\{ \begin{gathered} (I_{LCB}^{l} (x) - I_{LCB}^{\mu } (x)) \times CR(t) + I_{LCB}^{\mu } (x) \, t = 1 \hfill \\ (I_{LCB}^{h} (x) - I_{LCB}^{\mu } (x)) \times CR(t) + I_{LCB}^{\mu } (x) \, t = 2 \hfill \\ \end{gathered} \right.$$

(24)

where $t = 1$ and $t = 2$ represent the improvement function with low fidelity and high fidelity, respectively, $CR(t)$ is the cost coefficient that is formulated as follows:

$$CR(t) = \left\{ \begin{gathered} c \, t = 1 \hfill \\ 1 \, t = 2 \hfill \\ \end{gathered} \right.$$

(25)

where $c \ge 1$, which is the ratio of cost between HF and LF simulations. With the cost coefficient applied, the HV improvement with LF sample is expanded proportionally by controlling the computational cost to the same level for different fidelities. Thus, the HV improvement with different fidelities can be reasonably compared and choose the most appropriate one. When the cost ratio $c$ is larger than one, it is invidious to compare $I_{LCB}^{l} (x)$ with corresponding $I_{LCB}^{h} (x)$ after multiplying the cost coefficient $CR(t)$ because the predicted mean with all fidelity levels has no discrepancy between each other in BF surrogate model. Hence, the effect of the cost coefficient on the HV improvement of the predicted mean should be eliminated.

As the cost coefficient $CR(t)$ with HF level is equal to 1, the term where $t = 2$ in Eq. (24) can be regarded as follows:

$$(I_{LCB}^{h} (x) - I_{LCB}^{\mu } (x)) \times 1 + I_{LCB}^{\mu } (x) = I_{LCB}^{h} (x)$$

(26)

and the $I(x,t)$ can be simplified as follows:

$$I(x,t) = \left\{ \begin{gathered} (I_{LCB}^{l} (x) - I_{LCB}^{\mu } (x)) \times CR(t) + I_{LCB}^{\mu } (x) \, t = 1 \hfill \\ I_{LCB}^{h} (x) \, t = 2 \hfill \\ \end{gathered} \right.$$

(27)

When the $f_{LCB} ({\text{x}})$ locates in dominated region, all of terms in $I(x,t)$ is equal to zero and it cannot estimate which fidelity level is more appropriate. On the other hand, the algorithm of optimizing the acquisition function, such as genetic algorithm (GA) (Jin 2011), adopts random sampling to obtain the initial population, whereas Pareto solutions usually only occupy a narrow region relative to the whole design space. Therefore, GA may not be able to acquire a solution located in the area of Pareto set during the searching process, which leads to the decline of search efficiency in optimization process. Figure 5 illustrates the situation that the LCB function locates in dominated region of Pareto frontier.

To address this issue, we refer to our previous work (Shu et al. 2020), a novel acquisition function is defined as follows:

$$a(x,t) = \left\{ \begin{gathered} I(x,t), \, where \, f_{LCB} \, is{\text{ non - dominated}} \hfill \\ - \mathop {\min }\limits_{{{\text{i}} = 1,2,...q}} (\left\| {f_{LCB} (x) - f(X_{i} )} \right\|_{2} ), \, where \, f_{LCB} \, is{\text{ dominated}} \hfill \\ \end{gathered} \right.$$

(28)

where $I(x,t)$ is Eq. (27), $X_{i}$ ($i = 1,2,...,q$) denotes the current Pareto solution, $\left\| {f_{LCB} (x) - f(X_{i} )} \right\|_{2}$ means the Euclidean distance between $f_{LCB} (x)$ and the Pareto solution $X_{i}$. The acquisition function makes an encouragement to help GA find the solution close to the Pareto frontier so that the optimization efficiency can be improved.

3.3 Procedure and termination condition

The proposed method, based on BF model and LCB function for multi-objective optimization (BF-MOLCB), follows a process as illustrated in Fig. 6. In the study, the Latin hypercube sampling (LHS) (Wang 2003) policy is applied to generate two initial sample sets for two fidelity levels. With the corresponding responses of initial samples evaluated, the current Pareto frontier can be obtained by the non-dominated sorting method (Deb et al. 2002).

The procedure of the BF-MOLCB is summarized in the flowchart as shown in Fig. 6, includes

Step 1 Generate two sample sets for LF and HF model by LHS.
Step 2 Evaluate the true responses of the corresponding samples.
Step 3 Construct the LF Kriging models for each objective utilizing the DACE toolbox (Lophaven et al. 2002).
Step 4 Construct the hierarchical Kriging model based on LF Kriging model and HF samples. The initial value of the hyperparameter $\theta$ is set to be 1. The region of $\theta$ is set to be $[10^{ - 6} ,10^{3} ]$.
Step 5 Obtain the current Pareto frontier by the non-dominated sorting method.
Step 6 Judge the termination criterion, if it is not satisfied, go to step 7. Otherwise, the method is stopped and output the solution.
Step 7 Maximize the proposed acquisition function to obtain a new sample, turn back to step 2.

A stop criterion is adopted to terminate the optimization process after the following two conditions are satisfied:

(1)
After the algorithm finds at least 20 solutions, thus, the Pareto set will provide enough choice to the decision maker; Or the computational expense gets the maximum budget, defined as 200 in this paper.
(2)
The discrepancy of the quality metrics in two adjacent iterations represents the variation of the current Pareto frontier. Hence, when the ratio of the difference and the current HV value is less than a specific value (e.g., 0.1% in this paper), the algorithm will be terminated, which is defined as follows:
$$HV_{T} = \frac{{HV_{i} - HV_{i - 1} }}{{HV_{i} }}$$
(29)

where the $HV_{i}$ means the HV value of the ith iteration (Shu et al. 2020).

4 Examples and results analysis

In this section, four numerical examples and an engineering design problem are adopted to test the efficiency and applicability of the proposed method. The formulations and true Pareto solutions of the four numerical examples (Shu et al. 2019) are summarized in Table 1. There are two single-fidelity BO methods (Sun et al. 2021; Zhan et al. 2017) only utilizing HF simulations and other two bi-fidelity BO methods (He et al. 2022) to compare with the proposed BF-MOLCB method: (1) Euclidean distance-based EIM method (EIMe), (2) Euclidean distance-based LCBM method (LCBMe), (3) Hypervolume-based VFEIM method (VFEIMh), and (4) VFEMHVI method.

Table 1 The formulation of the four numerical examples

Full size table

4.1 Numerical examples

ZDT1, ZDT2, FON, and POL examples are solved by above methods and the results are compared in this part. The cost of the initial sample is set as 10*n (n is the dimension of design variable), where the cost of the bi-fidelity method is formulated by an equivalent calculation as follows:

$$COST = \frac{{N_{l} }}{CR} + N_{h}$$

(30)

where $N_{l}$/$N_{h}$ denotes the number of LF/HF samples. For initial sampling, $N_{l}$ and $N_{h}$ are set as 5*n*CR and 5*n, respectively. The $CR$ is the ratio of cost between LF and HF samples, which is set to be 4 in the following numerical test problems. To obtain the HV indicator, a reference point is set in each test function, as the shown in Table 2.

Table 2 Reference points to calculate HV indicator in each test function

Full size table

Firstly, a detailed comparison between these methods is demonstrated using the example ZDT1 (dimension $n = 3$). Each method is run 10 times, where the statistics of the HV indicator and computational expense (COST) are recorded. The comparison results of HV and COST using different methods are summarized in Table 3, including the mean, median value, and STDs of HV and COST from 10 runs. The best indicators are shown in bold in the Table. Figure 7 illustrates the box charts of the HV and COST obtained from different methods in ZDT1 problem.

Table 3 The compared results of different methods for ZDT1 problem

Full size table

As the HV indicators presented in Table 3, the LCBMe method, the VFEMHVI method, and proposed method have the most ideal Pareto frontier approximation quality according to the HV indicator. Meanwhile, the minimum mean and STD value of the COST demonstrate that the proposed method enables to have an ideal performance with lowest computational expense simultaneously, in which the COST means of the proposed method have 55.67%, 43.74%, 46.03%, and 28.13% decline over EIMe, LCBMe, VFEIMh, and VFEMHVI, respectively. Figure 7 illustrates the comparison over different methods intuitively. It can be seen that COST indicator of the proposed method has the most inconspicuous abnormal value in this box chart.

Table 4 presents the average computational expense required to obtain different number of Pareto solutions in ZDT1, further comparing the efficiency of finding Pareto solutions over these methods. As shown in Table 4, the proposed method always has the lowest value of the average computational expense. It is important to find enough Pareto solutions to provide adequate options for designers.

Table 4 The average computational expense required to obtain different number of Pareto solutions for ZDT1 problem

Full size table

The quality of Pareto frontier is also a considerable metric in multi-objective optimization problems, which directly determines the information provided for designers. Hence, it is necessary to compare the quality of Pareto frontier from different methods. There are several figures illustrating the Pareto frontier with the lowest HV indicator over 10 runs, as shown in Fig. 8. In the ZDT1 problem, the most ideal Pareto frontier are still obtained by the VFEMHVI method and the proposed method, but the advantage is not evident enough, in which the difference of the HV means between the proposed method and VFEMHVI is lower than 0.01. Other methods also have nonnegligible competitiveness in this case.

The comparisons of remaining examples are summarized in Tables 5 and 6. All these methods present satisfactory performance in obtaining the Pareto solutions on FON and ZDT2 problems with $n = 3$ and the proposed method still has the lowest computational cost and ideal HV indicators. A very diverse phenomenon appears in POL problem. Even though the discrepancy of the cost among different methods is not significant enough, the cost of the proposed method is slightly higher than other single-fidelity methods and VFEIMh method. However, the proposed method has the highest HV indicator, which means it enables to obtain a Pareto set with better quality than other methods. Similar to Fig. 8, Fig. 9 illustrates the Pareto frontier with the lowest HV indicator in 10 runs on POL problem. It can be seen that the Pareto solutions obtained by the proposed method uniformly distribute on the true Pareto frontier, which is obviously better than VFEIMh and other single-fidelity methods. The VFEMHVI method has the same situation but its expense is 3.43% higher than the proposed method.

Table 5 The compared results of different methods for ZDT2 (n = 3), POL and FON problems

Full size table

Table 6 The compared results of different methods for high-dimension ZDT2 problems

Full size table

Additionally, there are several high-dimension test problems (ZDT2 problem with dimension $n = 68{,}\,10$) for measuring the ability to solve the optimization problems under more complex cases. The proposed method reveals a prominent performance in these problems, as shown in Table 6. The “/” symbol denotes that the termination criterion cannot be satisfied when the computational expense reaches the maximum budget 200. For these high-dimension problems, the proposed method always has the lowest computational expense and ideal quality of Pareto frontiers. In the ZDT2 problem with $n = 6$, single-fidelity methods sometimes fail to obtain 20 Pareto solutions before the maximum budget reached. The performance of the VFEMHVI and the proposed BF-MOLCB method has notable improvement, in which the COST means of the proposed method have 52.89%, 62.87%, 60.24%, and 34.70% decline over EIMe, LCBMe, VFEIMh, and VFEMHVI, respectively. The VFEMHVI method can obtain the largest HV indicator. The proposed method can also obtain an ideal Pareto frontier (difference of HV between the VFEMHVI and the proposed method is less than 0.01% of the whole HV), meanwhile the computational expense of the proposed method is much lower than the VFEMHVI. In the test problem with $n = 8,10$, all single-fidelity methods cannot meet the termination criterion until the maximum budget 200 reached. The cost of the VFEIMh and VFEMHVI method is also unsatisfactory. Especially in ZDT2 with $n = 10$, only the proposed method can still obtain a satisfactory Pareto frontier with controllable computational expense.

Comparing these test problems in terms of the dimensions, low-dimension problems ($n = 2,3$) have a smaller difference in the computational cost between these methods, whereas the corresponding discrepancy in high-dimension problems is much more noteworthy. Therefore, the bi-fidelity surrogate models have better performance than single-fidelity models in approximating the complex problems, which confirms the research significance of utilizing bi-fidelity models in BO to solve the multi-objective optimization problems.

4.2 Engineering case

In this section, two engineering examples, torque arm and honeycomb structure vibration isolator design optimization, are adopted to verify the performance of the proposed method. The torque arm is fixed at the left end. The load on the torque arm, $P_{1} = 8.0kN$ and $P_{2} = 4.0kN$, is exerted at the right end (Fig. 10).

Six design variables are applied for the torque in this example, including $\alpha ,b_{1} ,D_{1} ,h,t_{1} ,t_{2}$, and $t_{2}$. The optimization problem is expressed as follows:

$$\begin{gathered} \min f_{1} = V(\alpha ,b_{1} ,D_{1} ,h,t_{1} ,t_{2} ) \\ \min f_{2} = \max \_d(\alpha ,b_{1} ,D_{1} ,h,t_{1} ,t_{2} ) \\ where \, 3\deg \le \alpha \le 4.5\deg ; \, 25mm \le b_{1} \le 35mm \\ 90mm \le D_{1} \le 120mm; \, 20mm \le h \le 30mm \\ 12mm \le t_{1} \le 22mm; \, 8mm \le t_{2} \le 12mm \\ \end{gathered}$$

(31)

where $V$ is the total volume, $\max \_d$ is the maximum displacement of the torque arm, and the Young’s modulus is 200 GPa and Poisson’s ratio is 0.3. In this example, the reference point is [1000;10] and the cost ratio between high- and low-fidelity samples is 4.

Each method is run 10 times, where the average of the HV indicator and computational expense is recorded, as the shown in Table 7. All of these methods can meet the termination condition; thereinto, the proposed method has the best performance in the quality of Pareto solutions due to the highest HV indicator. However, the cost of the proposed method is higher than other methods. The reason why this phenomenon appears is that the non-dominated solutions obtained by the single-fidelity methods and the VFEIMh method early on have satisfied the termination criterion, which leads the method cannot obtain an ideal Pareto set. There are some figures in Fig. 11 used to show the Pareto frontier with the lowest HV indicator in 10 runs on this torque arm problem. Obviously, the Pareto solutions obtained from proposed method distribute uniformly on the true Pareto frontier, distinctly better than other single-fidelity methods and VFEIMh.

Table 7 The comparison with different methods for the torque arm problem

Full size table

To compare the proposed method with other methods more fairly, the same budget is adopted to compare the quality of the Pareto solution set, which is set 100 in this study. Each method works 10 times repeatedly, terminated when the cost of optimization process reaches 100, and summarizes the median, mean, and std value of the HV indicator. The results are shown in Table 8. It can be seen that the median and mean value of the HV indicator obtained by the proposed method are still better than other methods; meanwhile, the least std value reveals the stability of the proposed method. Even the HV indicators obtained by the single-fidelity methods and the VFEIMh method after 100 iterations are still less than the corresponding value when the proposed method has obtained 20 Pareto solutions (5488.2470 on average), whereas the proposed method only costs 35.93 on average to obtain 20 Pareto solutions in optimization process. The VFEMHVI method also has a good performance, which has slight discrepancy with the proposed method.

Table 8 The comparison with the same budget for the torque arm problem

Full size table

Figure 12 intuitionally presents the Pareto frontier of the lowest HV indicator with the same budget. The Pareto frontiers obtained by the single-fidelity methods are much better than before but still distributed unevenly. The diversity of the Pareto solutions is also worser than the proposed method obtained, in which a fixed area many non-dominated solutions intensively distributed. Compared with other two bi-fidelity method, the proposed method still has the best performance in the torque arm optimization problem.

In the vibration isolator optimization problem, six design variables are applied, including $\theta ,D,R_{f} ,L,t_{h}$, and $t_{l}$ as shown in Fig. 13. The ratio of the natural frequency $e$ and the Equivalent strain $\varepsilon_{\max }$ are set as the goal of the multi-objective optimization problem to minimize. The nonlinear coefficient $\xi$, the transverse-longitudinal stiffness ratio $R_{k}$, the total height $H$, and the dimensional coordination conditions $T_{s}$ are set as the constraints, in which $\xi$ and $R_{k}$ should be evaluated by computationally expensive simulations while the $H$ and $T_{s}$ can be obtained by mathematical formulas. The optimization problem is expressed as follows:

$$\begin{gathered} \min \, e(\theta ,D,R_{f} ,L,t_{h} ,t_{l} ) = \left| {1 - \frac{f}{{f_{0} }}} \right| \\ \varepsilon_{\max } (\theta ,D,R_{f} ,L,t_{h} ,t_{l} ) \\ s.t. \, g_{1} = 1 - \frac{\xi }{0.75} \le 0,g_{2} = 1 - \frac{{R_{k} }}{1.2} \le 0,g_{3} = \frac{H}{70} - 1 \le 0,g_{4} = T_{s} \le 0 \\ \end{gathered}$$

(32)

where the parameters are shown in Tables 9, 10, and 11.

Table 9 Fixed parameters

Full size table

Table 10 Physical meaning and ranges of design variables

Full size table

Table 11 Objective and constraint functions and their boundary conditions

Full size table

The vibration isolator test problem is modeled and simulated using ANSYS 18.2, in which high- and low-fidelity levels are set via adjusting the dimension of the mesh division. The number of mesh division of the high-fidelity analysis model in the direction of inclined beam thickness $t_{l}$ is set as 6, and the corresponding number in the low-fidelity analysis model is set as 2, as shown in Fig. 14. The cost coefficient is set as 3. The penalty function is used to handle the constraints which are obtained by mathematical formulas. For constraints which need time-consuming simulations to obtain, this paper addresses them based on the method proposed by Schonlau (1998), which was extended to solve multi-fidelity optimization problems in our previous work (Shu et al. 2021). The reference point is set as [10;0.2].

Each method is run 10 times, where the average of the HV indicator and computational expense is recorded, as the shown in Table 12. All of these methods can meet the termination condition; thereinto, the proposed method has the best performance in the quality of Pareto solutions due to the highest HV indicator. With regard to the computational expense, there is only a small variation between these methods, in which the EIMe has the lowest cost and the VFEMHVI is the highest one. Other three methods are almost the same level. However, the proposed method has notable advantage in term of the HV indicator, especially comparing with the EIMe, the LCBMe, and the VFEIMh method. Figure 15 illustrates the Pareto frontier with the lowest HV indicator in 10 runs. It can be seen that the Pareto frontier obtained from proposed method is better than other methods obtained.

Table 12 The comparison with different methods for the engineering problem

Full size table

5 Conclusion

A bi-fidelity Bayesian optimization method for multi-objective optimization is proposed. In the proposed method, the LCB functions are adopted to utilize the uncertainty of surrogate model. On this basis, a novel acquisition function is defined, in which a cost coefficient balances the computational expense and accuracy of new sample points with different fidelity levels. Then, the new sample can be determined by maximizing the proposed acquisition function.

Four numerical test problems with different dimensions and two real-world engineering design optimization problems are applied to investigate the feasibility and efficiency of the proposed method. Some conclusions are drawn as follows:

1)
The effect of samples with different fidelities for improving the quality of the current Pareto set is quantified in the proposed method;
2)
The proposed method balances the cost and effect of samples with different fidelities, which can utilize the LF samples to potentially further reduce the computational expense;
3)
Compared with other two state-of-the-art single-fidelity methods and two bi-fidelity methods, the proposed method shows prominent performance in computational cost and improving of Pareto solutions, especially for solving high-dimension problems.

In the future, several potential directions are worth exploring. First, the acquisition function of choosing new samples from multiple fidelities (> 2) still remains to be researched and tested. Second, combined with the research from Foumani et al. (2022), other acquisition functions like EI and PI can be utilized in multi-objective optimization with multi-fidelity model. Last but not the least, the proposed method can be extended to solve multi-objective robust optimization problems.

Abbreviations

$CR$ :: Cost ratio of samples with different fidelity
$F(x)$ :: Objective of multi-objective optimization
$f_{l} (x)$ :: Predicted mean of low-fidelity surrogate model
$f_{i}^{j}$ :: jTh Pareto solution of the ith objective
$f_{LCB} (x)$ :: LCB function at sample x
$f_{LCB}^{l} (x)$ :: Low-fidelity LCB function at sample x
$f_{LCB}^{h} (x)$ :: High-fidelity LCB function at sample x
$g_{j} (x)$ :: jTh Constrain of multi-objective optimization
$H(p)$ :: Hypervolume indicator
$I(x)$ :: Improvement function at sample x
$I_{LCB} (x)$ :: Improvement of LCB function at sample x beyond current Pareto set
$I_{LCB}^{l} (x)$ :: Improvement of low-fidelity LCB at sample x beyond current Pareto set
$I_{LCB}^{h} (x)$ :: Improvement of high-fidelity LCB at sample x beyond current Pareto set
$I(x,t)$ :: The novel improvement function with different fidelity
$N_{l}$ :: Quantity of the low-fidelity samples
$N_{h}$ :: Quantity of the high-fidelity samples
${\text{R}}$ :: Covariance matrix of the hierarchical Kriging model
${\text{R}}_{l}$ :: Covariance matrix of the low-fidelity Kriging model
$s_{l}^{2} (x)$ :: Mean-squared error of the low-fidelity Kriging model at unobserved points
$t$ :: Fidelity level
$X_{j}$ :: jTh Pareto solution
$Z( \cdot ),Z_{l} ( \cdot )$ :: Gaussian random process
$\beta_{0,l} {,}\,\beta_{0}$ :: Scaling factor of trend model
${\text{r}}$ :: Correlation vector of the hierarchical Kriging model
${\text{r}}_{l}$ :: Correlation vector of the low-fidelity Kriging model
$\theta$ :: “Roughness” parameter
$\sigma^{2}$ :: Process variance
$LCB$ :: Function associated with LCB
$l$ :: Low-fidelity value
$h$ :: High-fidelity value
$lb$ :: Lower bound of the value
$ub$ :: Upper bound of the value
$h$ :: High-fidelity value
$l$ :: Low-fidelity value
$\mu$ :: Value associated with predicted mean

References

Chen S, Jiang Z, Yang S, Chen W (2016) Multimodel fusion based sequential optimization. AIAA J 55(1):241–254
Article Google Scholar
Choi S, Alonso JJ, Kroo IM (2009) Two-level multifidelity design optimization studies for supersonic jets. J Aircr 46(3):776–790
Article Google Scholar
Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans Evol Comput 6(2):182–197
Article Google Scholar
Emmerich MTM, Deutz AH, Klinkenberg JW (2011) Hypervolume-based expected improvement: Monotonicity properties and exact computation. Paper presented at the 2011 IEEE congress of evolutionary computation (CEC)
Eweis-Labolle JT, Oune N, Bostanabad R (2022) Data fusion with latent map gaussian processes. J Mech Des 144(9):091703
Article Google Scholar
Han ZH, Görtz S (2012) Hierarchical kriging model for variable-fidelity surrogate modeling. AIAA J 50:1885–1896
Article Google Scholar
He Y, Sun J, Song P, Wang X (2022) Variable-fidelity hypervolume-based expected improvement criteria for multi-objective efficient global optimization of expensive functions. Eng Comput 38(4):3663–3689
Article Google Scholar
Jeong S, Obayashi S (2005) Efficient global optimization (EGO) for multi-objective problem and data mining. Paper presented at the 2005 IEEE congress on evolutionary computation
Jerome S, William JW, Toby JM, Henry PW (1989) Design and analysis of computer experiments. Stat Sci 4(4):409–423
MathSciNet MATH Google Scholar
Jiang P, Cheng J, Zhou Q, Shu L, Hu J (2019) Variable-fidelity lower confidence bounding approach for engineering optimization problems with expensive simulations. AIAA J 57(12):5416–5430
Article Google Scholar
Jin Y (2011) Surrogate-assisted evolutionary computation: recent advances and future challenges. Swarm Evol Comput 1(2):61–70
Article Google Scholar
Jones DR (2001) A taxonomy of global optimization methods based on response surfaces. J Global Optim 21(4):345–383
Article MathSciNet MATH Google Scholar
Jones DR, Schonlau M, Welch WJ (1998) Efficient global optimization of expensive black-box functions. J Global Optim 13(4):455–492
Article MathSciNet MATH Google Scholar
Keane AJ (2006) Statistical improvement criteria for use in multiobjective design optimization. AIAA J 44(4):879–891
Article Google Scholar
Kennedy MC, O’Hagan A (2000) Predicting the output from a complex computer code when fast approximations are available. Biometrika 87(1):1–13
Article MathSciNet MATH Google Scholar
Knowles J (2006) ParEGO: a hybrid algorithm with on-line landscape approximation for expensive multiobjective optimization problems. IEEE Trans Evol Comput 10(1):50–66
Article Google Scholar
Leifsson L, Koziel S, Tesfahunegn YA (2015) Multiobjective aerodynamic optimization by variable-fidelity models and response surface surrogates. AIAA J 54(2):531–541
Article Google Scholar
Li X, Qiu H, Jiang Z, Gao L, Shao X (2017) A VF-SLP framework using least squares hybrid scaling for RBDO. Struct Multidiscip Optim 55(5):1629–1640
Article MathSciNet Google Scholar
Liu H, Ong YS, Cai J, Wang Y (2018) Cope with diverse data structures in multi-fidelity modeling: a Gaussian process method. Eng Appl Artif Intell 67:211–225
Article Google Scholar
Lophaven SN, Nielsen HB, Søndergaard J (2002) DACE: a Matlab kriging toolbox (Vol. 2): Citeseer
Perdikaris P, Venturi D, Royset JO, Karniadakis GE (2015) Multi-fidelity modelling via recursive co-kriging and Gaussian–Markov random fields. Proc R Soc a: Math, Phys Eng Sci 471:20150018
Article Google Scholar
Qian J, Cheng Y, Zhang A, Zhou Q, Zhang J (2021) Optimization design of metamaterial vibration isolator with honeycomb structure based on multi-fidelity surrogate model. Struct Multidiscip Optim 64(1):423–439
Article Google Scholar
Ruan X, Jiang P, Zhou Q, Hu J, Shu L (2020) Variable-fidelity probability of improvement method for efficient global optimization of expensive black-box problems. Struct Multidiscip Optim 62(6):3021–3052
Article MathSciNet Google Scholar
Schonlau M, Welch W, Jones D (1998) Global versus local search in constrained optimization of computer models. Inst Math Stat Lect Notes Monograph Ser 34:11–25
Article MathSciNet Google Scholar
Shahriari B, Swersky K, Wang Z, Adams RP, Freitas Nd (2016) Taking the human out of the loop: a review of bayesian optimization. Proc IEEE 104(1):148–175
Article Google Scholar
Shi M, Lv L, Sun W, Song X (2020) A multi-fidelity surrogate model based on support vector regression. Struct Multidiscip Optim 61(6):2363–2375
Article MathSciNet Google Scholar
Shu L, Jiang P, Zhou Q, Xie T (2019) An online variable-fidelity optimization approach for multi-objective design optimization. Struct Multidiscip Optim 60(3):1059–1077
Article MathSciNet Google Scholar
Shu L, Jiang P, Shao X, Wang Y (2020) A new multi-objective bayesian optimization formulation with the acquisition function for convergence and diversity. J Mech Design. https://doi.org/10.1115/14046508
Article Google Scholar
Shu L, Jiang P, Wang Y (2021) A multi-fidelity Bayesian optimization approach based on the expected further improvement. Struct Multidiscip Optim 63(4):1709–1719
Article MathSciNet Google Scholar
Singh P, Couckuyt I, Elsayed K, Deschrijver D, Dhaene T (2017) Multi-objective geometry optimization of a gas cyclone using triple-fidelity co-kriging surrogate models. J Optim Theory Appl 175(1):172–193
Article MathSciNet MATH Google Scholar
Sóbester A, Leary SJ, Keane AJ (2005) On the design of optimization strategies based on global response surface approximation models. J Global Optim 33(1):31–59
Article MathSciNet MATH Google Scholar
Srinivas N, Krause A, Kakade SM, Seeger MW (2012) Information-theoretic regret bounds for Gaussian process optimization in the bandit setting. IEEE Trans Inf Theory 58(5):3250–3265
Article MathSciNet MATH Google Scholar
Sun G, Li L, Fang J, Li Q (2021) On lower confidence bound improvement matrix-based approaches for multiobjective Bayesian optimization and its applications to thin-walled structures. Thin Walled Struct 161:107248
Article Google Scholar
Svenson J, Santner T (2016) Multiobjective optimization of expensive-to-evaluate deterministic computer simulator models. Comput Stat Data Anal 94:250–264
Article MathSciNet MATH Google Scholar
Tran A, Tran M, Wang Y (2019) Constrained mixed-integer Gaussian mixture Bayesian optimization and its applications in designing fractal and auxetic metamaterials. Struct Multidiscip Optim 59(6):2131–2154
Article MathSciNet Google Scholar
Wang GG (2003) Adaptive response surface method using inherited Latin hypercube design points. J Mech Des 125(2):210–220
Article Google Scholar
Williams B, Cremaschi S (2021) Selection of surrogate modeling techniques for surface approximation and surrogate-based optimization. Chem Eng Res Des 170:76–89
Article Google Scholar
Williams CK, Rasmussen CE (2006) Gaussian processes for machine learning. MIT press Cambridge, MA
MATH Google Scholar
Yang K, Emmerich M, Deutz A, Bäck T (2019) Multi-objective Bayesian global optimization using expected hypervolume improvement gradient. Swarm Evol Comput 44:945–956
Article Google Scholar
Zanjani Foumani Z, Shishehbor M, Yousefpour A, Bostanabad R (2022). Multi-Fidelity Cost-Aware Bayesian Optimization. arXiv:2211.02732
Zhan D, Cheng Y, Liu J (2017) Expected improvement matrix-based infill criteria for expensive multiobjective optimization. IEEE Trans Evol Comput 21(6):956–975
Article Google Scholar
Zhan D, Meng Y, Xing H (2022) A fast multi-point expected improvement for parallel expensive optimization. IEEE Trans Evol Comput. https://doi.org/10.1109/TEVC.2022.3232776
Article Google Scholar
Zhang Y, Han ZH, Zhang KS (2018) Variable-fidelity expected improvement method for efficient global optimization of expensive functions. Struct Multidiscip Optim 58(4):1431–1451
Article MathSciNet Google Scholar
Zhang Y, Kim NH, Park C, Haftka RT (2018) Multifidelity surrogate based on single linear regression. AIAA J 56(12):4944–4952
Article Google Scholar
Zheng J, Li Z, Gao L, Jiang G (2016) A parameterized lower confidence bounding scheme for adaptive metamodel-based design optimization. Eng Comput 33(7):2165–2184
Article Google Scholar
Zhou Q, Jiang P, Shao X, Hu J, Cao L, Wan L (2017) A variable fidelity information fusion method based on radial basis function. Adv Eng Inform 32:26–39
Article Google Scholar

Download references

Funding

This research has been supported by the National Natural Science Foundation of China (NSFC) under Grant No. 52105256 and No. 52188102.

Author information

Authors and Affiliations

The State Key Laboratory of Digital Manufacturing Equipment and Technology, School of Mechanical Science and Engineering, Huazhong University of Science and Technology, Wuhan, 430074, Hubei, People’s Republic of China
Kaiqin Xu, Leshi Shu, Linjun Zhong & Ping Jiang
School of Aerospace Engineering, Huazhong University of Science and Technology, Wuhan, 430074, Hubei, People’s Republic of China
Qi Zhou

Authors

Kaiqin Xu
View author publications
You can also search for this author in PubMed Google Scholar
Leshi Shu
View author publications
You can also search for this author in PubMed Google Scholar
Linjun Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Ping Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Qi Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Leshi Shu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Replication of results

The main step of the proposed method is introduced in Sect. 3. The Matlab codes can be available from the website: https://pan.baidu.com/s/1B0vf4QaWByIoNUvoa9vFUA by using the password: 0ubx.

Additional information

Responsible Editor: Ramin Bostanabad.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Xu, K., Shu, L., Zhong, L. et al. A bi-fidelity Bayesian optimization method for multi-objective optimization with a novel acquisition function. Struct Multidisc Optim 66, 53 (2023). https://doi.org/10.1007/s00158-023-03509-9

Download citation

Received: 26 May 2022
Revised: 19 January 2023
Accepted: 25 January 2023
Published: 22 February 2023
DOI: https://doi.org/10.1007/s00158-023-03509-9

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A bi-fidelity Bayesian optimization method for multi-objective optimization with a novel acquisition function

Abstract

Similar content being viewed by others

Bi-objective Bayesian optimization of engineering problems with cheap and expensive cost functions

A multi-fidelity Bayesian optimization approach based on the expected further improvement

A sequential multi-fidelity surrogate-based optimization methodology based on expected improvement reduction

1 Introduction

2 Background

2.1 Multi-objective optimization

2.2 Hierarchical kriging (HK) model

2.3 Bayesian optimization based on the LCB function

3 Proposed method

3.1 Predicted HV improvement after adding a new sample with different fidelities

3.2 The proposed acquisition function

3.3 Procedure and termination condition

4 Examples and results analysis

4.1 Numerical examples

4.2 Engineering case

5 Conclusion

Abbreviations

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Replication of results

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A bi-fidelity Bayesian optimization method for multi-objective optimization with a novel acquisition function

Abstract

Similar content being viewed by others

Bi-objective Bayesian optimization of engineering problems with cheap and expensive cost functions

A multi-fidelity Bayesian optimization approach based on the expected further improvement

A sequential multi-fidelity surrogate-based optimization methodology based on expected improvement reduction

1 Introduction

2 Background

2.1 Multi-objective optimization

2.2 Hierarchical kriging (HK) model

2.3 Bayesian optimization based on the LCB function

3 Proposed method

3.1 Predicted HV improvement after adding a new sample with different fidelities

3.2 The proposed acquisition function

3.3 Procedure and termination condition

4 Examples and results analysis

4.1 Numerical examples

4.2 Engineering case

5 Conclusion

Abbreviations

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Replication of results

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation