1 Introduction

Uncertainty propagation (UP) methods, which quantify uncertainty in system output performance based on random or noisy inputs, are of great importance for design under uncertainty like robust design and reliability-based design (Du and Chen 2002; Chen et al. 2015). A wide variety of UP techniques have been developed (Lee and Chen 2009) among which the polynomial chaos (PC) technique is one of the most popular approach due to its mathematically rigorous concept, strong theoretical basis, and inherent ability to converge to computer calculation precision (Eldred 2009). With PC, a stochastic quantity can be represented as a polynomial chaos expansion, based on which the statistical moments and reliability can be conveniently obtained. Oftentimes, the analysis models in practical engineering are highly nonlinear and computationally expensive, such as computational fluid dynamics (CFD) for aerodynamic analysis, resulting in intensively computational cost in implementing UP via PC, which becomes more serious for high-dimensional problems. Therefore, the straightforward application of PC on the expensive model for UP might be too costly and infeasible in practical application.

Generally, a complicated physical process can be modeled using several methods with different levels of fidelity, or a computer code for a complex problem can be run at different levels of fidelity. For example, the aircraft aerodynamic analysis can be simulated with different reduced physical order (e.g., Euler model vs. potential flow model) or different numerical solver (e.g., finite difference method vs. finite element analysis). A high-fidelity (HF) model takes more computational time but offers higher accuracy, whereas a low-fidelity (LF) model is faster at the cost of accuracy. The exploitation of the availability of multiple models within a hierarchy of fidelity is popular in assisting the process of optimization in engineering (Gratiet and Cannamela 2012; Huang et al. 2006; Gratiet et al. 2014). This scenario has been extended to UP via PC for improving computational efficiency recently, which has received considerable interest (Shah et al. 2015; Ng and Eldred 2012; Zhu et al. 2017; Zhu et al. 2014). The earliest work about the multi-fidelity PC method was proposed by Ng and Eldred (Ng and Eldred 2012), in which the stochastic collocation technique was employed to construct the PC model, and the LF and correction PC expansions are integrated in the form of addition, multiplication, or a combination of the two into a single expansion to match the HF model values. As stated by Ng and Eldred in their work (Ng and Eldred 2012), for the multiplicative and combinative correction forms, the calculation of the multi-fidelity polynomial coefficients is much more complicated and the accuracy is generally worse or comparable, and thus the additive form is widely studied and applied. This method with the additive form has been applied to UP for a vertical axis wind turbine under extreme gusts (Santiago Padron et al. 2014; Palar et al. 2018). Another similar technique is the multi-fidelity stochastic collocation that relies on Lagrange-polynomial interpolation, in which a greedy procedure based on the information from the LF model is used to collect “important” sample points for the HF simulations (Zhu et al. 2014, 2017).

For the widely studied method proposed by Ng and Eldred, the roots of the orthogonal polynomials are employed as the collocation points for the PC expansion. Therefore, the number and location of collocation points cannot be arbitrary, resulting in less flexibility in performing UP for the user with a limited computational budget. To address this issue, based on the work of Ng and Eldred, the multi-fidelity PC approach using regression has been developed (Pramudita et al. 2016), which has been applied to multidisciplinary design optimization under uncertainty (West and Gumbert 2017) and aerodynamic robust optimization (Palar et al. 2015). For all the above PC based multi-fidelity modeling approaches for UP, it is required that the HF sample points should be a subset of the LF ones (i.e., nested sample points). To relax this constraint and improve the flexibility, Berchier has proposed to calculate the correction expansion term using the low-fidelity PC model rather than the LF sample points (Matteo 2016).

In the deterministic case, the most well-known multi-fidelity modeling method is the multi-level co-kriging approach proposed by Kennedy and O’Hagan (short for KOH in this work), in which the discrepancy-based autoregressive multi-fidelity modeling formulation and Gaussian process (GP) modeling technique are employed (Kennedy and O’Hagan 2000). It has been widely recognized that KOH is more accurate and flexible compared to the classic additive and multiplicative correction forms for multi-fidelity modeling (Laurenceau and Sagaut 2008; Toal et al. 2011; Han et al. 2012; Huang et al. 2013; Toal and Keane 2015). One main reason is that the KOH framework employs the Gaussian process modeling method that can flexibly capture the nonlinearity of the model, and the scaling factor on the LF model that is more beneficial to improve the accuracy of the correction term and avoid the bumpy issue (Fernández-Godino et al. 2016). This prompts us to think about whether the KOH multi-fidelity modeling framework can be extended to the stochastic domain for UP, within which the multi-fidelity PC can be implemented to improve the performance of UP. As is well known, the basic theoretical foundation of KOH is the GP modeling method, and the predicted response in KOH can be represented as a GP. Recently, the PC method has been extended to polynomial-chaos-kriging (PC-Kriging) by adding a GP term to the original PC model to more accurately capture the local variability of the response model (Schobi et al. 2015). It has been demonstrated that the PC-Kriging approach is more accurate than PC as well as kriging in performing UP (Schobi et al. 2015; Kersaudy et al. 2015). Clearly, the success that introducing GP into PC for UP lays a foundation for the extension of KOH framework to multi-fidelity PC. Therefore, a multi-fidelity PC approach will be developed and studied using the KOH framework in this paper.

In addition, almost all the works of multi-fidelity UP focus on model fusion within a hierarchical fidelity of the models. However, in many applications, it is not possible to rank models by their levels of fidelity a priori, exhibiting non-hierarchical fidelity. For example, the models of climate system are developed from different research groups to understand and predict its behavior, based on disparate theories or mechanisms to incorporate the physics and chemistry of the atmosphere, ocean, and land surface (Allaire and Willcox 2012). The non-hierarchical multi-fidelity modeling approach from the deterministic point of view has been developed by Chen et al. using the spatial random process (Chen et al. 2016), which will be extended to multi-fidelity PC within a non-hierarchal fidelity in this work.

It is the objective of this work to explore the applicability and effectiveness of the Gaussian process modeling theory on multi-fidelity UP via the PC technique. For the hierarchical fidelities, the well-known KOH framework is extended to multi-fidelity UP, in which the lowest-fidelity model and all the correction terms are respectively represented as a PC-Kriging model. For the non-hierarchical fidelities, the weighted summation method proposed by Chen et al. is extended, in which all the lower-fidelity models and the correction term are respectively represented as a PC-Kriging model. Meanwhile, for high-dimensional problems, the hyperbola truncation scheme is employed to reduce the number of the orthogonal polynomials during the construction of PC term in the PC-Kriging model, and thus to reduce the computational cost.

The remainder of this paper is organized as follows. A brief review of the PC-Kriging method combining PC and Gaussian process modeling for UP in given in Section 2. The proposed multi-fidelity UP method using PC and the Gaussian process modeling technique is presented, in which multi-fidelity UP strategies respectively for hierarchical and non-hierarchical fidelities are explained in detail. Comparative studies on numerical problems are presented in Section 4, where the commonly used co-kriging method (Kennedy and O’Hagan 2000) and multi-fidelity PC method (Ng and Eldred 2012) are also tested for comparison. In Section 5, the proposed method is applied to an aerodynamic robust optimization problem to further verify its effectiveness and applicability in dealing with practical problems. Conclusions are drawn in Section 6.

2 Review of PC-kriging

The polynomial-chaos-kriging (PC-Kriging) method is a newly developed approach for UP by adding a GP term to the PC model. As is well known, the PC method formulated as a weighted sum of a set of orthogonal polynomials can efficiently capture the global behavior of the analysis model. The introduction of the GP term helps to capture the local variability, thus to improve the accuracy of PC for UP. With PC-Kriging, a stochastic response y = g(x) with a d-dimensional input vector x = [x1, ..., xd] can be represented as follows:

$$ y\approx {M}^{(PCK)}\left(\mathbf{x}\right)=\sum \limits_{i=0}^P{b}_i{\varPhi}_i\left(\mathbf{x}\left(\xi \right)\right)+{\sigma}^2Z\left(\mathbf{x}\right) $$
(1)

where \( \sum \limits_{i=0}^P{b}_i{\varPhi}_i\left(\mathbf{x}\left(\xi \right)\right) \) is a PC model with order p describing the mean value of the Gaussian process, ξ is a standard random vector generated by mapping the original random vector x to the standard random space according to the distribution parameters of x, bi is the coefficient of PC model; σ is the prior standard deviation of the Gaussian process, and Z(x) is a zero-mean and unit-variance stationary Gaussian process with the autocorrelation function R.

The commonly used formulation of R in the literature is:

$$ R\left(\mathbf{x},{\mathbf{x}}^{\prime },\boldsymbol{\theta}, h\right)=\exp \left(-{\sum}_{k=1}^d{\theta}_k{\left|{x}_k-{x}_k^{\hbox{'}}\right|}^h\right) $$
(2)

Building a PC-Kriging stochastic metamodel consists of two parts: (i) the construction of Φi(x(ξ)) in the PC term, and (ii) the estimation of hyper-parameters (θ, h,σ) and b = [b0, ..., bP]T. For the first part, the commonly used method is the direct tensor product technique. For high-dimensional problems, the least angle regression method (Wang et al. 2016) or the hyperbolic truncation scheme (Blatman and Sudret 2011) shown in (3) can be employed to remove some unimportant orthogonal polynomials, and thus to reduce the computational cost of PC:

$$ {A}_q^{d,\omega }=\left\{\boldsymbol{\upalpha} \in {\mathbb{N}}^d:{\left\Vert \boldsymbol{\upalpha} \right\Vert}_q={\left(\sum \limits_{i=1}^d{\alpha}_i^q\right)}^{\frac{1}{q}}\le \omega \right\} $$
(3)

where A is the truncation set of multi-indices α for PC, q is the sparse factor, ω is the highest order of polynomial Φ(x(ξ)), and αi is the degree of the ith random variable in Φ(x(ξ)).

For the second part, the maximum likelihood estimation (MLE) method or the cross validation (CV) method can be employed (Schobi et al. 2015) by maximizing the following functions, respectively:

$$ {G}_{ML}\left(\boldsymbol{\theta}, h\right)=\underset{\theta, h}{\arg\ \min}\left[\frac{1}{N}{\left(\boldsymbol{y}-\mathbf{F}\boldsymbol{b}\right)}^T{\mathbf{R}}^{-1}\left(\boldsymbol{y}-\mathbf{F}\boldsymbol{b}\right){\left(\det \mathbf{R}\right)}^{\frac{1}{N}}\right] $$
(4)
$$ {G}_{CV}\left(\boldsymbol{\theta}, h\right)=\underset{\theta, h}{\arg\ \min}\left[{\boldsymbol{y}}^T{\mathbf{R}}^{-1}\mathit{\operatorname{diag}}{\left({\mathbf{R}}^{-1}\right)}^{-2}{\mathbf{R}}^{-1}\boldsymbol{y}\right] $$
(5)

where N is the number of sample points, y = [y1, ..., yN]Tis the response vector at the input sample points,R is the autocorrelation matrix with the ith row and jth column element as Rij = R(xi, xj, θ, h), and Fis a (P + 1) × N matrix with Fij = Φi(xj(ξ))(i = 0, 1, ..., P; j = 1, ..., N); the PC coefficient vector b can be represented as follows:

$$ \boldsymbol{b}\left(\boldsymbol{\uptheta}, h\right)={\left({\mathbf{F}}^T{\mathbf{R}}^{-1}\mathbf{F}\right)}^{-1}\mathbf{F}{\mathbf{R}}^{-1}y $$
(6)

Once all the parameters are obtained, the predicted response at any new input site xp can be calculated by (Rasmussen and Williams 2006):

$$ {y}_{(PCK)}\left({\mathbf{x}}_p\right)=\boldsymbol{\Phi} {\left({\mathbf{x}}_p\left(\boldsymbol{\xi} \right)\right)}^T\boldsymbol{b}+R\left({\mathbf{x}}_p,\mathbf{X},\boldsymbol{\theta}, h\right){\mathbf{R}}^{-1}\left(\boldsymbol{y}-\mathbf{F}\boldsymbol{b}\right) $$
(7)

where X is the matrix stacking all the collected input sample points.

Then, Monte Carlo simulation (MCS) can be directly employed on the PC-Kriging metamodel to obtain the statistical moments and probabilistic distribution of random response y.

3 The proposed multi-fidelity UP method

3.1 Multi-fidelity UP for hierarchical fidelity

It is assumed that there exist s analysis models with responses as yt(x)|t = 1, ...s, and a larger t corresponds to a higher level of fidelity and larger computational cost. Therefore, y1(x)is the lowest-fidelity and cheapest model, while ys(x) is the highest fidelity and most expensive one. x = [x1, x2,  … , xd] ∈ d represents a d-dimensional random input vector. A step-by-step description of the proposed multi-fidelity UP using PC and the Gaussian process modeling technique for the hierarchical fidelity scenario is presented as follows.

  1. Step 1.

    According to the distribution information of random input vector x, generate input sample points using methods such as the Latin Hypercube sampling or Gaussian quadrature points rule etc., and calculate the corresponding model responses with different levels of fidelity.

    For the tthlevel (t = 1,...,s) model, it is supposed that a set of response observations \( {\mathbf{d}}_t={\left[{y}^t\left({\mathbf{x}}_1^t\right),...,{y}^t\left({\mathbf{x}}_{n_t}^t\right)\right]}^T \)at input sites \( {\mathbf{D}}_t={\left[{\left({\mathbf{x}}_1^t\right)}^T,{\left({\mathbf{x}}_2^t\right)}^T,...,{\left({\mathbf{x}}_{n_t}^t\right)}^T\right]}^T \)have been collected. Let \( \mathbf{d}={\left[{\mathbf{d}}_1^T,...,{\mathbf{d}}_s^T\right]}^T \) denote all of the collected response data from all models at the input space Γ = [D1; D2; ...; Ds]. Generally, the number of sample points nt is decreased with the increase of t considering the computational cost.

  2. Step 2.

    Construct the multi-fidelity PC-Kriging metamodel in replacement of the highest fidelity model ys(x)by extending the KOH framework to UP.

The KOH formulation in constructing multi-level co-kriging (Kennedy and O’Hagan 2000) is:

$$ {y}^t\left(\mathbf{x}\right)={\rho}_{t-1}{y}^{t-1}\left(\mathbf{x}\right)+{\delta}^t\left(\mathbf{x}\right),t=2,\dots, s $$
(8)

whereρt − 1represents the scaling factor between model responses yt(x) andyt − 1(x), and the correction function δt(x) is a Gaussian process denoting the discrepancy between yt(x) and ρt − 1yt − 1(x).

Accordingly, the highest fidelity output responseys(x)can be expressed as below based on (8):

$$ {y}^s\left(\mathbf{x}\right)=\left(\prod \limits_{i=1}^{s-1}{\rho}_i\right){y}^1(x)+\left(\prod \limits_{i=2}^{s-1}{\rho}_i\right){\delta}^2\left(\mathbf{x}\right)+\dots +{\rho}_{s-1}{\delta}^{s-1}\left(\mathbf{x}\right)+{\delta}^s\left(\mathbf{x}\right) $$
(9)

It is assumed that y1(x), δ2(x), ..., δs(x) can be respectively modeled by a GP, which are represented as PC-Kriging metamodels as below, respectively:

$$ \Big\{{\displaystyle \begin{array}{c}{y}^1\left(\mathbf{x}\right)=\sum \limits_{i=0}^P{b}_i^1{\boldsymbol{\Phi}}_i^1\left(\mathbf{x}\left(\boldsymbol{\xi} \right)\right)+{\sigma}_1^2{Z}^1\left(\mathbf{x}\right)\sim \mathcal{GP}\left({M}^1\left(\mathbf{x}\right),{V}^1\left(\mathbf{x},{\mathbf{x}}^{\prime}\right)\right)\\ {}{\delta}^2\left(\mathbf{x}\right)=\sum \limits_{i=0}^P{b}_i^2{\boldsymbol{\Phi}}_i^2\left(\mathbf{x}\left(\boldsymbol{\xi} \right)\right)+{\sigma}_2^2{Z}^2\left(\mathbf{x}\right)\sim \mathcal{GP}\left({M}^2\left(\mathbf{x}\right),{V}^2\left(\mathbf{x},{\mathbf{x}}^{\prime}\right)\right)\\ {}\vdots \\ {}{\delta}^s\left(\mathbf{x}\right)=\sum \limits_{i=0}^P{b}_i^s{\boldsymbol{\Phi}}_i^s\left(\mathbf{x}\left(\boldsymbol{\xi} \right)\right)+{\sigma}_s^2{Z}^s\left(\mathbf{x}\right)\sim \mathcal{GP}\left({M}^s\left(\mathbf{x}\right),{V}^s\left(\mathbf{x},{\mathbf{x}}^{\prime}\right)\right)\end{array}} $$
(10)

where \( M\left(\mathbf{x}\right)=\sum \limits_{i=0}^P{b}_i{\boldsymbol{\Phi}}_i\left(\mathbf{x}\left(\boldsymbol{\xi} \right)\right) \) is the mean function expressed as the weighted sum of Φ(x(ξ))(i.e., PC term), V(x, x) = σ2R(x, x, θ, h) is the covariance function, representing the spatial covariance between any two inputs x and xof the GP.

Then, additively, the highest fidelity model ys(x) can be further expressed by a GP based on (9) and (10) as follows:

$$ {\displaystyle \begin{array}{l}{y}^s\left(\mathbf{x}\right)\sim \mathcal{GP}\Big({M}^1\left(\mathbf{x}\right)\left(\prod \limits_{i=1}^{s-1}{\rho}_i\right)+{M}^2\left(\mathbf{x}\right)\left(\prod \limits_{i=2}^{s-1}{\rho}_i\right)+\dots +{M}^s\left(\mathbf{x}\right),\\ {}{\left(\prod \limits_{i=1}^{s-1}{\rho}_i\right)}^2{V}^1\left(\mathbf{x},{\mathbf{x}}^{\prime}\right)+{\left(\prod \limits_{i=2}^{s-1}{\rho}_i\right)}^2{V}^2\left(\mathbf{x},{\mathbf{x}}^{\prime}\right)+\dots +{V}^s\left(\mathbf{x},{\mathbf{x}}^{\prime}\right)\Big)\end{array}} $$
(11)
  1. Step 3.

    Estimate all the unknown hyper-parameters by the maximum likelihood estimation method.

The unknown hyper-parameters in (11) are: Δ = {B, σ, Θ, ρ, h}, where B = [(b1)T,  … , (bi)T, ..., (bs)T]Tis the polynomial coefficient matrix with each element as bi = [b0, b1..., bP]T, and the rest can be expressed as σ = [σ1,  … , σs]T, Θ = [θ1,  … , θs]T, ρ = [ρ1,  … , ρs − 1]T, h = [h1,  … , hs]T.

In this work, the method considering the full correlation of all the response models (Liu et al. 2018) is employed for parameter estimation. Then, one can obtain that all the collected data d follow a multivariate normal distribution based on the assumption of GP, i.e.:

$$ \mathbf{d}\sim \mathcal{N}\left(\mathbf{HB},{\mathbf{V}}_d\right) $$
(12)

where H is defined as:

$$ \mathbf{H}=\left[\begin{array}{cccc}{\boldsymbol{\Phi}}^1\left({\mathbf{D}}_1\left(\boldsymbol{\upxi} \right)\right)& \mathbf{0}& \cdots & \mathbf{0}\\ {}{\rho}_1{\boldsymbol{\Phi}}^1\left({\mathbf{D}}_2\left(\boldsymbol{\upxi} \right)\right)& {\boldsymbol{\Phi}}^2\left({\mathbf{D}}_2\left(\boldsymbol{\upxi} \right)\right)& \mathbf{0}& \mathbf{0}\\ {}{\rho}_1{\rho}_2{\boldsymbol{\Phi}}^1\left({\mathbf{D}}_3\left(\boldsymbol{\upxi} \right)\right)& {\rho}_2{\boldsymbol{\Phi}}^2\left({\mathbf{D}}_3\left(\boldsymbol{\upxi} \right)\right)& {\boldsymbol{\Phi}}^3\left({\mathbf{D}}_3\left(\boldsymbol{\upxi} \right)\right)& \vdots \\ {}\vdots & \vdots & \vdots & \mathbf{0}\\ {}\left(\prod \limits_{i=1}^{s-1}{\rho}_i\right){\boldsymbol{\Phi}}^1\left({\mathbf{D}}_s\left(\boldsymbol{\upxi} \right)\right)& \left(\prod \limits_{i=2}^{s-1}{\rho}_i\right){\boldsymbol{\Phi}}^2\left({\mathbf{D}}_s\left(\boldsymbol{\upxi} \right)\right)& \cdots & {\boldsymbol{\Phi}}^s\left({\mathbf{D}}_s\left(\boldsymbol{\upxi} \right)\right)\end{array}\right] $$
(13)

and Φt(Dj(ξ))is a matrix of size nj × (P + 1) (j, t = 1,2, …,s), formulated as:

$$ {\boldsymbol{\Phi}}^t\left({\mathbf{D}}_j\left(\boldsymbol{\xi} \right)\right)=\left[{\varPhi}_0^t\left({\mathbf{D}}_j\left(\boldsymbol{\xi} \right)\right),{\varPhi}_1^t\left({\mathbf{D}}_j\left(\boldsymbol{\xi} \right)\right),\dots, {\varPhi}_P^t\left({\mathbf{D}}_j\left(\boldsymbol{\xi} \right)\right)\right] $$
(14)

The matrix Vd of size (n1+, ..., +ns) × (n1+, ..., +ns) is given as:

$$ {\mathbf{V}}_d=\left(\begin{array}{ccc}{V}_{1,1}& \cdots & {V}_{1,s}\\ {}\vdots & \ddots & \vdots \\ {}{V}_{s,1}& \cdots & {V}_{s,s}\end{array}\right) $$
(15)

where the tth diagonal block (nt × nt) is defined as:

$$ {V}_{t,t}={\sigma}_t^2{R}_t\left({\mathbf{D}}_t\right)+{\sigma}_{t-1}^2{\rho}_{t-1}^2{R}_{t-1}\left({\mathbf{D}}_t\right)+\dots +{\sigma}_1^2\left(\prod \limits_{i=1}^{t-1}{\rho}_i^2\right){R}_1\left({\mathbf{D}}_t\right) $$
(16)

withRi(Dt) = Ri(Dt, Dt, θi, hi), and the off-diagonal block of size \( {n}_t\times {n}_{t^{\prime }} \) is given by:

$$ {V}_{t,{t}^{\prime }}=\underset{1\le t<{t}^{\prime}\le s}{\left(\prod \limits_{i=t}^{t^{\prime }-1}{\rho}_i\right)}\left({\sigma}_t^2{R}_t\left({\mathbf{D}}_t,{\mathbf{D}}_{t^{\prime }}\right)+\dots +{\sigma}_1^2\left(\prod \limits_{i=1}^{t-1}{\rho}_i^2\right){R}_1\left({\mathbf{D}}_t,{\mathbf{D}}_{t^{\prime }}\right)\right) $$
(17)

The maximum likelihood function is defined as follows:

$$ \mathcal{L}\left(\varDelta |\mathbf{d}\right)\propto {\left|{\mathbf{V}}_d\right|}^{-1/2}{\left|\mathbf{W}\right|}^{1/2}\mathit{\exp}\left\{-\frac{1}{2}{\left(\mathbf{d}-\mathbf{HB}\right)}^T{\mathbf{V}}_d^{-1}\left(\mathbf{d}-\mathbf{HB}\right)\right\} $$
(18)

where \( \mathbf{W}={\left({\mathbf{H}}^T{\mathbf{V}}_d^{-1}\mathbf{H}\right)}^{-1} \), and the PC coefficient matrix B in the PC-Kriging model can be derived using the first order optimal condition:

$$ \overset{\frown }{\mathbf{B}}=\mathbf{WH}{\mathbf{V}}_d^{-1}\mathbf{d} $$
(19)

The rest parameters σ, Θ, ρ and h can be obtained using genetic algorithm or simulated annealing algorithm by maximizing (18).

  1. Step 4.

    MCS is conducted on the constructed multi-fidelity PC-Kriging metamodel above to obtain the stochastic property of random output y.

Based on the GP modeling theory, all the collected data d together with the to-be-predicted responses \( \boldsymbol{y}\left({\mathbf{x}}_p\right)={\left[y\left({\mathbf{x}}_1\right),...,y\left({\mathbf{x}}_{n_p}\right)\right]}^T \) follow a multivariate normal distribution:

$$ \left[\begin{array}{c}\mathbf{d}\\ {}y\left({\mathbf{x}}_p\right)\end{array}\right]\sim \mathcal{N}\left(\left[\begin{array}{c}\mathbf{H}\\ {}{\mathbf{H}}_p\end{array}\right]\mathbf{B},\left[\begin{array}{cc}{\mathbf{V}}_d& {\mathbf{T}}_p^T\\ {}{\mathbf{T}}_p& {\mathbf{V}}_p\end{array}\right]\right) $$
(20)

Then, the final prediction of y(xp) can be calculated by:

$$ \overset{\frown }{\boldsymbol{y}}\left({\mathbf{x}}_p\right)={\mathbf{H}}_p\overset{\frown }{\mathbf{B}}+{\mathbf{T}}_p{\mathbf{V}}_d^{-1}\left(\mathbf{d}-\mathbf{H}\overset{\frown }{\mathbf{B}}\right) $$
(21)

where:

$$ {\mathbf{H}}_p=\left(\left(\prod \limits_{i=1}^{s-1}{\rho}_i\right){\boldsymbol{\Phi}}^1\left({\mathbf{x}}_p\left(\xi \right)\right),\left(\prod \limits_{i=2}^{s-1}{\rho}_i\right){\boldsymbol{\Phi}}^2\left({\mathbf{x}}_p\left(\xi \right)\right),\dots, {\rho}_{s-1}{\boldsymbol{\Phi}}^{s-1}\left({\mathbf{x}}_p\left(\xi \right)\right),{\boldsymbol{\Phi}}^s\left({\mathbf{x}}_p\left(\xi \right)\right)\right) $$
(22)
$$ {\mathbf{T}}_p={\left({t}_1{\left({x}_p,{\mathbf{D}}_1\right)}^T,\dots, {t}_s{\left({x}_p,{\mathbf{D}}_s\right)}^T\right)}^T $$
(23)
$$ {t}_1\left({\mathbf{x}}_p,{\mathbf{D}}_1\right)=\left(\prod \limits_{i=1}^{s-1}{\rho}_i\right){\sigma}_1^2{R}_1{\left({\mathbf{x}}_p,{\mathbf{D}}_1\right)}^T $$
(24)
$$ {t}_t\left({\mathbf{x}}_p,{\mathbf{D}}_t\right)={\rho}_{t-1}{t}_t\left({\mathbf{x}}_p,{\mathbf{D}}_t\right)+\left(\prod \limits_{i=t}^s{\rho}_i\right){\sigma}_t^2{R}_t{\left({\mathbf{x}}_p,{\mathbf{D}}_t\right)}^T,t=2,\dots, s $$
(25)
$$ {\mathbf{V}}_p={V}_d^s\left({\mathbf{x}}_p,{\mathbf{x}}_p\right)+{\rho}_{s-1}^2{V}_d^{s-1}\left({\mathbf{x}}_p,{\mathbf{x}}_p\right)+{\rho}_{s-1}^2{\rho}_{s-2}^2{V}_d^{s-2}\left({\mathbf{x}}_p,{\mathbf{x}}_p\right)+\dots +\left(\prod \limits_{t=2}^{s-1}{\rho}_t\right){V}_d^2\left({\mathbf{x}}_p,{\mathbf{x}}_p\right) $$
(26)

MCS is employed directly on (21) to calculate the mean, standard deviation and probabilistic distribution, etc., of the random response y.

3.2 Multi-fidelity UP for non-hierarchical fidelity

For the non-hierarchical fidelity case, it is assumed that there are m lower-fidelity models with responses as yq(x)|q = 1, ...m, of which the fidelity cannot be ranked in advance, and one high-fidelity model yH(x).

  1. Step 1.

    Similar to the step 1 during hierarchical multi-fidelity modeling in Section 3.1, collect input data \( {\mathbf{D}}_q={\left[{\left({\mathbf{x}}_1^q\right)}^T,{\left({\mathbf{x}}_2^q\right)}^T,...,{\left({\mathbf{x}}_{n_q}^q\right)}^T\right]}^T \)and output response data \( {\mathbf{d}}_q={\left[{y}^q\left({\mathbf{x}}_1^q\right),...,{y}^q\left({\mathbf{x}}_{n_q}^q\right)\right]}^T \)for each lower-fidelity model. Let \( {\mathbf{d}}_L={\left[{\mathbf{d}}_1^T,...,{\mathbf{d}}_m^T\right]}^T \)denote the collected response data from all lower-fidelity models at the input sites ΓL = [D1; D2; ...; Dm], and \( {\mathbf{d}}_H={\left[{y}^H\left({\mathbf{x}}_1^H\right),...,{y}^H\left({\mathbf{x}}_{n_H}^H\right)\right]}^T \) denote the collected response data from high-fidelity model at input sites \( {\mathbf{D}}_H={\left[{\left({\mathbf{x}}_1^H\right)}^T,...,{\left({\mathbf{x}}_{n_H}^H\right)}^T\right]}^T \).

  2. Step 2.

    Construct the multi-fidelity PC-Kriging metamodel in replacement ofyH(x).

According to the weighted summation method (Chen et al. 2016), represent yH(x) as an addition of the weighted summation of all the lower-fidelity models yq(x)|q = 1, ...m and a residual deviation function δ(x):

$$ {y}^H\left(\mathbf{x}\right)=\sum \limits_{q=1}^m{\rho}^q{y}^q\left(\mathbf{x}\right)+\delta \left(\mathbf{x}\right) $$
(27)

where ρqdenotes the weighting coefficient of model yq(x).

It is assumed that all the lower-fidelity models yq(x)|q = 1, ...mandδ(x)are priori independent and can be respectively represented as a GP to simplify the model fusion process. Construct the stochastic metamodel for each lower-fidelity model yq(x)|q = 1, ...m using the PC-Kriging method, during which the same Gaussian correlation function R(x, x, θ, h) is employed considering that all the lower-fidelity models describe the same physical process. Similarly, construct the PC-Kriging metamodel for δ(x) with the Gaussian correlation function Rδ(x, x, θδ, hδ).

Based on (27), yH(x) can be further expressed as a GP additively as follows:

$$ {y}^H\left(\mathbf{x}\right)\sim \mathcal{GP}\left(M\left(\mathbf{x}\right),V\left(\mathbf{x},{\mathbf{x}}^{\prime}\right)\right) $$
(28)

where \( M\left(\mathbf{x}\right)=\sum \limits_{q=1}^m{\boldsymbol{\Phi}}^q{\left(\mathbf{x}\right)}^T{\boldsymbol{b}}^q{\rho}^q+{\boldsymbol{\Phi}}^{\delta }{\left(\mathbf{x}\right)}^T{\boldsymbol{b}}^{\delta } \), \( V\left(\mathbf{x},{\mathbf{x}}^{\prime}\right)={\boldsymbol{\rho}}^T\mathbf{E}\boldsymbol{\rho } R\left(\mathbf{x},{\mathbf{x}}^{\prime}\right)+{\sigma}_{\delta}^2{R}^{\delta}\left(\mathbf{x},{\mathbf{x}}^{\prime}\right) \), ρ = [ρ1,  … , ρm]T, \( \mathbf{E}=\left[\begin{array}{ccc}{\mathbf{E}}_{1,1}& \cdots & {\mathbf{E}}_{1,m}\\ {}\vdots & & \vdots \\ {}{\mathbf{E}}_{m,1}& \cdots & {\mathbf{E}}_{m,m}\end{array}\right] \), Ei,j is the unknown covariance between lower-fidelity models yi(x) and yj(x) calculated by \( {\mathbf{E}}_{i,j}={c}_{i,j}\sqrt{{\mathbf{E}}_{i,i}{\mathbf{E}}_{j,j}} \), and ci, j ∈ [−1, 1] is the unknown correlation coefficient.

  1. Step 3.

    Estimate the hyper-parameters by the maximum likelihood estimation method.

The hyper-parameters to be estimated areΔ = {b1,  … , bm, bδ, Ε, ρ, θ, θδ, h, hδ}. Similarly to the step 3 during the hierarchical multi-fidelity modeling, all the collected response data d = [dL; dH] follow a multivariate Gaussian distribution as shown in (12), and some related matrices are re-defined as follows:

$$ \mathbf{B}={\left[{\left({\boldsymbol{b}}^1\right)}^T,\dots, {\left({\boldsymbol{b}}^m\right)}^T,{\left({\boldsymbol{b}}^{\delta}\right)}^T\right]}^T $$
(29)
$$ \mathbf{H}=\left[\begin{array}{cccc}{\boldsymbol{\Phi}}^1\left({\mathbf{D}}_1\left(\xi \right)\right)& \cdots & \mathbf{0}& \mathbf{0}\\ {}\vdots & \ddots & \vdots & \vdots \\ {}\mathbf{0}& \cdots & {\boldsymbol{\Phi}}^m\left({\mathbf{D}}_m\left(\xi \right)\right)& \mathbf{0}\\ {}{\rho}^1{\boldsymbol{\Phi}}^1\left({\mathbf{D}}_H\left(\xi \right)\right)& \cdots & {\rho}^m{\boldsymbol{\Phi}}^m\left({\mathbf{D}}_H\left(\xi \right)\right)& {\boldsymbol{\Phi}}^{\delta}\left({\mathbf{D}}_H\left(\xi \right)\right)\end{array}\right] $$
(30)
$$ {\mathbf{V}}_d=\left[\begin{array}{cccc}{\boldsymbol{e}}_1^T\mathbf{E}{\boldsymbol{e}}_1R\left({\mathbf{D}}_1,{\mathbf{D}}_1\right)& \cdots & {\boldsymbol{e}}_1^T\mathbf{E}{\boldsymbol{e}}_mR\left({\mathbf{D}}_1,{\mathbf{D}}_m\right)& {\boldsymbol{e}}_1^T\mathbf{E}\boldsymbol{\rho } R\left({\mathbf{D}}_1,{\mathbf{D}}_H\right)\\ {}\vdots & \ddots & \vdots & \vdots \\ {}{\boldsymbol{e}}_m^T\mathbf{E}{\boldsymbol{e}}_1R\left({\mathbf{D}}_m,{\mathbf{D}}_1\right)& \cdots & {\boldsymbol{e}}_m^T\mathbf{E}{\boldsymbol{e}}_mR\left({\mathbf{D}}_m,{\mathbf{D}}_m\right)& {\boldsymbol{e}}_m^T\mathbf{E}\boldsymbol{\rho } R\left({\mathbf{D}}_m,{\mathbf{D}}_H\right)\\ {}\boldsymbol{\rho} \boldsymbol{E}{\boldsymbol{e}}_1R\left({\mathbf{D}}_H,{\mathbf{D}}_m\right)& \cdots & \boldsymbol{\rho} \boldsymbol{E}{\boldsymbol{e}}_mR\left({\mathbf{D}}_H,{\mathbf{D}}_m\right)& \begin{array}{l}{\boldsymbol{\rho}}^T\mathbf{E}\boldsymbol{\rho } R\left({\mathbf{D}}_H,{\mathbf{D}}_H\right)\\ {}+{\sigma}_{\delta}^2{R}^{\delta}\left({\mathbf{D}}_H,{\mathbf{D}}_H\right)\end{array}\end{array}\right] $$
(31)

where ei is an m-dimensional unit column vector, with the ith element as 1, while the others as zeros.

Then, using the same MLE method in Section 3.1, the hyper-parameters can be estimated.

  1. Step 4.

    Predict response value at the sample pointsxpsimilarly to the step 4 during the hierarchical multi-fidelity modeling in Section 3.1. Meanwhile, some matrices are re-defined as follows:

$$ {\mathbf{H}}_p=\left[{\rho}^1{\boldsymbol{\Phi}}^1\left({\mathbf{x}}_p\left(\xi \right)\right),\dots, {\rho}^m{\boldsymbol{\Phi}}^m\left({\mathbf{x}}_p\left(\xi \right)\right),{\boldsymbol{\Phi}}^{\delta}\left({\mathbf{x}}_p\left(\xi \right)\right)\right] $$
(32)
$$ {\mathbf{T}}_p=\left[{\boldsymbol{\rho}}^T\mathbf{E}{e}_1R\left({\mathbf{x}}_p,{\mathbf{D}}_1\right),\dots, {\boldsymbol{\rho}}^T\mathbf{E}{e}_mR\left({\mathbf{x}}_p,{\mathbf{D}}_m\right),{\boldsymbol{\rho}}^T\mathbf{E}\boldsymbol{\rho } R\left({\mathbf{x}}_p,{\mathbf{D}}_H\right)+{\sigma}_{\delta}^2{R}^{\delta}\left({\mathbf{x}}_p,{\mathbf{D}}_H\right)\right] $$
(33)
$$ {\mathbf{V}}_p={\boldsymbol{\rho}}^T\mathbf{E}\boldsymbol{\rho } R\left({\mathbf{x}}_p,{\mathbf{x}}_p\right)+{\sigma}_{\delta}^2{R}^{\delta}\left({\mathbf{x}}_p,{\mathbf{x}}_p\right) $$
(34)

4 Comparative studies

The effectiveness of the proposed two multi-fidelity modeling approaches is tested by some mathematical examples with different nonlinearity and random input dimension for UP in this section.

4.1 Test for hierarchical fidelity

For hierarchical fidelity, the existing multi-fidelity PC method with the addition form proposed by Ng and Eldred (Ng and Eldred 2012) (denoted as MF-PC) and co-kriging that has been commonly employed for metamodeling in the deterministic domain (Kennedy and O'Hagan 2000) are also employed for UP and compared to the proposed method in this work (denoted as MF-PCK). The tested examples are shown in Table 1, in which \( \mathcal{B} \),\( \mathcal{U} \), and \( \mathcal{N} \) respectively represent beta, uniform, and normal distribution. Examples 1 and 3–6 are adopted from the papers about multi-fidelity modeling in literature, and example 2 is created by authors that have a simple correction term to explore the effect of the scale factor in MF-PCK. Meanwhile, cases with nested (Di + 1 ⊆ Di, i = 1, …,s-1) and non-nested (Di + 1 ⊄ Di, i = 1, …,s-1) sample points are both tested to explore the impact of sample property on the accuracy of multi-fidelity modeling. The results generated by conducting MCS on the original high-fidelity response function (denoted as direct MCS for simplicity) is employed as the benchmark to validate the effectiveness of the proposed methods.

Table 1 Tested examples with hierarchical fidelity

Considering that high-fidelity sample points are often much fewer than low-fidelity ones, for examples 1, 2, 5, and 6 with a two-level fidelity, the order of PC terms for the low-fidelity model y1(x) and the correction term between two models are set as 5 and 3 (forδ2(x)), respectively. For examples 3 and 4 with a three-level fidelity, the order of PC terms for lower-fidelity model y1(x) and correction terms between three models are respectively set as 5, 3 (forδ2(x)), and 2 (forδ3(x)). The number of sample points employed for each response model during multi-fidelity modeling for all the examples is listed in Table 3, considering the varying nonlinearity and complexity of each example.

Table 2 Random variables and distributions for example 6
Table 3 The number of sample points for each response model

Meanwhile, as it is relatively a high-dimensional problem (d = 6), the sparse index in (3) is set as q = 0.5 for example 5 and q = 0.25 for example 6 during the construction of all PC models/terms for both MF-PCK and MF-PC to save computational cost.

The first four order statistical moments (mean, variance, skewness, and kurtosis) of the function response for the three multi-fidelity UP methods (MF-PCK, co-kriging, and MF-PC) are calculated through conducting MCS on the constructed multi-fidelity metamodel, which are then compared to those obtained by direct MCS. Considering that the sample points for each model are generated randomly during the tests, the results of UP may vary each time. Therefore, simulation is repeatedly conducted on each example for 20 times to avoid the impact of sample randomness on UP. Taking one of the 20 simulations for illustration, the curves/surfaces of the function responses produced by the three methods for examples 1–4 (one- or two-dimensional problems) are respectively shown in Figs. 1, 3, 5, 7, and 9, in which HF, MF-PCK, Co-kri, and MF-PC respectively denote the curve generated by the high-fidelity model, the proposed MF-PCK method, co-kriging, and the existing MF-PC method. The errors of the first four order statistical moments relative to those of direct MCS are calculated as well, of which the mean values of the 20 simulations for all the examples are shown in Tables 4, 5, 6, 7, 8, 9, and 10, respectively.

Fig. 1
figure 1

Function response curves of example 1 (case 1)

Table 4 Relative errors of statistical moments for example 1 (case 1)
Table 5 Relative errors of statistical moments for example 1 (case 2)
Table 6 Relative errors of statistical moments for example 2
Table 7 Relative errors of statistical moments for example 3
Table 8 Relative errors of statistical moments for example 4
Table 9 Relative errors of statistical moments for example 5
Table 10 Relative errors of statistical moments for example 6
Fig. 2
figure 2

Boxplots of statistical moments for example 1 (case 1)

Fig. 3
figure 3

Function response curves of example 1 (case 2)

Fig. 4
figure 4

Boxplots of statistical moments for example 1 (case 2)

Fig. 5
figure 5

Function response curves for example 2

Fig. 6
figure 6

Boxplots of statistical moments for example 2

Fig. 7
figure 7

Function response curves of example 3

Fig. 8
figure 8

Boxplots of statistical moments for example 3

Fig. 9
figure 9

Function response curves of example 4

During our tests, it is found that the proposed MF-PCK method can produce results that are very close to those of direct MCS, exhibiting high accuracy. However, with the same sample points as the three methods, the errors of the existing MF-PC method are evidently much larger than those of the other two approaches (MF-PCK and co-kriging) for most examples. Therefore, another UP test is conducted on MF-PC by increasing the number of high-fidelity sample points, during which only examples 1 and 2 are tested considering the space limit of this paper and the number of high-fidelity sample points are increased to 10 and 12, respectively. Correspondingly, the orders of PC for low-fidelity model and correction term are increased to 10 and 8, respectively. The results of this test are shown in Tables 4, 5, and 6, in which “A,” “B,” “C,” and “CI” respectively denote the results produced by MF-PCK, co-kriging, MF-PC, and MF-PC with increased high-fidelity sample points; “1” and “2” respectively denote the case with nested and non-nested sample points for simplicity; em, ev, es, and ek, respectively denote the relative error of mean, variance, skewness, and kurtosis of the output response with respect to MCS.

To more clearly show the robustness of the three multi-fidelity UP methods, the calculated statistical moments for all the examples are shown as boxplots in Figs. 2, 4, 6, 8, 10, 11, and 12. In the boxplots, the results of direct MCS are marked as MCS, and the black dot represents the mean value of the obtained data from the repeated 20 times UP and the red star represents the abnormal value (Figs. 6, 8, and 12). The top and bottom edges of the box are the upper and lower quartile of the data from the 20 simulations, respectively. And the line within the box represents the median. Generally, the smaller the distance between the top and bottom edges of the box is, the more concentrated the data is, and the more robust this method exhibits. From these tables (relative errors of statistical moments) and figures (function response curves/surfaces and boxplots of statistical moments), some noteworthy observations can be made.

Fig. 10
figure 10

Boxplots of statistical moments for example 4

Fig. 11
figure 11

Boxplots of statistical moments for example 5

Fig. 12
figure 12

Boxplots of statistical moments for example 6

Firstly, it is noticed that for both nested and non-nested sample points with the same computational cost, the proposed MF-PCK method is generally the most accurate, followed by co-kriging and then MF-PC, and MF-PC produces relatively large errors compared to the other two approaches.

MF-PCK vs. co-kriging

MF-PCK is generally more accurate than co-kriging. Especially for examples 1, 4, 5, and 6, MF-PCK is evidently much more accurate. The interpretation is that although both approaches employ the GP modeling theory within the KOH framework to construct metamodel for UP, a PC model and a constant number are respectively adopted to capture the global trend of the stochastic output response for MF-PCK and co-kriging. As is well known, the PC model can deal with various random input distribution types including symmetric and unsymmetric ones and high nonlinearity of output response. Therefore, the conjunction of PC and GP in MF-PCK is more accurate in doing UP. Specially, for example 1 (case 2), as beta distribution (unsymmetric) is considered that is difficult to be handled by co-kriging, the results of the proposed MF-PCK method are much more accurate. For example 6, it is a high-dimensional problem and the variation of output response is large, which is very difficult to be approximated accurately by co-kriging. Therefore, co-kriging produces very large errors (see ev, es, and ek). However, it is observed that for example 2, although both approaches can produce accurate results that are very close to those of direct MCS, some errors (es and ek) of co-kriging are slightly smaller than MF-PCK (see the bold numbers with underline in Table 6). The reason is that for example 2, normal distribution (symmetrical) is considered and the variation of output response is relatively small, and thus it is easier for co-kriging to describe the output response function with high accuracy. Meanwhile, for MF-PCK, as PC is included, more parameters should be estimated, which may induce certain numerical error during parameter estimation.

MF-PCK vs. MF-PC

The proposed MF-PCK method is clearly more accurate than the existing MF-PC method. With the fusion of some lower-fidelity data, the accuracy of predicted response curves/surfaces as well as UP can be evidently improved with only a few HF data within the proposed MF-PCK multi-fidelity framework. Although MF-PC can be expected to produce comparable results (see the errors of CI-1 and CI-2 with increased HF sample points in Tables 4, 5, and 6) to MF-PCK, the number of HF sample points should be clearly increased. The interpretation is that the highly efficient KOH framework is employed for model fusion for MF-PCK, in which a scale factor ρ can be optimized. The introduction of ρ can clearly reduce the nonlinearity and bumpy issue of the correction term δt(x) (Park et al. 2016; Ren et al. 2016), which can certainly improve the accuracy of multi-fidelity metamodel. However, for MF-PC, ρis actually set as a constant ρ = 1. Taking examples 1 and 2 for illustration here, as ρcalculated by MF-PCK is ρ = 2 for example 1, the correction term δt(x)is completely linear, which is easy to be approximated by PC-Kriging. However, for MF-PC, asρ = 1, a nonlinear correction term δt(x) is produced, which would evidently increase the difficulty in approximation. For example 2, asρemployed by MF-PCK and MF-PC are both ρ = 1, the correction terms δt(x) approximated by the two approaches are close to each other. Therefore, the accuracy of MF-PC is clearly improved compared to that of example 1. In addition, a GP term is added to the PC model to enhance the local approximation ability of PC for the proposed MF-PCK method, which can also improve the accuracy compared to MF-PC.

Secondly, from the boxplots shown in Figs. 2, 4, 6, 8, 10, and 11, it is observed that for MF-PCK and co-kriging with nested and non-nested sample points, the top and bottom edges of the box generally are very close to each other, and some even seem overlapped (Figs. 6 and 8), and the distances between the two edges are much smaller than those of MF-PC with the same sample points. This indicates that MF-PCK and co-kriging are more robust and stable than MF-PC for UP as the data from the 20 simulations for the two approaches are more concentrated, which is attributed to the employment of the KOH framework in the two approaches. Although, MF-PC can also produce concentrated and comparable results, the order of PC and number of high-fidelity sample points are required to be increased greatly (see the boxplots of CI-1 and CI-2 for examples 1 and 2). Meanwhile, it is also observed that the distances between the two edges for MF-PCK are basically smaller than those of co-kriging, indicating that MF-PCK is more robust and stable than co-kriging. However, for example 2, such distance for co-kriging is smaller than MF-PCK (see Fig. 6). As has been stated above that this example is relatively easier to be approximated, the impact of instability existing in parameter estimation for MF-PCK that is weak in common cases becomes more evident.

Thirdly, compared to the results with non-nested sampling points, the accuracy with nested ones is generally slightly better for MF-PCK and co-kriging; while it is much better for MF-PC. The interpretation lies in two aspects. For MF-PC, the direct addition framework is employed, in which the PC model in the correction term δ(x) is directly constructed based on the high-fidelity input sample points xHand the corresponding response difference values yH(xH) − yL(xH). For the nested case, yL(xH)is the exact response from the low-fidelity model, while for the non-nested one, yL(xH) is predicted by PC at xH, which inevitably would induce error. Such error may be large when PC for the low-fidelity model is inaccurate. However, for the other two approaches, the KOH framework is employed, in which the PC-Kriging/Kriging model for the correction term δ(x)is not directly and explicitly constructed based on the response difference values as that of MF-PC, but estimated using MLE based on all the collected response data from the whole simulation models. Therefore, the prediction error is avoided, which is helpful for ensuring the accuracy. In addition, the covariance among data with different fidelities is fully considered during the hyper-parameter estimation for MF-PCK and co-kriging (Liu et al. 2018), which can reduce the adverse impact of the non-nested sample points to some extent. However, for MF-PC, the PC coefficients for the LF model and additive correction terms are directly calculated by regression, based on which the PC model for the HF model is constructed. It is also noticed that the results of MF-PC with nested sample points are much more accurate than those with non-nested ones for examples 3 and 4. The reason is that there are three multi-fidelity models to fuse, involving two correction terms, which would amplify the impact of prediction error on UP for MF-PC.

Fourthly, it is found from Figs. 1, 5, and 7 that the response curves generated by MF-PC is so far from the HF response at the first and last HF point. The explanation is as follows. The weighted stochastic response surface method (WSRSM) (Xiong et al. 2011) is employed to construct the PC model for MF-PC, in which the Latin Hypercube sampling method is employed to generate sample points and values of the joint probability density function at the sample points are employed as the weights. It has been demonstrated that WSRSM considering sample weight is more accurate than the stochastic response surface method (SRSM) for UP. For example 1 (case 1, Fig. 1), example 2 (Fig. 5), and example 3 (Fig. 7), the random input follows normal distribution, and thus the smallest weights are assigned to the first and last sample points as they are furthest from the mean point. Therefore, the improvement in accuracy of the stochastic response surface at the first and last sample points is relatively smaller than the points in the middle region.

As can be seen from Fig. 7, this phenomenon is more obvious for example 3. The reason is that three multi-fidelity models with random inputs of normal distribution and three weighted stochastic response surfaces (two for the correction terms and one for the lowest-fidelity model) are involved for example 3, which would amplify the impact of above prediction errors. For problem with uniform distributed random input, this phenomenon is not very obvious. For the proposed MF-PCK method, as the PC coefficients are calculated by the maximum likelihood estimation method rather than WSRSM, this phenomenon does not exist.

4.2 Test for non-hierarchical fidelity

For models with non-hierarchical fidelity, cases with nested (DH ⊆ Di, i = 1, …,m) and non-nested (DH ⊄ Di, i = 1, …,m) sample points are also tested to explore the impact of sample property on the accuracy of multi-fidelity modeling. The tested examples are displayed in Table 11. The MF-PC method can only be used in the case that the accuracy of the multi-fidelity models can be ranked (it is named as the hierarchical fidelity). However, for the case of non-hierarchical fidelity that the accuracy of the multi-fidelity models cannot be ranked, MF-PC is no longer applicable, as it constructs the metamodel based on the fidelity level and the higher-fidelity model is represented as the sum of the adjacent lower-fidelity PC model and the PC correction model of the difference between them. Therefore, only the proposed MF-PCK method is tested.

Table 11 Tested examples with non-hierarchical fidelity

Similar to the cases with hierarchical fidelity, 20 simulations are repeatedly done. For all the examples, the order of PC for lower-fidelity models and correction term are set as 5 and 3, respectively. For example 1, the input sample points for the three models y1(x), y2(x), and y3(x) are set as n1 = 15, n2 = 15, and n3 = 8, and for example 2, n1 = 20, n2 = 20, and n3 = 10. The existing MF-PC method is not tested here, as it cannot deal with models with non-hierarchical fidelity. Therefore, only the MF-PCK method is tested.

The response function curves, errors of the first four order statistical moments relative to those of MCS, and boxplots of the statistical moments are respectively shown in Figs. 13 and 15, Tables 12 and 13, and Figs. 14 and 16. From these results, it is observed that for the non-hierarchical case, the proposed non-hierarchical MF-PCK approach can also produce accurate enough and robust results. The approximated response function curves by MF-PCK are very close to those of the high-fidelity one. Meanwhile, the statistical moments produced by MF-PCK show great agreements to those of MCS, and exhibit high robustness. However, the existing MF-PC method cannot work in these cases.

Fig. 13
figure 13

Function response curves of example 1

Table 12 Relative errors of statistical moments for example 1
Table 13 Relative errors of statistical moments for example 2
Fig. 14
figure 14

Boxplots of statistical moments for example 1

Fig. 15
figure 15

Function response curves of example 2

Fig. 16
figure 16

Boxplots of statistical moments for example 2

In this work, the sample sizes for the multi-fidelity response models {n1, n2, …} in each example are determined by testing the accuracy of UP corresponding to different combinations of sample sizes. The combination that yields accurate UP results for MF-PCK and the minimum amount of computational cost is selected. The other approaches (co-kriging and MF-PC) just use the same sample points to do UP for comparison. As the tested examples are simple mathematical problems, it is relatively easy to obtain a good combination of sample sizes. However, it is impossible to employ this way in practice especially for black-box-type problems. This paper tries to develop a new multi-fidelity modeling approach for UP to reduce the computational cost, and the focus lies in how to fusion models with different fidelities. How to efficiently determine the sample size for each response model is another research topic and is out of the scope of this work. In literature, many works related to this topic have been done. Some works aims to develop a sequential sampling strategy that can obtain an optimal combination of the sample sizes for a given computational budget and improve the accuracy of a multi-fidelity metamodel as far as possible (Guo et al. 2018; Zhang et al. 2018). Some works propose the scenario of resource allocation to determine such combination, so as to reduce the epistemic uncertainty of model as far as possible (Jiang et al. 2016; Hu and Mahadevan 2018).

5 Application to airfoil optimization

The proposed MF-PCK method is applied to a benchmark aerodynamic design problem involving inviscid and viscous transonic past airfoil shapes, developed by the AIAA aerodynamic design optimization discussion group (Farin 1993). It aims at maximizing the lift-to-drag ratio of the modified NACA 0012 airfoil section at a free-stream Mach number of Ma = 0.7 and an angle of attackα = 3, subject to the thickness constraint.

In this work, the B-spline curve (Guo et al. 2018) with 10 control points is employed for the shape parameterization of the airfoil, where the horizontal locations of the 10 control points are fixed as x = [0.1 0.3 0.5 0.7 0.9 0.9 0.7 0.5 0.3 0.1] and the vertical locations y are free to move, as shown in Fig. 17.

Fig. 17
figure 17

B-spine parameterization for the airfoil

The determinate optimization (DO) is formulated as:

$$ {\displaystyle \begin{array}{l}\underset{y}{\mathit{\max}}\ f=L(y)/D(y)\\ {}\mathrm{s}.\mathrm{t}.\kern1em {t}_{\mathrm{max}}(y)\ge 0.1043\end{array}} $$
(35)

where y is the vector of design variables; L(y)andD(y)are lift and drag, respectively; and tmax is the airfoil maximum thickness and equals 0.1043 for the original baseline airfoil.

Robust optimization (RO) is performed on this problem considering uncertainties from the flight condition, i.e., the Mach number Ma and the attack angleα. Ma and αare assumed to follow uniform distribution with variations ±0.1 and ±1 around their nominal values, respectively. The robust airfoil optimization is formulated as follows:

$$ {\displaystyle \begin{array}{l}\underset{y}{\mathit{\max}}\ F={\mu}_f-k{\sigma}_f\\ {}\mathrm{s}.\mathrm{t}.\kern1em {t}_{\mathrm{max}}(y)\ge 0.1043\end{array}} $$
(36)

where μfandσfare the mean and standard deviation of the lift-to-drag ratio of airfoil, and the weighting factor k is set as k = 3.

During the optimization, the computational fluid dynamics (CFD) flow field solver using Fluent17.0 is employed to obtain the aerodynamic data, and the turbulence model employed is the k-omega two-path turbulence model. The steady-state density solver with Roe-FDS is employed. Two multi-fidelity models are considered according to the number of nodes of grid employed during the CFD analysis. Table 14 shows the convergence condition with different grid density for the NACA0012 airfoil, from which it is found that the density of the mesh significantly impacts the simulation time cost and the accuracy of CFD analysis result. When the density of mesh is increased largely enough, the lift-drag ratio basically remains unchanged (please see the results in the last two rows of Table 14). Therefore, in this work, the CFD analysis with 150 nodes in the remote field of the grid, and 300 in the airfoil boundary is considered as the HF model, while the CFD analysis with 100 nodes in the remote field, and 200 in the airfoil boundary is employed as the LF model.

Table 14 Convergence condition with different grid density for the NACA0012 airfoil

For the MF-PCK model, 5 HF sample points and 10 LF sample points are generated using the Latin hypercube sampling technique to construct the MF-PCK model for uncertainty propagation. Meanwhile, the order of the PC term for the LF model and correction term are set as 2 equally. To test the accuracy of the MF-PCK method for UP using these sample points, the MF-PC model based on 8 HF sample points and 10 LF sample points generated by the Latin hypercube sampling method is also constructed for comparison and the order of PC term is set as 3. The results of MCS are used as the reference values. Table 15 shows the mean and standard deviation of the lift-drag ratio produced by the proposed MF-PCK method, MF-PC and MCS at the original NACA0012 airfoil considering uncertainties from Ma and α. It is observed that the results of MF-PCK are very close to those of MCS, demonstrating its accuracy and effectiveness. Moreover, MF-PCK is basically as accurate as MF-PC, while the function calls of the HF model are clearly reduced. Therefore, during the following robust optimization, MF-PCK constructed with 5 HF and 10 LF sample points is employed for UP during each optimization iteration.

Table 15 Uncertainty propagation results of the NACA0012 airfoil

The robust optimization (RO) with MF-PCK and MF-PC, and deterministic optimization (DO) without considering any uncertainties are conducted for comparison, of which the results are shown in Figs. 18 and 19 and Table 16. For the sake of simplicity, the results of RO using the proposed MF-PCK and the existing MF-PC are denoted as RO-P and RO-E, respectively.

Fig. 18
figure 18

Comparison of three optimized and baseline airfoils

Fig. 19
figure 19

Comparison of static pressure clouds

Table 16 Optimal results different methods

Figure 18 shows the airfoils obtained by the two ROs and DO, as well as that of the baseline. It is observed that the two ROs using the proposed MF-PCK UP and the MF-PC UP methods can produce very similar airfoils. The leading edges produced by the two ROs are almost overlapped. Meanwhile, the thickness of the leading edge of optimization design is obviously reduced than the original baseline airfoil, which can improve the critical Mach number and weaken the shock wave area, thus reduce the drag coefficient of the airfoil. These results indicate that through optimization, the drag coefficient is reduced. In addition, it is noticed that the rear trailing edge produced by DO bends downwards exhibiting evident characteristics of supercritical airfoil, which can increase the lift coefficient. Therefore, compared to the two ROs, the obtained lift-drag ratio is larger for DO, which will be verified in the following analysis.

Figure 19 illustrates the static pressure nephograms obtained by the two ROs and DO, as well as the baseline. It is observed that there is a strong shock wave area (i.e., the junction of dark blue triangle area and light blue area on its right) on the upper surface for the baseline airfoil. The intensity of shock wave is proportional to the pressure difference between both sides of this junction. The stronger of the shock, the larger of the wave drag, and the smaller of the lift-drag ratio. Clearly, it is noticed that the pressure differences between both sides of the junction of the airfoils obtained by all the optimization designs are smaller than that of the baseline, especially for the RO. Therefore, the drag coefficient is clearly reduced through optimization, and it is reduced more for RO compared to that of DO. From the results of Figs. 18 and 19, it is concluded that the increase in the lift-drag ratio for DO is mainly caused by the increase in lift, while it is mainly caused by the decrease in drag for RO.

The optimal results and computational cost of the optimized airfoils by different methods are shown in Table 16, from which it is found that after optimization, the lift-drag ratio can be increased compared to that of the baseline airfoil. Compared to the results of DO, RO can clearly improve the robustness of design (smallerσf) that is less sensitive to uncertainties, while at a sacrifice of the performance (smallerμf). RO with the proposed MF-PCK method can produce very close results to those from RO with the existing MF-PC approach, while it clearly reduces the computational time (415.29 vs. 489.72). These results show great agreement to what have been observed above, and demonstrate the effectiveness and advantages of the proposed multi-fidelity UP method.

6 Conclusions

In this paper, a multi-fidelity PC approach using the Gaussian process modeling theory is developed to make the PC method more efficient and applicable to practical problems. With the proposed approach, the classic multi-level co-kriging modeling framework is extended from the deterministic domain to the stochastic one for UP, which can deal with analysis models with both hierarchical and non-hierarchical fidelities. Through comparative studies on several numerical examples for UP with the same computational cost, it is noticed that compared to the commonly used addition correction based multi-fidelity PC method, the proposed approach can consistently reduce the error to at least 5%. Compared to co-kriging, it generally can be reduced to about 50 to 12%, and it can be reduced to 10 to 0.1% for problems with unsymmetric distributed random input or large variation. Meanwhile, compared to both existing methods, the proposed approach can evidently enhance the robustness of UP. These results demonstrate the effectiveness and advantage of the proposed multi-fidelity PCK method. The application of the proposed multi-fidelity PCK approach to an engineering robust aerodynamic optimization problem further verifies its effectiveness and applicability.

7 Replication of results

The results shown in the manuscript can be re-produced. Considering the size limit of the uploaded supplementary material, the codes for one of the mathematical example (Example 1) is uploaded as supplementary material. For the rest of the examples, it is very easy to implement by changing the response functions and sample points based on the codes provided to obtain the results shown in the manuscript.