1 Introduction

Surrogate models have received increasing attention due to their low cost and less time-consuming, especially in engineering optimization and design [1, 2]. Design of experiment (DOE) is the key link in constructing the surrogate models, which directly determines the quality of the model to a certain extent. DOE methods for generating samples in the design space can be roughly divided into two categories, one-stage (or static) sampling [3] and adaptive (or sequential) sampling [4].

One-stage sampling refers to generating all sample points used to build a surrogate model at once. Until now, some one-stage sampling methods have been widely used, such as Latin hypercube design [5], minimax and maximin design [6], fractional factorial design [7], and entropy design [8]. These methods are not related to surrogate models, meaning that they require sample size and location to be determined ahead of time. In fact, there is few information about the true model, thereby making it difficult to determine the appropriate number and location of samples. In order to address this limitation, adaptive sampling is developed and demonstrated effeteness in various applications [9,10,11,12]. More importantly, adaptive sampling has proven to be more efficient than one-stage sampling [4].

Adaptive sampling, also known as sequential sampling, is a model-dependent approach, showing that they can actively learn from built surrogate models and then generate new samples. The prediction uncertainty metric is an important part of a typical adaptive sampling scheme [13]. In adaptive sampling, the prediction uncertainty can be used to select samples as reasonably as possible, so as to improve the accuracy of the surrogate model to the greatest extent. Some approaches based on the prediction uncertainty are proposed, such as the probability of improvement [14], mean square error (MSE) [15], statistical lower bound [16], and expected improvement (EI) [17]. However, these approaches are developed based on Kriging surrogate with ready-made uncertainty metrics. In other words, these methods are difficult to apply in other surrogate models. In fact, other surrogate models usually have higher accuracy than Kriging in some problems [18]. Thus, it is necessary to develop some adaptive sampling methods for general surrogate models.

Currently, great efforts have been devoted to adaptive sampling with general surrogate models, and many results have been achieved. For adaptive sampling with general surrogate models, cross-validation (CV) is usually used to estimate uncertainty. By using CV, multiple prediction models can be obtained, and then, according to the distributions of predictions estimate uncertainty. Jin et al. proposed an adaptive sampling method by combining the standard deviation of CV predictions with the maximin distance to estimate uncertainty [19]. Zhang et al. developed an adaptive sampling based on the CV and the interquartile range to determine the new samples (IQR) [13]. Xu et al. used CV predictions and the Voronoi diagram to explore global regions, thereby determining the uncertainty estimation at the given points [20]. Lv et al. presented the Go-inspired hybrid infilling (Go-HI) strategy using CV and a tree-like structure [21]. In addition, there are many CV-based adaptive sampling methods that can be referred to Refs. [22,23,24,25].

Another type of adaptive sampling method, query by committee (QBC), is also proposed in Ref. [26]. QBC is a variance -based adaptive sampling that uses several surrogates to calculate the prediction variances to estimate the uncertainty at a candidate point. The point with maximal prediction variance is chosen as the new sample. The prediction variances are calculated by predictions form several surrogates. Therefore, QBC can be used to identify the region with high uncertainty. Liu et al. comprehensively reviews QBC-based adaptive sampling methods and divides them into homogeneous QBC and heterogeneous QBC [27]. Similarly, Fuhg et al. introduces the QBC method in detail and reviews the QBC adaptive schemes proposed in the literature [28]. Here, there are many references that describe the adaptive sampling method based on QBC [29, 30]. In addition, it should be noted that CV is also often used in QBC-based adaptive sampling [31,32,33].

However, CV-based predictions still introduce new constraints, even as they effectively characterize uncertainty. Because in CV, the largest error at the training samples appears by dropping this point out of the process of fitting. As a result of this error, most adaptive sampling methods based on standard deviations or variances introduce uncertainty at the training samples. In other words, the training samples have non-zero uncertainty, which interferes with the precise characterization of the uncertainty and may even lead to repeated samples [13]. Motivated by this issue, this paper develops an index, namely average uncertainty, to characterize the prediction uncertainty as accurately as possible. A novel general adaptive sampling method based on the average uncertainty (GAS-AU) is further proposed, and the contributions are as follows,

  1. (1)

    This work constructs an average uncertainty index to improve the accuracy of prediction uncertainty by avoiding non-zero uncertainty at training points. This index eliminates the impact of non-zero uncertainty at the training point on the prediction uncertainty measure, as compared to standard deviations or variances in CV-based adaptive sampling. Therefore, the possibility of duplicate samples is avoided during adaptive sampling.

  2. (2)

    Based on the average uncertainty, a novel adaptive sampling approach is proposed to improve the robustness of the model and achieve cost savings. A comparison of GAS-AU with other adaptive sampling approaches shows that it outperforms them on some test problems. More importantly, an engineering case still illustrates the priority of the proposed approach.

The remainder of this paper is structured as follows. Section 2 gives the workflow of the proposed GAS-AU approach in detail. Furthermore, one-dimension problem as an example to illustrate the specific process of using the GAS-AU approach. In Sect. 3, the GAS-AU approach is compared with other adaptive sampling approaches, i.e., MSE [15], EI [17], GO-HI [21], IQR [13], CVVor [20], and EIGF [25] using six test problems with different dimensions. Section 4 uses an engineering problem to evaluate the performance of the proposed method. More importantly, the priority of the proposed GAS-AU approach is verified again by comparing with other approaches. Finally, the main conclusions of this work are given in Sect. 5.

2 Average uncertainty-based general adaptive sampling approach

This work proposes a novel adaptive sampling strategy, namely GAS-AU. In this section, the construction process of average uncertainty is illustrated in detail. In addition, in order to facilitate the reproduction, this section uses a one-dimension problem as an example to illustrate the specific process of using the GAS-AU approach.

2.1 The average uncertainty

Based on CV predictions, this paper constructs an index to reflect the uncertainty of predictions. Firstly, the CV predictions are generated at a given point x in the design space,

$$CVP(x)=\left\{{\widehat{f}}_{-1}\left(x\right),{\widehat{f}}_{-2}\left(x\right),\cdots ,{\widehat{f}}_{-p}\left(x\right)\right\}$$
(1)

where \(CVP(x)\) represents a set of CV predictions. \({\widehat{f}}_{-1}\left(x\right),{\widehat{f}}_{-2}\left(x\right),\) and \({\widehat{f}}_{-p}\left(x\right)\) are the surrogate predictions at a given point x, where – 1, – 2, – p represent the surrogate model trained without the first, second, and p-th training points, respectively. It is noticed that p is the number of the training points.

With different surrogate models, CV predictions are different. Therefore, this work employs multiple surrogate models to enrich the distribution of CV predictions. For multiple different types of surrogate models, CV predictions can be described as

$$MCVP(x)=\left\{{CVP}_{1}\left(x\right),{CVP}_{2}\left(x\right),\cdots ,{CVP}_{m}\left(x\right)\right\}$$
(2)

where \(MCVP\left(x\right)\) represents a population of CV predictions under different surrogate models. m is the number of the surrogate models. The median of the CV-predicted population is calculated as the predicted mean M. The specific formula is as follows,

$$SMCVP(x)=\left\{{S}_{1}\left(x\right),{S}_{2}\left(x\right),\cdots ,{S}_{n}\left(x\right)\right\}$$
(3)
$$M(x)=\left\{\begin{array}{ll}{S}_{(n+1)/2}\left(x\right) &n\, is\, odd\\ \frac{1}{2}\left({S}_{n/2}\left(x\right)+{S}_{(n/2+1)}\left(x\right)\right)& n\,is\,even\end{array}\right.$$
(4)

where \(SMCVP(x)\) means to sort the elements in MCVP from small to large. M is the median, which represents the predicted mean. n is the number of elements in MCVP, which is equal to \(p\times m\).

Next, we need to calculate the predicted value at a given point x. For multiple surrogate models, the predicted value can be calculated by [34,35,36],

$$\widehat{y}\left(x\right)=\sum_{i=1}^{m}{\omega }_{i}{\widehat{y}}_{i}\left(x\right)$$
(5)

where

$$\sum\limits_{i=1}^{m}{\omega }_{i}=1 \mathrm \,{and}\, {\omega }_{i}\ge 0$$
(6)

\(\widehat{y}\left(x\right)\) is the predicted value at a given point x. \({\widehat{y}}_{i}\left(x\right)\) and \({\omega }_{i}\) are the predicted value and the weight value of the i-th surrogate, respectively. In order to obtain the weight value, we first use Leave-one-out CV to calculate the global error [37],

$${CVE}_{i}=\frac{1}{p}\sum_{j=1}^{p}{\left[{y}_{i}\left({x}_{j}\right)-{\widehat{y}}_{i}^{-j}\left({x}_{j}\right)\right]}^{2} j=\mathrm{1,2},\cdots ,p$$
(7)

where \({CVE}_{i}\) is the global error of i-th surrogate. \({y}_{i}\left({x}_{j}\right)\) is true response at a given training point \({x}_{j}\). \({\widehat{y}}_{i}^{-j}\left({x}_{j}\right)\) is the predicted response of surrogate trained without training point \({x}_{j}\). A larger global error represents the lower prediction accuracy of surrogate. At this point, the weight of this surrogate should be set smaller. On the contrary, the weight of this surrogate is set larger when the global error with small value. Thus, the weight can be calculated as follows,

$${\omega }_{i}={e}_{i}/e$$
(8)

where

$$e=\sum\limits_{i=1}^{m}\left({e}_{i}\right) \mathrm\,{and} \,{e}_{i}=1/{CVE}_{i}.$$
(9)

Finally, based on Eqs. (4) and (5), the predicted mean \(M(x)\) and predicted value \(\widehat{y}\left(x\right)\) are obtained at a given point x. The average uncertainty is constructed by,

$$AU(x)=\left|\widehat{y}\left(x\right)-M(x)\right|$$
(10)

\(AU\) stands for the average uncertainty, which can be used to estimate the uncertainty of the surrogate predictions. According to Eq. (10), a new sample is determined,

$$\mathrm{Find} {x}_{new}=\mathrm{max}AU(x)$$
(11)

2.2 Modeling using the GAS-AU approach

The GAS-AU approach is a process of adaptive sampling that adds a new sample at each iteration to improve the accuracy of the surrogate model. The main procedure consists of (1) building multiple surrogate models based on the initial samples, (2) generating a new sample at the point with maximum AU value, (3) updating the model using initial and new samples, (4) continuing to generate new samples until the stopping criterion is met. The overall framework of the GAS-AU is depicted in Fig. 1.

Fig. 1
figure 1

The overall framework of the GAS-AU

2.3 Illustrative example of the GAS-AU approach

To further understand the proposed approach, a test function as shown in Eq. (12) is used to illustrate the specific procedure,

$$y={\left(6x-2\right)}^{2}\times \mathrm{sin}\left[2\left(6x-2\right)\right] 0\le x\le 1.$$
(12)

In this work, the Kriging (KRG) and Radial basis function (RBF) model are used to construct the prediction model. Multiquadric basis function (RBF-M) and thin plate basis function (RBF-T) are selected in the RBF model. Figure 2 shows the CV prediction models under the different surrogate models. Based on these models, nine predicted values can be obtained at a given point x. The distribution of these predicted values reflects, to some extent, the uncertainty at a given point.

Fig. 2
figure 2

The CV predicted model under different surrogate

Figure 3 presents the CV predictions at the sample point (x = 1) and unsampled point (x = 0.76). At sample point x = 1, the calculated mean and median are 10.86 and 15.83, respectively. At this point, the value of true response is 15.83. If the uncertainty is calculated based on the variance or standard deviation, we can find that the uncertainty is not 0 at sample point x = 1. In fact, the uncertainty at the sample point is 0 in the absence of noise. This result interferes with accurate measures of prediction uncertainty. Conversely, using the average uncertainty constructed based on the median, the uncertainty is 0 at the sample point, as shown in Fig. 4. It means that this index can effectively avoid the situation of non-zero uncertainty at the training point. The interference caused by non-zero uncertainty at the training point is eliminated and the possibility of repeated sampling is avoided during adaptive sampling.

Fig. 3
figure 3

The CV predictions at the sample x = 1 and unsampled point x = 0.76

Fig. 4
figure 4

Illustration of average uncertainty and predictions

Based on the maximum AU value, new samples are generated and the predicted model is updated. The coefficient of determination R2 and normalized root mean square error NRMSE are used to evaluate the accuracy of the model, and the results are shown in Tables 1 and 2. MSE is an adaptive sampling approach by the maximum mean square error to determine a new sample [15]. GO-HI, IQR, CVVor, and EIGF are proposed by Lv et al. (2020) [21], Zhang et al. (2020) [13], Xu et al. (2014) [20], and Lam (2008) [25], respectively. As can be seen from Table 1, the prediction performance of the initial surrogate model is poor. After four iterations, the accuracy of the surrogate model is close to 1. However, GAS-AU exhibits higher accuracy than the other approaches. The same results can be drawn from Table 2, indicating the priority of the proposed method.

Table 1 Comparison of the results of R2
Table 2 Comparison of the results of NRMSE

3 Numerical examples

This section verifies the performance of the GAS-AU approach, and compares it with six existing approaches: 1) maximum mean square error (MSE) [15], 2) maximum expected improvement (EI) [17], 3) GO-HI proposed by Lv et al. [21], 4) IQR proposed by zhang et al. [13], 5) CVVor proposed by Xu et al. (2014) [20], and 6) EIGF proposed by Lam (2008) [25]. Note that the same as the multiple surrogates used in Sect. 2.3, this part still uses KRG, RBF-M and RBF-T to construct the prediction model and average uncertainty AU. In addition, these individual surrogate models are implemented using the MATLAB toolbox developed by Viana [38].

3.1 Test functions

The effect of the GAS-AU is tested by six numerical functions with different dimension (D), i.e., the 2D function 1 and 2, the 3D function 3, the 4D function 4, the 6D function 5, and the 10D function 6.

Function 1 (2D)

$$y={{x}_{1}}^{2}+2{{x}_{2}}^{2}-0.3\mathrm{cos}\left({3\pi x}_{1}\right)-0.4\mathrm{cos}\left({4\pi x}_{2}\right)+0.7, x\in {[-100, 100]}^{2}$$
(13)

Function 2 (2D)

$$y={[{x}_{2}-1.275{\left(\frac{{x}_{1}}{\pi }\right)}^{2}+5\frac{{x}_{1}}{\pi }-6]}^{2}+10\left(1-\frac{1}{8\pi }\right)\mathrm{cos}\left({x}_{1}\right)+10, x\in {[-5, 10]}^{2}$$
(14)

Function 3 (3D)

$$y=\sum\limits_{i=1}^{3}{x}_{i}^{2}+{\left(\sum \limits_{i=1}^{3}{0.5ix}_{i}\right)}^{2}+{\left(\sum\limits_{i=1}^{3}{0.5ix}_{i}\right)}^{4}, \,x\in {[-5, 10]}^{3}$$
(15)

Function 4 (4D)

$$y=100{({{x}_{1}}^{2}-{x}_{2})}^{2}+{({x}_{1}-1)}^{2}+{({x}_{3}-1)}^{2}+90{({{x}_{3}}^{2}-{x}_{4})}^{2}+10.1{({x}_{2}-1)}^{2}+{({x}_{4}-1)}^{2}+19.8\left({x}_{2}-1\right)\left({x}_{4}-1\right), x\in {[-10, 10]}^{4}$$
(16)

Function 5 (6D)

$$y=0.0204{x}_{1}{x}_{4}\left({{x}_{1}+{x}_{2}+x}_{3}\right)+0.0187{x}_{2}{x}_{3}\left({{x}_{1}+{1.57x}_{2}+x}_{4}\right)+0.0607{x}_{1}{x}_{4}{x}_{5}^{2}\left({{x}_{1}+{x}_{2}+x}_{3}\right)+0.0437{x}_{2}{x}_{3}{x}_{6}^{2}\left({{x}_{1}+{1.57x}_{2}+x}_{4}\right), x\in {[0, 100000]}^{6}$$
(17)

Function 6 (10D)

$$y=\sum\limits_{i=1}^{10}{x}_{i}^{2}+{\left(\sum\limits_{i=1}^{10}{0.5ix}_{i}\right)}^{2}+{(\sum\limits_{i=1}^{10}{0.5ix}_{i})}^{4}, \,x\in {[0, 50]}^{10}$$
(18)

3.2 Design of experiment and performance criteria

Design of experiment is a sampling approach that acquires some samples from design space before building the surrogate [39]. In this work, Latin hypercube sampling is used to produce the initial samples. Table 3 presents the sampling configurations for the above six test functions. MATLAB built-in function lhsdesign is used to randomly generate the initial samples and test samples. In addition, all results are averaged over 20 DOE sets to reduce the effect of randomness.

Table 3 Sampling configurations for the six functions

The coefficient of determination R2 and normalized root mean square error NRMSE are usually used to evaluate the global accuracy of the surrogate model. The mathematical expression can be described as,

$${R}^{2}=1-\frac{{\sum }_{j}{\left({\widehat{y}}_{j}-{y}_{j}\right)}^{2}}{{\sum }_{j}{\left({y}_{j}-\bar{y}\right)}^{2}}$$
(19)
$$NRMSE=\sqrt{\frac{{\sum }_{j}{\left({\widehat{y}}_{j}-{y}_{j}\right)}^{2}}{{\sum }_{j}{\left({y}_{j}\right)}^{2}}}$$
(20)

where \({\widehat{y}}_{j}\), \({y}_{j}\), and \(\bar{y}\) are the prediction value of the model, the actual response value, and the mean value of \({y}_{j}\), respectively. The range of R2 is from 0 to 1, but negative values appear in some cases. Thus, in this work, R2 was set to 0 if \({R}^{2}\le 0\). On the one hand, \({R}^{2}<0\) and \({R}^{2}=0\) all represent that the surrogate model cannot capture the relationship between design variables and responses. On the other hand, if \({R}^{2}\le 0\) then setting \({R}^{2}=0\) will avoid the large negative value deteriorating the averaged results.

3.3 Performance analysis

For test function 1, Fig. 5 shows the comparisons of the GAS-AU with six approaches, i.e., MSE, EI, GO-HI, IQR, CVVor and EIGF. Note that the mean of R2 and NRMSE is averaged over 20 DOE sets. Std of R2 and NRMSE represent the standard deviation of R2 and NRMSE over 20 DOE sets, which can be described the robustness of the surrogate model. In the top subplot of Fig. 5, we can see that the adaptive sampling approach can effectively improve the prediction accuracy. Compared to other approaches, GAS-AU presents the best prediction accuracy at every infilling sample. In other words, the proposed approach exhibits the best improvement in each iteration than the other approaches. More importantly, compared with MSE, EI, GO-HI, IQR, CVVor and EIGF, the GAS-AU approach performs the best robustness. In addition, the analysis based on the NRMSE also verifies the above results.

Fig. 5
figure 5

Comparison of GAS-AU and six adaptive sampling approaches on test function 1

Figure 6 depicts the comparison of the GAS-AU with MSE, EI, GO-HI, IQR, CVVor, and EIGF on test function 2. The same results as in Fig. 5 are still obtained, that is, the GAS-AU presents the best prediction accuracy and robustness in each iteration process. It shows that compared with other approaches, the proposed approach can maximize the performance of the prediction model when infilling samples each time.

Fig. 6
figure 6

Comparison of GAS-AU and six adaptive sampling approaches on test function 2

On test function 3, Fig. 7 compares GAS-AU with other approaches. The obtained results in Fig. 7 are similar to that in Figs. 5 and 6. In short, compared with MSE, EI, GO-HI, IQR, CVVor, and EIGF, the GAS-AU presents the best prediction accuracy and robustness in each iteration. The results mean that the proposed method can be achieved the best improvement in every infilling sampling than the other approaches.

Fig. 7
figure 7

Comparison of GAS-AU and six adaptive sampling approaches on test function 3

The GAS-AU is compared with other approaches on test function 4, as shown in Fig. 8. It is clear from the overall trend that the proposed method has high priority, even if GAS-AU does not consistently display the best prediction accuracy and robustness. Similarly, in the case of test functions 5 and 6, the results shown in Figs. 9 and 10 still illustrate the above finding. In summary, the GAS-AU provides the best prediction accuracy in each iteration when compared with MSE, EI, GO-HI, IQR, CVVor, and EIGF. In spite of the fact that the GAS-AU may not always exhibit the best robustness, it still performs better than other methods.

Fig. 8
figure 8

Comparison of GAS-AU and six adaptive sampling approaches on test function 4

Fig. 9
figure 9

Comparison of GAS-AU and six adaptive sampling approaches on test function 5

Fig. 10
figure 10

Comparison of GAS-AU and six adaptive sampling approaches on test function 6

As can be seen from Figs. 5, 6, 7, 8, 9, 10, as long as there are enough iterations, a surrogate model with sufficient prediction accuracy can be constructed by the adaptive sampling method. The difference, however, is that a good adaptive sampling method can build a high-accuracy model with fewer iterations. Thus, in order to analyze the performance of GAS-AU more accurately, it is necessary to set the iteration stopping criterion. As described in Ref. [40], R2 > 0.8 means that the surrogate model presents good prediction accuracy. Nevertheless, in this work, the iteration stopping criterion is set to R2 > 0.9 to obtain better prediction accuracy. If the requirements are met, the iterative process is stopped, otherwise, the iteration continues.

Table 4 gives the detailed comparison results of the GAS-AU with other approaches on number of iterations. Note that the mean of iterations is averaged over six test functions. In comparison to other methods, the average savings represents the cost savings achieved by GAS-AU. The GAS-AU has the smallest number of iterations in each test function. At the same time, compared with MSE, EI, GO-HI, IQR, CVVor, and EIGF, the GAS-AU saves an average of 43%, 79%, 46%, 41%, 50%, and 62% computational costs. The results indicate that the proposed approach can build surrogate models with acceptable predictive capability at a lower cost than other approaches. Once again, the priority of GAS-AU is stated.

Table 4 Comparison of GAS-AU with six adaptive sampling approaches on number of iterations

4 Engineering problem

This section uses a hoist sheave as shown in Fig. 11 to further study the performance of the proposed GAS-AU approach. In Fig. 11, x1 to x8 represent the structural parameters, that is, the eight variables in the design space. To determine the design size of the hoist sheave, a maximum stress model is built using the surrogate model. The eight variables are defined as follows:

Fig. 11
figure 11

Structural parameters of the hoist sheave

$$\mathrm{Fun}: \sigma =f\left({x}_{i}\right), i=1, 2, \cdots , 8$$
$$\mathrm{s}.\mathrm{ t}.\left\{\begin{array}{c}2100\le {x}_{1}\le 2300\\ 500\le {x}_{2}\le 800\\ 300\le {x}_{3}\le 500\\ 100\le {x}_{4}\le 300\\ 2300\le {x}_{5}\le 2500\\ 300\le {x}_{6}\le 500\\ 250\le {x}_{7}\le 500\\ 150\le {x}_{8}\le 300\end{array}\right.$$
(21)

In order to obtain the stress of the hoist sheave, finite element analysis is adopted to conduct computer simulations. The number of grids is approximately 30 thousand for finite element analysis computations, as shown in Fig. 12. In addition, the stress distribution is shown in Fig. 13. The material of the finite element analysis model is Q345-B, in which the density, elastic modulus, Poisson's ratio, yield strength and tensile strength are 7850, 2.06e5, 0.28, 345, 470, respectively.

Fig. 12
figure 12

Structural grids for the hoist sheave model

Fig. 13
figure 13

Stress distribution for the hoist sheave

In this section, Latin hypercube sampling is employed to generate 40 initial samples for constructing the initial surrogate model. Furthermore, we adopt 100 samples as the test dataset to evaluate the performance of the surrogate model. The evaluation index still adopts R2 and NRMSE, which is similar to Sect. 3. Then, the adaptive sampling method is used to generate the new samples to improve the prediction accuracy of the surrogate model until this model meets sufficient accuracy (R2 > 0.9).

This section still compares the proposed GAS-AU approach with MSE, EI, GO-HI, IQR, CVVor, and EIGF, and the results are shown in Fig. 14. It visually compares the prediction accuracy of the stress model during each iteration. The fluctuations of the curves in Fig. 14 suggest that infilling new samples may also lead to poor prediction accuracy. However, from the overall trend, the prediction accuracy of the surrogate model is improved with the increase of the number of sample points. More importantly, although GAS-AU does not always show the best prediction accuracy, it still outperforms other methods. The analysis based on NRMSE also illustrated the above results again. In short, under the same number of iterations, GAS-AU exhibits higher priority than other approaches.

Fig. 14
figure 14

Comparison of GAS-AU and four adaptive sampling approaches on an engineering problem

To obtain more reliable results, this section tests the performance of the proposed GAS-AU approach again by setting a stopping criterion (R2 > 0.9). Furthermore, GAS-AU is still compared with other approaches, and the results are listed in Table 5. The proposed GAS-AU method satisfies the stopping criterion with only 23 iterations. Compared to MSE, EI, GO-HI, IQR, CVVor, and EIGF, the proposed approach has the least number of iterations. The result illustrates that this method can be used to construct the surrogate model with acceptable accuracy at a lower cost than other approaches. In other word, the GAS-AU can satisfy the iterative criterion at a faster rate. For this engineering problem, the GAS-AU saves 45%, 61%, 30%, 57%, 28%, and 56% computational costs than MSE, EI, GO-HI, IQR, CVVor, and EIGF, respectively. In summary, the proposed approach has a higher priority than other approaches.

Table 5 Comparison of GAS-AU with six adaptive sampling approaches on number of iterations

5 Conclusion

This work proposes a novel adaptive sampling approach method, namely GAS-AU. In the proposed approach, the most important is the construction of the average uncertainty index. This indicator is designed to describe prediction uncertainty as accurately as possible. First, the predicted mean is obtained based on the CV predictions of multiple surrogate models. Compared to a single surrogate model, multiple surrogates can be enriched the distribution of CV predictions, further improving the reliability of the predicted mean. Then, the weight coefficients are calculated based on the global errors of multiple surrogate models, thereby construct predicted model. Finally, an average uncertainty index is constructed based on the predicted mean and predicted model. A new sample at the point with the largest average uncertainty is generated to update the predicted model until the model reaches an acceptable accuracy.

Six test functions with different dimension are used to test the performance of the GAS-AU approach. Compared with MSE, EI, GO-HI, IQR, CVVor, and EIGF, the GAS-AU approach presents the better prediction accuracy and robustness. The results mean that the proposed GAS-AU approach can be achieved the better improvement than the other approaches. More importantly, the same results are obtained on an engineering problem, indicating that the GAS-AU exhibits higher priority than other approaches under the same number of added samples. To make the obtained results more reliable, this paper tests the performance of the proposed GAS-AU approach again by setting a stopping criterion. Compared with MSE, EI, GO-HI, IQR, CVVor, and EIGF, the GAS-AU can satisfy the stopping criterion using the minimum number of iterations. The results show that this proposed approach can achieve the construction of surrogate models with acceptable accuracy at a lower cost compared to other approaches. In summary, the GAS-AU approach demonstrates a higher priority than other approaches.

Although the proposed approach shows good performance and can save a lot of computational cost, it cannot be ignored that this approach uses CV prediction, meaning that GAS-SU is slightly slower than MSE and EI in calculation speed. Therefore, the next work is to improve the calculation speed of the GAS-AU approach.