Introduction

In sheet metal forming process, expensive computer simulations are frequently used to optimize the process parameters. Compared with physical trial and error approaches, the Finite Element (FE) simulation outperforms in time and cost reductions. The utilization of sheet metal forming simulations has been widely studied. In the early stage of sheet metal forming optimization, some classical optimization methods have been integrated with the FE simulation. Ohata et al. [1] integrated the sweeping simplex method and FE analysis to optimize the punch travel and forming stages with a uniform thickness distribution. Batoz and Guo [2] optimized the blank shape and the drawbead restraining forces by using Sequential Quadratic Programming (SQP) method. Ghouati et al. [3, 4] used gradient optimization method with static and dynamic implicit algorithms to control the springback. Park et al. [5] suggested a method for determining the optimal blank dimensions of a square cup by combining the ideal forming theory with a FE deformation path iteration method. Guo et al. [6] also applied the SQP to shape optimization of blank contours. Naceur et al. [7] used the IA (Inverse Analysis or Approach) to optimize the drawbead restraining forces to improve the sheet metal formability in a deep drawing process. Kleiber et al. [58] used the Adaptive Monte Carlo (AMC) as the system reliability assessment technique to perform sheet stamping optimization. To obtain the global optimal solution, some heuristic methods, such as Genetic Algorithm (GA) and Particle Swarm Optimization (PSO), have been used to forming design. Azaouzi et al. [8] developed an automatic numerical procedure based on commercial FEM code and Heuristic Optimization Algorithms (HOA) for blank shape design of high precision metallic parts. Wei et al. [59] used a Multi-Objective Genetic Algorithm (MOGA) to optimize sheet metal forming process. Naceur et al. [9] developed an approach based on the coupling between the IA and an Evolutionary Algorithm (EA) to deal with the optimization of the blank contour of a square cup. Although most of heuristic methods might obtain the global solution, such methods are not easy to be employed in practice due to the large number of expensive evaluations. To improve the efficiency of classical optimization methods, Surrogate Assisted Optimization (SAO) has been widely used for forming optimization. Ohata et al. [10] used the Response Surface Method (RSM) to find the annealing conditions suitable for a sheet metal forming process. Jansson et al. [11] applied Space Mapping (SM) with RSM to the optimization of sheet metal forming process. Breitkopf et al. [12] used a successive RSM to optimize the initial blank of the Twingo dashpot cup. Jansson et al. [13] integrated the RSM and the SM for optimization of automotive sheet metal part. They also used a design hierarchy and RSM to avoid failure in material and, simultaneously, reach an acceptable through thickness strain. Huang et al. [14] presented an efficient method based on a combination of FEM and RSM to optimize the intermedial tool surfaces and therefore minimize the thickness variation in the multi-step sheet metal stamping process. Bonte et al. [15] proposed a surrogate-based robust optimization and applied it to a deep drawing process. Wang et al. [16] used Adaptive Response Surface Method (ARSM) based on Intelligent Sampling (IS) method. Wang et al. [17] also used a surrogate-based optimization method by integration of Support Vector Regression (SVR) and IS strategy to optimize sheet metal forming process. Kahhal et al. used MOGA by integrating RSM to minimize objective functions of fracture and wrinkle simultaneously. Wiebenga et al. [18] made use of regularization in the fitting procedure of Kriging, Radial Basis Function (RBF) and Artificial Neural Networks (ANNs) surrogates to alleviate the severe deteriorating effect of numerical noise on the approximation quality of surrogates in sheet metal forming optimization. Obviously, the sheet metal forming optimization gained significant progress and the surrogate modeling techniques have attracted much attention to improve the efficiency of optimization for sheet metal forming problems in recent 20 years.

In our opinion, most of sheet metal forming problems should be analyzed considering the time dependence of process parameters. Unfortunately, few studies focused on time dependent sheet metal forming optimization. Jakumeit et al. [19] have optimized VBHF (Variable Blank Holder Force) of a deep-drawing process by using an iterative parallel Kriging algorithm. In their application, three time points are investigated to control the deep-drawing simulation. In Goel’s [20] work, the deformation process should be divided into multiple steps and intermediate flange contour is designed for each stage. However, with the increase of key time points and load time, the number of design variables should increase correspondingly. Therefore, most of time dependent problems are medium-scale problems and curse of dimensionality seems to be a formidable difficult issue of time dependent sheet metal forming optimization. Recently, Surrogate Assisted Evolutionary Algorithms (SAEAs) appeared as a promising approach for dealing with such computationally expensive optimization problems. Zhou et al. [21] used the GP model as a global surrogate model and Lamarckian evolution as a local surrogate model coupling with Evolutionary Algorithms (EA) for an aerodynamic problem with 24 parameters. Lim et al. [22] used an ensemble surrogate to assist EA. Reasonably good results on benchmark problems with 30 design variables have been obtained. Bo et al. [23] proposed a GP model assisted EA for dealing with a circuit design problem with 17 design variables. Therefore, EA seems a potentially useful method for high dimensional problems. Moreover, another important issue is to select a suitable surrogate for the EA. Theoretically, any surrogates can be integrated with the EA. Among them, Polynomial Regression (PR), also known as RSM, SVR, ANN, RBF and GP, also referred to as Kriging or Design and Analysis of Computer Experiment (DACE) models are the most prominent and commonly used [24,25,26]. Using these surrogates, complex optimization problems with computationally expensive objective functions and constraints can be performed more efficiently. Moreover, the search time of the EA can be significantly reduced. In this work, GP modeling is used according to the following considerations.

  1. 1)

    GP modeling exhibits better performance in modeling accuracy than many other surrogate modeling methods, especially when the size of sample points is small and the optimization problems in engineering show strong nonlinear behavior. Because it is a theoretically principled method for determining a much smaller number of free model parameters [27, 28].

  2. 2)

    GP modeling is good at estimating the model uncertainty at each predicted point. Thus, effect of smoothing noisy data on constructing a high-performance model can be greatly deteriorated. Furthermore, the search algorithm can find a better promising area based on the model uncertainty and improve the modeling accuracy efficiently [27, 28].

  3. 3)

    GP modeling can be better combined with some good available prescreening methods in optimization. For example, GP modeling with the Expected Improvement (EI) prescreening [29] has demonstrated its ability to balance the local and the global optimization [30].

The rest of this paper is organized as follows. The basic theories of GP modeling technique and EI assisted prescreening strategy are introduced in “Gaussian process modeling and Expected Improvement-based Prescreening (EIP) strategy” sections, respectively. The framework of the suggested method is then presented in “Gaussian process assisted firefly algorithm (GPFA)” section. The experimental results of the suggested GPFA method on some commonly used mathematical benchmark problems and a real-world application of medium scale expensive time dependent sheet metal forming optimization is described in “Numerical benchmarks” section. “Time dependent assisted variable blank holder force optimization” section gives the final conclusions.

Gaussian process modeling

Considering a problem of approximating an underlying function of interest y = f(x), xR d., GP modeling supposes that f(x) at any sample point x is a Gaussian random variable whose mean and variance is μ and σ2 respectively, and one can express the function f(x) with Gaussian distribution in closed form, such as

$$ f\sim N\left(\mu, {\sigma}^2\right) $$
(1)

Therefore, for any sample point x, a Gaussian model postulates a combination of a fixed constant μ and departures of the form:

$$ f(x)=\mu + Z(x) $$
(2)

where Z(x) is assumed to be a random function of x representing the ‘localized’ deviations and is Normal (0, σ2). For any two sample points x, x’∈Rd., the correlation between Z(x) and Z(x’) can be described by the covariance function

$$ \mathrm{Cov}\left[ Z(x), Z\left({x}^{\prime}\right)\right]={\sigma}^2 R\left( x,{x}^{\prime },\boldsymbol{\uptheta} \right) $$
(3)

where R(x, x’, θ) is the correlation function between x and x’ and can be expressed as

$$ R\left( x,{x}^{\prime },\uptheta \right)= \exp \left[-\sum_{l=1}^d{\theta}_l{\left|{x}_l-{x}_l^{\prime}\right|}^{p_l}\right] $$
(4)

where θ = [ θ 1, θ 2,…,θ d ]T is the positive correlation parameter vector used to fit model, its l th component θ l indicates the importance of x l on f(x), and parameter 1 ≤ p l  ≤ 2 is related to the smoothness of f(x) with respect to x l . More details related to GP modeling can be found in [31].

Given K sample points x 1, x 2,…, x KR d. and their corresponding function values y 1, y 2,…, y K. Suppose that response at any one sample point x can be approximated by the linear combination of y 1, y 2,…, y K

$$ \widehat{f}(x)={\mathbf{c}}^T\mathbf{y} $$
(5)

where c = [c 1, c 2,…, c K]T, y = [y 1, y 2,…, y K]T, then the hyper parameters μ, σ2 and the correlation parameter vector θ can be estimated by maximizing the likelihood estimation function that f(x) = y i at x = x i (i = 1, 2,…, K)

$$ \frac{1}{{\left(2{\pi \sigma}^2\right)}^{K/2}\sqrt{ \det \left(\mathbf{R}\right)}} \exp \left[-{\left(\mathbf{y}-\mathbf{e}\mu \right)}^{\boldsymbol{T}}{\mathbf{R}}^{-1}\left(\mathbf{y}-\mathbf{e}\mu \right)/2{\sigma}^2\right] $$
(6)

where R is a K × K matrix whose (i, j)-element is R(x i, x j, θ), the concrete expression of R(x i, x j, θ) is given in Eq.(4), and e = [1, 1,…, 1]T. In this study, Differential Evolution (DE) is used to optimize the likelihood function. As a result, estimates of predictor and standard deviation at x, respectively noted by\( \widehat{f}(x) \)and\( \widehat{s}(x) \), can be obtained by combining the condition of uniformly minimum variance unbiased estimation with Lagrangian multiplier method:

$$ \widehat{f}(x)=\widehat{\mu}+{\mathbf{r}}^T{\mathbf{R}}^{-1}\left(\mathbf{y}-\mathbf{e}\widehat{\mu}\right) $$
(7)
$$ {\widehat{s}}^2(x)={\widehat{\sigma}}^2\left[1\hbox{-} {\mathbf{r}}^T{\mathbf{R}}^{-1}\mathbf{r}+{\left(1-{\mathbf{e}}^T{\mathbf{R}}^{-1}\mathbf{r}\right)}^2/{\mathbf{e}}^T{\mathbf{R}}^{-1}\mathbf{e}\right] $$
(8)

where

$$ \widehat{\mu}={\left({\mathbf{e}}^T{\mathbf{R}}^{-1}\mathbf{e}\right)}^{-1}{\mathbf{e}}^T{\mathbf{R}}^{-1}\mathbf{y} $$
(9)
$$ {\widehat{\sigma}}^2=\left[{\left(\mathbf{y}-\mathbf{e}\widehat{\mu}\right)}^T{\mathbf{R}}^{-1}\left(\mathbf{y}-\mathbf{e}\widehat{\mu}\right)\right]/ K $$
(10)
$$ \mathbf{r}={\left[ R\left( x,{x}^1,\boldsymbol{\uptheta} \right),\cdots, R\left( x,{x}^K,\boldsymbol{\uptheta} \right)\right]}^T $$
(11)

The details of estimating the hyper parameters μ, σ2, θ l and p l for l = 1, 2,…, d and obtaining estimates of the predictor\( \widehat{f}(x) \)and the standard deviation\( \widehat{s}(x) \)can be found in [32].

Expected improvement-based prescreening (EIP) strategy

Suppose there are M untested sample points x K+1, x K+2,…, x K+MR d., it is a key issue how to rank M untested sample points without real function evaluations. A simple and direct way to solve this problem is to build a GP model based on the exact evaluated (x i, y i) (i = 1, 2,…, K) data, and calculate the evaluation criterion of each untested sample point and then rank them according to a certain evaluation criterion. This process is called prescreening.

According to different evaluation criteria, several different prescreening strategies are available for the GP modeling in optimization, such as Lower Confidence Bound-based Prescreening (LCBP) [33], EI-based Prescreening (EIP) and Probability of Improvement-based Prescreening (PIP) [34]. In the suggested algorithm, the EIP is integrated to generate the most promising solution according to the following considerations.

  1. 1.

    The EIP is better at balancing the local and the global search than the LCBP and PIP.

  2. 2.

    The EIP is a theoretically prescreening algorithm for determining a smaller number of subjective parameters than the LCBP and PIP.

The concept of EI was suggested in the literature as early as 1978 [35]. Recently, it has been applied to some optimization algorithms, such as EGOs (Efficient Global Optimization). In EGOs, the EI function can be expressed as follows:

$$ \mathrm{EI}(x)=\left({f}_{\min }-\widehat{y}\right)\Phi \left(\frac{f_{\min }-\widehat{y}}{\widehat{s}}\right)+\widehat{s}\varphi \left(\frac{f_{\min }-\widehat{y}}{\widehat{s}}\right) $$
(12)

where f min = min(y 1, y 2,…, y K), y i is the function value of x i (i = 1, 2,…, K), Kis the number of initial sample points,\( \widehat{y} \)and \( \widehat{s} \)represent the DACE predictor and its standard error at a unknown sample pointx, respectively. Φ(·)andφ(·)denote Cumulative Probability Density Function (CDF) and Probability Density Function (PDF), respectively. Considering the derivative of EI function as given in (12) with respect to\( \widehat{y} \)or \( \widehat{s} \), the corresponding expressions are shown as follows.

$$ \frac{\partial \mathrm{EI}(x)}{\partial \widehat{y}}=\hbox{-} \Phi \left(\frac{f_{\min }-\widehat{y}}{\widehat{s}}\right)<0 $$
(13)
$$ \frac{\partial \mathrm{EI}(x)}{\partial \widehat{s}}=\varphi \left(\frac{f_{\min }-\widehat{y}}{\widehat{s}}\right)>0 $$
(14)

It is clear that the EI function is monotonic in\( \widehat{y} \)and in \( \widehat{s} \). It suggests that the EI is larger the lower is\( \widehat{y} \)and the higher is \( \widehat{s} \). Generally, the first term of Eq. (12) reflects the exploitation ability of EGO and makes EGOs focus on a relatively small region, which is near to the f min value sample point. The second term of Eq. (12) exhibits the exploration performance of EGO and makes EGOs focus on unexplored promising areas, where the standard error is higher. In a word, the EGO achieves a good balance between local exploitation and global exploration when using the EI criterion. In the suggested method, M untested sample points should be ranked according to the EI criterion. A new good sample point with the maximum EI value can be obtained from the untested sample points without exact function evaluations. The procedure of EIP method can be described as follows.

Suppose the training data is x 1, x 2,…, x KR d. and y 1, y 2,…, y K, the prescreened sample points are x K+1, x K+2,…, x K+MR d. Sequentially, the estimated best promising solution among x K+1, x K+2,…, x K+M without exact function evaluations can be obtained by the EIP strategy. The procedure of EIP is presented as follows:

  • Step 1. Construct a GP (Gaussian Process) model by using x i and y i data (i = 1, 2,…, K), and then use this model to calculate the EI value of each x K+j (j = 1,2,…,M).

  • Step 2. Find the sample point with the maximum EI function value among x K+1, x K+2,…, x K+M and regard it as the estimated best promising solution.

Although the EI is a feasible prescreening algorithm and has been widely integrated with several optimization methods, the major bottleneck is that the variance is often underestimated and the distribution of initial sample points is sensitive to the convergence ratio. Actually, if the initial distribution of sample points is given, a deterministic result should be determined. It suggests that the EI assisted optimization methods, such as EGO might fall into the local convergence. Recently, some methods have been used to keep the diversity of the sequential samples, such as Multi-Surrogate EGO (MSEGO) [36]. However, medium-scale problems are still difficult to be handled. Therefore, it is expected that the random characteristic of the EA will enhance the diversity of sample points to obtain global optimum efficiently. Considering the performances of recently developed EAs, FA is used in this study.

Gaussian process assisted firefly algorithm (GPFA)

Firefly algorithm (FA)

FA is used as the search engine in the suggested GPFA method. The FA is one of the latest EAs proposed by Yang [37]. It was inspired by flashing behavior of fireflies. Based on the biological tropism behavior, new candidate solutions can be generated. The basic idea is that each firefly moves towards the position of a brighter firefly, where the firefly brightness is proportional to the fitness value. As a firefly’s attractiveness is proportional to the light intensity seen by adjacent fireflies, the variation of attractiveness β with the distance r can be updated according to

$$ \beta ={\beta}_0{e}^{-{yr}^2} $$
(15)

where β 0 is the initial attractiveness at r = 0, γ is an absorption coefficient which controls the decrease of light intensity [38, 39], and r is the distance between any two fireflies i and j, which are located at x i and x j respectively, and the (i, j) element of r is described as a Cartesian distance by Eq.(16), where\( {x}_k^i \)is the kth component of spatial coordinate x i of the ith firefly and d is the number of dimensions [38, 39].

$$ {r}_{i j}=\left\Vert {x}^i-{x}^j\right\Vert =\sqrt{{\sum_{k=1}^d\left({x}_k^i-{x}_k^j\right)}^2} $$
(16)

The movement of a firefly i, which is attracted by another more attractive (i.e., brighter) firefly j, is determined by the following form:

$$ {x}_i^{t+1}={x}_i^t+{\beta}_0{e}^{-\gamma {r}_{i j}^2}\left({x}_j^t-{x}_i^t\right)+{\alpha}^t{\varepsilon}_i^t $$
(17)

where\( {x}_i^{t+1}={x}_i^t+{\beta}_0{e}^{-\gamma {r_{i j}}^2}\left({x}_j^t-{x}_i^t\right)+{\alpha}^t{\varepsilon}_i^t \)is the position or the solution of a firefly at time t, the second termtrepresents the attraction of a firefly to be seen by adjacent fireflies; the third term α t \( {\varepsilon}_i^t \)is randomization with α t being the randomization parameter, which is determined by the problem of interest and set to [0,1], and\( {\varepsilon}_i^t \)is a vector of random numbers drawn from a Gaussian distribution or uniform distribution at time t. If β 0 = 0, it becomes a simple random movement; if γ = 0, it reduces to a variant of PSO. Theoretically,\( {\varepsilon}_i^t \)can be extended to other distributions. According to the Eq. (17), α t controls the randomness and it can vary the randomness with the cycle counter t as

$$ {\alpha}^t={\alpha}_0{\delta}^t,\delta \in \left(0,1\right) $$
(18)

where α 0 is the initial randomness scaling factor, and δ is a cooling factor. For most of applications, δ is suggested to be selected within the range [0.95,0.97] to reduce the randomization [37].

For the initial α 0, experiments show that the FA will be more efficient if α 0 is associated with the scaling of design variables. Let L be the average scale of a problem; we can set α 0 = 0.01 L initially. The factor 0.01 comes from the fact that random walks require a number of steps to reach the target while balancing the local exploitation without jumping too far in a few steps [39]. The parameter β controls the attractiveness, and parametric studies suggested that β 0 = 1 can be used for most applications [40]. However, γ should be also related to the scaling L. In general, γ = 1/L 0.5. For most applications, the population size M = 15 to 100 is suggested, and the suggested range is M = 25 to 40 [38]. The parameters that need to be set in the FA are summarized in Table 1.

Table 1 Parameters in the FA

Compared with other popular natural-inspired optimization methods, such as GA [41], Particle Swarm Optimization(PSO) [42], Differential Evolution (DE) [43] and Artificial Bee Colony (ABC) algorithm [44], it has been shown that the FA is more efficient in dealing with multimodal, global optimization problems [45,46,47], and [48]. Two important characteristics of the FA should be noted.

  1. 1)

    The population can be subdivided into subgroups adaptively, because the FA is based on decreases of attraction and attractiveness with distance. Therefore, the adaptive subdivision capability makes it suitable for highly nonlinear, multimodal problems [49].

  2. 2)

    Another characteristic is that the FA can find all optima simultaneously if the size of population is much larger than the number of modes. According to Yang’s test, for De Jong’s function with 256 variables, the GA uses 25,412 (mean value), the PSO uses 17,040 (mean value) and the FA only needs 5567 (mean value) evaluations to obtain the same accuracy level of optimal solution. This demonstrates that the FA outperforms the others [49].

In this study, the FA is utilized as the search engine to generate potential sample points for prescreening. Similar to other SAEAs, a surrogate model, GP is integrated with FA to save the computational cost. Obviously, for SAEAs, the exact function evaluation should be replaced by the predicted value by the surrogate. However, compared with SAEAs, the distinctive characteristic is that the EIP is used. Therefore, the most important issue is how to integrate EIP and FA and should be discussed in the next Section.

Sequentially, the current M best sample points should be extracted from all candidates and denoted as P = {x i|x 1, x 2,  … , x M}and the rest of the sample points are denoted byP  = {x j|x M + 1, x M + 2,  … , x N}, N is the total number of sample points in the present cycle.

In order to generate a child sample point set \( {\mathbf{x}}_c=\left\{{x}_c^1,{x}_c^2,\dots, {x}_c^M\right\} \)of current samples, the FA works as follows:

  • Step 1. If the number of sample points in P′ is zero, let u = x i and x i belongs to P; otherwise the procedure goes to Step 2. u is a child sample point which belongs to x c . As is shown in Fig. 1, P′ is obtained from the former step.

  • Step 2. Select a sample point x j from P′ randomly and calculate its light intensity I j which is related with the exact function asI j  = f(x j)

  • Step 3. If all light intensity I i in P is larger than I j , the procedure goes to Step 4; otherwise sample points x j and x i should be exchanged and procedure goes to Step 2.

  • Step 4. Let\( {r}_{j l}=\left\Vert {x}^j-{x}^l\right\Vert =\sqrt{\sum_{m=1}^d\left({x}_m^j-{x}_m^l\right)} \), \( \beta =\left({\beta}_0-{\beta}_{\min}\right){e}^{-\gamma {r}_{jl}^2}+{\beta}_{\min } \)and \( {x}^j={x}^j\left(1-\beta \right)+\beta {x}^{best}+{\alpha}_0{\delta \varepsilon}_j^t \). Then procedure turns to step 2. The aim of this step is to adjust the value of parameter β and generate more promising samples for speeding up the convergence. Details about the setting of parameters β 0, γ, α0, δ, \( {\varepsilon}_i^t \) can be found in “Parameters settings” section.

Fig. 1
figure 1

Flowchart of the GPFA

GPFA framework

For SAEAs and SAOs, the distribution of initial sample points is important for sequential optimization procedure. In this study, a most popular Design of Experiment (DoE), Latin Hypercube Design (LHD) [50], is used to generate a set of initial sample points. Compared with other popular DOEs, such as Full Factorial (FF), D-Optimum (D-Opt), Central Composite Design (CCD), etc., a more effective sampling can be achieved by using fewer samples [51]. It has been widely used for initialization in optimization problems [52].

  • Step 1.The LHD is implemented to generate K initial sample points in [x Upper, x Lower] for construction of an initial surrogate and corresponding preliminary design evaluations are performed with the initial sample points. Let the K initial sample points and their corresponding evaluations to form the initial database.

  • Step 2. If a current stopping criterion (i.e., the stopping criterion is set to the maximum number of iterations for benchmark functions or the average relative change rate of objective function values for 15 consecutive iterations is less than 0.0001 for the real engineering problems) is satisfied, then procedure terminates, the best solution in the database is extracted; otherwise procedure goes to step 3.

  • Step 3. Select the M best sample points from the database to form a population P. Our pilot experiment onM = 20 , 40 ,  ⋯  , 180demonstrated that a large or a small M value can easily lead to slow convergence. The set of M is based on our pilot experiments and the following two considerations. Firstly, the M best sample points will be used as the spatial coordinate of initial fireflies in the FA to generate new candidates and the best range of M from 25 to 40 is suggested for most applications in FAs [38]. Secondly, Bo et al. [23] also suggested that 30 ≤ M ≤ 60 works well. Therefore, the value of M depends on the problem size. According to our experiment for medium-scale cases, M = 60 is suitable for the problems in this study.

  • Step 4. Apply the FA to P to generate M corresponding child samples for prescreening.

  • Step 5. Build the GP model based on K best sample points with the minimum function values in the updated database and their corresponding function values.

  • Step 6. Use the GP model to evaluate EI values of M child samples generated by the FA in Step 4.

  • Step 7. Find the sample point with maximum EI value among the M child sample points and evaluate its real function value.

  • Step 8. Add the evaluated sample and its function value to the database, then procedure goes back to Step 2.

In order to present the suggested method clearly, the flow chart of the GPFA is illustrated in Fig.1. Moreover, there are some remarks on the GPFA framework to be noted. The population P generated in Step 3 is formed by M sample points with the minimum function values from the database. Furthermore, the new sample point added to the set P is also the best one in each iteration. Therefore, most of child sample points generated in Step 4 are in several relatively small promising subareas. Because most child sample points to be prescreened in Step 6, 7 and 8 are in a relatively small promising subareas, the training data generated in Step 5 could not be far from a solution to be prescreened, which is good for the GPFA to keep the searching robust and fast.

Numerical benchmarks

To apply the suggested method to time-based sheet metal forming, the performance of the GPFA should be tested first. In this section, several famous mathematical test functions are tested.

Parameters settings

Control parameters in the GPFA are summarized as follows. Before analyzing the performance of the GPFA framework, several control parameters (i.e. M and β min) should be set based on the benchmark tests. Their settings and considerations are described as follows.

  1. 1)

    The size of initial sample point K.

According to the tradeoff of the computational efficiency and modeling accuracy, we set K = 10d for numerical testing problems and K = 5d for the sheet metal forming optimization, where d denotes the number of design variables.

  1. 2)

    The parameter in the present stopping criterion p.

The stopping criterion is set to the maximum number of iterations for benchmark functions, and the average relative change rate of objective function values for p consecutive iterations is less than 0.0001 for the real engineering problems. If it is the former, p is set to 1000. Otherwise, p is set to 15.

  1. 3)

    The number of best sample points selected from the initial sample M.

In this study, M is set to 60 for all benchmark tests. Details have been discussed in “GPFA Framework” section.

  1. 4)

    The FA parameters n g , β 0, β min, δ, γ, α 0,\( {\boldsymbol{\varepsilon}}_{\boldsymbol{i}}^{\boldsymbol{t}} \) .

Values of the FA parameters in this study are presented in Table 2. Firstly, L is the average scale of the problem of interest. n r is a random number generator uniformly distributed in [0,1]. Secondly, β 0, δ, γ, α 0 are set to their default values (as listed in Table.1) and β min is the variation of attractiveness β obviously which is defined in Eq.(15) and it is set to 0.04. n g, which is shown in Table 1, is the maximum value of t.

Table 2 Settings of the FA parameters in the GPFA

Numerical test for GPFA

In this section, the performance of the GPFA in numerical studies is compared against EGO, proposed in Jones et al. [29], and FA, proposed by Yang [37]. The benchmark test functions used in Zhou et al. [21] are employed. It includes a unimodal function (Sphere) and 4 multimodal functions (Ackley, Griewank, Rastrigin and Rosenbrock). All of them contain 20 variables and a single global minimum at 0. The test functions used in our experiments are listed in Table 3 and more details can be found in the Appendix. The statistics of the best function values obtained by the GPFA with 1000 function evaluations on 20 independent simulation runs for F1-F5 are shown in Table 4.

Table 3 Test functions used in the numerical test
Table 4 Statistics of the best function values obtained by GPFA with 1000 exact function evaluations on 20 independent simulation runs for F1-F5

According to the results reported in Fig.2, the GPFA obtains good solutions with 1000 exact function evaluations for F1-F5. These good solutions are very close to their corresponding global optimum, particularly for F1, F3 and F5. Moreover, GPFA can outperform the other methods in the F5 problem. As we know, it is very difficult to optimize F5 because of its multimodal and narrow valley of the global optimum. Zhou et al. [21] and Bo et al. [23] investigated the best function values obtained by their proposed method with 1000 function evaluations on 20 independent runs for F5. Zhou et al. [21] found the best function value is larger than 54.598 by using the SAGA-GLS algorithm. Bo et al. [23] concluded that the best is 15.1491 and the worst is 75.8806 by using their proposed GPEME method. While, from Table 4, for F5 problem, the GPFA best and worst values for the function are 3.8538 and 7.5409, respectively.

Fig. 2
figure 2

The best convergence curves of the EGO, FA and GPFA framework for F1-F5 in 20 independent runs

The best convergence curves for EGO, FA and GPFA are presented in Fig. 2. It should be noted that the results presented are averaged 20 independent runs conducted with 1000 function evaluations. According to the results obtained in Fig. 2, it is obvious that the best function value and the convergence rate is obtained by the GPFA for all test functions. It is worth noting that GPFA converges significantly faster than EGO and FA in 200 function evaluations for F1-F5. In the first 200 iterations, the test functions converge to 0.064 for F1, 58.55 for F4 and 12.26 for F5 by using the GPFA, while 234.69 for F1, 82.88 for F4 and 68.33 for F5 by using the EGO and 5.01 for F1, 146.81 for F4 and 42.26 for F5 by using the FA. As a result, it is very practical and efficient to optimize time dependent sheet metal forming problems when the GPFA is applied.

Time dependent assisted variable blank holder force optimization

In deep drawing process, the quality of the formed part is affected by the amount of metal drawn into the die cavity. Excessive metal flow will cause wrinkles in the part, while insufficient metal flow will result in tears or splits. The blank holder plays a key role in regulating the metal flow by exerting a predefined blank holder force (BHF) profile. Usually, in deep drawing process, a constant BHF is applied during the entire punch stroke. During the drawing process, the stress state in the deformed material changes significantly. Consequently, the processing conditions that reduce wrinkling and fracture change accordingly. To consider these changes, it is reasonable that the BHF should also be adjusted to increase the formability of the drawn part. To further improve the formability, the VBHF technique for controlling deep drawing processes is analyzed.

n this section, the GPFA is applied to optimize the time-dependent VBHF in a sheet metal forming process. During the last 20 years, sheet metal forming process parameters optimization, especially of the VBHF, as attracted much attention [53, 54]. If the VBHF is too low, wrinkling will probably occur at the start of punch stroke; otherwise fracture will occur at the end of the stroke. These types of defects can be reduced or eliminated by manipulating a suitable VBHF at different punch strokes. Therefore, wisely selecting sensitive areas at which different modes of failure will occur and reasonably changing VBHF at these sensitive locations are both very important to achieve a part with quality. To optimize the VBHF, the entire time period of the forming process should be decomposed into several stages and given according to experience. However, these given time periods might not be the most suitable setting for improving the quality of the part. Therefore, the boundary of each time period is also considered in this work. Obviously, the number of design variables of the problem increases significantly. This is the reason why we hope to apply the GPFA to such problem.

Problem descriptions

This case is derived from benchmark 2 of NUMISHEET2014 [55]. The first stage of benchmark 2 is selected to improve the forming quality of high strength steel. The geometry of the tools is shown in Fig. 3. FE model of tools and blank are shown in Fig. 4. All of tool parts are made from hardened steel and modelled as rigid body in the software.

Fig. 3
figure 3

Geometry of tools

Fig. 4
figure 4

FE model of tools and blank

The material used to produce the blank is DP600 steel. The principal geometrical and material properties of the blank are as follows: blank 300 × 250 mm, Young’s modulus E = 207 GPa, Poisson coefficient ν = 0.3, the initial thickness h 0 = 1.0 mm, averaged Lankford coefficient\( \overline{r}=1.02 \), material density ρ = 7850 Kg/m 3, friction coefficient μ = 0.10, hardening coefficient K = 1088, hardening exponent n = 0.1854.The hardening law is defined as

$$ \overline{\sigma}=1088{\left(0.0045+\overline{\varepsilon}\right)}^{0.1854} $$
(19)

\( \overline{\sigma} \) and \( \overline{\varepsilon} \) are the stress and strain respectively. Details about tool materials and blank material can be found in NUMISHEET2014 Benchmark2 [55]. A numerical model was built in the FE code DYNAFORM. The blank of the FE model is initially meshed with 17,881 quadratic elements (18,204 nodes), corresponding to an overall element size of 1.0 mm. The mesh is generated using Belytschko-Tsay shell elements. The stamping velocity is taken into account for the forming processes with a constant value of 5 mm/s.

Drawbeads are one of the most important parameters to control the material flow and thus the part quality in sheet forming process. Insufficient forces may lead to wrinkling, but strong restraining forces prevent the sheet from drawing-in and may cause necking. In this case, two equivalent drawbeads are applied to improve the forming quality of the rectangular box. The equivalent drawbeads location lines are shown in Fig. 5. The two drawbeads are situated about 10 mm from the internal die contour. The drawbeads are replaced by equivalent restraining forces in this problem. Drawbead restraining forces of the two drawbeads are the same due to the symmetry of the die.

Fig. 5
figure 5

Draw-beads location

Design variables and objective function

Design variables

As we mentioned before, the process parameter BHF (Blank Holder Force) plays an important role in the stamping process, since it is one of the most important parameters to control the material flowing into the die. The VBHF is applied to the sheet metal forming process to overcome the shortcomings of constant BHF. The BHF is variable during the VBHF sheet metal forming process. Furthermore, beside the BHF in each time period, the time domain of each given key time point is another kind of design variable. It is a key issue to determine the location of sensitive time points.

A basic problem in the design of VBHF is presented by the choice of some sensitive time points for the definition of VBHF curve. The usefulness of HDMR (High Dimensional Model Representation) is considered in this paper to obtain the sensitive time in sheet metal forming process. The HDMR is used to analyze the sensitivity of parameters and map the input/output relationships of high-dimensional complex systems [56, 57]. This study applies the HDMR to estimate the sensitivity of wrinkle and crack defects of the rectangular box with respect to the sensitive time points. The purpose of the procedure is to obtain the sensitive values of different time increments and to obtain the sensitive time points during sheet metal forming. In this study, the sensitive time points are generated using the following steps:

  • Step 1. The aim of this step is to find the range of BHF that can avoid a wide range of severe wrinkle along the side wall and crack in the corner of the rectangular box. When the constant BHF is 600 KN, the forming result is shown in Fig. 6. If the BHF is larger than 600 KN, the element located at the corner of the part will rupture, which is not allowed in the forming result. Therefore, the upper boundary of BHF is 600KN.Even though BHF is large enough for this part, some wrinkle elements still exist in A location as shown in Fig. 6 and the wrinkle defects can’t be eliminated. Typical extended BHF formability windows, as shown in Fig. 7, can illustrate this phenomenon. Therefore, the constant BHF should be larger than 600KN to eliminate the wrinkle elements, which will make rupture phenomenon appear in the corner of the part. Obviously, it is difficult to use the constant BHF to control the quality of the blank in the entire forming procedure.

Fig. 6
figure 6

The result of constant BHF(600KN)

Fig. 7
figure 7

Extended BHF formability windows

A small BHF should be used to ensure the safety of this part. The forming results, as shown in Fig. 8, are calculated when the constant BHFs are 400KN and 500KN. If the BHF is smaller than 400KN, the severe wrinkle range extends greatly in B location shown in Fig. 8(a) compared with the result in Fig. 8(b). In order to reduce the effect of severe elements to the forming quality, the lower boundary is selected as 400KN which can make sure the safety range of BHF is belong to [400,600] KN. Then, a reliable scale of BHFs can be obtained which can make sure that neither a wide range of severe wrinkle nor a wide range of fracture occurs in the forming process. Furthermore, the range of constant BHF can reduce the variable range of VBHF and save the computational expense. Based on simulation results, the reliable scale is set to [400kN, 600kN]. BHF and the restraining force bounds can be expressed as

$$ {F}_i^{BHF}\in \left[400,600\right] KN, i=1,2,\dots, \mathrm{n} $$
(20)
$$ {F}^{BD}\in \left[0,500\right] N $$
(21)
Fig. 8
figure 8

The result of constant BHF

where i represents the i-th design variable of BHF. F BD is the restraining force. n is the number of design variable of VBHF.

  • Step 2. In this case, 60 sample points of constant BHF are used to obtain the sensitive time points in the sheet forming process and the simulation results are calculated by DYNAFORM software. The sample points of constant BHF are generated by LHS method and listed in Tables 5. The range of constant BHF is determined by Step 1. In every simulation process, the entire procedure is divided into 14 time periods on the basis of the increment times in DYNAFORM which are listed in Tables 6, and 14 thickness increments corresponding to the 14 time periods exist for every simulation process. Due to the number of thickness increments for all the 60 sample points is numerous, the top-ten sample points are selected to illustrate the thickness increments in Table 7. The sequence number of the top-ten sample points is one to ten.

  • Step 3. The total number of wrinkle and crack elements is regarded as response value and the thick increments during sheet metal forming are design variables. They are the important data as the input of HDMR. The flowchart of sensitivity analysis is shown in Fig. 9. Ten sample points among all the sample points are selected to illustrate the thickness of the blank. The thickness increments of every time increments for the sample points are shown in Table 7 and the time increments used in Table 7 are the same as Table 6. The sensitive values of every time increment are shown in Fig. 10.

Table 5 The sample points of constant BHFs. (ID: the sequence number of 60 sample points)
Table 6 the time increment during sheet metal forming. (ID: the number of time increment)
Table 7 The thickness increments of every time increment of 10 sample points (ID: the sequence number of top-ten sample points)
Table 8 The design variables range
Table 9 Values of design variables of before and after optimization
Fig. 9
figure 9

Flowchart of the sensitive analysis

Fig. 10
figure 10

The sensitive values of 14 time increments

Based on the information contained in the sensitivity analysis, the sensitive time duration can be obtained. The seven largest numbers of sensitive values among the 14 time durations are considered as the sensitive time duration, and the sensitive time point will be generated in the time durations. Thus, 9 key time points are obtained including the beginning time point and the end time point. Draw-beads restraining forces and BHFs corresponding to 9 key time points are considered as a design variable. The design variables for this problem are listed in Table 8.

Objective function

To evaluate the possibility of wrinkling and fracturing, the strains in the formed blank elements are analyzed and compared against the FLD shown in Fig. 11. As a result, the following function describing the possibility of wrinkling and fracturing is taken as the objective function.

$$ {f}_{\varepsilon}={\left(\sum_{e=1}^{s um}{f}_{\varepsilon}^e\right)}^{\frac{1}{m}} with\left\{\begin{array}{cc}\hfill {f}_{\varepsilon}^e={\left(\left|{\varepsilon}_1^e-{\theta}_s\left({\varepsilon}_2^e\right)\right|\right)}^m\hfill & \hfill for\left[{\varepsilon}_1^e>{\theta}_s\left({\varepsilon}_2^e\right)\right]\kern0.5em \hfill \\ {}\hfill {f}_{\varepsilon}^e={\left(\left|{\varepsilon}_1^e-{\theta}_w\left({\varepsilon}_2^e\right)\right|\right)}^m\hfill & \hfill for\left[{\varepsilon}_1^e<{\theta}_w\left({\varepsilon}_2^e\right)\right]\hfill \\ {}\hfill {f}_{\varepsilon}^e=0\hfill & \hfill otherwise\hfill \end{array}\right. $$
(22)
Fig. 11
figure 11

An illustration of FLD and corresponding domains

where sum represents the total number of the formed blank elements; the parameter m is introduced to strengthen the nonlinearity of the objective function, m = 2 , 4 , 6is commonly used [12], and in this studym = 4is used; \( {\varepsilon}_1^e,{\varepsilon}_2^e \)are respectively called major strain and minor strain of any one element of the formed blank; \( {\theta}_s\left({\varepsilon}_2^e\right),{\theta}_w\left({\varepsilon}_2^e\right) \)represent the upper boundary curve and lower boundary curve values of the Safety domain in the FLD for the element (\( {\varepsilon}_1^e,{\varepsilon}_2^e \)), respectively. Assuming an element (ε 2 , ε 1) is selected from the formed blank, ifθ s (ε 2) > ε 1, then the selected element is regarded as a fracturing element and it will be located in the crack domain shown in Fig. 11; ifθ w(ε 2) > ε 1, the element is called a wrinkling element and locate in the wrinkling domain.

The upper boundary curve\( {\theta}_s\left({\varepsilon}_2^e\right) \) is determined by a Forming Limit Curve (FLCs)φ s (ε 2) , φ w (ε 2) proposed by Hillman and Kubli [60] and the safety tolerances. The lower boundary curve\( {\theta}_w\left({\varepsilon}_2^e\right) \) is determined by the FLD provided by DYNAFORM software. Their relationship can be expressed as:

$$ {\theta}_s\left({\varepsilon}_2\right)={\varphi}_s\left({\varepsilon}_2\right)- s $$
(23)
$$ {\theta}_w\left({\varepsilon}_2\right)=-{\varepsilon}_2 $$
(24)

whereφ s is the FLC which controls the cracking phenomenon,θ w is the FLC which controls the tendency of wrinkling phenomenon. Both of them are dependent depend on the material. The tolerance s is constant during the optimization process and it is used to define the safety margin. The default in the Dynaform software is set to the value which is less than the fracture limit curve. Therefore, the value of s is set as 0.1. The two curves\( {\theta}_s\left({\varepsilon}_2^e\right) \)and\( {\theta}_w\left({\varepsilon}_2^e\right) \)in this case are given as

$$ \begin{array}{l}{\varphi}_s\left({\varepsilon}_2\right)={FLD}_0-{\varepsilon}_2\kern12em {\varepsilon}_2<0\\ {}{\varphi}_s\left({\varepsilon}_2\right)={FLD}_0+{\varepsilon}_2\left(-0.008565{\varepsilon}_2+0.784854\right)\kern1.5em {\varepsilon}_2>0\end{array} $$
(25)

FLD 0 is the lowest point on the forming limit curve and its value is 0.239 [55].

Optimization result

Latin Hypercube Design (LHD) is used to generate the initial 60 sample points of VBHF. The objective function values of 60 sample points are sorted in descending order as shown in Fig. 12. The entire optimization procedure costs 400 evaluations, performed in about 170 h. The convergence curve of the GPFA, for the optimization of time dependent VBHF in the rectangular box forming process, is presented in Fig. 13. The values of design variables of before and after optimization by the GPFA are listed in Table 9 and corresponding curves are illustrated in Fig. 14. The comparison of the resulting FLDs and the blank forming quality of start and optimal design is shown in Fig. 15. The best result among the 60 sample points is shown in Fig. 15(a). Compared with the initial design, the optimal BHF curve as shown in Fig.15(b) improves the formability.

Fig. 12
figure 12

The objective function values of 60 sample points of VBHF

Fig. 13
figure 13

Convergence curve of the BHF optimization procedure (NFEs: Number of Evaluations)

Fig. 14
figure 14

Optimal curve of VBHF

Compared to the result in Fig. 15(a) which is the best solution among the 60 sample points, there are two aspects of improvement of the forming quality in Fig. 15(b). Firstly, it is apparent that the side wall in Fig. 15(a) is severe wrinkle. It may have a bad influence on the forming quality. Conversely, as shown in Fig. 15(b), this defect has been modified through optimized with the suggested method. The wrinkle of the side wall is reduced obviously. Secondly, the bottom of the blank is insufficient forming in Fig. 15(a). Insufficient forming may lead to some forming defects due to the unbalanced stress gradient such as the surface distortion. After optimization, the same part in Fig. 15(b) is sufficient formed and the green color means that this region is safe. Therefore, the forming quality after optimization as shown in Fig.15(b) is much better than the result in Fig. 15(a).

In NUMISHEET 2014 Benchmark 2, a flat blank is positioned in the die, which is then raised into contact with the blank holder, with a constant force: BHF = 595KN for DP600 steel. The stamping depth is 47.5 mm at a constant velocity of 5 mm/s. The result of the sheet metal forming is shown in Fig. 16 and it demonstrates that the forming quality is not satisfactory for the boundary that is given in NUMISHEET 2014 Benchmark 2. The corner of the rectangle box is crack tendency region which is shown in yellow color. Some ruptured elements exist in the blank. Furthermore, the severe wrinkle regions located in the side wall of the box have bad influence on the stamping process. Although the crack tendency elements in the corner of the part still exist in Fig. 15(b), the crack tendency elements are less than that in Fig. 16 and the elements located in the side wall are also eliminated as shown in Fig. 15(b). The comparison of the objective function values for the three modes, as shown in Fig.15 and Fig. 16, can also illustrate the effectiveness of the optimization. The value of the objective function of before optimization is 0.59610 which is larger than the value of after optimization which is 0.57822. When the constant BHF is 595KN, the objective function value is 0.58023. Although the gap between the three modes is slight, it still has great influence on the forming quality of the part. Therefore, the forming quality has been improved and the effectiveness of GPFA has been verified. In addition, this part should be considered proper depending on the safety factors in the actual project due to the existence of crack tendency elements.

Fig. 15
figure 15

Illustrations of the FLD and the forming quality of a blank

Fig. 16
figure 16

the forming result of a constant BHF(595KN)

Conclusion

In our opinions, most of sheet metal forming problems are time dependency of process parameters. However, it is difficult to consider time dependent design variables in an optimization problem due to the curse of dimensionality. To overcome this bottleneck, a novel SAO method, called GPFA is suggested. The main idea of GPFA is to construct a surrogate model-aware search mechanism by using the FA. In the GPFA, the FA is used to generate new sample points and the GP is used to prescreen the generated sample points by using EI criterion. Therefore, the accuracy and efficiency of the GPFA can be balanced well.

To evaluate the performance of GPFA, several medium-scale nonlinear problems are tested. Compared with recent published methods, the performance of the GPFA is obviously improved. Finally, the GPFA is applied to a BHF design with time dependent variables. The HDMR is also applied in this problem which can be used to analyze the sensitive values of time increment durations corresponding to the forming quantity. The result demonstrates that the GPFA is capable to solve such similar time dependent problems.