1 Introduction

Deep drawing is one of the most prominent metal flow processes in sheet metal forming, in which a sheet metal blank is subjected to large plastic deformation into a forming die by the advancement of a punch. Advanced high-strength steels (AHSS) have positioned themselves as a suitable choice to meet the strong market demand for lighter, safer, and cheaper formed products especially in the automotive and aerospace industries [1], while the use of this type of steel is constrained by manufacturing difficulties such as formability due to the low allowable deformation compared to standard steels used for this application (mild steel) [2]. The limits of sheet metal formability are limited by the occurrence of process failures, such as rupture, springback, wrinkling, and thinning [3]. An excessive metallic flow creates wrinkles in the deep drawn part while insufficient metallic flow creates tears or splits. The zones of high compressive stress are prone to wrinkling while the zones of high tensile stress are prone to the problem of necking. The wrinkling in the flange area is possible due to minor compressive stresses in the deep drawn part. At the same time, unsupported regions or regions in contact with a single tool are also susceptible to wrinkling. Generally, the thinning is produced in the blank wall compressed between the punch and the die.

The effect of the different factors that lead to the appearance of these defects is as follows: process parameters such as the blank holder force and the friction coefficients between the tools and the blank, geometric parameters of the tools such as die and punch section radius, and the mechanical properties of the material [4, 5]. Therefore, the ability to accurately predict failures, such as fracture, wrinkling, and thinning, becomes a critical requirement in the design of sheet metal forming. The formability of sheet metal is usually limited by the onset of localized necking and ductile fracture. To evaluate the formability of a sheet material, forming limit diagram (FLD) is a technique used. FLD, which was introduced by Keeler and Backhofen [6] and Goodwin [7], describes the maximum in-plane strains that can be applied to sheet metal before necking initiates [8]. Experimental methods to obtain FLDs of alloys are time-consuming and costly [10]. Therefore, many theoretical and numerical methods have been proposed and developed to predict the formability of sheet metals [9]. In an industrial environment, a quick and simple fracture indicator is the maximum thinning (Shivpuri and Zhang 2009). Moreover, thinning distribution allows also the control of wrinkling (which could be evidenced by eventual thickening) if it is necessary.

The integration of surrogate models into process design has become essential to achieve more optimal solutions and to reduce time-to-market. The combination of numerical simulations and these meta-modeling tools makes it possible to set up decision support tools that are faster and efficient than traditional numerical simulations [10]. Based on the literature, several substitution models are widely used, such as radial basis function (RBF), response surface method (RSM), Kriging, and artificial neural network (ANN) [11]. In this way, kinds of research have been established using surrogate models to predict the output response variables. Park et al. [12] used a RSM model with two-factor interactions to predict the curvature produced as a function of sheet compression ratio, the radius of curvature in the transverse direction, and initial blank width in flexibly reconfigurable roll forming. Although the radial basis function (RBF) has demonstrated great success in prediction problems, very little report has been made on the deep drawing process. Sun et al. [13] used the RBF technique to predict fracture and wrinkling failures in sheet metal forming. The authors concluded that RBF provides high accuracy. Liu et al. [14] proposed a multi-objective optimization approach in which the RBF technique is used to predict the objective and constraint functions. To ensure the accuracy of this technique, the authors proposed the reverse shape parameter analysis method to achieve the best approximation models. Kiani and Yildiz [15] developed an optimization framework based on the RBF technique to optimize the structural design of a vehicle model. In this paper, all structural responses associated with the crash (FFI, OFI, and lateral) and vibration were predicted by this surrogate model. In this study, LHS is used to generate 46 training points, where FE patterns (crash and vibration) are analyzed to extract output responses.

Artificial neural networks (ANNs) constitute another extensively applied prediction technique; Miranda et al. [16] used it to predict punch displacement in press brake air bending. Based on the comparison of results obtained with experimental tests, the authors concluded that ANN is able to accurately predict the final geometries. Greve et al. [17] developed an ANN model to replace a computationally expensive model based on geometric imperfection. They proposed this model to predict the occurrence of a localized necking based on virtual test data of an extended MK model. After numerical and experimental validation tests, the neural network model presents a high accuracy. Bonatti and Mohr [8] proposed an ANN model to provide a computationally efficient method to predict the effect of the loading path on the FLD of DP780 steel. It is demonstrated that this model can predict the FLD for a given proportional pre-drive history. Najm and Panit [18] proposed an artificial neural network (ANN) to predict the formability and geometric accuracy of truncated frustums processed using SPIF. In this paper, the input parameters selected are tool materials, end corner radius, tool shape, and the surface roughness (Ra/Rz) of the tool. They showed that the structure of a one-output solution gave better results than a two-output network. Thus, it is demonstrated that Trainscg and Trainbfg as the training function and Radbas and Logsig as the transfer function provided the best prediction regarding accuracy and formability.

You et al. [19] developed a kriging model to predict the temperature response values at different sampling points of an injection mechanism in squeeze casting. The results of this study show that the error values between the predicted and simulated data at the beginning of the injection phase are significant because of the uncertainty in the temperature distribution of the injection mechanism in the injection process. Instead, the mean–variance and standard deviation derived from the calibrated model decrease significantly compared to those derived from the original kriging model, indicating that the calibration provides a more precise prediction. Dang et al. [20] combined the reduction order model and the kriging method to develop the spatial field meta-model to replace the high-fidelity model. The blank holder force and the die radius are regarded as input variables. To establish the relationship between the blank holder force, the friction coefficient and the yield strength of the blank material, and the forming quality indices, Palmieri et al. [21] used the numerical results to develop the kriging model.

To predict the results of the FRRF process, Park et al. [22] conducted a comparative study between the ANN and the RSM techniques. They concluded that in the case of the saddle-shaped surface, the ANN model showed mainly high predictability, but in the case of the convex-shaped surface, the regression model performed better. Nouioua et al. [23] applied response surface methodology (RSM) and artificial neural network (ANN) techniques to search for a precise prediction of surface roughness and cutting force in dry, wet, and MQL machining. Depth of cut, tool nose, feed rate, lubrication, and cutting speed were considered as inputs to the model. In this case, the ANN model provides a more accurate prediction compared to that obtained by the RSM method. Huang et al. [24] compared the quadratic response surface (QRS) and Kriging techniques in terms of prediction accuracy. The surrogate models are validated and compared by cross-validation, root mean square error (RMSE), and maximum absolute error (MAE). The results of this study have indicated that the Kriging technique is more appropriate and accurate for modeling the sheet metal process. Rajbongshi and Sarma [25] used the artificial neural network and the response surface methodology to predict surface roughness and flank wear in a turning operation. The input parameters taken in this paper are feed rate, cutting speed, and depth. As measured by the regression coefficient (R2), the ANN model is better than the RSM model.

From the above literature, it is observed that the ability to accurately predict failures, such as fracture, wrinkling, and thinning, becomes a critical requirement in sheet metal forming design, especially when using AHSS materials. On the other hand, it can also be concluded that surrogate models are a very useful tool for predicting and modeling critical failures in engineering problems. The main advantages of using these models before starting new experiments are to reduce the time needed to prepare experiments, minimize errors, and increase efficiency. The points detailed above motivated the authors to examine the ability of surrogate models to predict the formability of stamped parts from DP600 high-strength steel thin blank. For this purpose, the four most common surrogate models, namely RSM, RBF, kriging, and ANN, are developed and compared with each other to predict thinning and rupture for the square cup deep drawing problem. Figure 1 shows the flowchart of the proposed strategy for developing and comparing those surrogate models. In the present work, the six most critical parameters are considered namely, blank holder force \(\left(BHF\right)\), punch section radius \(\left({R}_{Sp}\right)\), die section radius \(\left({R}_{Sd}\right)\), die fillet radius \(\left({R}_{Fd}\right)\), blank thickness \(\left({E}_{p}\right)\), and the friction coefficient between blank and die \(\left({\mu }_{d}\right)\).

Fig. 1
figure 1

Flowchart of the proposed strategy for process failure prediction

2 Predicting techniques

2.1 Response surface methodology (RSM) technique

The response surface methodology (RSM) technique, which is the simplest surrogate model, consists of finding a polynomial approximation of the process output response in terms of the input variables. Thus, the RSM can be defined as:

$$z\left(x\right)=\widehat{z}\left(x\right)+\varepsilon$$
(1)

where x is the vector of design variables, \(\widehat{z}\left(x\right)\) is the polynomial approximation, and ε is the random error. The approximation RSM predictor of \(\widehat{z}\left(x\right)\) can be expressed as follows:

$$\widehat{z} \left(x\right)= {\alpha }_{0}+\sum_{i=1}^{n}{\alpha }_{i}{x}_{i}+\sum_{i=1}^{n}\sum_{j>i}^{n}{\alpha }_{ij}{x}_{i}{x}_{j}+\sum_{i=1}^{n}{\alpha }_{ii}{x}_{i}^{2}$$
(2)

where \({\alpha }_{0}\), \({\alpha }_{i}\), \({\alpha }_{ij}\), and \({\alpha }_{ii}\) are regression coefficients. The matrix form of Eq. (2) can be written as follows:

$${\widehat{z}}_{s}=F\alpha$$
(3)

The regression coefficients are estimated using the least square technique as shown in the equation below:

$$\widehat{\alpha }={\left({F}^{T}F\right)}^{-1}{F}^{T}{z}_{s}$$
(4)

Once the coefficients \(\alpha\) are estimated, the approximate output \(\widehat{z} \left(x\right)\) at any untested input can be predicted efficiently by Eq. (2).

2.2 Kriging technique

Kriging is a surrogate model which consists in constructing the approximation of the output from deterministic data in the geostatistics field and in characterizing the output as the realization of a stochastic (Gaussian) process [26]. Its formulation can be described as a combination of a regression model and a stochastic process:

$$\widehat{z} \left(x\right)=\varphi {\left(x\right)}^{T}\gamma +y\left(x\right)$$
(5)

where \(\varphi {\left(x\right)}^{T}\gamma\) is a “global” linear regression model and \(y\left(x\right)\) is a Gaussian Process which allows to bring “local” corrections to the “global” model. This meta-model is designed under the assumption that the Gaussian process has zero mean as well as a spatial covariance for design sites x and \(v\):

$$E\left[y\left(x\right)\right]=0\qquad Var\left[y\left(x\right)\right]={\sigma }^{2} \qquad cov\left(y\left(x\right),y(v)\right)={\sigma }^{2}k\left(x,v\right)$$
(6)

The term \(k\left(x,v\right)\) represents a correlation function between any sampled data points \(x\) and \(v\) specified by the designer. The most commonly used correlation functions are summarized in Table 1. The construction of the kriging model boils down to the estimation of the model parameters \(\gamma\), the process variance \({\sigma }^{2}k\), and the parameters of the correlation function \(\varnothing\). Once all the training information is collected, the actual responses can be compared to the model predictions for the n experimental points D = {x1, …, xn} used.

Table 1 Correlation functions

2.3 Radial basis function (RBF) technique

The radial basis function is another technique also used to approximate a complex simulation model or black-box function. The idea behind radial basis function (RBF) prediction is to use a linear combination of symmetric basis functions centered on each sample point [27, 28]. The output can be defined by this technique as follows:

$$f\left(x\right)=\sum_{i=1}^{n}{\varphi }_{i}\delta \left(\Vert x-{x}_{i}\Vert \right)+\sum_{s=1}^{S}{d}_{s}{c}_{s}\left(x\right)$$
(7)

where n is the number of sample points, \(\Vert x-{x}_{i}\Vert\) is the Euclidean norm of the design variable vectors x and the \({i}_{th}\) sample point \({x}_{s}\), \(\delta\) is a basis function, \({\varphi }_{i}\) is the unknown weighting factor positioned at the \({i}_{th}\) sample point of the basis function, \({c}_{s}\left(x\right)\) are polynomial terms, S is the number of polynomial terms, usually S < n, and \({d}_{s}\) (S = 1, 2, …, S) is the coefficient of \({c}_{s}\left(x\right)\).

As shown in the equation above, therefore, this type of meta-model is a linear combination of S radial basis functions and K polynomial terms with weighted coefficients. The most commonly used basis functions are summarized in Table 2. From Fig. 2, it is clear that the properties of these basic functions are quite distinct from each other. Therefore, the choice of a basis function strongly influences the expected accuracy of fit.

Table 2 Commonly used radial basis function [13]
Fig. 2
figure 2

Plot of commonly used radial basis function [13]

2.4 Artificial neural network (ANN) model

According to the literature, there exist some neural network models for machine learning, such as convolutional neural network (CNN), recurrent neural network (RNN), and artificial neural network (ANN) [22]. For a defined problem, it is necessary to identify an adequate model in terms of performance and time reduction. Artificial neural networks have been used in this paper due to their functional nature and learning capabilities; they represent a much simpler and more efficient computational model. ANNs are meta-models inspired by the behavior of the human brain, mainly in the parallel, distributed, and multiprocessor aspects of the underlying structure [29]. These are usually several nodes, which implement a mathematical function (activation function) as shown in Fig. 3a, and these are interconnected according to a well-defined structure (Fig. 3b). This structure consists of input layers, hidden layers, and output layers, and for the hidden layer, it is possible to use one or more layers. Each link connects one node to another and is assigned a weight.

Fig. 3
figure 3

a Functional of the neural network node, b structure of the ANN model

The activation function influences the output value of the node, and this output value is finally used for the regression analysis. It is used in the hidden layer and the output layer, while no activation function is used for the input layer because no computation is involved in the input layer. The step function which has a linear form and the sigmoid function which has a nonlinear form is the most common activation functions. To identify the appropriate weight values of the ANN model, the backpropagation technique was applied. Several algorithms are employed in the backpropagation technique, namely the Levenberg–Marquardt method, scaled conjugate gradient, and Bayesian regularization.

The graduated conjugate gradient technique is the most basic algorithm in the three and only a small amount of memory is required for this algorithm. On the other hand, the Levenberg–Marquardt algorithm requires a large amount of memory, but the advantage is that the computation time is shorter. The Bayesian regularization algorithm requires more memory and time, but it can provide good generalization for difficult and noisy datasets.

3 Design of experiment and numerical simulation

3.1 Data sets used for the prediction techniques development

To accurately predict deep drawing process failures, the development of meta-models requires training and test data, and these samples can be made by the design of experiments \(\left(DOE\right)\), such as Latin hypercube, Box-Behnken design (BBD), Taguchi methods, and central composite design \(\left(CCD\right)\) [30]. In this study, the construction of training and test data is carried out by the LHD method. This choice is justified by the fact that LHD allows to take more levels and more combinations than other techniques and it also allows total freedom in the selection of the number of plans to execute. The idea behind this tool is to evenly partition the design space for each factor. Then, these levels are randomly combined to define m points composing the design matrix (each level of a factor is studied only once). In this study, six parameters are considered as input variables, namely blank holder force \(\left(BHF\right)\), punch section radius \(\left({R}_{Sp}\right)\), die section radius \(\left({R}_{Sd}\right)\), die fillet radius \(\left({R}_{Fd}\right)\), blank thickness \(\left({E}_{p}\right)\), and the friction coefficient between blank and die \(\left({\mu }_{d}\right)\). According to the literature and another study that will be published soon, these parameters are the most influential parameters for rupture and thinning failures. The ranges and levels of these variables are selected based on the range of values used in the industry and are listed in Table 3. The minimum number of experiments (m) can be determined as a function of the number of factors according to the equation: \(m=\left(n+1\right)*\frac{\left(n+2\right)}{2}=\left(6+1\right)*\frac{\left(6+2\right)}{2}=28\). For more accuracy, 40 samples are taken for training and 12 samples for testing and validating the prediction techniques. The data sets used for training are given in Table 14 in the Appendix.

Table 3 The utilized design variable levels
Table 4 The mechanical properties of the sheet used

3.2 Problem description and finite element model

Finite element simulation is an efficient technique to facilitate the analysis of the deep drawing process and to study the interaction between process parameters and output responses. It gives very useful information for the prediction and optimization of process failures. For this reason, finite element simulation is used to generate the training and test data. The square cup deep drawing problem, proposed by the Numisheet’93 conference as an international benchmark for FE codes and reported by Danckert [31], has been considered as a case study in this article. The main reason for this choice is that this type of deep drawing process is very complex due to the many parameters that determine this process and therefore are associated with many failures such as fracture, wrinkling, and thinning. The square cup benchmark was realized according to the schematic drawing presented in Fig. 4. A DP600 high-strength steel blank of 0.78-mm thickness is initially clamped between the die and the blank holder under a blank holder force of 19.6 kN. The friction coefficient between the tools and the blank is fixed at 0.144 during the stamping process.

Fig. 4
figure 4

Tool geometry of square cup deep drawing

The simulation of deep drawing was a nonlinear problem of nonlinear behavior of solid materials and contacts, which was applied to an elastic–plastic material employed with conditions of contact behavior. In this study, the commercial finite element analysis software Abaqus was employed to develop a 3D FE model for the deep drawing benchmark. Due to the symmetry of the tools and sheet geometry as well as the symmetry in the boundary conditions, only the fourth section of the model was modeled to minimize the computational time required. The dynamic explicit (DE) technique with a virtual punch speed of 10 m/s was used to simulate the forming process. The blank is discretized by the S4R shell element with eleven integration points and the stamping tools are discretized by the R3D4 discrete rigid element. The surface-to-surface contact technique was adopted, in which Coulomb’s law of friction was supposed. Figure 5 shows a view of the finite element analysis model of a square cup. An elastoplastic constitutive model was used to simulate the response of the sheet with isotropic strain hardening according to Swift’s rule to determine the true plastic stress plastic strain curve. The Hill anisotropy constants were included in the FE code to consider the sheet anisotropy. The mechanical data for DP 600 are shown in Table 4.

Fig. 5
figure 5

a Finite element analysis model of a square cup; b the deformed shape of the blank

To validate the FE model, a comparison between experimental and simulation results is performed. This evaluation consists of examining the deformed shape of the square cup after the punch travel. In this paper, the final geometry of the square cup, described by the average of the drawn length DX, DY, and DD of the flange from the initial position, is selected for this evaluation. They are determined using the procedure given in NUMISHEET’93 (1993) and are measured according to Fig. 6 [31]. Table 5 shows the draw-in values for a punch stroke of 15 and 40 mm. It can be noted that the draw-in values from the simulation are in good agreement with the experimental data.

Fig. 6
figure 6

Definition of draw-in direction

Table 5 Draw-in predictions of FE simulation and experimental data

3.3 Characterization of deep drawing failures

In this study, rupture and thinning are chosen as the output response variables. This choice is justified by the fact that these failures limit the use of advanced high-strength steel mainly in deep drawing processes due to the low deformation allowed compared to mild steel. The rupture was defined by using the forming limit diagram (FLD). As shown in Fig. 7, this diagram, which is introduced by Keeler and Backofen in 1961 [6], represents major deformation versus minor deformation and it constitutes two curves fitted to the deformation points called “failure curves.” When the major deformations of some elements are above FLC \(\varnothing\)(ε2) or the marginal safety line, yellow line, fracture may be present in this region of the part, and a greater distance denotes a greater chance of fracture [9]. The experimental FLD of DP600 is derived from the article by Regueras and López [2] and is displayed in Fig. 8. Thinning is another critical failure in sheet metal forming. It is defined as the thickness variation of the blank during the deep drawing process. There are limit values of thinning commonly set by the industrial requirement, for example, in the automotive industry, 20% thinning in the thickness of the sheet is the maximum tolerable value. Therefore, it is very essential to predict which zones are susceptible to excessive thinning as it is more probable that failures will be produced there. In this study, rupture and thinning were characterized by Eqs. (8) and (9) respectively.

$$rupture= \sum_{i=1}^{n}\frac{{\varepsilon }_{Major}}{{\varepsilon }_{Major}^{FLD}})$$
(8)
$$Thinning = \sqrt{\sum_{i=1}^{n}{\left({t}_{0}-{t}_{i}\right)}^{2}}$$
(9)

where n denotes the number of elements to be measured.; t0 denotes the initial sheet thickness; and ti denotes the final sheet thickness.

Fig. 7
figure 7

FLD to evaluate forming quality of an element

Fig. 8
figure 8

Forming limits diagram for the studied material

4 Results and comparison

4.1 Surrogate model validation

To validate and compare the surrogate models, a goodness-of-fit test was performed. There are several measures and methods for evaluating the accuracy of a surrogate model and comparing it to others. Since the goodness-of-fit obtained from the training data cannot be sufficient to assess the accuracy of the newly predicted points, in this paper, the meta-models are compared for the training and test data in order to get a more complete picture of the surrogate model performance. For training data, the meta-models are compared in terms of their coefficient of determination R-squared (R2), mean absolute percentage error (MAPE), and normalized root mean square error (NRMSE). For test data, the meta-models are compared in terms of their maximum relative error \(\left({e}_{max}\right)\), the mean relative error \(\left({e}_{avg}\right)\), and NRMSE. The R2 value is a criterion for evaluating how well the variability of the output is captured by the regression variables in a model. This value is between 0 and 1, and if R2 is close to 1, this normally indicates that the model has a good fit for the sampled data. The MAPE (mean absolute percentage error) is a statistical index that evaluates the size of the error in terms of percentage. The root mean square error (RMSE) is an evaluation criterion that measures the relative global error of the substitution model. The normalized root mean square error (NRMSE) is an evaluation criterion that assesses the relative overall error of the substitution model. A smaller value of MAPE or NRMSE indicates a more accurate surrogate model. The terms R2, MAPE, NRMSE, \({e}_{max}\), and \({e}_{avg}\) are calculated using formulas 10, 11, 12, 13, and 14 respectively.

$${\mathrm{R}}^{2}=1-\frac{{\sum }_{\mathrm{i}=1}^{\mathrm{m}}{\left({\mathrm{z}}_{\mathrm{i}}-{\widehat{\mathrm{z}}}_{\mathrm{i}}\right)}^{2}}{{\sum }_{\mathrm{i}=1}^{\mathrm{m}}{\left({\mathrm{z}}_{\mathrm{i}}-\overline{\mathrm{z} }\right)}^{2}}$$
(10)
$$\mathrm{MAPE}=\left(\frac{1}{\mathrm{M}}\sum_{\mathrm{i}=1}^{\mathrm{M}}\frac{\left|{\mathrm{z}}_{\mathrm{i}}-{\widehat{\mathrm{z}}}_{\mathrm{i}}\right|}{\left|{\widehat{\mathrm{z}}}_{\mathrm{i}}\right|} \right)*100$$
(11)
$$\mathrm{NRMSE}=\sqrt{\frac{1}{\mathrm{M}}{\sum }_{\mathrm{i}=1}^{\mathrm{M}}{\left({\mathrm{z}}_{\mathrm{i}}-{\widehat{\mathrm{z}}}_{\mathrm{i}}\right)}^{2}}*\frac{1}{{\mathrm{z}}_{\mathrm{max}}-{\mathrm{z}}_{\mathrm{min}}}$$
(12)
$${e}_{max}=\underset{i\in N}{\mathrm{max}}\left(\frac{\left|{z}_{i}-{\widehat{z}}_{i}\right|}{\left|{z}_{i}\right|}\right)$$
(13)
$${\mathrm{e}}_{\mathrm{avg}}=\frac{1}{\mathrm{N}}\sum_{\mathrm{i}=1}^{\mathrm{N}}\frac{\left|{\mathrm{z}}_{\mathrm{i}}-{\widehat{\mathrm{z}}}_{\mathrm{i}}\right|}{\left|{\mathrm{z}}_{\mathrm{i}}\right|}$$
(14)

where M is the number of training data, N is the number of test data, \({\widehat{z}}_{\mathrm{i}}\) is the corresponding predicted value for the real value \({z}_{\mathrm{i}}\), and \(\overline{z }\) is the mean of the real values.

4.2 Meta-model configuration

The prediction equations, to obtain the relation between the different responses output (rupture and thinning) and the input parameters (blank holder force, punch section radius, die fillet radius, die fillet radius, blank thickness, friction coefficient between die blank) were found using MATLAB codes. Kriging models are constructed and tested using a MATLAB toolbox called design and analysis of computer experiments (DACE). The accuracy of prediction techniques depends on the choice of the appropriate configuration, which in turn depends on the nature of the data and the expected accuracy of fit. Based on the results of several tests and comparisons, the appropriate configurations of RSM, kriging, and RBF are listed in Table 6. The appropriate configuration of the kriging model is the first-order polynomial function hybridized with a Gaussian correlation function for rupture case and a second-order polynomial function hybridized with a spherical correlation function for thinning case. For the RSM model, the reduced cubic polynomial function is most appropriate for both outputs. The appropriate configuration of the RBF model is the thin-plate spline basis function for rupture and the Gaussian basis function for thinning.

Table 6 Configuration of RSM, RBF, and kriging techniques

In this work, a multilayer perceptron, where every layer is formed by some neurons, has been implemented. The Bayesian regularization algorithm was utilized for obtaining high-quality prediction in this research. Employing this algorithm, many different combinations of hidden elements and layers were tested in multiple, iterated runs to identify the NN structure that best fits the available data. Since the prediction model for both outputs has a nonlinear relation, the sigmoid function was used as the activation function in this study. A MATLAB code was implemented for the ANN building and the data sets were divided into training (80%) and validation (20%) data. The condition to finish the learning, i.e., the adjustment of the weights between the elements of the ANN, is based on the value of the performance function (MSE). Table 7 summarizes the design features of the ANN architecture for both output responses. Figures 9 and 10 illustrate the graphical diagram of the ANN system for rupture and thinning respectively.

Table 7 Characteristics of the neural network architecture
Fig. 9
figure 9

Graphical diagram of the ANN system for rupture

Fig. 10
figure 10

Graphical diagram of the ANN system for thinning

4.3 Comparing prediction techniques

The RSM, RBF, kriging, and ANN models are developed with the same sampled data set, and the validation strategy proposed above is performed. The FEM and surrogate model results for the training data are presented in Table 8. As mentioned above, in the first step, the ANN and RSM models are evaluated based on training data by using performance criteria. Tables 9 and 10 provide the evaluation results of the accuracy of the surrogate models based on the selected statistical measures. Figure 11a and b display the graphs of the predicted values for the formation samples versus the observed curvature using the ANN and RSM models.

Table 8 Estimation data of each prediction model for the training data
Table 9 Goodness-of-fit tests results of the thinning response for the training sample
Table 10 Goodness-of-fit tests results of the rupture response for the training sample
Fig. 11
figure 11

Graphs of the predicted values vs the observed values for training data

For thinning case, the R-squared value of the ANN model is equal to 1 and its NRMSE and MAPE values were close to zero, whereas the R-squared value of the RSM model is equal to 0.87 and its NRMSE and MAPE values were a little far from zero. This result reflects that ANN exactly reproduces the training data but the RSM model fails to accurately reproduce the training data.

For the rupture case, the R-squared value of the ANN model is equal to 1 and its NRMSE and MAPE values were too close to zero, whereas the R-squared values of the RSM model are equal to 0.96 and its NRMSE and MAPE values were close to zero. This result shows that the ANN model accurately reproduces the training data, and the RSM model can reproduce the data with tolerable accuracy. According to the results for both cases, the ANN model guarantees a higher accuracy than the RSM model in the training stage.

As mentioned above, in the second step, the four surrogate models are evaluated based on test data by using performance criteria. The FEM and surrogate model results for the test data are given in Table 11. Tables 12 and 13 provide the evaluation results of the accuracy of the surrogate models based on the selected statistical measures. Figure 12a and b show a comparison of the errors of both response variables for the test data.

Table 11 Estimation data for each prediction model of the random sample condition
Table 12 Goodness-of-fit tests results of the thinning response for test data
Table 13 Goodness-of-fit tests results of the rupture response for test data
Fig. 12
figure 12

Comparison between surrogate models for test data

For thinning case, the maximum absolute percentage error is about 37.31%, 38.64%, 28.17%, and 18.84% in the case of the RSM, RBF, kriging, and ANN model respectively. According to this criterion, kriging and ANN models give better results than RSM and RBF models. The mean relative error \(\left({e}_{avg}\right)\) is about 19.83%, 19.92%, 17.39%, and 9.78% in the case of the RSM, RBF, kriging, and ANN model respectively. The root mean square error (RMSE) is about 0.36, 0.39, 0.30, and 0.19 in the case of the RSM, RBF, kriging, and ANN model respectively. According to two of these criteria, it can be observed that ANN is better than all other techniques. Therefore, it can be concluded that the ANN model seems to indicate extremely high predictability compared to other models.

For the rupture case, the maximum absolute percentage error is about 18.13%, 16.81%, 17.73%, and 16.99% in the case of the RSM, RBF, kriging, and ANN model respectively. The mean relative error \(\left({e}_{avg}\right)\) is about 6.14%, 6.50%, 7.15%, and 4.31% in the case of the RSM, RBF, kriging, and ANN model respectively. The root mean square error (RMSE) is about 0.17, 0.19, 0.19, and 0.15 in the case of the RSM, RBF, kriging, and ANN model respectively. From these summaries, all four models have good prediction and have approximately the same results with a small advantage for the ANN model.

5 Conclusions

This study presents a comparative analysis of prediction techniques using RSM, RBF, kriging, and ANN for failure prediction in square deep drawing of DP600 high-strength steel. Blank holder force, punch section radius, die section radius, die fillet radius, blank thickness, and friction coefficient between die and blank were chosen as input variables, and rupture and thinning as output variables. The LHD for six parameters including 40 training samples and 12 test samples is used to produce the database needed for the development of surrogate models. In this paper, the meta-models are compared for the training and test data in order to get a more complete picture of the surrogate model performance. Accordingly, the main conclusions from this investigation are:

  • The appropriate configuration of the kriging model is the first-order polynomial function hybridized with a Gaussian correlation function for rupture case and a second-order polynomial function hybridized with a spherical correlation function for thinning case. For the RSM model, the reduced cubic polynomial function is most appropriate for both outputs.

  • The appropriate configuration of the RBF model is the thin-plate spline basis function for rupture and the Gaussian basis function for thinning. For the ANN model, a feedforward neural network associated with the Bayesian regularization algorithm and a 3-layer structure, 6–8-6 for the rupture case and 6–8-9 for the thinning case, is the most appropriate configuration.

  • For both output variables, the ANN model can accurately reproduce the training data, whereas the RSM model is not able to accurately reproduce the training data in the rupture case and it can to reproduce the data with tolerable accuracy in the rupture case. It can be concluded that the ANN model guarantees a higher accuracy than the RSM model in the training stage.

  • Concerning the results of the test data, the ANN model showed high predictability compared to the other models for the thinning output, while for the rupture output, all four models have good prediction and have approximately the same results with a small advantage for the ANN model.

From these findings, the four techniques have demonstrated an ability to predict with acceptable accuracy the output variables, but the ANN model has demonstrated high performance and accuracy than the other models. For this reason, the ANN models developed can be used as a guideline for process failure optimization and product design.