1 Introduction

Forming significantly impacts the product properties in manufacturing processes. Thus, finite element (FE) simulations with predictive capabilities are crucial in order to exploit the full lightweight potential and to optimize the forming process. Beyond properties such as hardening and residual stresses, ductile damage influences the product performance, i.e., damage decreases stiffness, lifetime, and crash safety [1]. Therefore, it is important to quantitatively predict ductile damage to improve the design of forming processes. This requires suitable damage models and their precise calibration.

To this end, we extend the open-source parameter identification tool ADAPT (A Diversely Applicable Parameter identification Tool) [2] with an extended version of a machine learning-based start value prediction developed in [3]. For demonstration purposes, the enhanced computational framework is applied to an explicit simulation model of a boundary value problem (BVP) of a round tensile test specimen in bulk metal forming, taking into account integral force data, full field data such as displacement fields and micro-scale void data to calibrate two different damage models. Future applications could extend to implicit models and include additional shear- or compression-dominated calibration tests.

In bulk metal forming, the material undergoes large deformations. At the same time, ductile damage occurs which is defined as the nucleation, growth, and coalescence of voids. Damage models that depict these mechanisms can be subdivided into phenomenological and micromechanical damage models. Phenomenological models describe the evolution of damage in terms of macroscopic parameters. Inspired by Kachanov’s work [4], Lemaitre introduced a model based on the strain energy concept [5]. Damage evolves with equivalent plastic strain and is linked to mechanical properties, as in the concept of effective stresses.

On the other hand, micromechanical models involve explicit representations of voids and inclusions in the material. They describe the void-related mechanisms on the microscale. For instance, a criterion for void nucleation based on the distribution of inclusions and the stress state has been proposed by McClintock [6]. Rice and Tracey developed a model for the growth of spherical voids in an infinite matrix [7]. Gurson later presented a so-called porous plasticity model, which describes the yield surface as a function of hydrostatic stress and void volume fraction [8]. This framework was extended by Tvergaard and Needleman by adding terms to account for void shape effects and void coalescence [9]. An extensive overview of ductile damage and fracture modeling can be found in [10].

A well-known limitation of damage models is their sensitivity to mesh size, particularly when strain and damage localize. To address this issue, regularization methods have been proposed for damage models [11]. For instance, Wcislo et al. [12], and more recently, Brepols et al. [13] and Sprave and Menzel [14] have defined non-local damage variables, incorporating their gradients into the model formulation. Non-local models have also been efficiently implemented in commercial software [15] [16], enabling the simulation of complex forming processes. Due to the complexity of the model, suitable experimental data are still indispensable for model calibration to provide meaningful predictions.

However, we apply two locally formulated damage models in this contribution: a Gurson–Tvergaard–Needleman (GTN) model and a Lemaitre model. To circumvent their mesh dependency, standard practice maintains a consistent element length across the simulation models [17].

In parameter identification, specific model parameters, such as Young’s modulus or yield stress, can be identified directly during homogeneous deformation. However, the identification of parameters associated with plasticity, damage, or other strain rate- and temperature-dependent mechanisms is not straightforward. A direct identification is impossible since these parameters may not have an evident physical meaning. Therefore, parameter identification approaches are taken which are based on the solution of inverse problems. An extensive description of the fundamentals of such optimization-based parameter identification schemes for constitutive models can be found in [18, 19] and [20]. In these approaches, the model parameters are updated iteratively until the simulation results fit the experimental data within a defined tolerance range. Besides load–displacement curves, full-field data, such as displacement fields [21, 22] and temperature fields [23], have already been employed to depict the local material behavior under inhomogeneous deformation. Classic optimization methods have also been applied to viscoelastic material models coupled to gradient-enhanced damage [24].

The calibration of damage models requires experimental data containing information about the damage evolution. Sprave and Menzel [14] and Shamshiri et al. [25] exploited the decrease in stiffness observed in tensile tests with elastic unloading and (re)-loading as a measure of damage. Seyyedi et al. [26] considered the onset of necking in tensile tests as a damage initiation of a modified Mohr–Coulomb criterion, while Roux and Bouchard [27] and Springmann and Kuna [28] incorporated strain fields to address the strain localization caused by damage-induced softening. These methods rely on macroscopic data rather than void measurements on the microscale. Consequently, the damage variable does not solely represent the void fraction. Instead, it represents a combination of several mechanisms contributing to the change in stiffness. However, recent works on sheet bending [29] and cold forging [30] demonstrated that the void fraction could directly be linked to product performance. Therefore, predicting the void fraction directly rather than the damage-induced softening on the macroscale is more critical. The work of Suárez et al. [31] provides an exception, as a Gurson-type model was calibrated using computer tomography (CT) measurements. This approach leads to a qualitatively good prediction of the void volume evolution in pearlitic steel. However, the quantitative error remains significant.

This indicates that even a micromechanical damage model will only provide accurate damage predictions of void fractions if the calibration is based on experimental data depicting the microscale void mechanism.

Upon reviewing different optimization-based parameter identification schemes considering different quantities, all approaches have one aspect in common: the determination of suitable start values for the optimization scheme is challenging. This applies to many parameter identification problems and not only to damage modeling. The selection of start values requires a profound understanding of the interaction between the model parameters, the evolution equations, and, ultimately, the final response of the constitutive model. This is especially relevant in the case of high model complexity, and a large number of model parameters increasing the nonlinearity of the objective function. Hence, optimization algorithms run into local minima that do not always provide a sufficient result. To overcome this challenge, the machine learning approach of Schulte et al. [3] is taken in order to pre-determine suitable start values for a subsequent classic optimization-based parameter identification. This strategy provides the further advantage, that a neural network only has to be trained once with a material model and can be subsequently used for the parameter prediction in various material applications. Unlike Schulte et al. [3], we directly consider the underlying BVP instead of relying on approximated homogeneous loading states aiming to enhance the prediction of the initial start values.

Over the years, various approaches have been employed for start value prediction. Heuristic methods, for instance, rely on empirical rules or simplified analytical solutions to provide initial estimates based on fundamental physical principles [32]. On the other hand, sensitivity analysis examines how parameter variations affect the model output, thereby providing guesses on appropriate start values [33]. One of the first contributions based on machine learning-assisted parameter identification was made by Huber and Tsakmakis. [34]. Using a neural network, they established an explicit relation between the loading state and the material parameters. The neural network could identify meaningful sets of material parameters, even when complex loading histories were considered. Later, Aguir et al. [35] developed a hybrid method of multi-objective optimization to identify the anisotropy parameters of the orthotropic criterion of Hill’s 1948 model, thereby substituting finite element simulations with a neural network, thereby reducing the computational time. Obrzud et al. [36] employed a neural network to estimate model parameters for the correction step of a Gauss–Newton algorithm. More recently, Guo et al. [37] introduced a model that combines a convolutional neural network for denoising and strain feature extraction with a long short-term memory neural network to identify path-dependent constitutive model parameters. Wei et al. [38] employed physics-informed neural networks (PINNs) to enhance parameter identification based on full-field measurements.

The current paper addresses the aforementioned shortcomings, i.e., the necessity for manually defining start values and the lack of considering suitable experimental data, by proposing a combined optimization approach (Fig. 1). This approach integrates machine learning-assisted start value prediction with multi-objective parameter identification. Machine learning-based start value prediction based on BVP training data is demonstrated for the first time. The subsequent classic parameter identification uses experimental data sets of different scales, i.e., integral data, such as load–displacement curves, field data, in particular displacement fields, and microscopic measurements of void fractions, aiming to predict ductile damage in the sense of void area fraction in bulk metal forming.

Fig. 1
figure 1

Overview of the proposed optimization framework incorporating a machine learning approach for start value prediction with subsequent multi-objective parameter identification. Within the multi-objective optimization, input data of different scales is used, i.e., integral data such as forces, field data such as displacement fields, and void measurements at the micro-scale. Here, the material parameters to be identified of the Lemaitre model are marked blue. (Color figure online)

The paper is structured as follows: Section 2 briefly describes the methods used, particularly the machine learning framework devised for accurate start value prediction and the multi-objective parameter identification tool ADAPT. Section 3 applies the combined methods to a boundary value problem specific to a round-notched tensile test of the case hardening steel 16MnCrS5. In Section 4, the calibrated damage models are applied to forward rod extrusion and validated based on void measurements. A conclusion and an outlook are given in Section 5.

2 Methods

2.1 Machine learning-based prediction of parameter sets

In classic parameter identification procedures, generating appropriate starting values is challenging. A proper starting value generation requires knowledge of the underlying material model and the influence of the constitutive material parameters on the simulated material response to estimate appropriate ranges for the parameters. The difficulty of the challenge increases with the complexity of the model—i.e., covering material effects such as damage, plasticity, or temperature—and the amount of model related parameters. Machine learning techniques were incorporated into parameter identification to improve the overall procedure and efficiency. In this work, the machine learning-based framework developed in [3] is applied in order to efficiently obtain a high-quality starting value for a subsequent classic optimization-based parameter identification. This approach identifies the ranges of the different parameters in the first step. Subsequently, a sufficiently large set of varying parameter combinations is generated, and the corresponding simulations are performed to obtain a training data set for an artificial neural network (ANN), which contains the simulated material response or loading path sequence, respectively, of each combination as input and the corresponding parameter set as output. A feedforward neural network is chosen due to its straightforward implementation and the flexibility to adapt the neural network by increasing the number of neurons and hidden layers to capture more complex relations within the underlying dataset. This architecture has already been successfully validated for homogeneous states of deformation in Schulte et al. [3] and will be applied to inhomogeneous states of deformation in the present paper. The computational time can be reduced by either employing simple loading paths or, by restricting the considered boundary value problem to homogeneous states of deformation, or by even reducing it to material point level. After the successful training, the experimentally measured material response can be handed to the trained ANN, and the corresponding material parameters can be predicted immediately.

Considering the sampling process of all specified parameter ranges, an even distribution can be obtained by employing the Latin Hypercube Sampling (LHS) method, which is directly available in many programming languages and which subdivides the whole parameter intervals into subintervals based on the defined amount of sample points. A random number is generated in each subinterval and spread over the generated array of each parameter to achieve an even distribution. Thus, a valid representation of the actual variability is ensured.

In this contribution, force–displacement curves are used for training. Since the displacements are prescribed at Dirichlet boundary in the simulation and therefore already known, it is sufficient to only consider the forces as input data for the ANN. In addition, 20% of the total data set is separated from the training set and used to test the network. In the testing process, the performance of the trained ANN can be investigated since the test data set was not considered in the training process. Thus, the deviations between predicted parameters, as based on the loading path sequences of the test data set, and original parameters can be employed for validation purposes. Furthermore, the original curve of the loading path can be compared to the simulated curve based on the predicted parameter set. Hence, even if the parameters deviate significantly, the prediction quality of the network can be sufficient if the deviation of the curves is only marginal.

If the network performance is insufficient, the performance can be improved by optimizing the hyperparameters of the network, particularly the number of layers and neurons per layer. This can be done automatically, and the best-performing network may ultimately be selected.

In addition, if the parameter predictions of the network deviate from the original parameters but the simulated material responses nearly match the original curves, the parameter deviations depict a kind of “sensitivity”-like information since either the parameters exhibit dependencies or the underlying loading path sequences are not sufficient to uniquely identify all material parameters. Consequently, the machine learning-based framework provides several practical advantages.

2.2 Optimization-based parameter identification

The free and open-source tool ADAPT [2] is employed for the subsequent optimization-based parameter identification of the constitutive models. Various types of input data, specifically integral data such as forces and full-field data, i.e., displacement and strain fields, can be considered in the multi-objective optimization. Within a classic optimization-based parameter identification, the optimization problem

$$\begin{aligned} \min (f(\varvec{\kappa })), \forall \varvec{\kappa } \in \textbf{K},\,\, \text {with} \, \,\textbf{K} = \{ \varvec{\kappa } \, | \, \textbf{h}(\varvec{\kappa }) = \textbf{0}, \textbf{g}(\varvec{\kappa }) \le \textbf{0} \} \end{aligned}$$
(1)

is solved for the parameter set \(\varvec{\kappa }\), where \(f(\varvec{\kappa })\) represents the scalar objective function quantifying the error between simulation and experiment. The equality and inequality constraints are denoted by \(\textbf{h}(\varvec{\kappa }) = \textbf{0}\) and \(\textbf{g}(\varvec{\kappa }) \le \textbf{0}\), respectively. To measure the error between simulation data \(\bullet ^i_{\textrm{sim}}\) and experimental data \(\bullet ^i_{\exp }\) for each data point i, the objective function is defined by using the root sum of squares

$$\begin{aligned} f(\varvec{\kappa }) = \sqrt{\frac{1}{n_{\textrm{dp}}} \, \sum _{i=1}^{n_{\textrm{dp}}} w_i\big |\bullet _i^{\textrm{sim}}(\varvec{\kappa }) - \bullet _i^{\exp } \big |} . \end{aligned}$$
(2)

Here, \(n_{\textrm{dp}}\) defines the number of data points in space and time, and \(w_i\) represents weighting coefficients. The symbol \(\bullet \) can refer to any quantity of interests.

In the current contribution, multiple objectives are considered instead of focusing on a single objective, notably forces, displacements, and void area fractions. Detailed information about the experimental data is prescribed in Section 3.1. The optimization of multiple objectives requires the formulation of multiple objective functions that need to be optimized simultaneously. Alternatively, the set of objective functions

$$\begin{aligned} \textbf{f}(\varvec{\kappa }) = \left[ f_1(\varvec{\kappa }),f_2(\varvec{\kappa }), \ldots , f_n(\varvec{\kappa })\right] ^{\textrm{T}} \end{aligned}$$
(3)

can be reduced to a single objective function

$$\begin{aligned} f(\varvec{\kappa }) = \sum _{i=1}^{n} \frac{f_i(\varvec{\kappa })}{\hat{f}_i} , \end{aligned}$$
(4)

where the single objective functions \(f_i(\varvec{\kappa })\) are normalized by using a scalar value \(\hat{f}_i\) and where n is the number of individual objectives. It is worth noting that solutions to multi-objective optimization problems are typically not unique since if the objective functions are optimized individually, they are often in conflict with one another. In other words, improving one objective can make another worse. To find the best compromise, a Pareto optimum [39] can be used instead by introducing a modified global objective function of the form

$$\begin{aligned} f(\varvec{\kappa }) = \sqrt{\sum _{i=1}^{n} \left[ \frac{f_i^*- f_i(\varvec{\kappa })}{f_i^*} \right] ^2} . \end{aligned}$$
(5)

Here, \(f_i^*\) defines the minimum error for each single objective function. The global objective function is then considered Pareto-optimal if it is impossible to improve one of the single objective functions without increasing the error of the others.

In ADAPT, evolutionary, gradient-based, and gradient-free algorithms are available for optimization. In this study, the gradient-free Nelder-Mead Simplex algorithm is employed [40]. This method does not require differentiability of the objective function and exhibits robust convergence behavior in complex optimization problems. However, its convergence rate is slower than gradient-based algorithms, emphasizing the need to choose suitable start values.

3 Parameter identification of 16MnCrS5

3.1 Experiment

Rotationally symmetric tensile specimens with a notch are used for optimization (Fig. 2a). The global strain rate was chosen as 0.00025 1/s (related to linear, non-logarithmic strains) to represent quasi-static conditions. Global displacement was measured by using a tactile extensometer with an initial length of 20 mm. The extensometer was positioned outside of the notch where the strains are nearly homogeneous. The load–displacement curve (Fig. 2b) has a pronounced force drop before fracture due to the ductile material behavior.

Fig. 2
figure 2

Experimental setup: a Geometry of the considered round tensile specimen of the material 16MnCrS5. b Load–displacement curve during tensile loading

Since the specimen is rotationally symmetric, the field data obtained by digital image correlation (DIC) contains a lot of redundant information and can be reduced to data along a vertical line on the surface by exploiting the symmetry (Fig. 3a). The radial strain of the specimen corresponds to the strain in orthogonal direction \(\varvec{e}_x\) for a path along the longitudinal axis centered on the specimen. This depth strain is calculated by assuming volume constancy based on the principal strains on the surface. Whereas this approach is reasonable for small strains in the thickness direction, as is the case for flat tensile specimens, the error would be too large for notched round tensile specimens. Displacement fields must therefore be used. By eliminating rigid body motions through the use of a moving coordinate system centered at the notch as the reference system, the radial displacement \(u_r(z)\) can be referred to as the displacement in depth-direction \(u_x (z, \varphi =0^\circ )\) (displacement orthogonal to the measured surface). The specimen thickness in terms of radius changes during loading, and at the same time, the specimen expands in the longitudinal direction (Fig. 3b). This behavior is mapped by the contour within the notch region. Such contour measurements are used for parameter identification.

Fig. 3
figure 3

a Displacement field in thickness direction \(u_x\) immediately before fracture. The displacement in the thickness direction is equal to the radial displacement of the outer contour along the vertical line (\(u_r(z) = u_x (z, \varphi =0^\circ )\)). b Contour evolution in radial direction w.r.t. extensometer displacement within the notch of the round tensile specimen

The tensile tests are performed at the Institute of Forming Technology and Lightweight Components (IUL) at TU Dortmund University. The Central Facility for Electron Microscopy (GFE) at RWTH Aachen University provided the experimental void measurements. The void measurements were made using scanning electron microscopy (SEM). Specimens loaded to failure were separated, embedded, and ground with grits of 320, 400, 600, 800, 1000, 1200, and 2400. The specimens were then polished to 6 \(\mu \)m, 3 \(\mu \)m, 1 \(\mu \)m and 0.25 \(\mu \)m. This was followed by an OPS polish (oxide polish suspension based on SiO\(_{2}\)) and a polish with water. The void fractions are detected along the central axis in loading direction. The area measured by SEM is 9 mm \(\times \) 1 mm (length \(\times \) width, Fig. 4a) and is located as close as possible to the fracture surface. The smallest detectable void has an area of 0.037 \(\mu {\hbox {m}}^2\). Based on the provided void measurements, the void area fraction is calculated by summing the measured void areas and dividing them by the corresponding total area, the so-called window. The choice of the window size greatly influences the results (Fig. 4b). Significant outliers are observed if the window is too small and if the data is not statistically representative. On the other hand, a window that is too large leads to insufficient resolution of the data, as differences along the length are not mapped with sufficient accuracy. In the following, the data with a corresponding window of 0.3 mm \(\times \) 0.3 mm will be used since data with this window size represent the local trends well. In addition, the window size is similar to the element size of the FE mesh in the deformed configuration.

Fig. 4
figure 4

a SEM panoramic image of the void area distribution along the axial plane (\(\varvec{e}_y\) - \(\varvec{e}_z\) plane, cf. Fig. 3a) of the round tensile specimen starting at the fracture surface (top) as well as the representation of the void area of single voids along the distance to the fracture surface (bottom). GFE has provided the void measurements. b Void area fraction along the \(\varvec{e}_z\)-direction to the fracture surface. The void area fraction is calculated by summing the measured void areas and dividing them by a corresponding total area. Area sizes of 0.01–0.15 mm\(^2\) with different lengths and widths are analyzed to identify a suitable corresponding area

3.2 Finite element model

The notched round tensile specimen of 16MnCrS5 was modeled as a rotationally symmetric body exploiting symmetry in the longitudinal direction (Fig. 5). The finite element analysis tool ABAQUS used two-dimensional axisymmetric elements with reduced integration and extended hourglass control (CAX4R). The simulations were performed in Abaqus with explicit time integration since the used material models are available in the form of VUMAT subroutines. According to a convergence study (see Appendix A), the element length was set to \(l_\text {e} = 0.25\) mm and a mass scaling of 10\(^{4}\) was chosen. The selected element length and mass scaling were verified by comparing explicit elasto-plastic simulation with static implicit elasto-plastic simulation. The experimentally measured displacements are applied to the top edge of the finite element model in \(\varvec{e}_z\)-direction. This is done where the extensometer was placed in the experiment to represent the real load condition as accurately as possible. The radial displacements at the top edge are not constrained, taking into account that the experimental clamping is sufficiently far from the extensometer position. The process is assumed to be isothermal.

Fig. 5
figure 5

Finite element model of the round tensile test with corresponding boundary conditions

3.3 Material models

The current study employs two models accounting for the elasto-plastic behavior and ductile damage in metals. The models are formulated for finite deformations and rely on a multiplicative decomposition of the deformation gradient \(\textbf{F}=\textbf{F}_{\text {e}}\cdot \textbf{F}_{\text {p}}\) into an elastic part \(\textbf{F}_{\text {e}}\) and a plastic part \(\textbf{F}_{\text {p}}\) as proposed by Lee [41]. The presentation of the model framework follows a previous comparison of the models by focusing on their predictive performance for the forming of thick sheets [42]. The kinematic framework approximates the logarithmic elastic strain \(\ln (\textbf{U}_\text {e})\) within the context of the polar decomposition \(\textbf{F}_{\text {e}}=\textbf{R}_{\text {e}}\cdot \textbf{U}_{\text {e}}\). Here, \(\textbf{R}_{\text {e}}\) represents the elastic rotation tensor. The approximation

$$\begin{aligned} \dot{\overline{\ln {(\textbf{U}_\text {e})}}} = \textbf{R}_\text {e}^{\text {T}} \cdot \textbf{D} \cdot \textbf{R}_\text {e} - \textbf{D}_\text {p} \end{aligned}$$
(6)

of the rate of change for the logarithmic elastic strain holds for small elastic strain \(|\ln (\textbf{U}_\text {e})| \ll 1\) and is derived in more detail in, e.g., [43]. The evolution relation depends on the total rate of deformation \(\textbf{D}=\text {sym} (\textbf{L})=\text {sym}(\dot{\textbf{F}}\cdot \textbf{F}^{-1})\). The rate of plastic deformation and rotation are defined as \(\textbf{D}_{\text {p}}=\text {sym}(\dot{\textbf{F}}_{\text {p}}\cdot \textbf{F}_{\text {p}}^{-1})\) and \(\textbf{W}_{\text {p}}=\text {skw}(\dot{\textbf{F}}_{\text {p}}\cdot \textbf{F}_{\text {p}}^{-1})\), respectively. It is assumed that the material axes rotate with the continuum spin, which implies that \(\textbf{W}_{\text {p}}\approx {\textbf {0}}\). Consequently, \(\textbf{R}_\text {e}=\textbf{R}\) results for the elastic rotation with \(\textbf{R}=\textbf{F}\cdot \textbf{U}^{-1}\) . The class of investigated metals is considered to be elastically and plastically isotropic. The back-rotated Cauchy stress

$$\begin{aligned} \textbf{M} = \textbf{R}_\text {e}^{\text {T}} \cdot \varvec{\sigma } \cdot \textbf{R}_\text {e} \end{aligned}$$
(7)

is chosen as the stress measure in the model formulation. The evolution relation

$$\begin{aligned} \textbf{D}_\text {p} = \dot{\gamma } \,\, \frac{\partial \Phi }{\partial \textbf{M}} \end{aligned}$$
(8)

for the rate of plastic deformation, where \(\dot{\gamma }\) denotes the plastic multiplier, is formally identical in the two models considered as this work proceeds. The total potential

$$\begin{aligned} \Phi (\textbf{M}, \sigma _\text {f}, \omega , Y) = \Phi ^{\text {p}}(\textbf{M}, \sigma _\text {f} , {\omega }) + \Phi ^{\text {d}}(Y, \omega ) \end{aligned}$$
(9)

is split into a plastic part \(\Phi ^{\text {p}}\) and a ductile damage part \(\Phi ^{\text {d}}\), where the internal variable \(\omega \) accounts for the evolution of voids. The specific choice of \(\Phi ^{\text {p}}\) determines the rate of plastic deformation in eq. (8). Similarly, the influence of damage on the plastic and elastic behavior depends on the specific choices for \(\Phi ^{\text {p}}\) and \(\Phi ^{\text {d}}\). In particular, the driving force Y for damage governs the influence of damage on elasticity.

Isotropic hardening according to Swift

$$\begin{aligned} \sigma _\text {f} = K(\varepsilon _0 + \varepsilon ^\text {p}_\text {eq})^{n} , \end{aligned}$$
(10)

in terms of the equivalent plastic strain \(\varepsilon ^\text {p}_\text {eq}=\!\int \! \sqrt{{2}/{3}}\,|\textbf{D}_{\text {p}}|\,\textrm{d}t\) and the material parameters K, \(\varepsilon _0\) as well as n is considered. The elastic stress update in the constitutive model is

$$\begin{aligned} \dot{\textbf{M}}=\lambda \,\text {tr}\left( \textbf{R}_{\text {e}}^{\text {T}}\cdot \textbf{D}\cdot \textbf{R}_{\text {e}}^{\text {}}-\textbf{D}_{\text {p}}\right) \,\textbf{I}+2\,\mu \,\left( \textbf{R}_{\text {e}}^{\text {T}}\cdot \textbf{D}\cdot \textbf{R}_{\text {e}}^{\text {}}-\textbf{D}_{\text {p}}\right) , \end{aligned}$$
(11)

where \(\textbf{I}\) denotes the second order identity tensor. Parameters \(\lambda =E\nu /\left[ \left( 1+\nu \right) \left( 1-2\nu \right) \right] \) and \(\mu = E / \left[ 2 \left( 1+\nu \right) \right] \) are the Lamé constant and the shear modulus in terms of the Young’s modulus E and the Poisson’s ratio \(\nu \), respectively.

3.3.1 Porous plasticity

The motivation of porous plasticity models is the coupling of the decrease in yield stress \(\sigma _{\text {y}}\) to the void evolution. Voids in the as-received material increase their volume due to the hydrostatic stress state. The choices of \(\omega =D_{\text {GTN}}\) and \(\Phi ^{\text {p}, \,\text {GTN}}\), with \(\Phi ^{\text {d}, \,\text {GTN}}=0\) lead to the plastic potential

$$\begin{aligned} \Phi ^{\text {p,GTN}} = \, \bigg (\frac{\sigma _\text {eq}}{\sigma _\text {f}}\bigg )^2 + 2 \, q_1 D_\text {GTN}\cosh {\bigg [ \frac{3}{2}\, \eta \,q_2\bigg ]} - (1 + q_3 D_{\text {GTN}}) , \end{aligned}$$
(12)

which is a function of the void volume fraction \(D_\text {GTN}\) and which depends on the initial yield stress \(\sigma _\text {f}\) and the equivalent stress \(\sigma _\text {eq} = \sqrt{3/2}\,|\text {dev}(\textbf{M})|\) for isotropic plasticity, where \(\text {dev}({\textbf {M}})\) defines the deviatoric part of the stress tensor. In addition, the triaxiality

$$\begin{aligned} \eta = \frac{\sigma _\text {m}}{\sigma _\text {eq}} \quad \text {with} \quad \sigma _\text {m} = \frac{1}{3}\text {tr}(\varvec{\sigma }) \end{aligned}$$
(13)

represents the hydrostatic stress state. The parameters \(q_1\), \(q_2\), and \(q_3\) control the influence of the void volume fraction on softening.

The current model considers void evolution by growth and nucleation following the fundamental work of Gurson [8]. In particular, the void nucleation term is adopted according to the extension by Chu and Needleman [44]. The evolution of the void volume fraction

$$\begin{aligned} \dot{D}_\text {GTN} = \dot{D}_\text {GTN}^\text {gr,\, hyd} + \dot{D}_\text {GTN}^\text {gr,\, shr} + \dot{D}_\text {GTN}^\text {nuc} \end{aligned}$$
(14)

consists of the parts for void nucleation \(\dot{D}_\text {GTN}^\text {nuc}\) (eq. (18)) and parts due to hydrostatic void growth \(\dot{D}_\text {GTN}^\text {gr,\, hyd}\) as well as void growth related to shear stresses \(\dot{D}_\text {GTN}^\text {gr,\, shr}\). The hydrostatic void growth

$$\begin{aligned} \dot{D}_\text {GTN}^\text {gr} = (1-D_\text {GTN}) \, \text {tr}(\textbf{D}_{\text {p}}) , \end{aligned}$$
(15)

depends on the current void volume fraction \(D_\text {GTN}\) and the trace of the rate of plastic deformation \(\textbf{D}_{\text {p}}\). The void growth due to shear according to the extension by Nahshon and Hutchinson [45]

$$\begin{aligned} \dot{D}_{\textrm{GTN}}^{\text {gr, shr}} = k_w\,D_{\textrm{GTN}} \frac{f({\text {dev}}({\textbf{M}}))}{\sigma _{\textrm{eq}}}{\textrm{dev}}(\textbf{M}):{\textbf{D}}_{\textrm{p}} \end{aligned}$$
(16)

depends on a material parameter \(k_w\) and a stress-dependent function

$$\begin{aligned} f(\text {dev}(\textbf{M})) = 1-\left( \frac{27\,J_3}{2\,\sigma _\text {eq}^3}\right) ^3 , \end{aligned}$$
(17)

where \(J_3 = \text {det}(\text {dev}(\textbf{M}))\) defines the third invariant of the deviatoric stress tensor. The rate of void nucleation is defined as

$$\begin{aligned} \dot{D}_\text {GTN}^\text {nuc} = \frac{f_\text {n}}{S_\text {\!n} \, \sqrt{2 \, \pi }} \, \exp \left( - \frac{1}{2} \, \left[ \frac{\varepsilon _\text {eq} - \varepsilon _\text {n}}{S_\text {\!n}} \right] ^2 \right) \, \dot{\varepsilon }^\text {p}_\text {eq} , \end{aligned}$$
(18)

where \(f_\text {n}\), \(S_\text {\!n}\) and \(\varepsilon _\text {n}\) denote additional material parameters. The model is numerically implemented into Abaqus explicit via the VUMAT interface by Soyarslan et al. [46].

3.3.2 Lemaitre model

The used version of the Lemaitre model extends the influence of voids to the elastic behavior via the specific potential

$$\begin{aligned} \Phi (\textbf{M}, \sigma _\text {f}, D_{\text {Lem}}, Y) = \Phi ^{\text {p,\,Lem}}(\textbf{M}, \sigma _\text {f} , D_{\text {Lem}}) + \Phi ^{\text {d,\,Lem}}(Y, D_{\text {Lem}})\,. \end{aligned}$$
(19)

The specific choices are then \(\omega =D_{\text {Lem}}\), cf. eq. (9) and

$$\begin{aligned} \Phi ^{\text {p,Lem}} = \tilde{\sigma }_\text {eq} - \sigma _{\text {f}} \end{aligned}$$
(20)

for the plastic potential with \(\tilde{\sigma }_\text {eq} = \sqrt{3/2}\,|\text {dev}(\tilde{\textbf{M}})|\), cf. Eq. (24). The damage potential is defined as

$$\begin{aligned} \Phi ^{\text {d, Lem}} = \frac{S}{1+\delta } \, \dot{\gamma } \, \bigg \langle \frac{Y - Y_0}{S} \bigg \rangle ^{1 + \delta } \, \frac{1}{(1-D_\text {Lem})^{\beta }} \end{aligned}$$
(21)

and depends on the driving force Y, and an associated threshold \(Y_0\), whereas S, \(\delta \) and \(\beta \) define material parameters related to damage. The driving force

$$\begin{aligned} Y = \frac{1 + \nu }{2\,E} \sum _i^3 \langle \tilde{M}_i \rangle ^2 - \frac{\nu }{2\,E} \langle \tilde{M}_\text {h} \rangle ^2 , \end{aligned}$$
(22)

is a function of effective principal stresses \(\tilde{M}_i\) and the hydrostatic effective stress \(\tilde{M}_\text {h}\). With the damage potential, the damage evolution

$$\begin{aligned} \dot{D}_\text {Lem} = \dot{\gamma } \, \bigg \langle \frac{Y - Y_0}{S} \bigg \rangle ^{\delta } \frac{1}{(1-D_\text {Lem})^{\beta }} \end{aligned}$$
(23)

is derived and the coupling of damage to the mechanical properties is realized through effective stresses. The effective stress

$$\begin{aligned} \tilde{\varvec{M}} = \frac{\varvec{M}}{1-D_\text {Lem}} \end{aligned}$$
(24)

describes the stresses acting on the void-free cross section \(A = A_0 - A_\text {v}\) based on the inital cross section \(A_0\) and the void area \(A_\text {v}\). Thus, in contrast to the GTN model, the damage variable

$$\begin{aligned} D_\text {Lem} = \frac{A_\text {v}}{A_0} . \end{aligned}$$
(25)

represents the void area fraction rather than the void volume fraction in the GTN model. The current study uses the numerical implementation of the described Lemaitre model into Abaqus explicit via the VUMAT interface by Soyarslan and Tekkaya [47].

3.3.3 Material parameters

The material parameters for the GTN and the Lemaitre model are provided in Tables 1 and 2, respectively. Parameters that are pre-defined and not subject to calibration are highlighted in blue. The Young’s modulus E and the Poisson’s ratio \(\nu \) are directly obtained from experimental observations, while it is assumed that the other pre-defined parameters do not significantly influence the material behavior within the considered BVP. For the parameters to be identified, specific parameter ranges based on the material 16MnCrS5 have been defined. These ranges have been used in order to generate training data to predict start values.

Table 1 Overview of the defined material parameters of the GTN model
Table 2 Overview of the defined material parameters of the Lemaitre model

3.4 Prediction of initial parameter set

We employ the machine learning-based framework outlined in Section 2.1 to efficiently obtain high-quality start values for the subsequent optimization-based parameter identification. To this end, a training data set has to be generated for each constitutive model from which 20% are separated for the test data set. By using the LHS method, 10,000 parameter combinations were generated by using the FE model of the BVP within the pre-defined parameter ranges (cf. Table 2) for the Lemaitre model, and 9315 of the corresponding simulations were completed successfully. Following the work in [3], a feedforward neural network is generated by using a multilayer perceptron regressor (MLPregressor). The hyperbolic tangent \(\tanh (x)\) is chosen as the activation function for this application case since it was very successful. The stochastic gradient-based optimizer from [48] is selected as the solver. While the shuffling of the samples in each iteration is deactivated, the initial learning rate is set to 0.0001. In addition, considering weights and bias initialization, the random number generation is fixed to 1. The training of the neural network was stopped once the mean squared error on a validation data set (data not included in the training process) no longer decreased. While the previously described settings influence the ANN’s performance, it also strongly depends on the corresponding hyperparameters. Therefore, a hyperparameter optimization via a random walk algorithm is performed following [3] to obtain the trained ANN with the best performance. Thus, the following hidden layer structure was achieved for the trained ANN in the case of the Lemaitre damage model [171, 171, 171, 171, 171, 171, 194, 194, 194, 194, 194,  194, 221, 252, 287, 287, 287, 287, 287, 287, 326, 326,  326, 326, 326, 326]. Each entry in this vector represents a hidden layer, where each number indicates the number of neurons per layer. Considering that, in this application case only two different material models shall be calibrated for one specific material—the case-hardened steel 16MnCrS5—the boundary value problem (BVP), which is used in the optimization process to simulate the experimental loading behavior (see Fig. 5), is directly employed for the generation of the training data set. Thus, the underlying BVP considers inhomogeneous states of deformations; however, the ANN’s input data only considers the load–displacement curve’s overall reaction force. Since the displacement is prescribed for the BVP, it is unnecessary to consider the displacements in the training process. Different measures in the testing process can quantify the performance of the trained ANN. At first, the average deviation of the predicted parameter to the original material parameter, cf. Table 3 for the Lemaitre parameters can be employed to investigate whether the ANN provides an accurate prediction.

Table 3 Average deviation between original and predicted parameters of the complete test data set for the optimal hyperparameter set in the case of the Lemaitre damage model

However, if the average deviation for a parameter is very high, the performance of the ANN is not necessarily bad. Suppose the simulated curve based on the predicted material parameter set is close to the original input curve. Different local minima exist in that case, and some parameters are not uniquely identifiable for the underlying material model based on the considered loading path. The curve deviation of each test data set j was calculated in [3] via the absolute error

$$\begin{aligned} \displaystyle \delta _\textrm{AE}^j = \left. \sum \limits _{i\!=\!1}^{n_s}\dfrac{\displaystyle \int \bigg |\bullet ^{\text {inp}}_i - \bullet ^{\text {pred}}_i\bigg |\text {d}t}{\textrm{max}\left\{ |\bullet ^{\text {train}}_i|\right\} }\right| ^j\,, \end{aligned}$$
(26)

where \(n_s\) denotes the number of loading path sequences, in this case \(n_s = 1\). The maximum value of the total data set normalizes the deviation. Hence, both measures can be used to investigate whether the loading path is sufficient for a proper calibration or whether the parameters are independent. In Fig. 6, all test data set load–displacement curves are shown, and the corresponding median curve is highlighted in red. The pronounced decrease in several load–displacement curves before the onset of fracture can be attributed to localization due to damage. The local damage distribution results in material softening, consequently causing a sharp decrease in force. The heatmap in Fig. 7 demonstrates the overall network performance in the testing process. The deviation of the loading curves of each test data set to the median curve is plotted in the horizontal direction. This deviation can demonstrate whether the network’s performance is similar over the total range or, for example, whether the ANN provides a lesser accuracy in the outer region. The predicted deviation to the original curve \(\delta _\textrm{AE}\) is shown in the vertical direction. The colorbar gives the frequency. The clustering of the data sets at the bottom of the figure demonstrates that the overall performance of the trained ANN for the Lemaitre model is very accurate over the whole data range.

Fig. 6
figure 6

Array of the load–displacement curves of the complete test data set for the GTN model (left) and the Lemaitre model (right). While all the curves are plotted in transparent light gray scales, the median curve is highlighted in red

Fig. 7
figure 7

Heatmaps of the network performance for the optimal hyperparameters for the GTN model (left) and the Lemaitre model (right). The frequency of the 1801 (GTN model) and 1863 (Lemaitre model) test data sets—20% of the total data set—regarding their curve deviation from the median curve and the curve deviation of the predicted and original curves are shown. The clustering at the bottom of the figure demonstrates a good overall network performance

Next, the experimentally measured load–displacement curve from Section 3.1 is fed into the trained and tested ANN in order to obtain a prediction of the material parameters for the Lemaitre model for the case-hardened steel 16MnCrS5. The predicted parameters are shown in Table 4, and the corresponding comparison of the load–displacement curves of the experiment and the simulation with the predicted parameter set are shown in Fig. 8 in the right plot. The results demonstrate that the predicted load–displacement curve is already close to the experimentally measured curve. However, after conducting the subsequent parameter identification by using only the load–displacement curve referred to as F-strategy, the optimal solution and the experimental curve show an excellent match. Furthermore, the predicted parameters of the ANN are already very close to the final optimum obtained from the subsequent parameter identification scheme.

Table 4 Predicted material parameters of the 16MnCrS5 for the Lemaitre damage model using the optimal set of hyperparameters

In addition to the Lemaitre damage model, the previously mentioned GTN model shall be calibrated for the case-hardened steel. Thus, the general structure of the machine learning-based framework, the experimental data, and the BVP are equal. However, an ANN has to be trained with a data set based on 10,000 parameter combinations within the defined parameter ranges (cf. Table 1) for the GTN model. In this case, 9005 simulations were completed successfully, and the optimal structure of the hidden layers was obtained as [120, 120, 120, 120, 120, 120, 120, 120, 120, 140, 140, 140, 140, 140, 140, 140, 140, 140, 180, 180,  \(180,180,180,180,180,240,240,240,240,240,240,240,320,320,320,320,320,320,320,320,320,420,420,420,420,\) 420, 420, 420, 420, 420].

Fig. 8
figure 8

Prediction of the load–displacement curve for the GTN model (left) and the Lemaitre model (right). The dashed blue line represents the prediction based on the ANN-determined parameter set, while the solid line represents the prediction of the subsequent parameter identification based on the load–displacement curve only, the so-called F-strategy

The corresponding average parameter deviation of the test data set is shown in Table 5. It can be seen that the last two parameters reveal a relatively large deviation between the predicted and original parameters. Hence, it is crucial to evaluate the deviation of the predicted and original curve \(\delta _\textrm{AE}\) as well as to analyze whether the trained ANN provides a bad performance regarding these parameters or whether these parameters cannot be uniquely identified based on the considered loading path. As shown in Fig. 7, the ANN’s overall network performance is quite accurate; thus, the loading path sequence cannot uniquely identify all parameters. However, since only a suitable start set shall be generated, the network performance is more than sufficient.

In Table 6, the predicted parameters for the GTN model are compared to the finally obtained parameters based on the subsequent multi-objective parameter identification. The predicted parameters deviate only slightly from the finally obtained parameters. The corresponding load–displacement curves are shown on the left in Fig. 8. It is shown that the simulated load–displacement curve based on the predicted parameter set nearly perfectly matches the experimentally measured material behavior. Hence, even the prediction quality of the ANN with experimental data sets is very accurate. Nevertheless, the subsequent optimization using only the load–displacement curve (F-strategy) can slightly improve the overall fit. Thus, the machine learning-based framework is well suited for identifying proper starting values for optimization schemes of different constitutive models.

Table 5 Average deviation of the original and predicted parameters of the complete test data set for the optimal hyperparameter set in the case of the GTN damage model
Table 6 Predicted material parameters of the 16MnCrS5 for the GTN damage model based on the optimal set of hyperparameters

In terms of enhancing the efficiency of parameter identification, it is crucial to not only consider the predictive quality of the ANN but also how the machine learning-based start value prediction compares to the manual determination of initial start values. Generating training data by using FE simulations and ultimately training the ANN requires significant computational effort. With a computation time of approximately 1 minute per simulation on the Linux HPC cluster of TU Dortmund (LiDO3) using 2 cores, the data generation for the GTN model (9005 simulations) and the Lemaitre model (9315 simulations) each took almost one week provided that multiple simulations are not carried out in parallel. The training process itself takes about 20 min and depends on the respective hyperparameters. The prediction of the initial values takes only 1–2 s. However, the ANN only needs to be trained once for a material model and can then be used for all material classes that can be described by the material model. To assess whether the time invested in model creation is worthwhile in terms of efficiency enhancement, parameter identification using AI-predicted start values is compared to parameter identification using start values generated through full-factor and random sampling (Table 13). The efficiency is evaluated based on the number of iterations required for reaching convergence of classic optimization and the magnitude of the remaining error.

The convergence behavior highly depends on the choice of initial start values (Fig. 9a). The machine learning-based predicted start value leads to the fastest convergence with the smallest remaining error of \(f_F =\) 183.38. In contrast, 68% of the generated start values either immediately or during optimization cause the termination of the FE simulation, whereas the remaining start values shows slower convergence behavior with generally larger remaining errors (Fig. 9b). Although similar accuracy was achieved with a few parameter sets, more iterations were required. The machine learning-based start value prediction not only identifies the smallest minimum depending on pre-defined parameter ranges, but also directly provides start values that, depending on the quality of training, are already close to the minimum, thereby reducing the number of iterations compared to the use of "manually" determined start values. Furthermore, the ANN can directly predict appropriate start values for new materials, provided that the material behavior is represented by the material model considered in the training.

Fig. 9
figure 9

Comparison of the convergence behavior during optimization of the Lemaitre model using a machine learning-based start value set and randomly selected start value sets. 46 initial parameter sets were selected based on full-factor and random sampling, see Table 13. Convergence was reached in 15 of 46 parameter sets, corresponding to a convergence rate of \(\approx \) 33%. (a) Convergence behavior of some selected start value sets. The start value sets are based on full-factor and random sampling. (b) Probability density \(\delta ^*\) of the final objective function value. The density of the ith bin is calculated using \(\delta ^*_i = f^*_i/(w^* \cdot n^*)\) with the frequency \(f^*_i\) as well as the width \(w_i\) of each bin and the total number of converged start sets \(n^* = 15\). The bin corresponding to the AI start value set is marked green. The solid line represents the normal distribution based on the mean value of 598.44 and the standard deviation of 192.70. (c) Efficiency assessment of each converged start value set using the final number of iterations and the corresponding objective function value. The bin corresponding to the AI start value set is marked green. (Color figure online)

Given the 46 optimizations conducted with different start value sets, it is evident that it is extremely challenging to find a set of initial values that converges and performs similarly well as the AI-determined start value set. The assessment of performance depends on the user-specific weighting of the two parameters "number of iterations" and "remaining error" (Fig. 9c). While in general, small overall bars, meaning a low converged loss value reached with a low number of iterations, are desired, specifics can be case-dependent. If a high quality of the found solution, i.e., a small upper bar, is desired, this might cost more time investment, in the case that better minimum is more challenging to be found. By contrast, if a fast convergence, i.e., a small lower bar, it might come at the cost of a mediocre local minimum, which is quickly reached.

3.5 Analysis of different parameter identification strategies

The previous section obtained accurate starting values for a subsequent single-objective parameter identification based on load–displacement curves referred to as F-strategy. In the following, multi-objective parameter identification strategies are performed with the pre-determined parameter values at hand, and the influence of different experimental data sets is analyzed. For this purpose, besides the forces, the contour data are used. The calibration based on the individual data sets is compared with the Pareto strategy, i.e., a strategy with several objective functions. The strategy that considers only the contour measurements in the objective function is called the r-strategy. The contour data are compared to the simulation at each time step. The evaluations of the different strategies are based on the dimensionless mean square error since this quantity represents the error measure in the parameter identification scheme.

Table 7 Predicted material parameters of the 16MnCrS5 for the GTN model based on the different parameter identification (PI) strategies
Table 8 Dimensionless error of different parameter identification strategies for the predicted material parameters of the 16MnCrS5 for the GTN model
Table 9 Predicted material parameters of the 16MnCrS5 for the Lemaitre model based on the different parameter identification (PI) strategies
Table 10 Dimensionless error of different parameter identification strategies for the predicted material parameters of the 16MnCrS5 for the Lemaitre model

The identified parameters and the corresponding errors are shown in Tables 7 and 8 for the GTN model and in Tables 9 and 10 for the Lemaitre model. The corresponding load–displacement curves for both material models are given in Fig. 10. At first, the predictions made by using the GTN model are analyzed. The r-strategy exhibits a significant error in predicting force (\(f_F = 285\times 10^4\)), whereas the F-strategy shows a worse performance in contour prediction compared to the other two strategies (\(f_r = 2.85\times 10^{-5}\)). For both experimental data sets, the Pareto strategy demonstrated the lowest overall errors (\(f_\text {Pareto}\) = 0.72). The maximum equivalent plastic strain \(\varepsilon _\text {eq}^\text {p}\) = 1.19 is largest for the F-strategy (Fig. 11). The other maxima deviate by approximately 15%. The predictions of the F- and Pareto strategies show a significantly more pronounced localization behavior than that of the r-strategy, where the area of maximum deformation extends across the entire width of the sample. This behavior is also reflected in the damage distribution (Fig. 11). The level of damage is highest in the prediction using the r-strategy and localization is greatest with the other two strategies.

Similar results can be observed for the Lemaitre model. The force prediction shows no significant differences to the experiments for the F- and Pareto strategy (\(f_F = 183.38\) and \(f_F = 194.10\)). In contrast, the results of the r-strategy differ significantly (\(f_F = 456\times 10^4\)). While the r-strategy predicts the contour best (\(f_r = 1.48 \times 10^{-5}\)), the F-strategy shows the largest error (\(f_r = 2.78 \times 10^{-5}\)). The Pareto strategy shows the lowest error overall and thus the lowest distance to the ideal solution (\(f_{\text {Pareto}} = 0.69\)). All simulations show similar localization behavior regarding the distribution of the equivalent plastic strain (Fig. 12). This region spans almost the entire width of the specimen in all simulations. The levels of equivalent plastic strain show differences of approximately 10%. Despite the relatively homogeneous strain distribution, damage localization is very pronounced (Fig. 12) and shows significant differences of up to 71% for the maximum damage.

Fig. 10
figure 10

Prediction of the load–displacement curve based on different parameter identification strategies for the GTN model (left) and the Lemaitre model (right)

Fig. 11
figure 11

Prediction of field quantities for different parameter identification strategies using the GTN model

Fig. 12
figure 12

Prediction of field quantities for different parameter identification strategies using the Lemaitre model

The aim is to predict the void evolution, as the damage affects the product performance. With a known distribution of damage, the safety reserves can be reduced and the component can be designed lighter. Therefore, the different parameter identification strategies are additionally analyzed concerning their prediction accuracy of the experimental void measurements during the tensile tests.

The monotonic damage decreases with increasing distance from the fracture surface and is correctly captured (Fig. 13). The relative error between simulation and experiment is in the range of 300.0% and 2714.1% and is not qualitatively satisfactorily mapped (Table 11). This is because the damage models directly link the void fraction to the softening of the mechanical properties. The predicted softening in the macroscopic load ranges from 9.0% to 12.6% for the F-strategy. In contrast, the measured void area fractions in the corresponding experiments are much lower (< 0.7%).

Fig. 13
figure 13

Prediction of the void area fraction based on different parameter identification strategies for the GTN model (left) and Lemaitre model (right)

Table 11 Relative errors in predicting void area fraction during forward rod extrusion

3.6 Parameter identification based on void measurements

So far, it has been demonstrated that the models perform most accurately if they predict quantities used in the calibration process. This has been shown for integral data, such as forces and derived field quantities (contour measurements). Since the aim is to predict void fractions, a parameter identification using experimentally determined void area fractions, referred to as D-strategy, is performed in the following. To consider the void measurements of the tensile specimens within the model calibration, the void area fractions are evaluated along a path on the symmetry axis in the longitudinal direction. Simulation and experiment can be compared by using interpolation to the same grid points. The parameters of the nucleation term of the damage evolution of the GTN model can be linked to microstructural properties and directly determined from the experimental void measurements without the need for classic optimization-based parameter identification. Since the GTN model considers the void volume fraction as the damage quantity, it is assumed that the magnitude of the experimentally determined void area fraction is identical to that of the void volume fraction. After the damage-related parameters \(f_\text {n}\), \(S_\text {n}\) and \(\varepsilon _\text {n}\) have been directly determined from the nucleated voids, the parameters associated with plasticity, K, \(\varepsilon _0\) and n, can be identified by using the Pareto strategy.

In contrast to the GTN model, the Lemaitre model links the void fraction directly to the softening by the concept of effective stresses. Therefore, these two quantities have similar value ranges. However, since the measured void fractions (Fig. 4b) are much smaller than the void fractions predicted in the previous simulations (cf. Fig. 13), a simultaneous identification of the damage parameters based on load–displacement curves and void measurements would not be useful. Instead, a two-step calibration approach is chosen. In the first step, damage was not activated, and the plasticity parameters (K, \(\varepsilon _0\), and n) were calibrated based on homogeneous states of deformation (up to the onset of force drop). In the second step, these parameters were fixed, and the damage-related parameters (S, \(\delta \), and \(\beta \)) were calibrated based on inhomogeneous states of deformation. This method allowed for the successful identification of all parameters. For better comparison, the model parameters identified by using the D-strategy as well as the other strategies can be found in Tables 7 and 9 for the GTN and Lemaitre model, respectively.

The D-strategy predicts the experimental void measurements of the case-hardened steel well for both constitutive models (Fig. 14). The relative errors for the GTN and Lemaitre models are 15.95%, and 37.38%, respectively (Table 11). It should be noted that the measured void area fractions are subject to fluctuations due to heterogeneities on the microscale. In an ideal data set, the deviations would be even smaller. The Pareto strategy reproduces the load–displacement curve better than the D-strategy with the Lemaitre model (Fig. 15). However, the difference is not significant. Similar results can be observed for the GTN model. Due to the low void area fraction of the D-strategy, the damage-induced softening is negligible. Accordingly, the possibilities are limited with respect to the representation of macroscopic data.

Fig. 14
figure 14

Prediction of the void area fraction based on the parameter identification scheme using void area fraction within the optimization process (D-strategy) for the GTN model (left) and the Lemaitre model (right)

Fig. 15
figure 15

Prediction of the load–displacement curve based on the parameter identification scheme using void area fraction within the optimization process (D-strategy) for the GTN model (left) and the Lemaitre model (right)

4 Prediction of void area fraction in forward rod extrusion

This section validates the proposed parameter identification strategies that use void measurements of forward extruded components. First, the experimental setup, execution, and numerical modeling are briefly described. Afterward, the prediction of the void area fractions is presented.

4.1 Experimental setup

In forward rod extrusion, the workpiece made of case-hardened 16MnCrS5 steel is pressed through a die (Fig. 16a). The initial diameter of the workpiece and its length are \(d_0\) = 30 mm and \(l_0\) = 71 mm, respectively. The process was stopped at a shaft length of 80 mm and the part has been ejected from the die. An extrusion strain of \(\varepsilon \) = \(2\,\text {ln}(d_1/d_0)\) = -0.5 was selected for the tests, resulting in a final diameter of \(d_1\) = 23.4 mm. The shoulder opening angle is 2\(\alpha \) = 90 \(^\circ \), and the transition radii in the shoulder are set to \(r_m\) = 3 mm. A punch speed of 10 mm/s was used. The cylindrical workpieces were lubricated with Beruforge 191 (coating lubricant containing MoS2) and pressed at room temperature. Experimental void measurements were carried out in the steady-state area of the extruded shaft along the radius. For this purpose, five measurement points were evenly distributed along the radius. The area of the measured voids was summed and divided by the nominal area of 1.28 \(\text {mm}^2\) to calculate the void area fraction (Fig. 16b).

Fig. 16
figure 16

(a) Sketch of forward rod extrusion. (b) Void area fraction along the radius of the extruded shaft after forward rod extrusion

The methods described in Section 3 were used for all measurements. The largest void evolution can be observed on the central axis. The corresponding void area fraction is approximately 0.0275%. As the radius increases, the void area fraction decreases and falls below the initial value of \(D_0\) = 0.0066%, implying void closure. At the outer edge of the component, the void area fraction then slightly increases again to 0.0083%. This observation aligns with the pattern of local stress triaxiality, which reaches its maximum at the center and decreases toward the edge.

4.2 Finite element model

The forward rod extrusion process is simulated with an axisymmetric 2D model with four-node quadrilateral elements, reduced integration, and extended hourglass control (CAX4R). The workpiece is homogeneously discretized with 20000 elements of an element edge length of 0.2 mm in the initial state. The die is modeled as an elastic body with the same Young’s modulus as the workpiece of E = 210 GPa. By comparing the punch forces in the experiment and simulation, the coefficient of friction was determined to be m = 0.04 [30]. The process is assumed to be isothermal. A void-free material in the initial state was assumed for the following numerical investigations.

4.3 Void evolution in forward rod extrusion

The damage prediction using the GTN model is for all strategies below the initial void area fraction (Fig. 17). In the GTN model, void growth is defined through the trace of the plastic deformation (cf. eq. (15)) and thus dependent on hydrostatic stresses. Contrary to the experimental observations, this term leads to void closure instead of void growth across the extruded cross section caused by the high hydrostatic pressures during extrusion. The relative errors compared to the experiment are in the range of 99.5 to 110.4% (Table 12). However, these predictions are not meaningful as the simulations are based on incorrect mechanisms and predict only void closure, not growth.

The Lemaitre model cannot capture the effect of void closure. Therefore, experimentally determined values that are found to be below the initial void area fraction of the as-received material are set to the value of the initial void area fraction. The macroscopic strategies for the Lemaitre model qualitatively predict the void evolution (Fig. 18). The void fraction is largest on the central axis of the extruded billet and decreases in radial direction to the outer surface. Finally, the void fraction is zero in a certain range and increases again toward the outer surface. Despite the well-predicted level of damage, there are strong differences in magnitude, leading to errors ranging from 5420.5–12854.6% (Table 12). The prediction of the Lemaitre model with the D-strategy provides a good qualitative agreement with the experiment (Fig. 19). A significant void growth on the central axis and a flattening with increasing distance can be observed. In contrast, void growth toward the outer surface is moderately overestimated. In terms of magnitude, void growth near the central axis is underestimated. The error is 84.0%, which is significantly lower (2413.0% to 15817.4%) than for the macroscopic strategies. The error of the D-strategy can be attributed to errors in the underlying void measurements, i.e., deviations in sample preparation and measurement position, and to differences between the stress states in the notched tensile test and in forward rod extrusion. On the center line of the tensile specimen, the averaged triaxiality is approximately in the range of \(\eta \) = 0.2 to \(\eta \) = 1.0. In forward rod extrusion, this range is between \(\eta \) = 0.0 and \(\eta \) = 1.5. A possible improvement could be to introduce calibration tests with shear- and compression-dominated loading states.

Table 12 Relative errors in predicting void area fraction during forward rod extrusion
Fig. 17
figure 17

Prediction of the void area fraction along the radius of the forward extruded shaft based on different parameter identification strategies for the GTN model

Fig. 18
figure 18

Prediction of the void area fraction along the radius of the forward extruded shaft based on different parameter identification strategies for the Lemaitre model

Fig. 19
figure 19

Prediction of the void area fraction along the radius of the forward extruded shaft based on the D-strategy for the Lemaitre model

5 Conclusion and outlook

In this study, we developed an optimization framework incorporating a machine learning approach for start value prediction with subsequent multi-objective parameter identification to calibrate damage models efficiently for accurate damage predictions.

As model complexity increases, the number of model parameters to be identified also rises, making the determination of appropriate start values for optimization-based parameter identification a challenging task. An ANN has therefore been employed to ascertain suitable start values. While in the previous contribution of Schulte et al. [3] the ANN training was limited to data of homogeneous loading states, our approach directly exploits the boundary value problem of a notched round tensile test of the case hardening steel 16MnCrS5 in order to generate training data. Even though it requires a higher computational effort, it is acceptable since the ANN only has to be trained once per constitutive model to provide start values for different materials. Consequently, the ANN predicted accurate start values, which already yield a close approximation to the experimental load–displacement curve of the BVP. The key benefit of this approach is that the user does not need a profound understanding of the model formulation to determine suitable start values. With the subsequent parameter identification, the prediction of the load–displacement curve was further improved. However, the sole use of integral data in the optimization is insufficient for the prediction of damage [2]. Nevertheless, in literature, most optimization approaches are still limited to integral data such as forces and field data such as displacement and strain fields. Furthermore, field data are only considered when flat specimens are involved. Hence, the deformation of bulk specimens and micro-mechanical mechanisms cannot be depicted sufficiently; thus, damage in the sense of void fraction cannot be predicted. We incorporated contour data derived from field quantities and high-resolution void measurements to overcome these shortcomings in multi-objective parameter identification.

Different types of input data—including load–displacement curves, contour data, and void measurements—were evaluated. It was shown that the values used for calibration were, on average, the most accurately predicted. Relying solely on load–displacement curves resulted in inaccuracies in predicting contour data derived from displacement fields. On the other hand, the exclusive use of contour data provides unrealistic high forces. The Pareto strategy introduced a method that permits the simultaneous use of different data sets without introducing subjective weighting by the user. With this strategy, it is possible to predict the different macroscopic data sets on average better than with the other approaches. However, calibration based purely on macroscopic data sets leads to a substantial overestimation or underestimation of damage in terms of void area fraction. A novel method was developed to address this, facilitating the incorporation of experimental void measurements into the model calibration. The locally determined void area fractions were used for the parameter identification. This necessitated an initial calibration of the plasticity parameters to subsequently adjust the damage evolution parameters. This step-by-step approach is crucial, as the plasticity parameters influence the flow curve and inherently control the void evolution via the yield stress. The strategy based on void measurements (D-strategy) reproduced the experimental data much better.

The calibrated models were validated on void measurements of forward rod extruded parts. The GTN model was not able to qualitatively or quantitatively predict the void evolution during forward rod extrusion. The high hydrostatic pressures during the process caused universal void closure in the simulation. However, the experimental data primarily showed void growth. In contrast to the GTN model, the Lemaitre model reasonably reproduced the trends in void evolution during forward rod extrusion. Even though the parameter identification strategies based on macroscopic data considerably overestimate the void area fractions, the D-strategy provides more reasonable results with a relative error of 84.0% compared to the experiment. Despite the substantial improvement in damage prediction using the D-strategy, assumptions - like the relationship between the void area fraction and the mechanical softening, such as the concept of effective stresses, lead to unavoidable deviations between simulation and experiment. Moreover, the load conditions of forward rod extrusion are not perfectly captured by the calibration process of a notched round tensile test. This suggests that employing a more process-related calibration test enhances the ductile damage prediction.