1 Introduction

Material laws describe the mechanical response between stress and stretch. They play a key role to solving the boundary-value problems in mechanics. The material law can determine if the solved boundary-value problem can match or explain experimental observations and its application in complex loading modes or extreme environment. The prevailing way in the past is to calibrate the material law with experimentally observed data. In this paper, we replace this phenomenological component of the boundary-value problem by a data-driven model.

Recently, Kirchdoerfer and Ortiz [1,2,3], Conti et al. [4], Leygue et al. [5], Chinesta et al. [6] have proposed a new paradigm to bypass the empirical fitting of the material law and formulate the calculation directly from experimental and/or computational data for elastic material and viscoelastic material from quasi-static loading to dynamics loadings. A strategy to minimize the discrepancy between experimental data and predicted response through the optimization is proposed under the constraint in the phase space of both stress and stretch. For multiple-dimensional problems with the rotation of the materials, the convergence is usually slow because the rotation of the material plays an important role. Another issue is that many materials have microstructures spanning several orders in magnitude. For example, the two dimensional materials such as graphene nowadays are often mixed into soft materials such as silicone rubber to increase both toughness and strength. The data obtained from the uniaxial tension and/or compression are not enough to capture the material behaviors in general stress states.

Another approach parallel to those of Ortiz and collaborators is to use the supervised learning to train material laws. This approach can be dated back to the 1990s and categorized into two types. One is that the material model is primary known but the parameters involved in the model needs to be identified. This parameter identification can be carried out by solving an optimization or constrained optimization problem to minimize the objective function, which is usually defined as a metric to measure the discrepancy between the benchmark (usually experimental data) and predictions (usually numerical simulations) [7,8,9]. Machine learning can be used to accelerate the identification process and is also widely employed. We just name a few in our view. Al-Haik et al. [10] developed a model based on an artificial neural network (ANN) to predict the stress relaxation of the polymer matrix composites. Zopf and Kaliske [11] coupled the neural network with microsphere model, which can take into the microstructure of polymer chains into account. The pure elastic response and inelastic material behavior are obtained via Recurrent Neural Network (RNN). For the other type, nothing is known about the material model in prior. Ghaboussi and Sidarta [12] first employed the artificial feed-forward neural network to train their experimental data for the material model. The stress increment is trained with the input of strain increment and the state variables in the previous steps by Nested Artificial Neural Networks (NANN). Their work is limited to the small deformation regime. The training for nonlinearly elastic–plastic material is also proposed and the tangent modulus at the each time step is derived [13]. The authors claim that the derived tangent modulus is independent of the specific material response. However, in finite deformation, objectivity of the material laws and loading–unloading of the material are not considered in their training. The complexity of the microstructured solids inspires the data-driven model to link information across multiple scales via offline training [14,15,16]. The training approach incorporating the microstructural data and direct numerical simulations (DNS) with the representative volume element (RVE) has proposed, aiming at unifying data-driven framework for designing and modeling of materials and structures. A self-consistent clustering analysis (SCA) method is also proposed to reduce computational costs and avoid the curse of dimensionality in the offline training [17,18,19,20,21,22]. SCA is an efficient tool for concurrent analysis on materials with multiscale structures. Wang and Sun [23] has extended the ANN training strategy from the single-physics solid mechanics problem to hydro-mechanical coupling problem of geological materials. These data-driven approachs can accelerate the process for the engineering design. Recently, it is further developed for real-time topology optimization [24].

In practice, it is hard to generate the stress-stretch data under arbitrary deformation modes/paths to construct the material laws. However, it is relatively easy to obtain the experimental or numerically generated data of materials under the principal deformation modes (uniaxial tension, biaxial loading or triaxial loading etc.). The experiments in the principal space are consistent with the spirit of the principal component analysis (PCA) in data analysis. PCA is a technique which is widely used to convert a set of possibly correlated data into a set of linearly uncorrelated data, called principal component [25]. Further, the generated data is usually trained by neural network as a black box. Then building a material law is a pure process from data to data. It is questionable if the requirements for material law established in continuum mechanics such as objectivity are preserved.

Although it is possible to generate the data through physical experiments in principal space, we will demonstrate data generation by numerical experiments with the help of the representative volume element (RVE) approach. The principal stretch/stress is imposed on a representative volume element (RVE) and the principal stress/stretch response is generated. In this paper, we resort to the established mechanics theory to build the material law for finite deformation nonlinear isotropic elastic materials based on the data generated by RVE. The organization of the paper is as follows. In Sect. 2, two widely used models for hyperelastic material is employed to generate the data in principal stretch-stress through RVE: neo-Hookean model [26] and Arruda–Boyce model [27] based on the homogenization theory. We describe in Sect. 3 the mechanics-theory-based supervised machine learning techniques to train the material law. The details on training procedure, the selection of machine learning method and related techniques, are given in “Appendix B”. Section 4 gives the computational results of the trained material laws to simulate the structure in three-dimensional geometry under different loading modes with and without microstructure. Finally a short conclusion is drawn in Sect. 5.

2 Data generation

Heterogeneous material usually consists of different phases. RVE is often used to build the material law for these heterogeneous materials. That is, RVE is used to define the relationship between the imposed homogeneous deformation gradient \(\bar{\mathbf {F}}\) and the homogenous second Piola-Kirchhoff (PK) stress \(\bar{\mathbf {S}}\). The data generated by RVE computation can be trained for material law. Due to the composition of different phases, a constitutive response of each phase should be first given for RVE computations. The readers who are familiar with data generation with RVE can skip this section.

2.1 Homogenization of RVE in principal space

Fig. 1
figure 1

a A material point X with/without microstructure in the deformable body moves from initial configuration to current configuration. A RVE (Representative Volume Element) can be thought as being attached at each material point at the macroscale. b Two RVEs without microstructure and with void to generate the stress-strain data in principal spaces to be trained for material laws

A RVE can be thought as being attached at each material point (X, Y, Z) at the macroscale (Fig. 1a) in the reference configuration. Let \(\Omega _{0}\) be the region occupied by a RVE, consisting of single or multiple phases, in an unstressed reference configuration with bounding surface \(\partial \Omega _{0}\) (Fig. 1b). The RVE is associated with a Cartesian coordinate system with orthogonal frame \(\left\{ \mathbf {e}_{1},\mathbf {e}_{2},\mathbf {e}_{3}\right\} \), which are base vectors in Cartesian coordinates in x, y and z direction respectively. Only widely used cuboid RVE is considered in this work with lengthes of \(L_{x}\), \(L_{y}\) and \(L_{z}\) in x, y and z direction respectively. The detailed derivation of homogenization on the RVE is given in “Appendix A”.

In terms of the displacement vector \(\varvec{u}\), the boundary condition on the RVE can be rewritten as

$$\begin{aligned} \varvec{u}=\left( \lambda _{1}-1\right) X\mathbf {e}_{1}+\left( \lambda _{2}-1\right) Y\mathbf {e}_{2}+\left( \lambda _{3}-1\right) Z\mathbf {e}_{3} \end{aligned}$$
(1)

It can be seen that \(\left( \lambda _{1},\lambda _{2},\lambda _{3}\right) \) are the principal stretches which can be computed based on the imposed displacement on RVE. Correspondingly, the homogenized second PK stress can be computed by

$$\begin{aligned} \bar{\mathbf {S}}=S_{1}\mathbf {e}_{1}\otimes \mathbf {e}_{1}+S_{2}\mathbf {e} _{2}\otimes \mathbf {e}_{2}+S_{3}\mathbf {e}_{3}\otimes \mathbf {e}_{3} \end{aligned}$$
(2)

with

$$\begin{aligned} S_{1}= & {} \frac{1}{\lambda _{1}L_{y}L_{z}}\int _{0}^{L_y}\int _{0}^{L_z}t_{1}dYdZ, \qquad \end{aligned}$$
(3)
$$\begin{aligned} S_{2}= & {} \frac{1}{\lambda _{2}L_{x}L_{z}}\int _{0}^{L_x}\int _{0}^{L_z}t_{2}dXdZ, \end{aligned}$$
(4)
$$\begin{aligned} S_{3}= & {} \frac{1}{\lambda _{3}L_{x}L_{y}}\int _{0}^{L_x}\int _{0}^{L_y}t_{3}dXdY \end{aligned}$$
(5)

Here \(t_{1},\)\(t_{2}\) and \(t_{3}\) are traction forces on the outer boundary of the RVE and can be numerically computed from the nodal reaction force. It has shown by many studies that the proportional ratio of Cauchy stresses can be realized even under the displacement controlled loading [28,29,30,31].

2.2 Data of stress-strain in principal space by RVE

Based on the homogenization theory, the data of stress-strain in principal space can be generated. Two RVEs are used in the present work (Fig. 1b). One is without any microstructure and the other with a void at the center. The one without the microstructure is easy for us to compare with the existing models to verify the accuracy and effectiveness of the proposed method. The material law for a RVE with the void is not known in prior. It can help to further test the predicability of the proposed method.

Table 1 14 and 7 sets of loading paths with constant ratios between Cauchy stress components to generate the data of principal components of the second PK stress and principal stretch components

The reference material models of neo-Hookean model for nonlinear elastic material with \(\mu = 2\), \(D_{m}=0.1\) and Arruda–Boyce model with parameters \(\mu =2\), \(D_{m}=0.1\) and \(\lambda _m = 7\) are adopted in this work. The elastic deformation energy of neo-Hookean model and Arruda–Boyce model are

$$\begin{aligned} W=\frac{\mu }{2}(\bar{I}_{1}-3)+\frac{1}{D_{m}}(J-1)^{2} \end{aligned}$$

and

$$\begin{aligned} W= & {} \mu \left\{ \frac{1}{2}(\bar{I}_{1}-3)+\frac{1}{20\lambda _{m}^{2}}(\bar{I} _{1}^{2}-9)+\frac{11}{1050\lambda _{m}^{4}}(\bar{I}_{1}^{3}-27) \right. \\&\left. +\frac{19}{7000\lambda _{m}^{6}}(\bar{I}_{1}^{4}-81)+\frac{519}{ 673750\lambda _{m}^{8}}(\bar{I}_{1}^{5}-243)\right\} \\&+\frac{1}{D_{m}}\left( \frac{J^{2}-1}{2}-\ln J\right) \end{aligned}$$

respectively where \(\bar{I}_{1}=J^{-\frac{2}{3}}(\lambda _{1}^2+\lambda _{2}^2+\lambda _{3}^2)\), \(J=\det (\mathbf {F})\) and \(\mathbf {F}\) is the deformation gradient. The stress can be derived from the deformation energy based on the classical continuum mechanics [32].

Different proportional loading paths are designed to generate the data by RVE (see Table 1). The loading path can be defined by the controlling parameter R and the ratios \(\bar{\Sigma }_{i}/R, \left( i=1\cdots 3\right) \). Here \(\bar{\Sigma }_{i}\) are the principal Cauchy stress (16). History data of principal components of the second PK stress \(\left( S_{1},S_{2,}S_{3}\right) \) with the principal stretch components \(\left( a_{1},a_{2},a_{3}\right) \) on each loading path can be generated by increasing the controlling parameter R from 0 to \(1.5\mu \). Where \(a_{i}=\lambda _{i}^{2}\left( i=1\cdots 3\right) \) and \(\mu \) is the shear modulus. Each loading history is divided into \(N_{s}\) time steps evenly and \(N_{L}\) loading paths in the principal stress space. \(N_{s}\) is set to 500 and \(N_{L}\) is set to 7 or 14, see Table 1. The data of \(\{a_{1},a_{2},a_{3}\}^{[\alpha ,\beta ]}\) and \(\{S_{1},S_{2},S_{3}\}^{[\alpha ,\beta ]}\) can be generated through Eqs. (1) and (3) respectively where \(\alpha \) represents the loading path while \(\beta \) represents the loading step. It should be commented here that the loading path should be evenly distributed in the stress space as far as possible. The 14 loading cases in the Table 1 may not be the optimal one, but it can train an isotropic material models effectively according to our numerical experiences.

Fig. 2
figure 2

The visualization of data generation of \(\{a_{1},a_{2},a_{3}\}\) and \(\{S_{1},S_{2},S_{3}\}\) through RVE for a loading path ID 3; b loading path ID 7; c loading path ID 14 in Table 1. Yellow color represents the original configuration of RVE and red color represents the configuration of RVE at the time step when the controlling loading parameter \(R=1\). (Color figure online)

Fig. 3
figure 3

The generated data of \(\{a_{1},a_{2},a_{3}\}\) and \(\{S_{1},S_{2},S_{3}\}\) versus the controlling loading parameter of through RVE for a loading path ID 3; b loading path ID 7; c loading path ID 14 in Table 1. 500 time steps are adopted for data sampling. For the showing purpose, only sampled data at a few time steps are shown

Figure 2 visualizes how the data \(\left( a_{1},a_{2},a_{3}\right) \) and \(\left( S_{1},S_{2,}S_{3}\right) \) for loading path ID 3, 7 and 14 in Table 1 at different time steps can be obtained through RVE computation according to Eqs. (1) and (3). Figure 3 shows the generated data of \(\left( a_{1},a_{2},a_{3}\right) \) and \(\left( S_{1},S_{2,}S_{3}\right) \) versus the controlling loading parameter R for loading path ID 3, 7, and 14 respectively.

3 Data training and on-line computation

3.1 Data training in principal space by neural networks

Through RVE computation, the data of \(\{a_{1},a_{2},a_{3}\}^{[\alpha ,\beta ]}\) and \(\{S_{1},S_{2},S_{3}\}^{[\alpha ,\beta ]}\) in the principal space can be generated. Here \(\alpha \) represents the loading path (\(\alpha =1\cdots N_{L}\)) and \(\beta \) the time step of loading history (\(\beta =1\cdots N_{s}\)). The data of \(\{a_{1},a_{2},a_{3}\}^{[\alpha ,\beta ]}\) and \(\{S_{1},S_{2},S_{3}\}^{[\alpha ,\beta ]}\) can be stored in three-dimensional arrays \(a[\alpha ,\beta ,k]\) and \(S[\alpha ,\beta ,k]\). The dimension of both arrays is \(N_{L}\times N_{s}\times 3\).

These data will be used to train a data-driven model between \(\{a_{1},a_{2},a_{3}\}\) and \(\{S_{1},S_{2},S_{3}\}\) by neural network. It should be commented here that the generated data on each loading path is highly correlated, but data on different load paths is uncorrelated. Therefore, the data from 14 loading paths are evenly distributed in the stress space, which can meet the i.i.d. assumption for Artificial Neural Network (ANN) training conditions.

Various types of neural networks are proposed in the past years. We adopt a standard multi-layer ANN to train the data, which is shown in Fig. 4. This neural network includes an input layer (\(c^{1}\)), three hidden layers (\(c^{2}\), \(c^{3}\) and \(c^{4}\)) and output layer (\(c^{5}\)). The input layer, the three hidden layer and output layer have \(N_{input}\), \(N_{hidden}\) and \(N_{output}\) neurons. For neural network shown in Fig. 4. \(N_{input}\), \(N_{hidden}\) and \(N_{output}\) are 3, 6 and 3 respectively. \(W_{ij}^{n+1}\) are the weights for the link between \(i\hbox {th}\) neuron on layer \(c^{n}\) and \(j\hbox {th}\) neuron on layer \(c^{n+1}\). \(b_{j}^{n+1}\) are the biases on \(j\hbox {th}\) neuron on layer \(c^{n+1}\). Here, the superscript n represents the layer number (\(n=1\cdots N\)). N is the total number of the layers of ANN (the input layer is excluded in this definition and \(N=4\) in the present work).

Fig. 4
figure 4

Artificial neural network to train the stress-stretch data in principal space with 3 inputs, 3 hidden layers with the same number of neurons and 3 outputs. The weights \(\mathbf {W^{n}}\) and biases \(\mathbf {b^{n}}\) and its components \(W_{ij}^{n}\) and \(b_{j}^n\) (\(n=2\cdots 5\)) are marked. The principal stress can be obtained by the input arbitrary principal stretch after training

Let \(S^{N}_{\alpha ,\beta ,k}\left( k=1\cdots 3\right) \) denote the output principal stress of the ANN with input principal strain \(a[\alpha ,\beta ,k]\). If ANN has three layers (one input layer \(c^{1}\), one hidden layer \(c^{2}\) and one output layer \(c^{5}\)), the principal stress predicted by ANN can be written:

$$\begin{aligned} S^{N=2}_{\alpha ,\beta ,k}=\tanh \left( a[\alpha ,\beta ,m]W_{mi}^{2}+b_{i}^{2}\right) W_{ik}^{5}+b_{k}^{5} \end{aligned}$$
(6)

in which \(m=1\cdots N_{input}\), \(i=1\cdots N_{hidden}\) and \(k=1\cdots N_{output}\) and the summation of dummy index should be carried out. \(\tanh \) is the hyperbolic tangent function. And so on, for ANN with five layers (\(c^{1}\cdots c^{5}\)) in the present work, the principal stress predicted by ANN can be written:

$$\begin{aligned} S^{N=4}_{\alpha ,\beta ,k}= & {} \tanh \left( \tanh \left( \tanh \left( a[\alpha ,\beta ,m]W_{mo}^{2}+b_{o}^{2}\right) W_{oi}^{3}\right. \right. \nonumber \\&\left. \left. +\,b_{i}^{3}\right) W_{ip}^{4}+b_{p}^{4}\right) W_{pk}^{5}+b_{k}^{5} \end{aligned}$$
(7)

where \(m=1\cdots N_{input}\), \(o,i,p=1\cdots N_{hidden}\) and \(k=1\cdots N_{output}\) and the summation of dummy index also should be carried out.

The training of the data tries to minimize the distance between the predicted points \(S^{N=4}_{\alpha ,\beta ,k}\) and the generated data points \(S[\alpha ,\beta ,k]\) on all the loading paths and history, which can be written as

$$\begin{aligned} {\mathop {\hbox {argmin}}\limits _{W_{ij}^{2},b_{j}^{2},\cdots ,W_{ij}^{5},b_{j}^{5}}} \sum _{k=1}^{3}\sum _{\alpha =1}^{N_{L}}\sum _{\beta =1}^{N_{s}}\left( S^{N=4}_{\alpha ,\beta ,k}- S[\alpha ,\beta ,k]\right) ^{2} \end{aligned}$$
(8)

by optimizing the weights \(W_{ij}^{n}\) and biases \(b_{j}^{n}\) (\(n=2 \cdots 5\)).

The whole algorithm for training the material law is shown in Table 2. The Neural Fitting Toolbox (nftool) of MATLAB is used for training. The derivation details about the general ANN are given in “Appendix B”.

Table 2 The algorithm for training the material law (Offline learning)

3.2 On-line computation with the trained material law

At any material point, the right Cauchy-Green tensor can be expressed in the principal space:

$$\begin{aligned} \bar{\mathbf {C}}=\text { }a_{i}\mathbf {N}_{i}\otimes \mathbf {N}_{i} \end{aligned}$$
(9)

Here \(a_{i}\)\(\left( i=1\cdots 3\right) \) are the eigenvalues of \( \bar{\mathbf {C}}\) and \(\mathbf {N}_{i}\)\(\left( i=1\cdots 3\right) \) are the eigenvector and the second PK stress can be described in the similar way:

$$\begin{aligned} \bar{\mathbf {S}}=S_{i}\mathbf {N}_{i}\otimes \mathbf {N}_{i} \end{aligned}$$
(10)

where \(S_{i}\)\(\left( i=1\cdots 3\right) \) are the eigenvalues. Defining the second order tensor \(\mathbf {A}_{i}=\mathbf {N}_{i}\otimes \mathbf {N}_{i}\) with no summation on i, the tangent modulus is given by the following:

$$\begin{aligned} \mathbf {C}^{M}= & {} \mathbf {2}\frac{\partial \bar{\mathbf {S}}}{\partial \bar{\mathbf {C}}}=\underset{i=1}{\overset{3}{\sum }}\underset{j=1}{\overset{3}{ \sum }}2\frac{\partial S_{i}}{\partial a_{j}}\frac{\partial a_{j}}{\partial \bar{\mathbf {C}}}\mathbf {N}_{i}\otimes \mathbf {N}_{i}+\underset{i=1}{\overset{3}{ \sum }}2S_{i}\frac{\partial \mathbf {N}_{i}\otimes \mathbf {N}_{i}}{\partial \bar{\mathbf {C}}} \nonumber \\= & {} \underset{i=1}{\overset{3}{\sum }}\underset{j=1}{\overset{3}{\sum }}2 \frac{\partial S_{i}}{\partial a_{j}}\mathbf {A}_{i}\otimes \mathbf {A}_{j} \nonumber \\&+2\sum _{i\ne j,i\ne k}^{3}S_{i}\left( \frac{\mathbf {A}_{i} \mathbf {A}_{j}^T+\mathbf {A}_{j} \mathbf {A}_{i}^T}{a_{i}-a_{j}}+\frac{ \mathbf {A}_{i} \mathbf {A}_{k}^T+\mathbf {A}_{k} \mathbf {A}_{i}^T }{a_{i}-a_{k}}\right) \nonumber \\ \end{aligned}$$
(11)

where \(\otimes \) is dyadic symbol for vectors. The derivation of \(\left( \frac{ \partial \mathbf {N}_{i}\otimes \mathbf {N}_{i}}{\partial \bar{\mathbf {C}}}\right) \) can refer to Rosati and Valoroso [33] and Tang et al. [34]. When \(a_i\) approaches \(a_j\), it looks that Eq. (11) leads to singularity. Then Eq. (11) should be computed in terms of limitation. The tangent modulus is composed of two terms, one corresponding to the derivatives of the principal stress with respect to principal stretch; the other corresponding to the spin of the principal axes.

After the data training by ANN, \(\left( S_{m},m=1\cdots 3\right) \) are given by the implicit function:

$$\begin{aligned} \left[ S_1,S_2,S_3\right] =f(a_1,a_2,a_3;{\mathbf{W,b}}) \end{aligned}$$
(12)

which are used to update the stress in on-line computations (ref. Eq. 23). Here it should be noted that \((a_1,a_2,a_3)\) are arbitrary for on-line computation and (W,b) are known by the ANN training of the data. In practice, the switch of the stretch components \(a_{i}\) and \(a_{j}\) may not lead to the switch of the stress components \(S_{i}\) and \(S_{j}\) computed by neural network (the detailed discussion is given in “Appendix B”). Then permutation of \(\left( a_{i},a_{j},a_{k}\right) \) is carried out to compute the stress \(\left( S_{i},S_{j},S_{k}\right) \). Equation (12) is computed six times by permutation of \((a_1,a_2,a_3)\).

$$\begin{aligned} \left[ S_1^1,S_2^1,S_3^1\right]= & {} f(a_1,a_2,a_3;{\mathbf{W,b}})\\ \left[ S_2^2,S_3^2,S_1^2\right]= & {} f(a_2,a_3,a_1;{\mathbf{W,b}})\\ \left[ S_3^3,S_1^3,S_2^3\right]= & {} f(a_3,a_1,a_2;{\mathbf{W,b}})\\ \left[ S_1^4,S_3^4,S_2^4\right]= & {} f(a_1,a_3,a_2;{\mathbf{W,b}})\\ \left[ S_3^5,S_2^5,S_1^5\right]= & {} f(a_3,a_2,a_1;{\mathbf{W,b}})\\ \left[ S_2^6,S_1^6,S_3^6\right]= & {} f(a_2,a_1,a_3;{\mathbf{W,b}}) \end{aligned}$$

Then the results from the computation of six times are summed up and average to get the final \(S_{i}\) and \(\frac{\partial S_i}{\partial a_j}\) (ref. Eq. 24):

$$\begin{aligned}&S_i=\frac{1}{6}\sum \limits _{m=1}^6 S_i^m \nonumber \\&\frac{\partial S_i}{\partial a_j}=\frac{1}{6}\sum \limits _{m=1}^6\left( \frac{\partial S_i}{\partial a_j}\right) ^m \end{aligned}$$
(13)

Then \(\mathbf {S}\) and \(\mathbf {C^{M}}\) can be computed based on the above results. The whole algorithm for applying the trained material law to simulate the deformation of structures is shown in Table 3.

Remarks

  1. i.

    In the above, we only show how to derive tangent modulus \(\mathbf {C^{M}}\). According to Belytschko et al. [32], pushing forward of \(\mathbf {C^{M}}\)can obtain tangent modulus based on Truesdell rate:

    $$\begin{aligned} C_{ijkl}^{T}=\frac{1}{J}C_{mnpq}^{M}F_{im}F_{jn}F_{kp}F_{lq} \end{aligned}$$

    The tangent modulus based on Jaumann rate can be obtained through further transformation:

    $$\begin{aligned} C_{ijkl}^{J}=C_{ijkl}^{T}+\frac{1}{2}\left( \delta _{ik}\sigma _{jl}+\delta _{ij}\sigma _{jk}+\delta _{jk}\sigma _{il}+\delta _{jl}\sigma _{ik}\right) -\delta _{ij}\sigma _{kl} \end{aligned}$$
  2. ii.

    The proposed approach can greatly reduce the computational cost for data training. In the previous works such as Hashash et al. [13], the history of all the components of stretch are used in the training. Wang and Sun [23] train their model with the history of the principal strain and history of the incremental rotation. The introduction of incremental rotation is used to resolve the issue for the objectivity of the material laws. It can be seen that our derivation is still within the classical framework. Only mathematical form in the continuum mechanics theory is replaced by the trained data. It can preserve the objectivity of material law approximately. This will be discussed next in the section of numerical examples.

Table 3 The algorithm for applying the trained material law to simulate the deformation of structures (Online computation)

4 Numerical examples

The numerical algorithm for both data training and online finite element computation is shown in Tables 2 and 3. The predictions by the ANN trained model will compare with those by the reference neo-Hookean or Arruda–Boyce model. In all our examples, we use the consistent unit of measurement. The unit of length is mm; force is N; bending moment is \(\hbox {N}\cdot \hbox {mm}\); stress, pressure and modulus are MPa.

Fig. 5
figure 5

The finite element analysis on a rectangular plate with a circular hole. a The geometric model and boundary conditions of the voided plate. b The load-displacement curves for the voided plate, predicted by the ANN trained model and the neo-Hookean model. The contour plots of effective stress by c the ANN trained model and d the reference neo-Hookean model under the same levels of imposed displacement 10. The FEM model includes 2348 nodes and 1093 elements

4.1 The material law by RVE without microstructure

Fig. 6
figure 6

Difference comparison between the results obtained from the ANN trained model and reference neo-Hookean model. Frequency histogram of the mechanical states versus the relative difference between two models on the effective stress and the maximum logarithmic strain in (a) and (b) respectively. The mean (E) and variance (\(\sigma ^2\)) of the difference are marked in the figure

We first show the results for the material law trained by RVE without microstructure. Because the RVE is without any microstructure, therefor the material response of RVE should be the same as the material model chosen for RVE analysis.

A rectangular plate with a circular hole of radius 10 at the center under the tensile loading is investigated first under plane stress conditions. The geometric setup is shown in Fig. 5a. The mesh is refined around the hole. A displacement of 10 is applied on the right edge and the left edge is fixed in the x direction. Both the neo-Hookean model and the ANN trained model are employed (The ANN model is trained by the data generated with 14 loading paths and 500 time steps of load). Figure 5b shows the load-displacement curves for both models. A very good agreement between the two models’ prediction is observed. Figure 5c, d plots the contour of effective stress predicted by the two models. It is hard to distinguish the difference obtained by the two models. Figure 6a, b shows the statistical data to compare the perdition by both models. Figure 6a counts the frequency of the states versus the relative difference between two models on the effective stress. The average relative difference for two models is around \(0.76\%\). Figure 6b counts the frequency of the state versus the relative difference between two models on the maximum logarithmic principal strain \(E_{N}^{\max }\). The average relative difference for two models is around \(0.20\%\).

Fig. 7
figure 7

The finite element analysis on a cuboid bar under the imposed torsion. a The geometric model and boundary conditions. b The contour plots of effective stress for the ANN trained model and the reference Arruda–Boyce model. c Frequency histogram of the mechanical states versus the relative difference between two models on the effective stress and the maximum logarithmic principal strain. The mean (E) and variance (\(\sigma ^2\)) of the difference are marked in the figure. The FEM model includes 4352 nodes and 3283 elements

We then use the ANN trained model to predict the mechanical response for a three-dimensional problem where a torque (\(M=500\)) is imposed on cuboid beam on the one end of surface and the beam is fixed on the other end. The torque is applied at a reference point, and the reference point is coupled with the end surface. The geometric setup is given in Fig. 7a. Figure 7b plots the contour of effective stress for the ANN trained model and the reference Arruda–Boyce model. Same as the two dimensional problem, there is a tiny difference between two models. This tinty difference can be identified through the statistical analysis on the relative difference of effective stress and the maximum logarithmic principal strain. The average error for effective stress and maximum logarithmic strain is around \(0.12\%\) and \(1.27\%.\) It should be emphasized that the large rotation of body exists in this example. The agreement between the ANN trained model and the reference Arruda–Boyce implies the objectivity of the ANN trained model.

Fig. 8
figure 8

The finite element analysis on a cuboid bar with voids under the combined torsion and bending. a The geometric model and boundary conditions. b The load-displacement curves predicted by the ANN trained model and the neo-Hookean model. c The contour plots of effective stress predicted by the ANN trained model (14 and 7 data-sets) and Arruda–Boyce model. The FEM model includes with 4880 nodes and 3711 elements

Fig. 9
figure 9

Frequency histogram of the mechanical states versus the relative difference on the effective stress and the maximum logarithmic strain with the ANN trained model by a 14 data-sets and b 7 data-sets. The mean (E) and variance (\(\sigma ^2\)) of the difference are marked in the figure

A more complicated example is studied. The combined force (\(F=4.5\)) and torsion (\(M=500\)) is imposed on cuboid beam on the one end of surface and fixed on the other end. The geometrical setup is shown in Fig. 8a. A through-hole with radius 5 at the center of the specimen is introduced. The material law is trained by ANN using 7 or 14 data-sets with the neo-Hookean model. Figure 8c plots the contour of effective stress of the reference neo-Hookean model and the ANN trained model with 7 or 14 data-sets. It can be seen from Fig. 8b there is still no distinct difference between the ANN trained model and the reference neo-Hookean model even under the combined complicated loading conditions (see the statistical data in Fig. 9). It should be noted that the ANN trained model with 7 data-sets is almost the same as that trained by 14 data-sets. It looks that the material law trained by 7 data-sets can cover the full range of stress-states. This can greatly reduce the offline training costs.

Fig. 10
figure 10

Finite element buckling analysis on a cuboid beam with holes. a The geometry and boundary conditions. b The order of buckling mode versus relative difference in terms of buckling force and buckling mode (morphology) predicted by the reference neo-Hookean model and the ANN trained model. The FEM model includes 4880 nodes and 3711 elements

Fig. 11
figure 11

The buckling modes (morphology) of the cuboid beam through buckling analysis predicted by the neo-Hookean model and the ANN trained model. The first 5 modes are shown for comparison

Buckling of soft solids recently attracts a lot of research attention [34,35,36,37]. Buckling of soft solids created many new opportunities to design the materials with complex microstructures to realize specific functions. Here we show an example to analyze the buckling with microstructure with the ANN trained model. The geometrical setup of the problem is shown in Fig. 10a. The specimen is cuboid with a through-hole with radius 5 at the center. The cuboid is under compressive loadings. As predicted by the stability theory Timoshenko [38], the cuboid loses stability when the applied compressive loading is beyond threshold of the critical load. Buckling analysis is used to predict the critical load and mode (morphology). To measure the difference between the two modes (morphology), a L2-norm is defined though \(u_i^j\) which is the displacement of node i at the degree of freedom j:

$$\begin{aligned} Norm=\sqrt{\sum _{i=1}^{N_{node}}\sum _{j=1}^{3} (u_{i}^{j})^2} \end{aligned}$$

where \(N_{node}\) is the total number of nodes in finite element model. The relative difference predicted by the ANN trained model and the reference neo-Hookean model for different buckling modes is shown in Fig. 10b. The relative difference of the first four modes is less than \( 1\%\), which increases with mode number. However, high-order modes are rarely considered in engineering applications. The first to fifth buckling modes are shown in Fig. 11. The agreement between the ANN trained model and the reference neo-Hookean model is shown clearly.

4.2 The material law by RVE with void

Fig. 12
figure 12

The finite element analysis on a cuboid beam under the imposed force. The beam contains 40 spherical voids, which are evenly distributed. a The geometric model and boundary conditions. b The FEM model for direct numerical simulations (DNS) with 50,702 nodes and 35,621 elements. c The FEM model with the voids smeared out (the material is described by the ANN trained material law). It involves 40 elements

In the previous examples, we show the capability of the proposed method based on the material law trained by the RVE without the microstructure. We can forecast that the material model trained by the RVE without microstructure should almost the same as the adopted neo-Hookean model or Arruda–Boyce model. However, we do not know the material model for a void in the neo-Hookean solid in prior. Then we will use the ANN trained material law based on the data generated by the RVE with void to show the predicative capability.

Let us consider a three-dimensional problem that a cuboid beam which one end surface is imposed a coupling constraint with a reference point and the other end is fixed. A force (\(F=0.3\)) is imposed on the reference point. The geometric setup is given in Fig. 12a. The beam contains 40 spherical voids with the same size of void in the RVE, which are evenly distributed. We will use direct numerical simulation (DNS) with neo-Hookean model to solve the problem first. Then the spherical voids are smeared out inside the beam and the material law is described by the ANN trained model. The DNS involves 50,702 nodes and 35,621 elements to resolve all the voids in the beam (Fig. 12b). The FEM mesh smearing out the voids only has 40 elements shown in Fig. 12c.

Fig. 13
figure 13

The deformed configuration predicted by the DNS and the FEM model with voids smeared out in which the ANN trained material law is used to describe the material behavior

Fig. 14
figure 14

The force-displacement curve predicted by the DNS and the FEM model with voids smeared out in which the ANN trained material law is employed to describe the material behavior

Figure 13 plots the deformed configuration of DNS and the ANN trained model in the same coordinates at the final step of the imposed loading. The displacement scale is 1. It can be seen clearly that the the deformed shape predicted by the DNS is also the same as that predicted by the FEM model with the ANN trained isotropic material law even under the large bending deformation. Figure 14 shows the force versus displacement curve of the reference point for both DNS and the FEM model with the ANN trained material law. The results predicted by both models are almost the same. The largest difference is less then \(1\%\). We also compare the residual norms for both models during the iteration process for time step 1, shown in Table 4. It is observed that the classical neo-Hookean model converges faster than the ANN trained model as expected. However, the ANN trained model also can converge at the same accuracy with two more iterations. Because of the consistent tangent modulus is derived, the second order convergence with the ANN trained material law is implied. Due to the less elements used by the ANN trained model, the ANN trained model shows some advantages. It should be commented that this example with a single centered void is slightly orthotropic. However, isotropic assumption is widely used to approximate slightly anisotropic behavior. Because of the multiple voids and their even distribution in the beam, it makes the porous material nearly isotropic. This is the reason why the trained model for isotropic materials can compare with DNS very well. The spirit of the proposed method can be extended to consider the anisotropic nonlinear elastic solids but further work should be carried out.

Table 4 Comparison of residual norm for the ANN trained model and the reference neo-Hookean model at time step 1 (time increment \(5\%\) strain) for the voided beam problem shown in Fig. 12

Finally, the mesh information for all the above examples is summarized in Fig. 15.

Fig. 15
figure 15

The summarized mesh information for all the examples shown in the present paper. Corresponding to a Fig. 5; b Fig. 7; c Fig. 8 and Fig. 10. d Cross-section view corresponding to Fig. 12

5 Conclusions

With advent of big data science and machine learning, it is possible to obtain the material law through a data-driven approach. In this work, we have presented an efficient data-driven computational framework to build material law for nonlinear isotropic elastic materials based on the principal component expansion. Carefully designed RVE based on principal stretches is used to generate the stress-stretch data, which greatly reduces the required training cost for material law and obtained a high quality model. Our framework can satisfy the requirements for material law such as objectivity approximately. With the derived consistent tangent modulus based on the data in principal space, the second-order convergence capability is implied. The proposed approach can be used under the multiscale computational homogenization framework naturally. It can provide a way to obtain the material law involved at different scales effectively by pure data.