1 Introduction

Image-based modeling and simulations of polycrystalline microstructures, using crystal plasticity finite element or CPFE models, are effective methods for determining microstructure-property relationships. The CPFE models capture details of microstructural features, e.g. crystallographic orientations, misorientations, grain morphology and their distributions and provide a platform for understanding various deformation and failure mechanisms such as the nucleation and propagation of twins and micro-cracks [17]. Image-based CPFE modeling commonly involves computer generation of virtual polycrystalline microstructures from experimental data, followed by discretization into finite element meshes. The polycrystalline microstructures of many metals and alloys are quite complex with sharp and tortuous grain boundaries and multiple grain junctions. Discretization of these domains is best accomplished using three-dimensional four-node tetrahedral or TET4 elements, which conform to the complex geometries [8]. However, it has been commonly observed e.g. in [913] that TET4 elements suffer from severe volumetric locking when simulating deformation of incompressible or nearly incompressible materials. A metric that is used to understand element performance for incompressible or nearly incompressible deformations is termed as the incompressibility constraint ratio. It is defined as the ratio of number of available degrees of freedom (DOF) to the number of incompressibility constraints in a finite element mesh. Low incompressibility constraint ratio associated with TET4 elements can lead to large spurious hydrostatic stresses in models of plastically deforming metallic materials. This volumetric phenomenon is commonly ignored by most CPFE modelers who have been focused on the development of constitutive laws. This paper aims at developing stable, locking-free TET4 element formulations for efficient and accurate crystal plasticity finite element modeling and simulations.

A variety of methods has been proposed for the stabilization and control of volumetric locking in TET4 elements. A major idea in these methods is to associate nodal points with patches corresponding to an assembly of surrounding sub-elements, and subsequently to integrate the weak form over these patches, thus reducing the incompressibility constraint ratio. An average nodal pressure technique has been proposed for dynamic explicit formulations in [14], where the volumetric strain energy is integrated over the patch for each node. In [10], a node-based uniform strain (NUS) formulation is introduced for four-node tetrahedral elements associated with linear elasticity problems. The volumetric and deviatoric strain energy components are integrated over nodal patches in this formulation. Spurious zero energy modes were identified with this approach in [15], and consequently an additional stabilization term with a modified constitutive law was added to the potential energy functional. This approach was further extended in [16] into a locally integrated weighted strain formulation, where numerical integration is done at local Gauss points instead of nodes. In [11], the fact that instability is linked only to the isochoric strain energy contribution was exploited through a stress splitting operation, to stabilize the formulation in [10]. A generalized node-based, smoothed finite element method (NS-FEM) has been proposed in [17] that adopts an arbitrary polygonal element domain discretization. This method provides an upper-bound solution for the strain energy and is shown to reduce to the formulation in [10] for the special case of linear tetrahedral elements. The strain smoothing operation in NS-FEM is later extended to edge-based smoothed finite element method (ES-FEM) [18, 19] and face-based smoothed finite element method (FS-FEM) [20]. The above methods are however not suitable for anisotropic crystal plasticity finite element formulations, since the stress or the elasto-plastic tangent stiffness tensor cannot be split into isochoric and deviatoric components. An element formulation with a F-bar patch method has been introduced in [12, 13] to overcome volumetric locking in TET4 elements for finite deformation problems. The original F-bar formulation in [21] was developed for four-node quadrilateral and eight-noded hexahedral elements. This simple and effective model can be used for any constitutive law and is easily implemented in any standard displacement-based finite element code. Other competing strategies in developing locking-free linear tetrahedral elements include stabilizing NUS formulation with additional higher order support function [22], and mixed enhanced elements [9] in which additional augmentation strain fields are used in conjunction with a linearly interpolated pressure field to treat the incompressibility constraints. Volume and area bubble functions have been added to mixed tetrahedral elements in [23, 24] to stabilize the displacement and strain fields.

The present paper develops and examines three locking-free stabilized finite element formulations in the context of crystal plasticity finite element or CPFE analysis. They include a node-based uniform strain (NUS) element, a locally integrated B-bar (LIB) based element and a F-bar patch (FP) based element. The locally integrated B-bar element is based on splitting of the gradient operator matrix \(\mathbf {B}\) for TET4 elements. It selectively reduces the volumetric strain over a nodal patch and keeps the deviatoric strain unchanged in each TET4 element. The paper compares results with the different methods and provides a guideline for conducting reliable CPFE analysis. The CPFE formulation in an updated Lagrangian framework is briefly reviewed in Sect. 2. In Sect. 3, the three locking-free finite element formulations are described, while their implementation for large deformation CPFE problems are detailed in Sect. 4. Comparison of results, including patch tests, elastic bending problems, bicrystal and polycrystal CPFE simulations, are conducted in Sect. 5. The computational efficiency of these formulations are compared in Sect. 6 and concluding remarks are made in Sect. 7.

2 Finite deformation crystal plasticity FE formulation

The finite element weak form of equilibrium equations for a body undergoing finite deformation is obtained by taking the product of the governing equations with a weighting function and integrating over the volume in the current or reference configuration. In an incremental formulation and solution process, where a typical time step transcends discrete temporal points t and \(t+\triangle t\), the principle of virtual work for a quasi-static process at time \(t+\triangle t\) occupying the domain \(\varOmega ^{t+\triangle t}\subset \mathcal {R}^{3}\) is written as [25]:

$$\begin{aligned}&\int _{\varOmega ^{t+\triangle t}}(\nabla \delta \mathbf {u}^{t+\triangle t})\!:\!\varvec{\sigma } \, d\varOmega ^{t+\triangle t}\!=\!\int _{\varOmega ^{t+\triangle t}}\delta \mathbf {u}^{t+\triangle t}\cdot \mathbf {b}\, d\varOmega ^{t+\triangle t} \nonumber \\&\quad +\int _{\varGamma ^{t+\triangle t}_{\sigma }}\delta \mathbf {u}^{t+\triangle t}\cdot \bar{\mathbf {t}}\, d\varGamma ^{t+\triangle t}_{\sigma } \qquad \forall \delta \mathbf u ^{t+\triangle t}\in \varvec{\mathscr {U}} \end{aligned}$$
(1)

where \(\varvec{\sigma }\) is the Cauchy stress tensor, \(\mathbf u\) is the displacement field and \(\mathbf {b}\) is the body force per unit volume. The test function \(\delta \mathbf{u}=\delta u_{i}\mathbf {e}_{i}\) is defined in the space \(\mathscr {U}\) of virtual displacements, i.e.

$$\begin{aligned} \mathscr {U}=\left\{ \delta u^{t+\triangle t}_{i}\mathbf {e}_{i}\in H^{1}({\varOmega }),\;\delta \mathbf {u}^{t+\triangle t}=\mathbf {0}\;\text {on}\;{\varGamma }_{u}\right\} \end{aligned}$$
(2)

where \(\mathbf e _i, \; i=1,2,3\) are the orthogonal unit basis vectors. The time dependent boundary conditions are:

$$\begin{aligned} \varvec{\sigma } \cdot \mathbf n =\bar{\mathbf{t }} ~~~\mathrm{on}~~ \varGamma _{\sigma } ~~~~~~ \text {and} ~~~~~~ \mathbf u =\bar{\mathbf{u }} ~~~\mathrm{on}~~ \varGamma _{u} \end{aligned}$$
(3)

Here \(\bar{\mathbf{t }}\) and \(\bar{\mathbf{u }}\) are time-dependent prescribed quantities on the traction boundary \(\varGamma _{\sigma }\) and displacement boundary \(\varGamma _{u}\) respectively, where \(\varGamma =\varGamma _{\sigma } \bigcup \varGamma _{u}\), and \(\mathbf n\) represents the outward unit vector normal to \(\varGamma _{\sigma }\). An updated Lagrangian formulation is developed in this work [25], where the reference configuration for integrating the weak form corresponds to that at the beginning of the time step, i.e. at time t. In this formulation, the weak form in equation (1) reduces to:

$$\begin{aligned} \int _{\varOmega ^t} \delta \mathbf {E}^{t+\triangle t}_{t}: \mathbf {S}^{t+\triangle t}_{t} \, d\,\varOmega ^{t}=\, R^{\mathrm{ext}\,^{t+\triangle t}} \end{aligned}$$
(4)

where

$$\begin{aligned}&\mathbf {u}^{t+\triangle t}=\mathbf {u}^{t}+ \triangle \mathbf {u} \nonumber \\&\quad \text {Incremental displacement} \end{aligned}$$
(5a)
$$\begin{aligned}&\mathbf{E}_t^{t+\triangle t}=\frac{1}{2}\left( \frac{\partial \triangle \mathbf {u}}{\partial \mathbf{x}^t}+\left( \frac{\partial \triangle \mathbf {u}}{\partial \mathbf{x}^t} \right) ^T+\left( \frac{\partial \triangle \mathbf {u}}{\partial \mathbf{x}^t}\right) ^T \frac{\partial \triangle \mathbf {u}}{\partial \mathbf{x}^t}\right) \nonumber \\&\quad \text {Green-Lagrange strain tensor} \end{aligned}$$
(5b)
$$\begin{aligned}&\mathbf{S}_t^{t+\triangle t} = J_t^{t+\triangle t} \left( \mathbf{F}_t^{t+\triangle t}\right) ^{-1} \varvec{\sigma }^{t+\triangle t} \left( \mathbf{F}_t^{t+\triangle t}\right) ^{-T}\nonumber \\&\quad \text {Second Piola--Kirchhoff stress} \end{aligned}$$
(5c)
$$\begin{aligned}&R^{\mathrm{ext}\,^{t+\triangle t}}=\int _{\varOmega ^{t+\triangle t}} \delta \mathbf{u}^{t+\triangle t} \mathbf{b}~ d \varOmega ^{t+\triangle t}\nonumber \\&\quad +\int _{\varGamma ^{t+\triangle t}_{\sigma }} \delta \mathbf{u}^{t+\triangle t} \bar{\mathbf{t}}~d\varGamma ^{t+\triangle t}_{\sigma }\nonumber \\&\quad \text {External virtual work} \end{aligned}$$
(5d)

\(\mathbf{F}\) corresponds to the deformation gradient tensor and J is its determinant or Jacobian. All quantities in equations (5) are at time \(t+\triangle t\) and referred to the configuration at time t. Equation (4) may be written in an incremental form as:

$$\begin{aligned}&\int _{\varOmega ^t} \delta \triangle \mathbf {E}\,:\triangle \mathbf {S}d\varOmega ^{t}+\int _{\varOmega ^{t}}\delta \varvec{\eta }\,:\varvec{\sigma }^{t} d\, \varOmega ^{t}=R^{\mathrm{ext}\,^{t+\triangle t}}\nonumber \\&\quad -\int _{\varOmega ^{t}}\delta \mathbf {e}: \varvec{\sigma }^{t} d\,\varOmega ^{t} \end{aligned}$$
(6)

In the above equation, \(\triangle \mathbf {S}=\mathbf {S}_{t}^{t+\triangle t}- \varvec{\sigma }^{t}\) is the increment of second Piola-Kirchhoff stress, \(\triangle \mathbf {E}= \mathbf {E}_{t}^{t+\triangle t}- \mathbf {E}_{t}^{t}\) is the increment of Green-Lagrange strain. Furthermore, \(\mathbf{e}\) and \({\varvec{\eta }}\) are respectively the linear and non-linear parts of \(\triangle \mathbf {E}\), expressed as:

$$\begin{aligned} \mathbf{e}\!=\!\frac{1}{2}\left[ \left( \frac{\partial \triangle \mathbf{u}}{\partial \mathbf{x}^t}\right) ^T\!+\! \frac{\partial \triangle \mathbf{u}}{\partial \mathbf{x}^t} \right] ~~~\text{ and }~~~ {\varvec{\eta }}\!=\!\frac{1}{2} \left( \frac{\partial \triangle \mathbf{u}}{\partial \mathbf{x}^t}\right) ^T \frac{\partial \triangle \mathbf{u}}{\partial \mathbf{x}^t} \nonumber \\ \end{aligned}$$
(7)

The nonlinear equation (6) is solved using an iterative method such as the Newton-Raphson solver. A linearized form of equation (6) is required to set up the tangent matrix. Employing an incremental constitutive law of the form \(\triangle \mathbf {S} = \mathbf {C}^t: \mathbf {e}\) and using approximation \(\delta \triangle \mathbf {E}=\delta \mathbf {e}\), the linearized equation to be solved becomes

$$\begin{aligned}&\int _{\varOmega ^{t}}\,\delta \mathbf {e} : \mathbf {C}^t : \mathbf {e} d\varOmega ^{t}\!+\!\int _{\varOmega ^{t}}\,\delta \varvec{\eta }:\varvec{\sigma }^{t} d\varOmega ^{t}\!=\!R^{\mathrm{ext}\,^{t+\triangle t}}\nonumber \\&\quad -\int _{\varOmega ^{t}} \delta \mathbf {e}:\varvec{\sigma }^{t}\, d\varOmega ^{t} \end{aligned}$$
(8)

where \(\mathbf {C}^t\) is the history-dependent fourth-order tangent stiffness tensor at time t, which should be obtained for the specific constitutive model.

2.1 Crystal plasticity constitutive model

Polycrystalline microstructures of metals and alloys are modeled using CPFE models that describe micro-mechanisms of crystallographic plastic deformation in individual grains and polycrystalline aggregates. Deformation mechanisms and texture in CPFE models have been used to model creep and deformation response of metals and alloys in [2629] using a power law description [30], and the thermally activated theory of plastic flow [31]. The author’s group has developed crystal plasticity FE models and codes for simulating deformation and failure in a variety of metallic materials. These studies include creep and fatigue simulations for Ti alloys [13, 32], dwell fatigue simulations in Ti alloys in [4, 6], cyclic deformation in HSLA steels [33], deformation twin modeling in Mg alloys in [34] and hierarchical models of Ni-based superalloys in [7]. The proposed locking-free element formulations in this paper are not limited to any specific crystal plasticity model and are quite general in their applications to a wide class of elastic-plastic constitutive laws. However a candidate crystal plasticity constitutive model for Mg alloys is chosen for its capability to capture the strong anisotropy in plastic deformation and twin induced material failure [34]. This constitutive model illustrates the significant effect of element locking in predicting material failure.

2.1.1 Kinematic relations and flow rule

The deformation gradient \(\mathbf F _0^t=\frac{\partial \mathbf x ^t}{\partial \mathbf x ^0}\) at time t with respect to the initial reference configuration at \(t=0\), is multiplicatively decomposed into elastic and plastic components as:

$$\begin{aligned} \mathbf {F}^t_0=\mathbf {F}^{e}\mathbf {F}^{p} \end{aligned}$$
(9)

The component \(\mathbf {F}^e\) describes elastic stretching and rigid-body rotation of the crystal lattice, whereas the component \(\mathbf {F}^p\) corresponds to the incompressible plastic flow due to dislocation slip on different slip systems. The second Piola-Kirchoff (PK2) stress \(\mathbf{S}\) is expressed in terms of elastic Green-Lagrange strain tensor \(\mathbf {E}^e\) as:

$$\begin{aligned} \mathbf {S}=\mathbf {C}^{e}:\mathbf {E}^{e} \end{aligned}$$
(10)

where \(\mathbf {C}^{e}\) is a fourth-order anisotropic elasticity tensor. The evolution of plastic deformation is expressed in terms of plastic velocity gradient \(\mathbf {L}^p\) as:

$$\begin{aligned} {\mathbf L}^p = \dot{\mathbf{F}}^p {\mathbf {F}^p}^{-1} = \sum \limits _{\alpha =1} ^{N_\mathrm{slip}} {\dot{\gamma }^{\alpha } \mathbf{s}_0^{\alpha } } \end{aligned}$$
(11)

where \(\dot{\gamma }^{\alpha }\) is the slip rate on a slip system \(\alpha \) and \(N_{slip}\) is the total number of slip systems. The Schmid tensor associated with \({\alpha }\)-th slip system \(\mathbf{s}_0^\alpha \) is expressed in terms of the slip direction \(\mathbf {m}_{0}^{\alpha }\) and slip plane normal \(\mathbf {n}_{0}^{\alpha }\) in the reference configuration, i.e. \(\mathbf {s}_0^{\alpha } = \mathbf {m}_{0}^\alpha \otimes \mathbf {n}_{0}^{\alpha }\).

For Mg alloys, 12 active slip systems (of 30 possible systems in hcp materials) are distributed among three different families, viz. the \(<a>\)-basal, \(<a>\)-prismatic and \(<c+a>\) pyramidal slip system families as shown in Fig. 1. A power law model in [34] is used for the slip rate on slip system \(\alpha \), given as:

$$\begin{aligned} \dot{\gamma }^{\alpha }=\dot{\gamma }_{0}^{\alpha }\left| \frac{\tau ^{\alpha }-s_{a}^{\alpha }}{s_{*}^{\alpha }}\right| ^{\frac{1}{m}}\mathrm{sign}(\tau ^{\alpha }-s_{a}^{\alpha }) \end{aligned}$$
(12)

where \(\dot{\gamma }_{0}^{\alpha }\) is a reference slip-rate for slip system \(\alpha \) and m is the power law exponent representing strain-rate sensitivity. The resolved shear stress on slip system \(\alpha \) is \(\tau ^{\alpha }=\mathbf {F}^{eT}\mathbf {F}^{e}\mathbf {S} : \mathbf {s}_0^{\alpha }\). Here \(s^{\alpha }_{a}\) is the athermal resistance arising from the long-range internal stress field between parallel dislocation lines or from grain boundaries, and \(s^{\alpha }_{*}\) is the thermal shear resistance due to local obstacles such as dislocation jogs and forest dislocations.

Fig. 1
figure 1

Schematic showing active slip systems in Mg and Mg alloys

2.1.2 Evolution of slip system resistances

The evolution of athermal (\(s_{a}^\alpha \)) and thermal (\(s_{*}^\alpha \)) shear resistances is controlled by two types of dislocations, viz. statistically stored dislocations (SSDs) and geometrically necessary dislocations (GNDs) [6, 35, 36]. SSDs are responsible for homogeneous plastic deformation and are characterized by vanishing net Burgers vector. On the other hand, GNDs correspond to the storage of polarized dislocation densities, necessary for accommodating crystal lattice curvatures in single crystal bending or near polycrystalline grain boundaries. Accordingly, the total athermal and thermal shear resistances are composed of three components viz. the initial shear resistance, and contributions from the evolution of SSDs and GNDs respectively. At time t, these are [34]:

$$\begin{aligned} s_{a}^\alpha =&\,\hat{s}_{a,0}^{\alpha } + \int _{t' = 0}^{t' = t} {\sum \limits _{\beta = 1}^{N_\mathrm{slip}} {h^{\alpha \beta } } \left| {\dot{\gamma }_P^{\alpha }\sin ({\varvec{n}_{0}^{\alpha } },{\varvec{t}_{0}^{\beta } }) } \right| } \;\mathrm{d}t' \nonumber \\&+ {c_1}G b^\alpha \sqrt{\rho _\mathrm{GND, P}^\alpha } \end{aligned}$$
(13a)
$$\begin{aligned} s_{*}^\alpha =&\,\hat{s}_{*,0}^{\alpha } + \int _{t' = 0}^{t' = t} {\sum \limits _{\beta = 1}^{N_\mathrm{slip}} {h^{\alpha \beta } } \left| {\dot{\gamma }_F^{\alpha }\cos ({\varvec{n}_{0}^{\alpha } },{\varvec{t}_{0}^{\beta } }) } \right| } \;\mathrm{d}t'\nonumber \\&+\frac{{{Q_\mathrm{slip}^{\alpha }}}}{{{c_2}{c_3}{b^{\alpha ^{2}}}}}\sqrt{\rho _\mathrm{GND, F}^\alpha } \end{aligned}$$
(13b)

where \(\hat{s}^{\alpha }_{*,0}\) and \(\hat{s}^{\alpha }_{a,0}\) are grain size-dependent initial thermal and athermal shear resistances, given by Hall-Petch type relation [32, 37, 38]. G is the shear modulus, \(Q_{slip}^{\alpha }\) is the effective activation energy for dislocation slip, and \(c_1,~ c_2, ~ c_3\) are constants representing the passing stress, jump width, and obstacle width respectively. The hardening of thermal slip resistance on slip system \(\alpha \) is caused by the portion of forest SSDs on slip system \(\beta \) whose line direction \(\varvec{t}_{0}^{\beta }=\varvec{m}_{0}^{\beta } \times \varvec{n}_{0}^{\beta }\) is parallel to the slip plane normal \(\varvec{n}_{0}^{\alpha }\). Therefore, the hardening rate for thermal resistance is projected with \(\sin ({\varvec{n}_{0}^{\alpha } },{\varvec{t}_{0}^{\beta } })\). On the other hand, the hardening of athermal slip resistance is caused by the interaction of the dislocations on slip system \(\alpha \) with the portion of SSDs on slip system \(\beta \) whose line direction lies in the slip plane \(\alpha \) and consequently perpendicular to \(\varvec{n}_{0}^{\alpha }\). Therefore, the hardening rate for thermal resistance is projected with \(\cos ({\varvec{n}_{0}^{\alpha } },{\varvec{t}_{0}^{\beta } })\) [7]. The hardening rate of slip system \(\alpha \) is defined in terms of a hardening matrix \(h^{\alpha \beta }\) as:

$$\begin{aligned}&h^{\alpha \beta }=q^{\alpha \beta } h_\mathrm{ref}^{\beta }\left| 1-\frac{s_{*}^{\beta }}{s_\mathrm{sat}^{\beta }} \right| ^{r}\;\mathrm{sign}\left( 1-\frac{s^{\beta }}{s_\mathrm{sat}^{\beta }} \right) ~~(\text { no sum on}~\beta )\nonumber \\&\quad \text {where} ~~~ s_\mathrm{sat}^{\beta }=\tilde{s}^{\beta } \left( \frac{\dot{\gamma }^{\beta }}{\dot{\gamma }_{0} ^{\beta }} \right) ^{n} \end{aligned}$$
(14)

The introduction of non-local GND models in CPFE analysis is necessary for accurate representation of stress concentrations near grain boundaries that are responsible for crack and twin nucleation. Accumulation of GNDs occurs with the incompatibility of plastic strain field especially at points of discontinuous plastic flow, such as grain boundaries. Contribution of GNDs to the slip system hardening is from two sources. The dislocation components \(\rho _\mathrm{GND,P}^\alpha \) parallel to slip plane \(\alpha \) and the forest dislocation components \(\rho _\mathrm{GND,F}^\alpha \) normal to slip plane \(\alpha \). \(\rho _\mathrm{GND,P}^\alpha \) contributes to the athermal shear resistance \(s_{a}^\alpha \) by providing a long-range stress, and \(\rho _\mathrm{GND,F}^\alpha \) increases the thermal shear resistance \(s_{*}^\alpha \) by hindering the slip of mobile dislocations. These are given in equations (13a) and (13b) respectively. The Nye’s dislocation density tensor is used to quantify GND densities on different slip systems. It is expressed in terms of the curl of plastic deformation gradient as [39]:

$$\begin{aligned} \varvec{\varLambda } = - {\left( \varvec{\nabla }_\mathbf{x ^{0}} \times {{\mathbf {F}}^{{p^T}}} \right) ^T} \end{aligned}$$
(15)

where \(\varvec{\nabla }_\mathbf{x ^{0}}\) is the gradient operator with respect to the reference configuration at time \(t=0\). Nye dislocation tensor quantifies the closure failure of a circuit in the intermediate configuration due to the presence of GNDs and can be alternatively derived in terms of GNDs densities as [40]:

$$\begin{aligned} \varvec{\varLambda }=\sum _{\alpha =1}^{N_\mathrm{slip}} \left( \rho _\mathrm{GNDs}^{\alpha }\mathbf{b ^{\alpha }_{0}\otimes \mathbf{m ^{\alpha }_{0}}}+\rho _\mathrm{GNDet}^{\alpha }\mathbf{b ^{\alpha }_{0}\otimes \mathbf{t ^{\alpha }_{0}}}+\rho _\mathrm{GNDen}^{\alpha }\mathbf{b ^{\alpha }_{0}\otimes \mathbf{n ^{\alpha }_{0}}} \right) \end{aligned}$$
(16)

where \(\mathbf b _{0}^{\alpha }\) is the Burgers vector for a slip system \(\alpha \) in the reference configuration. GNDs on the slip system are decomposed into three components, viz. a screw component \(\rho _\mathrm{GNDs}^\alpha \) with dislocation line parallel to \(\mathbf b ^\alpha _{0}\), and two edge components \(\rho _\mathrm{GNDen}^\alpha \) and \(\rho _\mathrm{GNDet}^\alpha \) with dislocation lines parallel to \(\mathbf n ^\alpha _{0}\) and \(\mathbf t ^\alpha _{0}\) respectively. There are in general \(3 \times N_\mathrm{slip}\) unknown GND densities, which corresponds to 90 for HCP crystals. Of these only 63 are independent for HCP crystals, corresponding to 9 \(\rho _\mathrm{GNDs}^{\alpha }\)s, 24 \(\rho _\mathrm{GNDet}^{\alpha }\)s and 30 independent \(\rho _\mathrm{GNDen}^{\alpha }\)s. Equations (15) and (16) constitute an under-determined state of equations that are expressed in a matrix form as:

$$\begin{aligned} \left\{ {\varvec{\varLambda }}\right\} =\left[ \mathbf A \right] \left\{ \varvec{\rho }_\mathrm{GND}\right\} \end{aligned}$$
(17)

\(\left\{ {\varvec{\varLambda }}\right\} \) is the \(9 \times 1\) vector representation of the Nye tensor \(\varvec{\varLambda }\), \(\left[ \mathbf A \right] \) is a \(9 \times 63\) matrix containing the basis tensors \({{\mathbf {b}}^\alpha _{0}} \otimes {{\mathbf {m}}^\alpha _{0}}\), \({\mathbf {b}}^\alpha _{0} \otimes {{\mathbf {t}}^\alpha _{0}}\) and \({{\mathbf {b}}^\alpha _{0}} \otimes {{\mathbf {n}}^\alpha _{0}}\), and \(\left\{ \varvec{\rho }_\mathrm{GND}\right\} \) is a \(63 \times 1\) vector containing the independent GND densities. Following discussions in [40], the geometric constraints in equation (16) allow only certain dislocations to exist on the slip planes. This constraint is taken into account through a Lagrangian multiplier in the functional to be minimized, which is:

$$\begin{aligned} \mathcal {F} \left( \left\{ \varvec{\rho }_\mathrm{GND}\right\} ,\left\{ \varvec{\lambda }\right\} \right)= & {} \left\{ \varvec{\rho }_\mathrm{GND}\right\} ^{T} \left\{ \varvec{\rho }_\mathrm{GND}\right\} \nonumber \\&+ \left\{ \varvec{\lambda }\right\} ^{T} \left( \left[ \mathbf A \right] \left\{ \varvec{\rho }_\mathrm{GND}\right\} - \left\{ {\varvec{\varLambda }}\right\} \right) \end{aligned}$$
(18)

where \(\left\{ \varvec{\lambda }\right\} \) is a \(9 \times 1\) vector of Lagrange multipliers. Equation (18) implies that GNDs correspond to the minimum amount of polarized dislocation densities necessary to recover lattice compatibility. Minimizing equation (18), the GND densities are obtained as:

$$\begin{aligned} \left\{ \varvec{\rho }_\mathrm{GND}\right\} = \left[ \mathbf A \right] ^{T} \left( \left[ \mathbf A \right] \left[ \mathbf A \right] ^{T} \right) ^{-1} \left\{ {\varvec{\varLambda }}\right\} \end{aligned}$$
(19)

Using dislocation line projection [36], the parallel and forest GND components in equation (13) are obtained as:

$$\begin{aligned} \rho _{\mathrm{GND},P}^{\alpha } =&\sum \limits _{\beta = 1}^{N_{slip}} {\chi ^{\alpha \beta }}\left[ {} \left| \rho _\mathrm{GNDs}^\beta {\sin (\mathbf{n ^\alpha _{0} },\mathbf{m ^\beta _{0} })} \right| + \left| \rho _\mathrm{GNDet}^\beta {\sin (\mathbf{n ^\alpha _{0}},\mathbf{t ^\beta _{0}})} \right| \right. \nonumber \\&\left. +\left| \rho _\mathrm{GNDen}^\beta {\sin (\mathbf{n ^\alpha _{0}},\mathbf{n ^\beta _{0}})} \right| {} \right] \end{aligned}$$
(20a)
$$\begin{aligned} \rho _{\mathrm{GND},F}^\alpha =&\sum \limits _{\beta = 1}^{N_\mathrm{slip}}{\chi ^{\alpha \beta }} \left[ {} \left| \rho _\mathrm{GNDs}^\beta {\cos (\mathbf{n ^\alpha _{0}},\mathbf{m ^\beta _{0}})} \right| + \left| \rho _\mathrm{GNDet}^\beta {\cos (\mathbf{n ^\alpha _{0}},\mathbf{t ^\beta _{0})}} \right| \right. \nonumber \\&\left. + \left| \rho _\mathrm{GNDen}^\beta {\cos (\mathbf{n ^\alpha _{0}},\mathbf{n ^\beta _{0}})} \right| {} \right] \end{aligned}$$
(20b)

The coefficient \({\chi ^{\alpha \beta }}\) describes the strengthening effect due to the interaction between slip systems \(\alpha \) and \(\beta \), e.g. in the formation of dislocation locks. For HCP crystals, \({\chi ^{\alpha \beta }}\) is taken to be 1 in this work.

2.2 TET4 elements in CPFE analysis and associated volumetric locking

For any element in the CPFE model, the displacement increment \(\triangle \mathbf {u}\), increment of displacement gradient \(\frac{\partial \triangle \mathbf {u}}{\partial \mathbf {x}^t}\) and the linearized strain increment \(\mathbf {e}\) in equation (7) are respectively written as:

$$\begin{aligned} \triangle \mathbf {u}=\mathbf {N}\triangle \mathbf {q}, ~~~~~~\frac{\partial \triangle \mathbf {u}}{\partial \mathbf {x}^t}=\frac{\partial \mathbf {N}}{\partial \mathbf {x}^t}\triangle \mathbf {q}=\mathbf {G}\triangle \mathbf {q},~~~~\text{ and }~~~~ \mathbf {e} =\mathbf {B}\triangle \mathbf {q} \end{aligned}$$
(21)
Fig. 2
figure 2

Mesh of TET4 elements subject to nodal displacements for illustrating volumetric locking

Table 1 Nodal coordinates and element connectivity for the FE model in Fig. 2

For the four-node constant strain tetrahedral or TET4 element, the shape function \(\mathbf {N}\) is a \(3\times 12\) matrix, \(\mathbf {G}\) is a \(9\times 12\) gradient operator matrix and \(\mathbf {B}\) is the \(6\times 12\) strain-displacement matrix. Explicit forms of \(\mathbf {N}\), \(\mathbf {G}\) and \(\mathbf {B}\) for the TET4 element are given in [41]. Stress and strain tensors are represented using the reduced order Voigt vector notation. Substituting equations (21) into equation (8) and integrating using the one-point Gaussian quadrature rule, yields the discrete form of the finite element equations as:

$$\begin{aligned}&\sum _{i=1}^{N_e}\mathbf {B}^{t^{T}}\mathbf {C}^{t}\mathbf {B}^{t} \varOmega ^{i,t}\triangle \mathbf {q}+\sum _{i=1}^{N_e}\mathbf {G}^{t^{T}}\underset{\sim }{{\varvec{\sigma }}}^{t}\mathbf {G}^{t}\varOmega ^{i,t} \triangle \mathbf {q}\nonumber \\&\quad =\mathbf {f}^{ext^{t+\triangle t}}-\sum _{i=1}^{N_e}{\mathbf {B}^{t}}^T\varvec{\sigma }\varOmega ^{i,t} \end{aligned}$$
(22)

where \(\varOmega ^{i,t}\) is the volume of element i at time t. The matrix \(\underset{\sim }{{\varvec{\sigma }}}^{t}\) is explicitly written as:

$$\begin{aligned} \underset{\sim }{{\varvec{\sigma }}}^{t}=\left[ \begin{array}{ccc} {\varvec{\sigma }}^{t} &{} \quad {\mathbf {0}} &{} \quad {\mathbf {0}} \\ {\mathbf {0}} &{} \quad {\varvec{\sigma }}^{t} &{} \quad {\mathbf {0}} \\ {\mathbf {0}} &{} \quad {\mathbf {0}} &{} \quad {\varvec{\sigma }}^{t} \end{array}\right] \end{aligned}$$
(23)

where \({\varvec{\sigma }}^{t}\) is the \(3 \times 3\) stress matrix, \({\mathbf {0}}\) is a \(3 \times 3\) matrix of zeros and \({\mathbf {f}^\mathrm{ext}}^{t+\triangle t}\) is the external force vector corresponding to \(R^{\mathrm{ext}^{t+\triangle t}} ={\mathbf {f}^\mathrm{ext}}^{t+\triangle t}.\delta \triangle \mathbf {q}\). The system of equations to be solved is:

$$\begin{aligned} \mathbf {K}^{t}\triangle \mathbf {q}={\mathbf {f}^\mathrm{ext}}^{t+\triangle t}-{\mathbf {f}^\mathrm{int}}^{t} \end{aligned}$$
(24)

\(\mathbf {K}^{t}\) and \({\mathbf {f}^\mathrm{int}}^{t} \) are the global stiffness matrix and internal force vector respectively. The material tangent stiffness tensor \(\mathbf{C}^\mathbf{t}\) is needed for the evaluation of \(\mathbf {K}^{t}\). The formulation of \(\mathbf{C}^\mathbf{t}\), which is consistent with the proposed crystal plasticity constitutive model, is given in appendix A.

2.2.1 Volumetric locking in TET4 elements

TET4 elements are known to exhibit volumetric locking for incompressible or nearly incompressible materials. A simple example illustrates the occurrence of volumetric locking emanating from numerical interpolation of strains in the TET4 element. Consider a nearly-incompressible elastic bar of dimensions \(4 \times 2 \times 2 \) units, with Young’s modulus \(E=1 \;GPa\) and Poisson’s ratio \(\nu =0.4999\). The bar is discretized into 6 TET4 elements as shown in Fig. 2. The nodal coordinates and element connectivity list are tabulated in Table 1. All the 8 nodes are subjected to prescribed values corresponding to the displacement field

$$\begin{aligned} u_x=\frac{xy}{2}, \qquad u_y=\frac{x^{2}}{4}-\frac{\nu }{4(1-\nu )}y^{2}, \qquad u_z=0 \end{aligned}$$
(25)

The normal components of linear strain, corresponding to the prescribed displacement field, is analytically obtained as:

$$\begin{aligned} e_{xx}=\frac{y}{2},\quad e_{yy}=-\frac{\nu y}{2(1-\nu )},\quad e_{zz}=0 \end{aligned}$$
(26)

The corresponding volumetric strain is given as \(e_{xx}+e_{yy}+e_{zz}=\frac{1-2\nu }{2(1-\nu )}y\). This is clearly dependent on the Poisson’s ratio \(\nu \) and the location y. For the given geometry \(y \in [-1,1]\) and Poisson’s ratio \(\nu =0.4999\) the volumetric strain is nearly zero. However, for TET4 elements, the volumetric strain due to the use of linear shape functions is clearly non-zero as listed in Table 2. The large volumetric strains induce high spurious dilatational energy that results in element locking and high stresses.

Table 2 Strain components for each TET4 element for the problem in Fig. 2

Crystal plasticity constitutive models exhibit isochoric plastic flow, i.e. \(\mathrm{det}~ \mathbf{F}^p=1\). Since plastic strains are significantly larger than elastic strains, the use of TET4 element in CPFE simulations may result in volumetric locking under different deformation modes, e.g. bending.

Fig. 3
figure 3

a 2D patch construction for node s; b 3D volume partitioning \(\varOmega ^{i,t}_{s}\) for node s, in the NUS method

3 Locking-free formulations for TET4 elements

Stabilization of TET4 elements, through node-based uniform strain (NUS) formulation was introduced in [10]. While the NUS method has been successful in avoiding volumetric locking, spurious zero energy modes were reported in [11]. Alternately, the F-bar patch (FP) formulation [12, 13] has been shown to alleviate volumetric locking without the reintroduction of spurious zero energy modes. As an extension to the NUS formulation, a locally integrated B-bar (LIB) element is developed to stabilize TET4 elements in this paper. The methods are termed in this paper as locking free stabilized or LFS-TET4 elements. These formulations are summarized in the context of CPFE formulation in this section.

3.1 Node-based uniform strain (NUS) element formulation

In the NUS formulation, a patch of sub-elements is assigned to each nodal point in the finite element mesh. Consider \(\hat{\varOmega }^{s,t}\) to denote such a patch assigned to a node s at time t that is defined as:

$$\begin{aligned} \hat{\varOmega }^{s,t}=\sum _{i=1}^{N^{s}} \varOmega _{s}^{i,t}=\sum _{i=1}^{N^{s}} \alpha _{s}^{i} \,\varOmega ^{i,t} \end{aligned}$$
(27)

\(N^{s}\) corresponds to the number of TET4 elements attached to the node s, \(\varOmega ^{i,t}_{s}\) is the volume contribution of the \(i-th\) TET4 element to the patch \(\hat{\varOmega }^{s}\) and \(\alpha _{s}^{i}\) is a scalar weighting factor. For 3D meshes, \(\alpha _{s}^{i}=\frac{1}{4}\). Figure 3a illustrates a 2D patch construction method for a node s, while Fig. 3b shows the partitioning of a 3D TET4 element to generate its contribution \(\varOmega ^{i,t}_{s}\) to the patch.

Within each patch, the linear strain increment \(\hat{\mathbf{e }}^{s}\) is taken to be uniform and obtained as the average value from surrounding elements, i.e.:

$$\begin{aligned} \hat{\mathbf{e }}^{s} = \sum _{i=1}^{N^{s}}w^{i}\mathbf {e}^{i,t}= \sum _{i=1}^{N^{s}}w^{i}\mathbf {B}^{i,t}\triangle \mathbf {q}^{i} = \hat{\mathbf{B }}^{s,t} \, \triangle \hat{\mathbf{q }}^{s} \end{aligned}$$
(28)

where \(w^i\) is a relative volume-based weight for element i and \(\hat{\mathbf{B }}^{t,s}\) is the gradient matrix associated with the patch s that is obtained by assembling \(\mathbf {B}^{i,t}\) from surrounding elements with weight \(w^i\). From equation (27) \(w^i=\frac{1}{4} \frac{{\varOmega }^{i,t}}{\hat{\varOmega }^{s,t}}\). \(\triangle \hat{\mathbf{q }} ^{s} \) is the displacement increment vector associated with the patch s, obtained by assembling \(\mathbf {q}^{i}\) from surrounding elements. Nodal averaging of the gradient of displacement increment \(\frac{\partial \triangle \mathbf {u}}{\partial \mathbf {x}^t}\) is obtained in the same way as:

$$\begin{aligned}&\left[ \begin{array}{ccccccc} \frac{\partial \triangle {u}_{1}}{\partial {x}^{t}_{1}}&\frac{\partial \triangle {u}_{1}}{\partial {x}^{t}_{2}}&\frac{\partial \triangle {u}_{1}}{\partial {x}^{t}_{3}}&\cdots&\frac{\partial \triangle {u}_{3}}{\partial {x}^{t}_{1}}&\frac{\partial \triangle {u}_{3}}{\partial {x}^{t}_{2}}&\frac{\partial \triangle {u}_{3}}{\partial {x}^{t}_{3}} \end{array} \right] ^{T}_{\mathrm{patch}\;s}\nonumber \\&\quad =\sum _{i=1}^{N^{s}}w^{i} \mathbf G ^{i,t}\ \triangle \mathbf q ^{i} = \hat{\mathbf{G }}^{s,t} \, \triangle \hat{\mathbf{q }} ^{s} \end{aligned}$$
(29)

\(\hat{\mathbf{G }}^{s,t}\) is the gradient matrix associated with the patch s created by assembling \(\mathbf {G}^{i,t}\) from surrounding elements with weight \(w^i\). From equations (7), (28) and (29) it is seen that the strain increment \(\triangle \mathbf {E}\) over the patch is uniform, which makes the cumulative strain uniform as well.

The linearized weak form (8) with constant strain patches is represented for the discrete model as:

$$\begin{aligned}&\sum _{s=1}^{N_\mathrm{nodes}}\int _{\hat{\varOmega }^{s,t}} {\mathbf {C}}^{s,t}:{\hat{\mathbf{e}}}^{s}\delta {\hat{\mathbf{e}}}^{s}\, d\hat{\varOmega }^{s,t} + \sum _{s=1}^{N_\mathrm{nodes}}\int _{\hat{\varOmega }^{s,t}} {\varvec{\sigma }}^{s,t}:\delta {\hat{\varvec{\eta }}}^{s} d \hat{\varOmega }^{s,t}\nonumber \\&\quad = R^{{ext}^{t+\triangle t}}-\sum _{s=1}^{N_\mathrm{nodes}}\int _{\hat{\varOmega }^{s,t}} {\varvec{\sigma }}^{s,t}:\delta {\hat{\mathbf{e}}}^{s}\, d\hat{\varOmega }^{s,t} \end{aligned}$$
(30)

where \(\varvec{\sigma }^{s,t}\) is the Cauchy stress, obtained from the constitutive model, and \(\mathbf {C}^{s,t}\) is the crystal plasticity tangent stiffness matrix in node-based patch s. The NUS formulation assumes that \({\mathbf {C}}^{s,t}\) and \(\varvec{\sigma }^{s,t}\) are also uniform and constant over the patch s. Thus the one-point numerical integration may be used for each patch and the crystal plasticity constitutive updates are made for the node of the patch. This removes volumetric locking through a reduction in the number of incompressibility constraints. The incompressibility constraint ratio approaches an optimal value of 3. Substituting equations (28) and (29) into equation (30), the tangent stiffness matrix and internal nodal force vector are derived as:

$$\begin{aligned}&\mathbf {K}^{t}=\sum _{s=1}^{N_\mathrm{nodes}}\,\hat{\mathbf {B}}^{s,tT}\,{\mathbf {C}}^{s,t}\,{{\hat{\mathbf {B}}}^{s,t}} \hat{\varOmega }^{s,t}+\sum _{s=1}^{N_\mathrm{nodes}}\,\hat{\mathbf {G}}^{s,tT}\,\underset{\sim }{{\varvec{\sigma }}}^{s,t}\,{\hat{\mathbf {G}}^{s,t}} \hat{\varOmega }^{s,t} \end{aligned}$$
(31a)
$$\begin{aligned}&{\mathbf {f}^\mathrm{int}}^{t} =\sum _{s=1}^{N_\mathrm{nodes}} \hat{\mathbf {B}}^{s,tT}\,{\varvec{\sigma }}^{s,t} \hat{\varOmega }^{s,t} \end{aligned}$$
(31b)

This node-based uniform strain (NUS) element has however been reported to exhibit spurious zero or low energy modes in [15]. Such spurious energy modes can cause large distortion of the TET4 element and eventually lead to a negative determinant of the Jacobian matrix.

3.2 Locally integrated B-bar (LIB) element

Several stabilization methods have been developed to overcome the zero-energy modes in the original NUS formulation [11, 15]. These methods are based on splitting the stress or the tangent stiffness matrix \(\mathbf {C}\). Such decomposition is not however possible in CPFE analysis with anisotropic elasto-plastic stiffness matrix \(\mathbf {C}\). To overcome this issue, a locally integrated B-bar (LIB) based element is proposed in this paper. Following procedures in [42], the linear strain increment is decomposed into volumetric and deviatoric parts by splitting the gradient matrix \(\mathbf {B}\) in this formulation, i.e.

$$\begin{aligned} \mathbf {e}=\mathbf {e}^\mathrm{vol}+\mathbf {e}^\mathrm{dev}=\mathbf {B}^\mathrm{vol}\triangle \mathbf {q} + \mathbf {B}^\mathrm{dev}\triangle \mathbf {q} \end{aligned}$$
(32)

Only the volumetric part of the linear strain increment \(\mathbf {e}^{vol}\) is assumed to be uniform inside the patch for each node to reduce constraints. For node s, the uniform volumetric strain increment \(\hat{\mathbf {e}}^{s,vol}\) is obtained as:

$$\begin{aligned} \hat{\mathbf {e}}^{s,\mathrm{vol}} = \sum _{i=1}^{N^{s}} w^{i} \mathbf {e}^{i,\mathrm{vol}} = \sum _{i=1}^{N^{s}} w^{i} \mathbf {B}^{i,\mathrm{vol}} \triangle \mathbf {q}^{i} = \bar{\mathbf {B}}^{s,vol}\triangle \hat{\mathbf {q}}^{s} \end{aligned}$$
(33)

\(\bar{\mathbf{B }}^{s,\mathrm{vol}}\) is the volumetric part of the gradient matrix associated with patch s that is assembled from surrounding element \(\mathbf {B}^{i,vol}\)’s with weights \(w^i\). The deviatoric part of the strain increment \(\mathbf {e}^\mathrm{dev}\) is constant over each TET4 element. This leads to two separate distributions of the volumetric and deviatoric strain increment over the domain, as illustrated in Fig. 4.

Fig. 4
figure 4

Strain distributions in the patch, tetrahedron and sub-domain of tetrahedron in the LIB method

Each TET4 element is divided into 4 equal sub-domains. Within each sub-domain, the volumetric and deviatoric parts of the strain increment are constant. The strain increment in a sub-domain \(\varOmega ^{i,s}\) is thus represented as:

$$\begin{aligned} \bar{\mathbf {e}}^{i,s}=\bar{\mathbf {e}}^{s,\mathrm{vol}} + \mathbf {e}^{i,\mathrm{dev}}=\bar{\mathbf {B}}^{s,vol}\triangle \hat{\mathbf {q}}^{s} + {\mathbf {B}}^{i,dev}\triangle \mathbf {q}^{i} =\bar{\mathbf {B}}^{i,s} \triangle \hat{\mathbf {q}}^{s} \end{aligned}$$
(34)

\(\bar{\mathbf {B}}^{i,s} \) is the modified gradient matrix associated with sub-domain \(\varOmega ^{i,s}\). Note that \(\triangle \mathbf {q}^{i}\) is contained in \(\triangle \hat{\mathbf {q}}^{s}\) as shown in equation (28). Thus it allows the additive decomposition \(\bar{\mathbf {B}}^{i,s}= \bar{\mathbf {B}}^{s,\mathrm{vol}}+{\mathbf {B}}^{i,\mathrm{dev}}\). Analogous to \(\mathbf {B}\), the \(9 \times 12\) gradient matrix \(\mathbf {G}\) can be split into volumetric and deviatoric parts i.e. \(\mathbf {G}=\mathbf {G}^\mathrm{vol}+\mathbf {G}^{dev}\). The explicit form of \(\mathbf {G}^{vol}\) is :

$$\begin{aligned} \mathbf {G}^\mathrm{vol}=\left[ \begin{array}{cccc} \mathbf {G}^\mathrm{vol}_{n1}, \mathbf {G}^\mathrm{vol}_{n2}, \mathbf {G}^{vol}_{n3}, \mathbf {G}^{vol}_{n4} \end{array} \right] \end{aligned}$$
(35)

where \(n1,\ldots ,n4\) are the local node indices in the element. For any \(a \in [n1,\ldots ,n4]\), \(\mathbf {G}^\mathrm{vol}_{a}\) is explicitly written as:

$$\begin{aligned} \mathbf {G}^\mathrm{vol}_{a}=\left[ \begin{array}{ccc} G_1&{} \quad G_2&{} \quad G_3\\ 0 &{} \quad 0 &{}\quad 0\\ 0 &{} \quad 0 &{}\quad 0\\ 0 &{} \quad 0 &{}\quad 0\\ G_1&{} \quad G_2&{} \quad G_3 \\ 0 &{} \quad 0 &{}\quad 0\\ 0 &{} \quad 0 &{}\quad 0\\ 0 &{} \quad 0 &{}\quad 0\\ G_1&{} \quad G_2&{} \quad G_3 \end{array}\right] \end{aligned}$$
(36)

where \(G_i=\frac{\partial N_a}{\partial x_i}\) and \(N_a\) is the shape function associated with node a. Substituting \(\mathbf {G}^{vol}\) and \(\mathbf {G}^{dev}\), the \(9 \times 1\) displacement gradient vector in the sub-domain \(\varOmega ^{i,s}\) is expressed as:

$$\begin{aligned} \left[ \begin{array}{ccccccc} \frac{\partial \triangle {u}_{1}}{\partial {x}^{t}_{1}}&\frac{\partial \triangle {u}_{1}}{\partial {x}^{t}_{2}}&\frac{\partial \triangle {u}_{1}}{\partial {x}^{t}_{3}}&\cdots&\frac{\partial \triangle {u}_{3}}{\partial {x}^{t}_{1}}&\frac{\partial \triangle {u}_{3}}{\partial {x}^{t}_{2}}&\frac{\partial \triangle {u}_{3}}{\partial {x}^{t}_{3}} \end{array} \right] ^{T}_{\varOmega ^{i}_{s}}\nonumber \\ \quad = \bar{\mathbf{G }}^{s,vol}\ \triangle \hat{\mathbf{q }}^{s} +\mathbf G ^{i,dev}\ \triangle \mathbf q ^{i} = \bar{\mathbf{G }}^{i,s} \, \triangle \hat{\mathbf{q }} ^{s} \end{aligned}$$
(37)

Correspondingly, the linearized weak form (8) with constant strain sub-domains reduces to:

$$\begin{aligned}&\sum _{i=1}^{N_{sub-tet}}\int _{{\varOmega }^{i,t}} {\mathbf {C}}^{i,t}:{\bar{\mathbf{e}}}^{i}\delta {\bar{\mathbf{e}}}^{i}\, d{\varOmega }^{i,t} + \sum _{i=1}^{N_\mathrm{sub-tet}}\int _{{\varOmega }^{i,t}} {\varvec{\sigma }}^{i,t}:\delta {\bar{\varvec{\eta }}}^{i} d {\varOmega }^{i,t}\nonumber \\&\quad =R^{\mathrm{ext}^{t+\triangle t}}-\sum _{i=1}^{N_{sub-tet}}\int _{{\varOmega }^{i,t}} {\varvec{\sigma }}^{i,t}:\delta {\bar{\mathbf{e}}}^{i}\, d\varOmega ^{i,t} \end{aligned}$$
(38)

where \(N_\mathrm{sub-tet}~(=4 \times N_{e})\) is the total number of sub-domains and \({\varOmega }^{i,t}\) is the sub-domain volume at time t. \({\varvec{\sigma }}^{i,t}\) and \({\mathbf {C}}^{i,t}\) are updated using the crystal plasticity constitutive models. Again it is assumed they are uniform and constant over the sub-domain i and one-point numerical integration can be used.

\({\mathbf {C}}^{i,t+\triangle t}\) and \({\varvec{\sigma }}^{i,t+\triangle t}\) depend on the deformation gradient \(\mathbf {F}^{t}_{0}\), as well as other history-dependent state variables. In the LIB element formulation, the evaluation of \(\mathbf {F}^{t+\triangle t}_{0}\) in each sub-domain must be consistent with the interpolation of strain with \(\bar{\mathbf {B}}\). This is achieved using the following relation:

$$\begin{aligned} {\mathbf {F}}_0^{t+\triangle t}= & {} {\mathbf {F}}_t^{t+\triangle t}{\mathbf {F}}_0^{t} =\left( \mathbf {I}+ \frac{\partial \triangle \mathbf {u}}{\partial \mathbf {x}^t}\right) \frac{\partial \triangle \mathbf {x}^{t}}{\partial \mathbf {x}^0} \nonumber \\= & {} \left( \mathbf {I}+ \bar{\mathbf {G}}^{t}\triangle \mathbf {q} \right) \frac{\partial \triangle \mathbf {x}^{t}}{\partial \mathbf {x}^0} \end{aligned}$$
(39)

Substituting equations (34) and (37) into (38), the tangent stiffness matrix and internal nodal force vector in the LIB element formulation are expressed as:

$$\begin{aligned}&\mathbf {K}^t=\sum _{i=1}^{N_\mathrm{sub-tet}}\,\bar{\mathbf {B}}^{i,t\;T}\,{\mathbf {C}}^{i,t}\,{{\bar{\mathbf {B}}}^{i,t}} {\varOmega }^{i,t}+\sum _{i=1}^{N_\mathrm{sub-tet}}\,\bar{\mathbf {G}}^{i,t\;T}\,\underset{\sim }{{\varvec{\sigma }}}^{i,t}\,{{\bar{\mathbf {G}}}^{i,t}} {\varOmega }^{i,t} \end{aligned}$$
(40a)
$$\begin{aligned}&{\mathbf {f}^{int}}^t=\sum _{i=1}^{N_\mathrm{sub-tet}} \bar{\mathbf {B}}^{i,t\;T}\,{\varvec{\sigma }}^{i,t}{\varOmega }^{i,t} \end{aligned}$$
(40b)

The LIB element selectively reduces the volumetric strain components over the node-based patch and keeps the deviatoric strain components unchanged in each tetrahedral element. This stabilization method effectively alleviates volumetric locking without introducing spurious zero-energy modes.

3.3 F-bar patch-based (FP) element

The F-bar patch (FP) based stabilization method has been proposed in [12] for relieving volumetric locking in lower order tetrahedral elements. The F-bar patch method modifies the deformation gradient for stress tensor calculations such that incompressibility is enforced in the element in a weak sense, rather than a point-wise enforcement.

The Cauchy stress at the end of a time interval \(\left[ t,t+\triangle t \right] \) may be computed in terms of the deformation gradient and state variables \(\alpha ^{t}\) at time t as:

$$\begin{aligned} \varvec{\sigma }^{t+\triangle t}=\varvec{\sigma } \left( \alpha ^{t}, \mathbf {F}^{t+\triangle t} \right) \end{aligned}$$
(41)

The deformation gradient is decomposed into isochoric and volumetric components as:

$$\begin{aligned} \mathbf {F}=\mathbf {F}_\mathrm{iso} \mathbf {F}_{vol}~~~~\text{ where }~~ \mathbf {F}_\mathrm{iso}=\left( \mathrm{det} \mathbf {F} \right) ^ {-\frac{1}{3}} \mathbf {F} \quad \text{ and } \quad \mathbf {F}_\mathrm{vol}=\left( \mathrm{det} \mathbf {F} \right) ^{\frac{1}{3}} \mathbf {I} \end{aligned}$$
(42)

In the original F-bar formulation for four-node quadrilateral and eight-node hexahedral elements in [21], \(\mathbf {F}\) is first calculated at all Gauss quadrature points, as well as \(\mathbf {F}_{0}\) at the element centroid. Subsequently, the stabilized deformation gradient \(\bar{\mathbf {F}}\) at the Gauss points are obtained by replacing the volumetric component with its value at the centroid, i.e.

$$\begin{aligned} \bar{\mathbf {F}}=\mathbf {F}_\mathrm{iso} \left( \mathbf {F}_{0}\right) _\mathrm{vol} = \left( \frac{\mathrm{det} \mathbf {F}_{0}}{\mathrm{det} \mathbf {F}}\right) ^{\frac{1}{3}} \mathbf {F} \end{aligned}$$
(43)

This implies that the determinant of \(\bar{\mathbf {F}}\) within the element is equal to the determinant of \(\mathbf {F}_{0}\). Thus, incompressibility in the constitutive model is enforced only at the centroid of the element, rather than at all Gauss points. The constitutive model is then solved at Gauss points using \(\bar{\mathbf {F}}\), i.e.

$$\begin{aligned} \varvec{\sigma }^{t+\triangle t}=\varvec{\sigma } \left( \alpha ^{t}, \bar{\mathbf {F}}^{t+\triangle t} \right) \end{aligned}$$
(44)

This methodology has been effective in overcoming volumetric locking for bilinear quadrilateral and trilinear hexahedral elements in [21]. However it is not directly applicable to linear tetrahedral elements as they have only one Gauss point located at the element centroid. Additionally the deformation gradient is constant in the element and hence, \(\bar{\mathbf {F}}\) in equation (43) becomes:

$$\begin{aligned} \bar{\mathbf {F}} = \mathbf {F}_{0} = \mathbf {F} \end{aligned}$$
(45)
Fig. 5
figure 5

Patch of elements in the F-bar-patch method

Clearly, this relation will not help in overcoming volumetric locking in TET4 elements in the incompressibility limit. A modified formulation has been proposed in [12] where constitutive incompressibility is enforced over a patch of elements, rather than in each element. This requires that elements in the mesh be assigned to non-overlapping patches as illustrated in Fig. 5 in 2D. Let \(\mathscr {P}\) denote a set of elements forming a patch. The modified deformation gradient for element \(K \in \mathscr {P}\), is defined as

$$\begin{aligned} \bar{\mathbf {F}}_{K}=\left[ \frac{\varOmega ^{t+\triangle t}_\mathrm{patch}}{\varOmega ^{0}_\mathrm{patch} \; det \mathbf {F}_K} \right] ^{\frac{1}{3}} \mathbf {F}_{K} \end{aligned}$$
(46)

where \(\varOmega ^{t+\triangle t}_\mathrm{patch}\) and \(\varOmega ^{0}_\mathrm{patch}\) are respectively the current and undeformed volumes of the patch \(\mathscr {P}\), calculated as:

$$\begin{aligned} \varOmega ^{0}_\mathrm{patch}=\sum _{K \in \mathscr {P}} {\varOmega ^{K,0}} \quad ,\quad \varOmega ^{t+\triangle t}_\mathrm{patch}=\sum _{K \in \mathscr {P}} {\varOmega ^{K,t+\triangle t}} \end{aligned}$$
(47)

It is noteworthy that F-bar patch method reduces to the conventional tetrahedral element formulation if each element is identified with a patch. Adding more elements to the patch relaxes the incompressibility constraint ratio and helps relieve volumetric locking. However, the presence of too many elements in a patch may result in spurious energy modes. Through numerical experiments, it was inferred in [13] that 8 elements per patch is adequate for 3D problems without spurious mechanisms.

The internal force vector in the F-bar patch method is evaluated using the modified deformation gradient in equation (46) as:

$$\begin{aligned} {\mathbf {f}^\mathrm{int}}^K=\varOmega ^{K,t} {\mathbf {B}^{K,t}}^T \varvec{\sigma }^{K,t} \end{aligned}$$
(48)

The tangent stiffness matrix has a non-conventional structure in the sense that it not only depends on the degrees of freedom of the element, but also on the degrees of freedom of other elements in the patch. The tangent stiffness matrix for element K is derived as:

$$\begin{aligned} \mathbf {K}^{KK}=&\,\varOmega ^{K,t} \; {\mathbf {G}^{K,t}}^T \; \mathbf {a} \; \mathbf {G}^{K,t} \nonumber \\ {}&+ \left( \frac{\varOmega ^{K,t}}{\varOmega ^{t}_\mathrm{patch}} -1\right) \; \varOmega ^{K,t} \; {\mathbf {G}^{K,t}}^T \; \mathfrak {I}\; \mathbf {G}^{K,t} \end{aligned}$$
(49a)
$$\begin{aligned} \mathbf {K}^{KJ}=&\,\frac{{\varOmega ^{K,t}}^2}{\varOmega ^{t}_\mathrm{patch}} \; {\mathbf {G}^{K,t}}^T \; \mathfrak {I}\; \mathbf {G}^{J,t}, \quad J \in \mathscr {P}; \quad J \ne K \end{aligned}$$
(49b)

Here \(\mathbf {K}^{KK}\) corresponds to stiffness components whose rows and columns are associated with the degrees of freedom of element K, whereas \(\mathbf {K}^{KJ}\) corresponds to components whose rows and columns are respectively associated with the degrees of freedom of elements K and J in the patch, s.t. \(J \ne K\). The fourth-order spatial elasticity tensor \(\mathbf {a}\) is evaluated at \(\mathbf {F}=\bar{\mathbf {F}}\) [13], as

$$\begin{aligned} a_{ijkl}=\frac{1}{\mathrm{det} \mathbf {F}} A_{imkn}F_{jm}F_{ln} \end{aligned}$$
(50)

where \(\mathbf {A}\) denotes the elasticity tensor derived from the first Piola–Kirchhoff stress \(\mathbf {P}\) as \(A_{imkn}=\frac{\partial P_{im}}{\partial F_{kn}}\). \(\mathfrak {I}\) in equation (49) corresponds to the fourth-order tensor \(\mathfrak {I}=\frac{1}{3}\mathbf {a} : \left( \mathbf {I} \otimes \mathbf {I}\right) - \frac{2}{3} \left( \varvec{\sigma } \otimes \mathbf {I}\right) \).

The FP method is flexible to be used for various material constitutive models. Its implementation in any standard displacement-based FE code is quite straightforward as DOFs are merely nodal displacements and constitutive updates are performed at the element quadrature points. While the calculation of internal force vector is similar to that for TET4 elements, evaluation and assembly of the tangent stiffness matrix in (49) requires more attention.

4 LIB and FP stabilization methods in polycrystalline CPFE models

This section examines the application of LIB and FP induced LFS-TET4 elements to finite element models of polycrystalline microstructures, incorporating non-local rate-dependent crystal plasticity constitutive models. A special feature of these models is that they must account for discrete polycrystalline grain boundaries in the construction of the FE mesh and associated sub-structures. The constitutive update algorithms for the time increment between t and \(t+\triangle t\), use implicit time integration methods [34] to evaluate the Cauchy stress \(\hat{\varvec{\sigma }}^{t+\triangle s,t}\) , slip rates and all deformation state variables, as well as evaluating the fourth-order tangent moduli tensor \(\mathbf {C}^{t+\triangle s,t}\). Important steps in the implementation are discussed next.

4.1 Creating stabilization patches conforming to grain boundaries

For polycrystalline microstructures, the node-based patches needed with the LIB and the FP methods must conform to the grain structures. Consider a node s located on the grain boundary of a 2D model as shown in Fig. 6. Slip systems are not continuous across the boundary of grains with crystallographic misorientation, which leads to discontinuities in the plastic strains. With this consideration, the patch assigned to the node s should not cross grain boundaries. It is not logical to construct and smooth over a single patch for a grain boundary node that connects multiple grains. Correspondingly sub-patches that are exclusive to a single grain are created with representation:

$$\begin{aligned}&\hat{\varOmega }^{s,t}=\hat{\varOmega }^{s,t}_{\mathrm{grain}_1}+\hat{\varOmega }^{s,t}_{\mathrm{grain}_2} \end{aligned}$$
(51a)
$$\begin{aligned}&\text{ where } \quad \hat{\varOmega }^{s,t}_{\mathrm{grain}_1}= \sum _{K=1}^{N^{s}_\mathrm{grain_1}} \frac{1}{4} ~\varOmega ^{K,t}~~~~ \text{ and }~~~~\nonumber \\&\hat{\varOmega }^{s,t}_\mathrm{grain_2}=\sum _{J=1}^{N^{s}_\mathrm{grain_2}}\frac{1}{4}~ \varOmega ^{J,t} \end{aligned}$$
(51b)

Here \(N^{s}_\mathrm{grain_1}\) and \(N^{s}_\mathrm{grain_2}\) correspond to the number of TET4 elements attached to a node s that is common to grains 1 and 2. This procedure can be generalized for nodes at triple and quadruple points. The smoothing process, and evaluation of constitutive variables, tangent stiffness matrix and internal forces, are carried out separately for each sub-patch.

Fig. 6
figure 6

Constructing sub-patches for nodes on grain boundary in polycrystalline microstructures

4.2 Evaluating GNDs using a super-convergent patch recovery method

Numerical evaluation of \(\left\{ {\varvec{\varLambda }}\right\} \) requires computing the derivatives of the plastic deformation gradient. This renders the model to be non-local. Using the linear TET4 shape functions, the Nye tensor is evaluated at any given point inside an element as:

$$\begin{aligned} {\varLambda _{ij}}=\varepsilon _{jrs}\frac{\partial \hat{F}^p_{ir}}{\partial x^{0}_s} =\varepsilon _{jrs}\sum _{\alpha =1}^{4} \left( {\hat{F}}^p_{ir} \right) ^{\alpha } \frac{\partial N^{\alpha }}{\partial x^{0}_s} \end{aligned}$$
(52)

where \(\varepsilon _{jrs}\) is the permutation tensor component, \(\left( {\hat{F}}^p_{ij} \right) ^{\alpha }\) are components of the nodal plastic deformation gradient tensor and \(N^{\alpha }\) are shape functions.

An appropriate interpolation technique should be adopted for calculating the nodal plastic deformation gradient tensor to avoid numerical error in evaluating GND densities resulting in spurious high local stresses. The super-convergent patch recovery (SPR) method in [43] is deemed to be the most appropriate method for this purpose. The SPR method evaluates nodal values inside a super-convergent patch \(\varOmega _{p}\) by interpolating the variables using a higher order polynomial expansion as:

$$\begin{aligned} {F}^{p}_{ij}(\mathbf{x})= \left[ \mathbf P(x) \right] \mathbf {\left\{ \mathbf{a}\right\} }^{ij} \end{aligned}$$
(53)

where \({F}^{p}_{ij}(\mathbf{x})\) represents a higher order representation of components of plastic deformation gradient at a point \(\mathbf{x}\) in the patch. \(\left[ \mathbf P(x) \right] \) is the interpolation matrix containing polynomial basis functions as:

$$\begin{aligned} \left[ \mathbf P(x) \right] =\left[ 1,x,y,z,x^2,y^2,z^2,xy,yz,zx,\dots \right] \end{aligned}$$
(54)

\(\mathbf {\left\{ \mathbf{a}\right\} }^{ij}\) is the coefficient vector that is obtained by least squares minimization of the difference between the function in equation (53) and known values of \({F}^{p}_{ij}\) at the quadrature points of the elements in the patch. The functional to be minimized with respect to \(\mathbf {\left\{ \mathbf{a}\right\} }^{ij}\) is

$$\begin{aligned} f(\mathbf {\left\{ \mathbf{a}\right\} }^{ij})=\sum _{I=1}^{N_{P}}\left( {F}^{p}_{ij}(x,y,z)- \left[ \mathbf {P}(x,y,z) \right] \mathbf {\left\{ \mathbf{a}\right\} }^{ij}\right) ^2_{I} \end{aligned}$$
(55)

\(N_{P}\) is the number of elements in the super-convergent patch. The solution to this minimization problem is given in [43] as:

$$\begin{aligned} \mathbf {\left\{ \mathbf{a}\right\} }^{ij}=\left[ \mathbf {X}\right] ^{-1} \mathbf {\left\{ y\right\} }^{ij} \end{aligned}$$
(56)

where

$$\begin{aligned}&\left[ \mathbf {X}\right] =\sum _{I=1}^{N_{P}}\left[ \mathbf {P}(x,y,z)\right] ^T_{I} \left[ \mathbf {P}(x,y,z)\right] _{I} \end{aligned}$$
(57a)
$$\begin{aligned}&\mathbf {\left\{ y\right\} }^{ij}=\sum _{I=1}^{N_{P}} \left[ \mathbf {P}(x,y,z)\right] ^T_{I} F^{p}_{ij}(x,y,z)_{I} \end{aligned}$$
(57b)

Nodal values of \(F^{p}_{ij}\) are evaluated using equation (53). The super-convergent patches can be defined separately for each node by selecting the appropriate surrounding elements. The selection of this patch is important to avoid the ill-conditioning of \(\mathbf {X}\). Typically normalized coordinates are used in the construction of \(\left[ \mathbf {P}(x,y,z)\right] \) [43] as:

$$\begin{aligned} \bar{x}= & {} -1+2\frac{x-x_\mathrm{min}}{x_\mathrm{max}-x_\mathrm{min}}, ~~\bar{y}=-1+2\frac{y-y_\mathrm{min}}{y_\mathrm{max}-y_\mathrm{min}},\nonumber \\ \bar{z}= & {} -1+2\frac{z-z_\mathrm{min}}{z_\mathrm{max}-z_\mathrm{min}} \end{aligned}$$
(58)

Subscripts \(\mathrm{max}\) and \(\mathrm{min}\) correspond to the maximum and minimum coordinates in the patch. The normalized coordinates lie within the bounds \(-1\le \bar{x} \le 1\), \(-1\le \bar{y} \le 1\), and \(-1\le \bar{z} \le 1\). A weighted least square method is used in this work that can be used with large patches without discrimination. In this method, a weighting function is used which exponentially decays with the distance of the node from the element centroids in the patch. The functional in (55) is correspondingly modified as:

$$\begin{aligned} f(\mathbf {\left\{ \mathbf{a}\right\} }^{ij})=\sum _{I=1}^{N_{P}} w_{I} \left( F^{p}_{ij}(x,y,z)- \left[ \mathbf {P}(x,y,z) \right] \mathbf {\left\{ \mathbf{a}\right\} }^{ij}\right) ^2_{I} \end{aligned}$$
(59)

The exponentially decaying weight is chosen as \(w_{I}=\mathrm{exp}(- d_{I}/ \alpha )\). \(d_{I}\) is the distance from the centroid of I th element in the patch to the node in question and \(\alpha \) is the decay length. It is selected such that enough elements lie within this decay length to yield a recovery matrix \(\left[ \mathbf {X}\right] \) with a good condition number.

5 Numerical examples and discussions

The performance of the locally integrated B-bar (LIB) and F-bar patch (FP) based TET4 elements in CPFE analysis of polycrystalline materials is studied in this section. A standard patch test is performed in the first example. Subsequently an elastic bending problem and several crystal plasticity examples, including a bicrystal compression test, polycrystalline beam bending and constant strain rate deformation of a polycrystalline aggregate, are solved. The results are compared with those of the standard TET4 element. When possible, the 8-noded hexahedral element with B-bar stabilization method is used to generate reference solutions for comparison. The computational costs for different element formulations are then compared for a crystal plasticity problem. For crystal plasticity problems, two low-symmetry HCP metallic alloys, viz. a magnesium alloy AZ31 and titanium alloy Ti6Al, are chosen for numerical simulations. For Mg alloy simulations, the constitutive model presented in section 2.1 is used, while a constitutive model described in [2] is used for Ti alloys.

5.1 Element patch test

The patch test is a necessary condition that should be satisfied for all elements in a finite element ensemble. This test is performed on a \(20 \times 20 \times 40\) cube discretized into 48 TET4 elements with 8 nodes on the outer surfaces and 13 nodes inside the cube. The material is assumed to be isotropic, linear elastic. Nodal displacements on the outer surfaces are prescribed using linear functions as:

$$\begin{aligned} \triangle u_1= & {} \frac{z+20}{200}-\frac{x-10}{100}, \qquad \triangle u_2=\frac{z+20}{200}-\frac{y-10}{100}, \nonumber \\ \triangle u_3= & {} 0 \end{aligned}$$
(60)

Nodal displacements inside the cube are calculated for NUS, LIB and FP based TET4 elements. A norm of the displacement error is defined as:

$$\begin{aligned} \overline{\mathrm{err}}_\mathrm{dis}= \frac{\sum _{\alpha =1}^{13} \sum _{i=1}^{3}\left( \mathbf {u}_{\alpha ~i}^\mathrm{exact} - \mathbf {u}_{\alpha ~i}^{FE}\right) ^2}{\sum _{\alpha =1}^{13} \sum _{i=1}^{3} \left( \mathbf {u}_{\alpha ~i}^\mathrm{exact}\right) ^2} \end{aligned}$$

For both LIB and FP elements \(\overline{\mathrm{err}}_\mathrm{dis} \le 2.22 \times 10^{-15}\) and hence they pass the standard patch test. The NUS element, however, does not pass the patch test since the determinant of the Jacobian of some elements becomes negative and the elements undergo non-physical distortion. This corresponds to the presence of spurious modes in NUS element.

5.2 Bending of an elastic beam

In this example, a nearly incompressible elastic beam, subjected to a bending moment, is solved using the LIB, FP and standard TET4 elements. The material is isotropic, linear elastic with Young’s modulus \(E=300\) MPa and Poisson ratio \(\nu =0.4999\). Dimensions of the beam are 4 m \(\times \) 1 m \(\times \) 1 m, which is discretized into 31758 elements consisting of 6513 nodes. The bending moment boundary condition is manifested through imposing a linearly distributed normal stress \(\sigma _{xx}\) on the \(x=4.0m\) surface. Displacement boundary conditions are applied on the surface \(x=0.0\) m to constrain rigid body motion, as shown in Fig. 7.

Fig. 7
figure 7

Mesh and boundary condition for the elastic beam bending problem

Table 3 Maximum tip deflection of the bending dominated elastic beam at the limits of incompressibility using different element formulations
Fig. 8
figure 8

a Convergence of the tip deflection for different element formulations. The dashed line corresponds to the reference solution predicted by 8-noded hexahedral element with B-bar stabilization. b zoomed-in view of a showing the difference between LIB, FP4 and FP8 elements

TET4 elements exhibit volumetric locking for nearly incompressible materials due to too many incompressibility constraints. To generate a locking-free reference solution, the beam is solved using 4961 nodes and 4000 8-noded hexahedral elements with B-bar stabilization [42]. The maximum tip deflections are tabulated in Table 3. The results clearly show that the standard TET4 element suffers from severe volumetric locking, resulting in very stiff behavior. The LIB and FP elements, on the other hand, provide satisfactory results in comparison with the reference solution. Convergence of the LIB and FP elements is examined by solving the beam problem with 5 different meshes consisting of 343, 845, 1246, 2929, 6513 nodes respectively. The tip deflections predicted by the LIB and FP elements with a patch size of 4 tetrahedrons (FP4) and FP element with a patch size of 8 tetrahedrons (FP8) are plotted in the Fig. 8. The reference solution using 8-noded hexahedral element with B-bar stabilization is plotted with the dashed line. The FP8 element shows the softest response and its solution is closest to the reference solution. This is due to the fact that a larger patch in the FP8 element is able to further reduce the incompressibility constraints. The patch construction for the LIB element formulation does not involve flexible patch size and it is observed that this element solution lies between those of FP4 and FP8 for all the meshes.

5.3 Bicrystal compression test

A bicrystal uniaxial compression loading test is simulated in this example to understand the effect of volumetric locking in crystal plasticity FE analyses. The loading is applied along Z direction and simulations are conducted using standard TET4, LIB and FP elements. Grain boundaries are important in crystal plasticity analysis as they produce dislocation pile-ups, stress concentration and often trigger failure initiation. A flat simple-tilt grain boundary in the bicrystal is chosen in this example. The grain boundary is characterized by crystal orientations, which have Euler angles \(\left[ 0^\circ , 0^\circ , 0^\circ \right] \) and \(\left[ 0^\circ , 90^\circ , 0^\circ \right] \) defined in the \(Z -X- Z\) convention [44] for crystals 1 and 2 respectively. Both crystals have a dimension of \(10 \mu m \times 10 \mu m \times 10 \mu m\), as shown in Fig. 9a. Displacement boundary conditions are applied on the top surface and minimum displacement boundary conditions are imposed on other surfaces to remove the rigid body modes. The material constitutive models are those of the magnesium alloy AZ31 developed in [34].

Fig. 9
figure 9

a Illustration of the boundary conditions and the crystallographic orientations for the constant strain rate compression test on a magnesium AZ31 alloy bicrystal; distribution of loading direction stress \(\sigma _{zz}\) in the deformed configuration at \(5~\%\) strain using simulation results of: b 8-noded hexagonal element using B-bar method with a mesh of 18,081 nodes, c standard TET4 element with a mesh of 11,862 nodes, d LIB element with a mesh of 11,862 nodes, and e FP element with a mesh of 11,862 nodes

Fig. 10
figure 10

a Error plot of \(\left\| e\right\| _{L2}\) with increasing degrees of freedom (DOF). b zoomed-in view of (a) to compare the error between FP8 element and LIB element

Fig. 11
figure 11

Evolution of maximum of local hydrostatic stress with strain for different element formulations

Fig. 12
figure 12

a Schematic of a 327-grain Ti6Al polycrystalline beam showing misorientation distribution; b distribution of effective plastic strain after 324s for TET4, FP8 and LIB elements (from left to right)

This example shows that even for uniaxial loading, volumetric locking is observed in crystal plasticity FE analysis. This can be introduced by the lattice structure among grains rather than by external loading. From Schmid factor analysis, plastic deformation is expected to occur primarily on \(\langle c+a\rangle \) pyramidal plane in crystal 1 and \(\langle a\rangle \) prismatic plane in crystal 2. However dislocation glide may also occur on other slip systems close to the grain boundary as the local stress state deviates from average uniaxial stress state due to the lattice mismatch and plastic strain incompatibilities. Seven different meshes of different density, consisting of 766, 1106, 1583, 2742, 4400, 6421 and 11862 nodes are simulated using the standard TET4, LIB and FP elements. The reference solution, shown in Fig. 9b, is obtained by solving a mesh of 18,081 nodes using the 8-noded hexahedral element with the B-bar stabilization method. The distribution of loading direction stress \(\sigma _{zz}\) using the standard, LIB and FP elements are shown in Fig. 9c–e respectively. Very high stress concentration is observed at the grain boundary using standard TET4 element compared to the solution by other stabilized elements. Additionally, the result of the TET4 element shows a non-smooth distribution of the local stress, which is not seen for solutions with the LIB and FP elements. The error in the stress is evaluated as the L2 norm of the difference with the reference solution, expressed as:

$$\begin{aligned} \left\| e\right\| _{L2}=\frac{\left[ \int _{\varOmega }\left( {\sigma }_{ij}-{\sigma }_{ij}^\mathrm{ref}\right) \left( {\sigma }_{ij}-{\sigma }_{ij}^\mathrm{ref}\right) d\varOmega \right] ^{\frac{1}{2}}}{\left( \int _{\varOmega }{\sigma }_{ij} {\sigma }_{ij} d\varOmega \right) ^{\frac{1}{2}}} \end{aligned}$$
(61)

where \(\varvec{\sigma }\) and \(\varvec{\sigma }^\mathrm{ref}\) are the solution and reference Cauchy stress respectively. The corresponding error plots for different elements with increasing mesh densities are shown in Fig. 10. The average convergence rate for LIB and FP elements is 0.75. For CPFE analysis, the LIB and FP elements exhibit similar results with much smaller errors compared to the standard TET4 element.

For further stability analysis, the hydrostatic stress at the grain boundary is plotted in Fig. 11. Unrealistically large hydrostatic stresses are observed with conventional TET4 elements. With plastic incompressibility, the non-zero volumetric strain at each integration point gives rise to a large strain energy that results in a large hydrostatic stress. LIB and FP elements significantly alleviates this problem and exhibit a saturation of the hydrostatic stress, which is consistent with the results of the stabilized 8-noded hexahedral element. In contrast, all element formulations yield nearly the same values of the von Mises stress, as the deviatoric strain energy is nearly unaffected by volumetric locking for this bicrystal problem. In real polycrystalline microstructures however, the shear stress components are also affected by volumetric locking due to the existence of complex grain boundary patterns.

5.4 Bending of a polycrystalline cantilever beam

The effect of volumetric locking on bending of a polycrystalline Ti6Al cantilever beam is investigated in this example. The beam is \(2000\,\mu \mathrm{m}\) long with a square cross-section of \(300 \times 300\,\mu \mathrm{m}^2\), as shown in Fig. 12a. It consists of 327 grains that are cumulatively discretized into 276544 TET4 elements as shown in Fig. 12a. All 3 translational degrees of freedom are fixed on the left end. A linearly increasing shear traction is imposed in the Y direction on the right end to bend the beam mainly about the Z direction.

Fig. 13
figure 13

Distribution of hydrostatic stress on XY face of the beam after 324 s using different element formulations

Fig. 14
figure 14

A 540-grain polycrystalline microstructure of Ti6Al alloy discretized into 583432 TET4 elements

Distribution of the effective plastic strain by the different element formulations are plotted in Fig. 12. At early stages of deformation, the response is primarily elastic and all element formulations predict nearly the same tip deflection with almost no locking. With increasing deformation, the material starts to deform plastically near the fixed end as seen in Fig. 12b. This leads to formation of a plastic hinge near the clamped end, where the maximum bending moment occurs. As the material undergoes more plasticity near the fixed end further enhances the plastic hinge mechanism and overall rotation is facilitated. Volumetric locking causes less plastic strain with TET4 element than the other two leading to significant under-prediction of the tip deflection.

Figure 13 shows the distribution of hydrostatic stress on the XY face of the beam. A checkerboard type pattern is observed for the analysis done by TET4 elements near the fixed end of the beam where plastic strain is significant. These fluctuations are nearly eliminated in the results from the FP and LIB elements.

5.5 Constant strain-rate deformation of a polycrystalline microstructure

CPFE analysis, using different element formulations is performed to investigate the effects of volumetric locking on the response of a polycrystalline microstructure under constant rate of deformation. The \(680 \times 680 \times 680 \mu \mathrm{m}^3\) Ti6Al polycrystalline microstructure consists of 540 grains as shown in Fig. 14 with the corresponding pole figures. A constant rate of deformation \(\dot{\epsilon }=9 \times 10^{-5}\mathrm{s}^{-1}\) is applied in the [001] direction. Figure 15a shows the results of simulations using different element formulations. In the elastic regime, all formulations predict the same macroscopic response since the material is elastically compressible. With increasing plasticity, response obtained from TET4 element suffers volumetric locking and shows a stiffer response with a higher rate of hardening in comparison with the response predicted by FP8 and LIB elements.The FP8 and LIB elements predict almost the same response. The distribution of hydrostatic stress after 800 s, corresponding to nearly \(7~\%\) strain, in Fig. 15b clearly shows that TET4 element tends to over-predict hydrostatic stresses. It is observed that the FP8 and LIB elements perform equally well and their results have the same distributions.

Fig. 15
figure 15

Comparison of a loading-direction true stress-strain response of polycrystalline Ti-6Al alloy under uniaxial tension in the [001] direction, and b distribution of hydrostatic stress in the polycrystalline microstructure after 800s, by the different methods

5.6 Micro-twin nucleation in polycrystalline magnesium

Micro-twin nucleation in magnesium alloys is of significant interest for a wide range of applications [4548]. The effect of volumetric locking in CPFE analysis with respect to micro-twin nucleation is illustrated in this example. A detailed CPFE framework has been established in [34] with the capability of capturing microstructural twin nucleation. The model considers non-planar dissociation of a sessile pyramidal \(\langle c+a\rangle \) dislocation into a stable twin nucleus, which can propagate under applied in-plane shear stress and leave behind a sessile stair-rod dislocation to conserve the Burger’s vector. Energetic analysis of the dissociation process using 3D elastic theory of dislocations suggests that a stable twin nucleus will form if the following energy-based criteria are satisfied:

$$\begin{aligned} E_\mathrm{ini}\ge & {} E_{tw}(d=0)+E_{r}\quad \text{ and } \nonumber \\ E_\mathrm{ini}> & {} E_{F}(d_{s},\tau _{tw}) \quad \forall ~ d_{s}>2r_{0} \end{aligned}$$
(62)

where \(E_{ini}\) is the initial energy of the system given by the self-energy of the sessile \(\langle c+a\rangle \) dislocation before dissociation. After dissociation occurs, \(E_{tw}\) is the self-energy of the twinning dislocation loop, \(E_r\) is the self-energy of the stair-rod dislocation, d is the separation distance between the front segment of twinning dislocation loop and the stair-rod dislocation. \(E_{F}\) is the post dislocation energy of the system after dissociation. The first criterion states that the formation of the two partial dislocations are energetically favorable only if the initial energy exceeds the energy of the two partials before any further separation. The second criterion states that the equilibrium separation is energetically favorable and the process is irreversible if the final energy at a saddle point is less than the initial energy, and the saddle point must exceed a threshold separation distance (\(d_{s}>2r_{0}\)). Critical twin nucleation parameters in equation (62) have been calibrated from experiments in [34].

The CPFE simulation of micro-twin nucleation is conducted on a statistically-equivalent \(40 \mu m \times 40 \mu m \times 40 \mu m\) virtual microstructure of the polycrystalline Mg alloy AZ31, as shown in Fig. 17. The virtual microstructure is constructed using the DREAM.3D software [49] following methods described in [50, 51]. It contains 103 grains with an average grain size of \(~10 \mu m\), and matches morphological and crystallographic statistics with electron back-scattered diffraction (EBSD) data obtained from experiments in [52, 53]. The microstructure is discretized into 113425 tetrahedral elements with 21463 nodes. Displacement boundary conditions at a rate of \(0.004 \mu m/s\) are applied on the two surfaces in Y-direction, which tend to bend the microstructure about X-axis on Y-Z plane (Fig. 16).

Fig. 16
figure 16

Schematic of the applied bending boundary condition to polycrystalline Mg alloy AZ31, and the \(\left\{ 0001 \right\} \) and \(\left\{ 10\bar{1}0 \right\} \) pole figures showing the texture of the polycrystalline microstructure

Fig. 17
figure 17

a Stable micro-twin dissociation distance as a function of loading time, and b loading direction stress at a material point in the center with loading time

Fig. 18
figure 18

GND densities distribution at the middle section after 500s using: a TET4 elements, b FP8 elements, and c LIB elements

CPFE simulations using TET4 element (simulation A), F-bar patch element with a patch size of 8 tetrahedrons (FP8) (simulation B) and LIB element (simulation C) reveal the difference when locking is removed. The predicted twin nucleation is plotted in Fig. 17a. The simulation using TET4 elements shows a much earlier twin nucleation time (97 s) than that using the LIB element (160 s) and FP8 element (180 s). This difference is due to the difference in stress states predicted by the two element formulations. For the same level of displacement on the two surfaces, the conventional TET4 element shows a much stiffer response with much higher stresses. This is shown in Fig. 17b where the loading direction stresses at the material point from which twin nucleates are plotted. The non-physical stresses leads to a unrealistic external work in the TET4 elements to separate twin partials from stair-rod dislocations and result in an early twin nucleation prediction. Thus it inaccurately predicts material failure in polycrystalline microstructures. This is clearly remedied with the locking-free elements. It is noticed that the FP8 element shows a slightly higher level of locking removal than LIB element elements, due to a better constraint ratio. Comparison of the GND densities in Fig. 18 reveals that the simulation using LIB elements and FP8 elements shows highest GND concentrations close to grain boundaries. Volumetric locking in TET4 elements predicts a stiffer response to bending and constrained lattice distortion, which results in lower GND density.

Table 4 Comparison of CPU time for different element formulations

6 Computational efficiency with different element formulations

Methods of alleviating volumetric locking in either the LIB or FP elements result in more computational costs in comparison with the conventional TET4 element. The CPU time spent for LU factorization of the tangent stiffness matrix and element-level calculations, including computation of residual force and tangent stiffness matrix, are compared for efficiency of the three formulations. Only one processor is used in all the simulations to rule out the effects of improper parallel computing algorithms, if any, on the reported CPU times. LU factorization is carried out using the SuperLU package [54].

A small Ti6Al microstructure with 14 grains that is discretized into 2141 elements is considered for the efficiency study. The CPU times expended for LU factorization and element-level calculations for one iteration in the Newton-Raphson solution scheme are compared in Table 4. The CPU time spent on LU factorization for locking-free elements is more than that for TET4 elements. This is due to the fact that more degrees of freedom are connected to one another in FP and LIB formulations. This makes the bandwidth of the tangent stiffness matrix larger and more vector operations are needed for LU factorization. The factorization for FP elements takes less time than LIB elements since the bandwidth of the former tangent stiffness matrix is generally smaller. In regard to element-level calculations, the FP elements take slightly longer time than TET4 elements. This is mostly due to calculating the modified deformation gradient in the patch, calculating and assembling the cross-stiffness matrix \(K^{KJ}\) in equation (49). The LIB elements take significantly longer time to perform element-level calculations since each element is divided into 4 sub-tetrahedrons where the constitutive law is solved. The number of constitutive updates and assembly processes increases the CPU times for the LIB element. From this study, it is deemed that FP elements are preferred over LIB elements in CPFE simulations from an efficiency point of view.

7 Summary and conclusions

This paper examines three methods to overcome volumetric locking in 3D constant strain tetrahedral (TET4) elements and augments them for crystal plasticity finite element or CPFE analysis of polycrystalline metals and alloys. The three methods include node-based uniform strain (NUS) element, the locally integrated B-bar (LIB) element and the F-bar patch (FP) based element that incorporate stabilization patches for selectively integrating parts of the constitutive relations. The LIB element splits the gradient operator matrix into isochoric and volumetric components and then reduces the incompressibility constraint by smoothing the volumetric component within the patch. The FP element changes the deformation gradient tensor at each integration point to volume-averaged value within each patch. Both the LIB and FP elements provide stabilized solution without introducing spurious low-energy modes as with the NUS element. These elements also do not require isotropy in the material tangent stiffness tensor and can be applied to any constitutive law. Both of these elements pass the element patch test.

Various finite deformation CPFE simulations are conducted to investigate the performance of LIB and FP elements in eliminating volumetric locking. Bending simulations of a nearly incompressible elastic bar show that both the LIB and FP elements provide satisfactory results and converge to the reference solution. The FP element is capable of providing slightly better result than the LIB element for an optimal patch size. CPFE simulations of polycrystalline magnesium and titanium alloys under various loading modes reveal that these elements are able to relieve volumetric locking present with linear TET4 elements. The effects of locking are dominant near grain boundaries and cause locally high hydrostatic stresses and low plastic strains. The LIB and FB elements stabilize the displacement, local stresses and plastic strains, and GND distributions in CPFE analyses. Linear convergence rates are seen in bicrystal compression tests. In modeling micro-twin induced material failure in polycrystalline microstructures of AZ31 alloy, significantly premature micro-twin nucleation time is predicted by linear TET4 elements. This can be overcome by using the LIB or FP elements in CPFE analyses. Finally, when computational efficiency is considered the FP element outperforms the LIB element with a considerably lower simulation time. The fact that LIB element performs constitutive updates once for each sub-tetrahedron, increases the number of Gauss points and slows down the simulations. From both accuracy and efficiency consideration,the FP element is deemed more suitable for stabilized CPFE analysis.