1 Introduction

A car structure generally consists of hundreds of fabricated metal panels and frames joined together, using a combination of various joining techniques such as spot welding, riveting, clinching, hinging and screwing. Those joints are often considered the weakest points as regards to structural strength. Therefore, it is very important to understand the manufacturing process of joints and their mechanical behavior in a car design phase. In the vehicle’s virtual development process, extensive computer simulations are conducted using commercial finite element codes such as LS-DYNA® and Abaqus to study the performance of joints before a body-in-white structure is ready for production. Thus, it is critical for automotive industry to use the advanced computer aided engineering (CAE) software that enables for various joints modeling and failure prediction in car structure and safety design.

In the meantime, there is an increasing use of lightweight and high-performance materials like aluminum and galvanized or pre-painted steel in energy saving vehicles. As resistance spot welding of those materials is difficult or even impossible, automotive industry is striving toward new joining techniques such as flow drill screw (FDS) [1] and self-piercing rivet (SPR) [2] connections. These new joining methods often exhibit very complicated forming process which is difficult to model by Lagrangian finite element methods. For example, modeling the extreme thread forming procedure in the FDS driving process involves extensive plastic deformation and material separation simulations so traditional finite element methods would therefore be very challenging. Specifically, the C0-continuity assumption in most finite element methods is unable to describe the kinematic discontinuity of displacement fields for material separation simulation [3]. While the element deletion technique can be applied to reduce excessive straining and mesh tangling problems caused by the C0-continuity assumption in the material separation simulation, it gives another instance of numerical instability associated with the loss of conservation properties in mass and linear momentum. Furthermore, due to material erosion, the desired deformed shape such as the threaded boss might not be formed at all. As a consequence, the numerical result could become very problematic and parameter sensitive in the FDS driving simulation.

Although the Eulerian finite element method can be easily applied to solid mechanics applications to overcome the mesh distortion problem, the Eulerian representation of material flow presents other numerical difficulties in tracking the material points and free surfaces in the FDS driving simulation. Unlike the Eulerian finite element method, the arbitrary Lagrangian–Eulerian (ALE) method advances the computational mesh inside the domain arbitrarily while keeps the mesh on the boundary moving along with material flow. Despite this advantage offered by the ALE method to handle the free surface problem, its drawback is the manifestation of numerical oscillations when the convective effect is dominant in the governing equations. This numerical instability often arises when the velocity difference between mesh movement and material flow becomes pronounced. As a generalization of Eulerian approach, the ALE method also has troubles to model the formation of new surfaces in the course of material separation processes [4]. In consequence, extant finite element literatures [5, 6] for modeling FDS thread forming process and its failure characteristics are very limited. It is clear that current finite element modeling strategy for FDS thread forming simulation is not sufficient as it could greatly impede the overall prediction of load-bearing structures in the vehicle’s virtual development process.

Alternatively, Lagrangian particle or meshfree methods [7] have grown in popularity as a practical numerical tool in industrial design and applications. Despite the generic name “meshfree”, not all meshfree methods are truly meshless. Some meshfree methods, particularly those based on the Galerkin method, actually require an auxiliary cell structure called “integration cells” or “background meshes” for the domain integration of weak forms. Although those methods generally are more accurate and stable than truly particle methods, it is difficult for them to deal with the cell distortion issue present in severe deformation and material separation applications. Indeed, Lagrangian particle methods based on background meshes pose significant challenges from both mathematical formulation and programming aspects in dealing with severe deformation problems. Therefore, it is important to develop a stable and accurate particle method that obviates the inherent limitation of cell-based meshfree methods for FDS thread forming simulations. However, it is widely known that most Lagrangian particle methods are susceptible to several numerical deficiencies in solid mechanics applications. While issues of approximation consistency [8,9,10], essential boundary conditions [11,12,13], spurious energy modes [14, 15] and tension instability [11, 16, 17] have been gradually resolved for particle methods, the numerical issues in severe deformation and material separation applications remain to be addressed.

The smoothed particle Galerkin (SPG) method developed by Wu et al. [3, 4, 18, 19] is one of the new stabilized Lagrangian particle methods aims to bypass the need of background mesh for the ductile material failure simulation. The early development of SPG method [3, 4, 18, 19] introduces first-order strain gradients by means of velocity (or displacement) smoothing to achieve the stabilization effect. This SPG stabilization formulation requires two distinct but coinciding integration points [19] per particle for integrating the weak form. Based on Chen’s implicit gradient expansion [20] and Liu’s strain gradient stabilization technique [21], Hillman and Chen [22] proposed another stabilized Lagrangian particle method for high-speed impact applications. Similar strain gradient stabilization technique was also considered by Wu et al. [23] to study the friction drilling application. Those particle methods share a common feature in augmenting the standard quadratic energy functional by several stabilization terms containing first-order strain gradients to stabilize the results in severe deformation simulation. Because the stabilization in those particle methods is accomplished without the use of momentum equation residual, they belong to the non-residual stabilization methods. In order to integrate those non-residual stabilization terms, multiple integration points that are matching at each particle are needed. Although those particle methods can sufficiently control spurious energy modes in nonlinear analysis, they are not computationally efficient. Additionally, since strain gradient stabilization techniques used in those non-residual stabilization methods were initially derived based on the linear elasticity assumption [3], the evaluation of stabilization stress in nonlinear material models is therefore not straightforward. Following the stabilization work of Belytschko and Bindeman [24] in nonlinear finite element methods, a modified shear modulus [3, 4, 19, 22, 23] is usually specified to replace the elastic modulus in those non-residual stabilization terms for the stress update.

In order to improve the computational efficiency as well as to avoid the fundamental complication in the stabilization stress calculation, a new particle stabilization method [25] was recently proposed by the same authors of this manuscript. In this new stabilization method, a special velocity smoothing algorithm was introduced to replace the direct velocity smoothing in the original SPG stabilization method [3, 4, 18, 19]. This leads to a new stabilization formulation without using residual or non-residual stabilization terms. In other words, the new method only requires one integration point per particle. As a result, the specification of modified shear modulus and the evaluation of stabilization stress can be completely avoided, thus cutting the computational cost. As the new method based on the smoothed velocity field is consistently fulfilling the conservation of linear and angular momentum, we call it the momentum-consistent smoothed particle Galerkin (MC-SPG) method in this paper.

The object of this study is to apply the MC-SPG formulation to study the extreme thread forming procedure that involves large deformation and material separation in the FDS driving process. The reminder of the paper is organized as follows: The velocity smoothing algorithm in MC-SPG formulations is described in Sect. 2. In Sect. 3, additional numerical treatments to simulate the severe deformation and material separation problems are provided. Several numerical benchmarks including one thread forming simulation are given in Sect. 4. Conclusions are drawn in Sect. 5.

2 The velocity smoothing algorithm and MC-SPG formulations

Let’s first recall the weak form of the dynamic equation of motion as follows:

$$ \int_{\varOmega } {\rho \varvec{\ddot{u}}^{h} \cdot \delta \varvec{u}^{h} {\text{d}}\varOmega + \int_{\varOmega } {\varvec{\sigma}\cdot \nabla \delta \varvec{u}^{h} {\text{d}}\varOmega - \int_{\varOmega } {\varvec{b} \cdot \delta \varvec{u}^{h} {\text{d}}\varOmega - \int_{{\partial \varOmega^{h} }} {\varvec{h} \cdot \delta \varvec{u}^{h} = 0} } } } $$
(1)

with initial conditions

$$ \varvec{u}^{h} \left( {\varvec{X},0} \right) = \varvec{u}_{0} \left( \varvec{X} \right) $$
(2)
$$ \dot{\varvec{u}}^{h} \left( {\varvec{X},0} \right) = \dot{\varvec{u}}_{0} \left( \varvec{X} \right) $$
(3)

where \( \rho \) is the current mass density, \( \varvec{\sigma} \) is the Cauchy stress, \( \varvec{b} \) is the body force density measured in current configuration and h is the prescribed traction on the current boundary \( \partial \varOmega^{h} \). \( \varvec{u}^{h} \) is the approximation of the displacement field \( \varvec{u} \), and \( \varvec{\ddot{u}}^{h} \) is the corresponding acceleration. For a particle distribution denoted by an index set \( Z_{I} = \left\{ {\varvec{X}_{I} } \right\}_{I = 1}^{NP} \), approximating the displacement field \( \varvec{u} \) at time t using the first-order meshfree approximations [11, 12, 26] gives

$$ \varvec{u}^{h} \left( {\varvec{X},t} \right) = \sum\limits_{{I \in Z_{I} }}^{{}} {\phi_{I}^{a} \left( \varvec{X} \right)} \varvec{u}_{I} \left( {\varvec{X},t} \right){ = }\sum\limits_{{I \in Z_{I} }}^{{}} {\phi_{I}^{a} \left( \varvec{X} \right)} \varvec{u}_{I} $$
(4)

where NP is the total number of particles in the discretization. \( \phi_{I}^{a} \left( \varvec{X} \right),\quad I = 1, \ldots ,{\text{NP}} \) can be regarded as the Lagrangian shape functions for displacement field \( \varvec{u}^{h} \) where the superscript “a” denotes the support size of particle I. \( \dot{\varvec{u}}_{I} \) is the velocity at particle I. The introduction of Eq. (4) into Eq. (1) using the direct nodal integration (DNI) scheme gives the following standard semi-discrete equation to be solved numerically in Lagrangian particle methods.

$$ \varvec{M}^{\text{lump}} \varvec{\ddot{U}} = \varvec{F}^{\text{ext}} - \varvec{F}^{\text{int}} $$
(5)

where \( \varvec{M}^{\text{lump}} \) is the lumped mass matrix in diagonal form. The construction of mass matrix and the mass lumping scheme follows our previous works in [3, 4, 23]. \( \varvec{\ddot{U}} \) is a vector containing all particle accelerations, and \( \varvec{F}^{\text{ext}} \) is the standard external force vector. \( \varvec{F}^{\text{int}} \) is the regular internal force term given by

$$ \varvec{F}_{I}^{\text{int}} = \sum\limits_{{J \in Z_{I} }}^{{}} {\varvec{B}_{I}^{T} } \left( {\varvec{X}_{J} } \right)\varvec{\varXi}\left( {\varvec{X}_{J} } \right)V_{J} $$
(6)

where \( \varvec{B}_{I} \) is the standard displacement gradient matrix, \( \varvec{\varXi}\left( {\varvec{X}_{J} } \right) = \left[ {\begin{array}{*{20}c} {\sigma_{xx} } & {\sigma_{yy} } & {\sigma_{zz} } & {\sigma_{yz} } & {\sigma_{xz} } & {\sigma_{xy} } \\ \end{array} } \right]_{{X_{J} }}^{T} \) is the component-wise stress vector of particle J and \( V_{J} \) is volume of particle J.

It is known that the solution of Eq. (5) exhibits spurious energy modes in the displacement field. It is also intuitively evident that the oscillation solution can possibly be stabilized by a smoothing algorithm. These observations prompt applications of velocity smoothing in Lagrangian particle methods as a way to eliminate the spurious energy modes. In the original SPG method [3, 4, 19], this idea is justified by an introduction of a direct velocity smoothing algorithm for the particle velocity \( \dot{\varvec{u}}_{I} \) as described by

$$ \dot{\varvec{u}}_{I} = \sum\limits_{{J \in Z_{I} }}^{{}} {\phi_{J}^{a} \left( {\varvec{X}_{I} } \right)\varvec{\hat{\dot{u}}}_{J} } $$
(7)

where \( \varvec{\hat{\dot{u}}}_{J} \) is the unsmoothed (oscillating) velocity at particle J and \( \phi_{J}^{a} \left( {\varvec{X}_{I} } \right) \) is the smoothing function which is same as that in Eq. (4). It is obvious that a blunt use of Eq. (7) to stabilize the solution in Eq. (5) by means of a post-processing procedure may cause a loss of conservation properties for linear and angular momentum. For this reason, a non-residual stabilization term was derived using Eq. (7) to reach the stabilization effect in the original SPG method [3, 4, 19]. This leads to the following new system of equations to solve in the original SPG method

$$ \varvec{M}^{\text{lump}} \varvec{\ddot{U}} = \varvec{F}^{\text{ext}} - \varvec{F}^{\text{int}} - \varvec{F}^{\text{stab}} $$
(8)

where

$$ \varvec{F}_{I}^{\text{stab}} = \sum\limits_{{J \in Z_{I} }}^{{}} {\tilde{\varvec{B}}_{I}^{T} \left( {\varvec{X}_{J} } \right)\tilde{\varvec{\sigma }}\left( {\varvec{X}_{J} } \right)V_{J} } $$
(9)

\( \tilde{\varvec{B}}_{I} \) is the displacement gradient matrix which contains the first-order strain gradient (see [3, 4, 19]) for computation of the stabilization force \( \varvec{F}^{\text{stab}} \). The stabilization stress \( \tilde{\varvec{\sigma }} \) in small strain analysis is formulated using a material response tensor (elasto-plastic tangent modulus) \( \varvec{C}^{\sigma } \) as given by [3, 4, 19]

$$ \tilde{\varvec{\sigma }} = \varvec{C}^{\sigma } :\left( {\nabla\varvec{\varepsilon}\left( {\hat{\varvec{u}}} \right) \cdot\varvec{\lambda}^{b} } \right) $$
(10)

where \( \varvec{\lambda}^{b} \) is the coefficient matrix [3, 4, 19] for stabilization. It should be noted that the coefficient matrix does not contain artificial stabilization control parameter in the original SPG method. The stabilization term \( \varvec{F}^{\text{stab}} \) in Eq. (8) contains the first-order strain gradient [3, 4, 19] which can be considered as suitable gradient jump conditions to stabilize the oscillating solution. Actually, the idea of adding a term penalizing the jump of the gradient to stabilize the solution is not new in the computational fluid dynamics (CFD) community. Pioneering approaches to adopt this idea in solid mechanics applications have been demonstrated by Beissel and Belytschko [27] and Onate et al. [28] for the elasticity analysis.

In practice, however, Eq. (10) cannot be directly applied to material nonlinearity analysis. This is because the numerical evaluation of stabilized stress vector using Eq. (10) involves the elasto-plastic tangent modulus and is computationally unfeasible in the explicit dynamics analysis. For this reason, a modified shear modulus \( \tilde{\varvec{G}} \) is often utilized to replace the elasto-plastic tangent modulus according to the suggestion of Belytschko and Bindeman [24] in the finite element stabilization method, which is given by

$$ 2\tilde{G} = \sqrt {\frac{{{\rm H}_{\Delta \tau } }}{{H_{\Delta e} }}} $$
(11)
$$ H_{\Delta \tau } = \frac{1}{2}\sum\limits_{i = 1}^{3} {\sum\limits_{j = 1}^{3} {\Delta \tau_{ij} \Delta \tau_{ij} } } ,\quad H_{\Delta e} = \frac{1}{2}\sum\limits_{i = 1}^{3} {\sum\limits_{i = 1}^{3} {\Delta e_{ij} \Delta e_{ij} } } $$
(12)

where \( \Delta \tau_{ij} \) and \( \Delta e_{ij} \) are the components of the deviatoric part of the stress and strain increments, respectively.

In contrast to the reconstruction of high-order derivatives in previous gradient stabilization methods, the second type of stabilization methods is the reformulation of the divergence form of governing equations as a first-order system. This, of course, leads to more unknowns and larger problems to solve. Representative CFD approaches in this type of stabilization methods for solid mechanics applications include Taylor–Galerkin stabilization algorithm [29], Jameson–Schmidt–Turkel algorithm [30] and streamline upwind Petrov–Galerkin algorithm [31]. The major disadvantage of these CFD stabilization algorithms is the contradictory demands on an artificial stabilization control parameter placed by the accuracy requirement. It is worth mentioning that improper selection of the stabilization control parameter may lead to the loss of stability properties particularly in the nonlinear material analysis. Moreover, because most of those stabilization methods are formulated in a total Lagrangian framework, their extension to severe deformation and material separation analysis still needs to be developed.

Although the original SPG method [3, 4, 19] eliminates the need of artificial stabilization control parameter, the computation of stabilization stress for material nonlinearity problems is not straightforward and its performance in severe deformation and material failure analysis remains to be investigated. Furthermore, because the stabilization term in Eq. (8) needs to be integrated independently to achieve the stabilization effect, the method also demands additional computational cost. To resolve those issues in the SPG method, a third type of particle stabilization method, the MC-SPG method [25], was developed. As its name indicates, the particle velocity \( \dot{\varvec{u}}_{I} \) in the MC-SPG method is smoothed via a special algorithm associated with the linear momentum, and this is defined in the following

$$ \begin{aligned} \dot{\varvec{u}}_{I} & = \dot{\varvec{u}}\left( {\varvec{X}_{I} } \right):{ = }\frac{{\varvec{P}_{I} }}{{m_{I} }} = \frac{{\sum\nolimits_{{J \in Z_{I} }}^{{}} {\hat{m}_{J} \phi_{I}^{a} \left( {\varvec{X}_{J} } \right)\varvec{\hat{\dot{u}}}_{J} } }}{{\sum\nolimits_{{J \in Z_{I} }}^{{}} {\hat{m}_{J} \phi_{I}^{a} \left( {\varvec{X}_{J} } \right)} }}{ = }\sum\limits_{{J \in Z_{I} }}^{{}} {\left( {\frac{{\hat{m}_{J} \phi_{I}^{a} \left( {\varvec{X}_{J} } \right)}}{{\sum\nolimits_{{K \in Z_{I} }}^{{}} {\hat{m}_{K} \phi_{I}^{a} \left( {\varvec{X}_{K} } \right)} }}} \right)\varvec{\hat{\dot{u}}}_{J} } \\ & = \sum\limits_{{J \in Z_{I} }}^{{}} {\psi_{I}^{a} \left( {\varvec{X}_{J} } \right)\varvec{\hat{\dot{u}}}_{J} } , \, \varvec{X}_{I} \in Z_{I} \\ \end{aligned} $$
(13)

where PI is the smoothed linear momentum of particle I and will be defined in Eq. (19) and

$$ \psi_{I}^{a} \left( {\varvec{X}_{J} } \right) = \frac{{\hat{m}_{J} \phi_{I}^{a} \left( {\varvec{X}_{J} } \right)}}{{\sum\nolimits_{{K \in Z_{I} }}^{{}} {\hat{m}_{K} \phi_{I}^{a} \left( {\varvec{X}_{K} } \right)} }} $$
(14)

and \( \psi_{I}^{a} \left( {\varvec{X}_{J} } \right) \) can be viewed as a modified smoothing function in contrast to the smoothing function \( \phi_{J}^{a} \left( {\varvec{X}_{I} } \right) \) in Eq. (7) in the original SPG method [3, 4, 19]. Equation (13) also defines a standard linear mapping between two particle systems \( \dot{\varvec{U}} \) and \( \varvec{\hat{\dot{U}}} \). Let \( \varTheta_{h} :L^{2} \mapsto L^{2} \) be a discrete operator for the velocity smoothing, we have

$$ \dot{\varvec{U}}\left( {\varvec{X}_{I} } \right) = \varTheta_{h} \left( {\varvec{\hat{\dot{U}}}\left( {\varvec{X}_{I} } \right)} \right) \, \varvec{X}_{I} \in Z_{I} $$
(15)

or in matrix form

$$ \dot{\varvec{U}} = \varvec{T\hat{\dot{U}}}\text{,} \quad \varvec{T}_{IJ} = \varPsi_{I}^{a} \left( {\varvec{X}_{J} } \right)\varvec{I} $$
(16)

Equation (15) describes the computation of a global \( L^{2} \) projection. Subsequently, Eq. (16) gives

$$ \varvec{\hat{\dot{U}}} = \varvec{T}^{ - 1} \dot{\varvec{U}}\text{,}\quad \varvec{\hat{\dot{u}}}_{I} = \sum\limits_{{J \in Z_{I} }} {\varvec{T}_{IJ}^{ - T} \dot{\varvec{U}}_{J} } $$
(17)

which theoretically should be used for the velocity update. The two particle systems \( \dot{\varvec{U}} \) and \( \varvec{\hat{\dot{U}}} \) are collections of particle values \( \dot{\varvec{u}}_{I} \) and \( \varvec{\hat{\dot{u}}}_{I} \), respectively. It is clear that Eq. (17) involves a global matrix inversion which is computationally expansive when the modified smoothing function \( \psi_{I}^{a} \left( {\varvec{X}_{J} } \right) \) is updated through time during the large deformation analysis. In what follows we will replace Eq. (17) by the following approximation for the velocity update.

$$ \varvec{\hat{\dot{u}}}_{I} \approx \sum\limits_{{J \in Z_{I} }}^{{}} {\phi_{J}^{a} \left( {\varvec{X}_{I} } \right)\dot{\varvec{u}}_{J} } $$
(18)

It is important to note that utilization of this modification in the velocity updating procedure will still yield the conservation of linear and angular momentum results as will be shown later.

Finally, the smoothed linear momentum PI and smoothed mass MI of particle I in Eq. (13) are defined by

$$ \varvec{P}_{I} : = \sum\limits_{{J \in Z_{I} }}^{{}} {\hat{m}_{J} \phi_{I}^{a} \left( {\varvec{X}_{J} } \right)\varvec{\hat{\dot{u}}}_{J} } $$
(19)
$$ m_{I} : = \sum\limits_{{J \in Z_{I} }}^{{}} {\hat{m}_{J} \phi_{I}^{a} \left( {\varvec{X}_{J} } \right)} $$
(20)

where \( \hat{m}_{J} \) in Eqs. (19) and (20) is the unsmoothed lumped mass of particle J. The value of \( \hat{m}_{J} \) is computed directly using the direct nodal or Gauss integration schemes at t = 0 as described in [23] and does not change with time.

It is essential that Eq. (13) and Eqs. (19)–(20) employ a “gather” type [32] of smoothing function \( \phi_{I} \left( {\varvec{X}_{J} } \right) \) instead of the “scatter” type of smoothing function \( \phi_{J} \left( {\varvec{X}_{I} } \right) \) as shown in the original SPG method [see, Eq. (7)]. In general case, \( \phi_{J} \left( {\varvec{X}_{I} } \right) \) and \( \phi_{I} \left( {\varvec{X}_{J} } \right) \) are not equal. The employment of gather type of smoothing function yields the following equality which is critical to the preservation of linear and angular momentum in the MC-SPG method.

$$ \sum\limits_{{I \in Z_{I} }}^{{}} {m_{I} = \sum\limits_{{I \in Z_{I} }}^{{}} {\sum\limits_{{J \in Z_{I} }}^{{}} {\hat{m}_{J} \phi_{I}^{a} \left( {\varvec{X}_{J} } \right) = \sum\limits_{{J \in Z_{I} }}^{{}} {\hat{m}_{J} \left( {\sum\limits_{{I \in Z_{I} }}^{{}} {\phi_{I}^{a} \left( {\varvec{X}_{J} } \right)} } \right)} } } } = \sum\limits_{{J \in Z_{I} }}^{{}} {\hat{m}_{J} } $$
(21)

Equation (21) indicates that the global mass is preserved under the gather type of mass smoothing. It also can be verified that the scatter type of mass smoothing algorithm does not necessarily ensure the conservation of global mass. Now let’s recall the momentum of particle I before velocity smoothing to be

$$ \hat{\varvec{P}}_{I} = \hat{m}_{I} \varvec{\hat{\dot{u}}}_{I} $$
(22)

Using Eqs. (13), (18)–(22), it is straightforward to prove that the global linear momentum P is conserved, that is

$$ \varvec{P} = \sum\limits_{{I \in Z_{I} }}^{{}} {\varvec{P}_{I} = } \sum\limits_{{I \in Z_{I} }}^{{}} {\hat{\varvec{P}}_{I} = \hat{\varvec{P}}} $$
(23)

Subsequently, it is also not difficulty to show (see [25] for proof) that the following global conservation of angular moment is valid

$$ \sum\limits_{{I \in Z_{I} }}^{{}} {\varvec{x}\left( {\varvec{X}_{I} } \right) \times m_{I} \dot{\varvec{u}}_{I} = } \sum\limits_{{I \in Z_{I} }}^{{}} {\varvec{x}\left( {\varvec{X}_{I} } \right) \times \hat{m}_{I} \varvec{\hat{\dot{u}}}_{I} } $$
(24)

When the central difference integration scheme is used for the temporal discretization, Eq. (13) becomes

$$ \dot{\varvec{u}}_{I}^{n + 1/2} = \sum\limits_{{J \in Z_{I} }}^{{}} {\psi_{I}^{a} \left( {\varvec{X}_{J} } \right)\varvec{\hat{\dot{u}}}_{J}^{n + 1/2} } $$
(25)

Using Eq. (25), the particle density can be updated by

$$ \rho_{I}^{n + 1} = \rho_{I}^{n} \left( {1 - \Delta t_{n + 1/2} \nabla \cdot \dot{\varvec{u}}_{I}^{n + 1/2} } \right) $$
(26)

where \( \Delta t_{n + 1/2} = t_{n + 1} - t_{n} \). With the updated particle density in Eq. (26), the new particle volume can also be obtained by

$$ V_{I}^{n + 1} = {\raise0.7ex\hbox{${\hat{m}_{I} }$} \!\mathord{\left/ {\vphantom {{\hat{m}_{I} } {\rho_{I}^{n + 1} }}}\right.\kern-0pt} \!\lower0.7ex\hbox{${\rho_{I}^{n + 1} }$}} $$
(27)

Consequently, the particle displacements can be updated using Eq. (25) by

$$ \varvec{u}_{I}^{n + 1} = \varvec{u}_{I}^{n} + \Delta t_{n + 1/2}^{{}} \dot{\varvec{u}}_{I}^{n + 1/2} $$
(28)

The particle strain rate \( \dot{\varvec{\varepsilon }}_{I} = \nabla^{s} \left( {\dot{\varvec{u}}_{I}^{{}} } \right) \) also can be computed accordingly. The concept of gather type of velocity smoothing in the MC-SPG method is illustrated in Fig. 1.

Fig. 1
figure 1

Illustration of a gather type of velocity smoothing in the MC-SPG method

Now it is necessary to update the unsmoothed particle velocity in Eq. (25) during the time stepping. Let’s first compute the increment of unsmoothed particle velocity using the modified unsmoothed particle velocity in Eq. (18) and the (smoothed) acceleration from Eq. (5) to yield

$$ \Delta \varvec{\hat{\dot{u}}}_{I}^{n} \approx \Delta t_{n} \sum\limits_{{J \in Z_{I} }}^{{}} {\phi_{J}^{a} \left( {\varvec{X}_{I} } \right)\varvec{\ddot{u}}_{J}^{n} } $$
(29)

where \( \Delta t_{n}^{{}} = t_{n + 1/2} - t{}_{n - 1/2} \). The update of unsmoothed particle velocity is accomplished by

$$ \begin{aligned} \varvec{\hat{\dot{u}}}_{I}^{n + 1/2} & = \varvec{\hat{\dot{u}}}_{I}^{n - 1/2} + \Delta \varvec{\hat{\dot{u}}}_{J}^{n} \\ & = \varvec{\hat{\dot{u}}}_{I}^{n - 1/2} + \Delta t_{n} \sum\limits_{{J \in Z_{I} }}^{{}} {T_{IJ}^{ - T} } \left( {\varvec{X}_{I} } \right)\varvec{\ddot{U}}_{J}^{n} \\ & \approx \varvec{\hat{\dot{u}}}_{I}^{n - 1/2} + \Delta t_{n} \sum\limits_{{J \in Z_{I} }}^{{}} {\phi_{J}^{a} } \left( {\varvec{X}_{I} } \right)\varvec{\ddot{u}}_{J}^{n} \\ \end{aligned} $$
(30)

Using Eqs. (19), (20) and (30), the global linear momentum becomes

$$ \begin{aligned} \hat{\varvec{P}}^{n + 1/2} & = \sum\limits_{{I \in Z_{I} }}^{{}} {\hat{\varvec{P}}_{I}^{n + 1/2} = } \sum\limits_{{I \in Z_{I} }}^{{}} {\hat{m}_{I} \varvec{\hat{\dot{u}}}_{I}^{n + 1/2} } \\ & = \sum\limits_{{I \in Z_{I} }}^{{}} {\hat{m}_{I} \varvec{\hat{\dot{u}}}_{I}^{n - 1/2} + \Delta t_{n} \sum\limits_{{I \in Z_{I} }}^{{}} {\sum\limits_{{J \in Z_{I} }}^{{}} {\hat{m}_{I} \phi_{J}^{a} \left( {\varvec{X}_{I} } \right)\varvec{\ddot{u}}_{J}^{n} } } } \\ & = \varvec{P}^{n - 1/2} + \Delta t_{n} \sum\limits_{{J \in Z_{I} }}^{{}} {\left( {\sum\limits_{{I \in Z_{I} }}^{{}} {\hat{m}_{I} \phi_{J}^{a} \left( {\varvec{X}_{I} } \right)} } \right)} \varvec{\ddot{u}}_{J}^{n} \\ & = \varvec{P}^{n - 1/2} { + }\Delta t_{n} \sum\limits_{{J \in Z_{I} }}^{{}} {m_{J} \varvec{\ddot{u}}_{J}^{n} } \, \\ & = \varvec{P}^{n - 1/2} + \Delta \varvec{p}^{n} \\ & = \varvec{P}^{n + 1/2} \\ \end{aligned} $$
(31)

which verifies that the present velocity smoothing algorithm is consistently preserving the global linear momentum in the explicit time integration scheme. The preservation of global angular momentum also can be shown analogously (see [25] for proof). This smoothed velocity update algorithm is illustrated in Fig. 2.

Fig. 2
figure 2

Illustration of scatter type of velocity update in MC-SPG method

It is obvious that the stabilization scheme in the MC-SPG method does not involve first or higher order strain gradient terms as needed in the residual and non-residual type of stabilization methods [3, 4, 19]. Because of that, the MC-SPG method only requires a single integration point at each particle, and therefore, the method is computationally more efficient than the original SPG method. As the stress updating procedure can be performed regularly at each particle using the standard material constitutive law, no special specification of elastic modulus for stabilization stress calculation is needed like that in the non-residual type of stabilization methods. The major difference between the SPG and MC-SPG methods is summarized in Table 1.

Table 1 The difference between the SPG and MC-SPG methods

3 The numerical treatments for severe deformation and material separation analysis

In this section, the stabilization formulation described in the previous section is revised to cover the application in simulating the extensive plastic deformation and material separation problems during the severe FDS thread forming process. This is done by introducing two additional numerical features, the adaptive anisotropic Lagrangian kernel and the bond-based failure criterion originally proposed in the SPG method [3, 4, 19], to the MC-SPG method.

3.1 The adaptive anisotropic Lagrangian kernel for severe deformation analysis

It is recalled that Lagrangian shape functions, \( \phi_{I}^{a} \left( \varvec{X} \right),I = 1, \ldots ,{\text{NP}} \), constructed in Eq. (4) utilize Lagrangian kernels to remove the tension instability [11] in the nonlinear structural analysis. However, particle methods based on Lagrangian shape functions experience the excessive straining problem when the strict use of Lagrangian kernel is no longer applicable. Specifically, the excessive straining during the severe plastic deformation simulation in the FDS thread forming process inevitably causes the numerical breakdown when the deformation gradient computed at the particle ceases to become invertible. It presents the same issue of mesh distortion problems in the finite element method.

In order to handle the excessive straining problem in the MC-SPG method, an adaptive anisotropic Lagrangian kernel is considered [3, 4, 19]. Using the chain rule, the calculation for the deformation gradient at the particle can be rewritten as

$$ \varvec{F}_{{}}^{n + m} = \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{F} }_{{}}^{n + m} \varvec{F}^{n} $$
(32)

where \( \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{F} }^{n + m} \left( {\varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{x} }} \right) \) is the decomposed deformation gradient, from t = tn to tn+m, computed based on the new reference configuration and is given by

$$ \begin{aligned} \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{F}_{ij}^{n + m} \left( {\varvec{X}_{J} } \right) & = \frac{{\partial \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{x}_{i} }}{{\partial \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{X}_{j} }} = \sum\limits_{{I \in Z_{I} }}^{{}} {\frac{{\partial \phi_{I}^{a} \left( {\varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{X} }_{J} } \right)}}{{\partial \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{X}_{j} }}\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{x}_{iI} \left( {\varvec{X},t_{n + m} } \right)} \\ & = \sum\limits_{{{\text{I}} \in {\text{Z}}_{I} }}^{{}} {\frac{{\partial \phi_{I}^{a} \left( {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{X}_{J} } \right)}}{{\partial \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{X}_{j} }}\left( {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{X}_{iI} + \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{u}_{iI} \left( {\varvec{X},t_{n + m} } \right)} \right)} \\ & = \delta_{ij} + \sum\limits_{{I \in Z_{I} }}^{{}} {\frac{{\partial \phi_{I}^{a} \left( {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{X}_{J} } \right)}}{{\partial \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{X}_{j} }}\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{u}_{iI} \left( {\varvec{X},t_{n + m} } \right)} \\ \end{aligned} $$
(33)

Here, \( \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{x} } = \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{X} } + \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{u} }\left( {\varvec{X},t_{n + m} } \right) \) is a position vector defined in the new reference configuration, i.e., \( \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{X} } = \varvec{x}\left( {\varvec{X},t_{n} } \right) \). A local \( \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{X} }^{I} \)-coordinate system in which the axes are parallel to the global Cartesian coordinates whose origin located at \( \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{X} }_{I} \) is defined for each particle I. In each new reference configuration, an ellipsoidal nodal support is defined for the neighbor particle searching. The three-dimensional ellipsoidal cubic spline kernel function is defined in another local \( \varvec{\overset{\lower0.1em\hbox{$\smash{\scriptscriptstyle\frown}$}}{\overset{\lower0.1em\hbox{$\smash{\scriptscriptstyle\frown}$}}{X} } }^{I} \)-coordinate system by

$$ \varphi_{I}^{a} \left( {\varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{X} }_{J} } \right) = \phi_{1} \left( {\frac{{\overset{\lower0.1em\hbox{$\smash{\scriptscriptstyle\frown}$}}{\overset{\lower0.1em\hbox{$\smash{\scriptscriptstyle\frown}$}}{X} }_{J}^{I} }}{{h_{1}^{n} }}} \right)\phi_{1} \left( {\frac{{\overset{\lower0.1em\hbox{$\smash{\scriptscriptstyle\frown}$}}{\overset{\lower0.1em\hbox{$\smash{\scriptscriptstyle\frown}$}}{Y} }_{J}^{I} }}{{h_{2}^{n} }}} \right)\phi_{1} \left( {\frac{{\overset{\lower0.1em\hbox{$\smash{\scriptscriptstyle\frown}$}}{\overset{\lower0.1em\hbox{$\smash{\scriptscriptstyle\frown}$}}{Z} }_{J}^{I} }}{{h_{3}^{n} }}} \right) $$
(34)

where \( \varphi_{1} \) is a standard one-dimensional cubic spline kernel function, \( h_{1}^{n} \), \( h_{2}^{n} \) and \( h_{3}^{n} \) are the current semi-major axes of the ellipsoid. The sizes of semi-major axes can be considered the support sizes of the kernel and are updated according to the deformation [19]. \( \overset{\lower0.1em\hbox{$\smash{\scriptscriptstyle\frown}$}}{\overset{\lower0.1em\hbox{$\smash{\scriptscriptstyle\frown}$}}{X} }_{J}^{I} \), \( \overset{\lower0.1em\hbox{$\smash{\scriptscriptstyle\frown}$}}{\overset{\lower0.1em\hbox{$\smash{\scriptscriptstyle\frown}$}}{Y} }_{J}^{I} \) and \( \overset{\lower0.1em\hbox{$\smash{\scriptscriptstyle\frown}$}}{\overset{\lower0.1em\hbox{$\smash{\scriptscriptstyle\frown}$}}{Z} }_{J}^{I} \) are the projections of relative position vector \( \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{X} }_{J} - \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{X} }_{I} \) on the local \( \varvec{\overset{\lower0.1em\hbox{$\smash{\scriptscriptstyle\frown}$}}{\overset{\lower0.1em\hbox{$\smash{\scriptscriptstyle\frown}$}}{X} } }^{I} \)-coordinate system, respectively. The adaptive anisotropic Lagrangian kernel is updated constantly over a period of time. The spherical shape domain of cubic spline kernel function deforms and rotates according to the Lagrangian motion between each two adaptive Lagrangian kernel steps. We address the reader to Ref. [19] for a comprehensive description of the approach. For the computational efficiency in explicit time integration, the material derivatives of meshfree shape functions are always computed and stored at the new reference configuration and reused during the time stepping.

Since the operation of adaptive anisotropic Lagrangian kernel is performed at the particle and does not involve remeshing, the stress-recovery techniques or remapping procedures are not necessary. This unique property of present approach leads to a relatively simple mathematical formulation for simulating the severe plastic deformation problem.

3.2 The bond-based failure criterion for material separation analysis

Excessive straining also appears in the FDS thread forming process when the material starts to fail at the thread forming stage. Precisely, the C1-continuity assumption in Lagrangian particle methods is inadequate to describe the kinematic discontinuity of displacement field in a continuous setting for the failure analysis [3]. This makes conventional Lagrangian particle methods very challenging in simulating the extreme thread forming during the FDS driving process.

To further avoid the excessive straining problem due to the assumption of continuous displacement field in the thread forming simulation, a bond-based failure criterion [3, 23] is incorporated with the present stabilization formulation. The origins of this approach can be traced back to the bond failure in the bond-based peridynamics [33, 34] in which material failure is modeled through bond breakage. In Lagrangian particle methods, the bond is a representation of a connection between two particles. Given a length of the bond \( \left\| {\varvec{X}_{J} - \varvec{X}_{I} } \right\| \) for a particle pair consisting of particles I and J in the initial configuration, the stretch ratio eIJ of the bond is defined by

$$ e_{IJ} = \frac{{\left\| {\varvec{x}_{J} - \varvec{x}_{I} } \right\|}}{{\left\| {\varvec{X}_{J} - \varvec{X}_{I} } \right\|}} $$
(35)

For the thread forming simulation, we restrict our attention to the material failure in metals. In this bond-based failure criterion, two neighbor particles are considered disconnected during the neighbor particle sorting whenever their averaged effective plastic strain and stretch ratio reach their respective critical values. Accordingly, the three-dimensional ellipsoidal cubic spline kernel function in Eq. (29) for a pair of particles I and J can be modified as:

$$ \varphi_{I}^{a} \left( {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{X}_{J} } \right) = \left\{ {\begin{array}{*{20}l} {0,} \hfill & {{\text{if}}\; \, \overset{\lower0.1em\hbox{$\smash{\scriptscriptstyle\frown}$}}{X}_{J} \; \notin \;\sup \;p\left( {\varphi_{I}^{a} } \right)\;{\text{or}}\;\left( {\bar{\varepsilon }_{IJ}^{P} > \bar{\varepsilon }_{\text{crit}}^{P} {\text{ and }}e_{IJ} > e_{\text{crit}} } \right)} \hfill \\ {\varphi_{1} \left( {\frac{{\overset{\lower0.1em\hbox{$\smash{\scriptscriptstyle\frown}$}}{\overset{\lower0.1em\hbox{$\smash{\scriptscriptstyle\frown}$}}{X} }_{J}^{I} }}{{h_{1}^{n} }}} \right)\varphi_{1} \left( {\frac{{\overset{\lower0.1em\hbox{$\smash{\scriptscriptstyle\frown}$}}{\overset{\lower0.1em\hbox{$\smash{\scriptscriptstyle\frown}$}}{Y} }_{J}^{I} }}{{h_{2}^{n} }}} \right)\varphi_{1} \left( {\frac{{\overset{\lower0.1em\hbox{$\smash{\scriptscriptstyle\frown}$}}{\overset{\lower0.1em\hbox{$\smash{\scriptscriptstyle\frown}$}}{Z} }_{J}^{I} }}{{h_{3}^{n} }}} \right) \, ,} \hfill & {\text{otherwise}} \hfill \\ \end{array} } \right. $$
(36)

where \( \bar{\varepsilon }_{IJ}^{P} = \left( {\bar{\varepsilon }^{P} \left( {\varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{X} }_{I} } \right) + \bar{\varepsilon }^{P} \left( {\varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{X} }_{J} } \right)} \right)/2 \) represents an averaged effective plastic strain in the bond \( \left\| {\varvec{X}_{J} - \varvec{X}_{I} } \right\| \) and \( \bar{\varepsilon }_{{}}^{P} \) is the effective plastic strain. \( \bar{\varepsilon }_{\text{crit}}^{P} \) is the critical effective plastic strain for bond failure, and ecrit denotes the critical stretch ratio. We consider \( e_{\text{crit}} \ge 1.0 \) in our numerical analysis which implies that the bond failure does not occur under compression. This implication is valid for most metal failure process.

Because the effective plastic strain at each particle is monotonically increasing during the course of deformation, the kinematic disconnection in a particle pair is permanent and irreversible. This is a substantial characteristic for the present bond-based failure mechanism in metal failure analyses since the non-physical material self-healing issues resulting from generic neighbor searching algorithm can also be completely excluded from the material failure simulation.

4 Numerical examples

The present MC-SPG formulation was recently implemented into the commercial software LS-DYNA® [35] by the authors. To demonstrate the accuracy and efficiency of the present method, three examples are studied in this section using the LS-DYNA code. The first example is a small deformation wave propagation problem with comparison to the analytical solution. The second example is a large deformation Taylor bar impact simulation with comparison to the experimental data. The last example is used to demonstrate the numerical ability in the extreme thread forming simulation considering material failure and separation. Convergence study is conducted for all examples to inspect the stability of the present method in the nonlinear analysis.

4.1 Elastic wave propagation problem

We will now test how well the present method can capture the wave propagation in a dynamic but small deformation event. To access stability, the convergence of a dimensionless elastic wave propagation problem is considered in the following. The test problem can be stated as:

$$ \rho \ddot{u} = Eu,_{xx} \quad {\text{on}}\;]0,L[\text{ } \times \text{ }]0,T[ $$
(37)

with boundary and initial conditions given as:

$$ \begin{aligned} & u(0,t) = 0,\quad t \in ]0,T[ \\ & u,_{x} (L,t) = 0 \\ & u(X,0) = 0,\text{ }\quad X \in ]0,L[ \\ & \dot{u}(X,0) = - 0.01 \\ \end{aligned} $$
(38)

The analytical solution of the displacement and stress fields can be derived and given by:

$$ \begin{aligned} & u(X,t) = \sum\limits_{n = 1}^{\infty } {\left\{ {A_{n} \sin (\omega_{n} t)\sin \left[ {(2n - 1)\frac{\pi X}{2L}} \right]} \right\}} \\ & \sigma \left( {X,t} \right) = Eu_{,x} (X,t) = \sum\limits_{n = 1}^{\infty } {\left\{ {(2n - 1)\frac{\pi }{2L}A_{n} \sin (\omega_{n} t)\cos \left[ {(2n - 1)\frac{\pi X}{2L}} \right]} \right\}} \\ \end{aligned} $$
(39)

with

$$ \omega_{n} = \frac{(2n - 1)\pi }{2L}\sqrt {\frac{E}{\rho }} \quad A_{n} = \frac{{8L\dot{u}(X,0)}}{{(2n - 1)^{2} \pi^{2} }}\sqrt {\frac{\rho }{E}} $$
(40)

To numerically solve the problem, a three-dimensional square rod is used which has a cross-section of 1.2 × 1.2, E = 100.0, ν = 0.0, ρ = 100.0 and L = 10.0. To analyze the convergence behavior of the proposed formulation, three discretizations are used with nodal distance of 0.2 (coarse), 0.1 (medium) and 0.05 (fine), respectively.

Figures 3 and 4 show the axial displacement at the free end and the axial stress at the mid-length of the rod, respectively. Both displacement and stress are seen to match analytical solution nicely. No apparent amplitude and phase errors are observed. Also, it is observed that the numerical solution converges to the analytical solution as discretization is refined.

Fig. 3
figure 3

Elastic wave problem: displacement at free end

Fig. 4
figure 4

Elastic wave problem: stress at mid-length

4.2 Taylor bar impact problem

In this example, we report the stability performance of the present method in a large deformation problem. In this problem [36], a 6061-T6 aluminum alloy cylinder with diameter of 7.82 mm and length of 23.46 mm impacts perpendicularly to a rigid surface at an initial velocity of 373 m/s. In the test, the final cylinder length is deformed to about 16.46 mm. To study the converge performance of the proposed method, the cylinder is discretized into three models with nodal distance of approximately 0.49 mm (coarse, total 11,809 nodes), 0.33 mm (medium, total 38,325 nodes) and 0.25 mm (fine, total 86,621 nodes), respectively. The cross-sectional discretization is shown in Fig. 5 (for clarity, connectivity is shown, but not used in calculation at all). Johnson–Cook material law [37] is employed to model the plastic flow of the material while the parameters are taken from Ref. [38].

Fig. 5
figure 5

Taylor impact: discretizations

Figure 6 shows the displacement history at the free end. It is observed that the numerical solution is converged and close to the experimental result. In fact, the numerical solution does not show too much sensitivity to the discretization, which demonstrates the accuracy of the present method in large deformation analysis.

Fig. 6
figure 6

Taylor impact: longitudinal displacement

Figure 7 shows the plastic strain distribution at termination. The contour plots in three different discretization models are in good agreement with each other. Figure 8 shows the final deformed shapes at the contact interface. A background mesh is attached to the deformation plot for the demonstration purpose. No spurious energy mode is observed in the present solution for all three models. Over all, qualitatively consistent results are obtained.

Fig. 7
figure 7

Taylor impact: plastic strain at termination

Fig. 8
figure 8

Taylor impact: deformed shapes at the bottom

4.3 FDS thread forming simulation

In this example, we will simulate the extreme thread forming in the FDS driving process. A complete FDS joining process is a 6-step manufacturing operation featured with large plastic flow and thermal mechanical responses. Generally, the FDS joining process includes heating, penetration, extrusion forming, thread forming, screw-driving and tightening steps [39]. As the thermal response is not of interest in the current study, the heating step and thermal effect are ignored in the simulation. The tightening step is not considered either since only one layer of workpiece is used in this thread forming simulation, which is for the purpose of illustration rather than actually joining two or more pieces of workpiece. Our focus in this simulation is the extreme thread forming on the workpiece after the fastener penetrates and drives through the workpiece.

As shown in Fig. 9a, b, a nominal M5x20 fastener is used as the screw in this study. It rotates at ω0 = 5000 rpm and plunges at v0 = 2.0 m/s (Fig. 9c) during the penetration step and, thereafter, rotates at ω0 = 20,000 rpm and plunges at v0 = 0.2 m/s until the end of screw-driving step. The pitch distance of the fastener is 0.7 mm. The fastener is modeled by finite element formulation (Fig. 9d) with rigid material.

Fig. 9
figure 9

Flow drill screw model: geometry and discretization

The workpiece is a 20-mm-diameter and 1.5-mm-thick 6061-T6 aluminum alloy plate. The perimeter of the plate is clamped. To save some computational cost, only the central ∅7.0 mm region, where most likely, large plastic flow, material failure and separation will occur, is modeled by the proposed MC-SPG formulation while the remaining majority is modeled by the traditional finite element method since failure is unlikely to occur in this region. The coupling between the MC-SPG and finite element is simply handled by common nodes and the feasibility of this coupling scheme was studied in [25]. To study the convergence behavior of the MC-SPG formulation, four discretizations are used, namely nodal distance (in the central region) 0.15 mm (total 21,131 nodes and 36,000 elements), 0.125 mm (total 35,269 nodes and 63,360 elements), 0.10 mm (total 67,344 nodes and 127,680 elements) and 0.075 mm (total 157,773 nodes and 276,480 elements), respectively. The cross-sectional nodal distribution at the central region is demonstrated in Fig. 10.

Fig. 10
figure 10

Flow drill screw model: discretization of central ∅7.0 mm region

The plate is constitutively modeled by Johnson–Cook material law, and the same parameters as the previous example are used. The material failure and separation is handled by the bond failure mechanism described in Sect. 3.2. The critical effective plastic strain and stretch ratio for bond failure is 0.4 and 1.15, respectively. The interaction between the fastener and the metal plate is taken care by the classic pin-ball contact algorithm. Frictional contact with coefficient of friction 0.3 is considered.

Figure 11 shows the applied force (on the fastener) for the FDS driving process. The abscissa is the longitudinal distance that the fastener moved. The legend indicates the nodal distance in the MC-SPG discretization. It is observed that the difference between the MC-SPG solutions from different discretizations is marginal; therefore, we can conclude that the MC-SPG formulation converges in this simulation. In other words, we can also claim that the MC-SPG solution does not have high sensitivity to discretization, which is dramatically different from the traditional FEM with element erosion technique for failure analysis. It is also observed that the MC-SPG solution matches the original SPG solution in Fig. 11 for most of the forming stages. There is about 15% difference at the screw-driving stage, which is still acceptable for this type of extreme deformation with material failure problems. The higher SPG force response at the final screw-driving stage may be due to the modification of shear modulus in the stabilized stress computation for the original SPG method. On the other hand, if the element erosion technique is used in the finite element method to mimic the material failure phenomenon, the force is dramatically underestimated, which is unphysical. This is clearly demonstrated in Fig. 11 (green curve, obtained by finite element with element erosion at effective plastic strain of 0.4).

Fig. 11
figure 11

Flow drill screw-driving simulation: plunge force

While the accuracy of the MC-SPG formulation is close to the SPG formulation, its efficiency is higher. In this case, for the nodal distance 0.125 mm discretization, the CPU time for SPG formulation is about 2.4 times of that of the MC-SPG (refer to Table 2 for CPU time comparison). The force response shows that the compressive thrust builds up until the tip of the fastener penetrate through the plate. Then, as more and more bond failure occurs, the thrust cannot be build up any longer. Once extrusion forming starts, a direction reversed force is resulted due to material extruding up (along the thread on the fastener) and the compressive pulse can eventually turn into a tensile one. This is consistent with the physics described in [39]. In fact, if the FDS rotational and translational speeds are such that the plunge distance is exactly one pitch after each round of rotation, the applied force during the thread forming and screw-driving stages might just be zero because there is no material extrudes up or down along the fastener threads.

Table 2 Normalized CPU time

Figure 12 shows the new formed threads on the workpiece at termination for different discretizations obtained using the MC-SPG formulation. Cleaner and clearer threads are observed as the discretization is refined, which indicates that the critical physics of the FDS thread forming process is well simulated by the MC-SPG algorithm. This observation implies that to form clear threads, the discretization needs to be fine enough such that there are enough nodes in the pitch distance (0.15 mm—5 nodes, not clear; 0.125 mm—6 nodes, not clear; 0.10 mm—8 nodes, clear; 0.075 mm—10 nodes, best). This once again confirms that the MC-SPG formulation converges in this simulation. Had this process been analyzed by FEM with element erosion technique, no thread would be observed on the workpiece due to erosion upon failure, which is verified in Fig. 13b.

Fig. 12
figure 12

Flow drill screw-driving simulation: threads at termination by MC-SPG

Fig. 13
figure 13

Comparison of thread forming at termination by MC-SPG versus FEM

5 Conclusions

Flow drill screw (FDS) joining is a new mechanical fastening technique widely used to connect metal parts in modern lightweight car structures. Numerical simulations of FDS joining that well represents the physical behavior are important for the automotive industry. In particular, the ability to model extensive plastic deformation and material separation taking place in the extreme thread forming process is vital to the success of the FDS joining simulation. Thanks for the characteristics of discretization flexibilities and customized approximations, particle or meshfree methods have attracted significant attention from scientists and engineers in the last two decades to model challenging scientific and engineering problems. This paper presents a new particle method, the MC-SPG method, for simulating the extreme thread forming in the FDS driving process.

The starting point for the extreme thread forming simulation using this new particle method is the stabilization of formulation by a novel velocity smoothing algorithm. We show that this velocity smoothing algorithm consistently fulfills the conservation of momentum. The stabilized formulation is further supplemented with the adaptive anisotropic Lagrangian kernel and bond-based failure criterion to handle the severe deformation and material failure in the analysis. The present method differs from existing residual and non-residual stabilization methods in several important aspects. Most notably, the method does not require additional stabilization terms; thus, modifications of elastic modulus for stabilization stress and stabilization control parameters are not necessary. Furthermore, the present method is efficient and simple. Its implementation relies on velocity smoothing operators whose action is evaluated at the particle level using standard meshfree approximations. As a result, an existing meshfree code can be modified easily to handle the new stabilization procedure.

Numerical results illustrate the performance of the MC-SPG method. Convergence studies in small, large and severe deformation problems suggest the stability of the present method and its applicability in the nonlinear analysis. In particular, the present method has shown to offer a unique numerical capability in modeling the extreme thread forming operation. To the authors’ best knowledge, this is the first time in the literature a numerical method is successfully applied to model the forming of threads during the FDS driving process. In the future, the applications of the MC-SPG method to the simulation of other advanced joining techniques such as SPR connections will be studied.