1 Introduction

In steel structures, such as nuclear plants, chemical plants, and large ships, thick plates are used. In the construction of those structures, multi-pass welding is used. However, in welded structures, residual stress occurs near the weld metal and can cause stress corrosion cracks [1, 2] and fatigue cracks [3, 4]. However, it is very difficult to evaluate the residual stress of complex weld joints such as multi-pass welded joints. Therefore, it is desirable to investigate the residual stress of weld joints in advance by using a numerical simulation.

Currently, nonlinear FEM based on thermal elastic-plastic theory is generally used to predict the welding residual stress [57]. In nonlinear FEM, it is necessary to consecutively simulate welding deformation and stress from the beginning of heating to complete cooling. Therefore, tremendous calculations are necessary in nonlinear FEM and then the computing time is a problem.

To carry out a large-scale analysis of welding residual stress in practical problem, the authors developed the idealized explicit finite element method (IEFEM) [8]. IEFEM is based on dynamic explicit FEM [9]. The accuracy of IEFEM is almost the same as the existing method (static implicit FEM), and IEFEM requires shorter computing time and less memory consumption [8, 10].

In this research, parallelization using a graphics processing unit (GPU), which is now gaining attention as a high-performance parallel processor [11], is applied to IEFEM to achieve even faster computation. Then, parallelized IEFEM is applied to the multi-pass welding of V-groove pipe joint. The analysis considers a three-dimensional moving heat source to show the applicability of IEFEM to large-scale problems.

2 Theory of IEFEM

In IEFEM, the analysis progresses by the following procedures 1, 2, and 3 based on dynamic explicit FEM. The procedures are also shown in Fig. 1.

Fig. 1
figure 1

Concept of idealized explicit FEM

  1. 1.

    A load increment is given and the load is held.

  2. 2.

    The displacement is calculated based on the basic equation of dynamic explicit FEM shown in Eq. (1) until the static equilibrium state is obtained.

  3. 3.

    If the static equilibrium state is obtained, procedure 1 is iterated to give a new load increment.

    $$ \left[M\right]{\left\{\ddot{U}\right\}}_t+\left[C\right]{\left\{\dot{U}\right\}}_t+{\displaystyle \sum_{e=1}^{N_e}{\displaystyle {\int}_{V^e}{\left[B\right]}^T\left\{\sigma \right\}dV}}={\left\{F\right\}}_t $$
    (1)

    where, [M], [C], [B], and {σ} are the mass matrix, damping matrix, strain-displacement matrix, and stress vector, respectively, and {Ü} t , \( {\left\{\dot{U}\right\}}_t \), {U} t , and {F} t are the acceleration vector, velocity vector, displacement vector, and load vector at time t, respectively. V e is the volume of element e and N e is the number of elements.

The calculation of displacement in dynamic explicit FEM is performed by Eq. (2), which is obtained by the central difference of the acceleration and the velocity terms in Eq. (1).

$$ \left(\frac{1}{\varDelta {t}^2}\left[M\right]+\frac{1}{2\varDelta t}\left[C\right]\right){\left\{U\right\}}_{t+\varDelta t}={\left\{F\right\}}_t+\frac{2}{\varDelta {t}^2}\left[M\right]{\left\{U\right\}}_t-\left(\frac{1}{\varDelta {t}^2}\left[M\right]-\frac{1}{2\varDelta t}\left[C\right]\right){\left\{U\right\}}_{t-\varDelta t}-{\displaystyle \sum_{e=1}^{N_e}{\displaystyle {\int}_{V^e}{\left[B\right]}^T\left\{\sigma \right\}dV}} $$
(2)

where Δt is the time increment.

Here, if it is assumed that mass matrix [M] and damping matrix [C] are diagonal matrices, Eq. (2) is no longer a simultaneous equation and the computing time and memory consumption become very small. However, in the process to obtain the static equilibrium state by using dynamic explicit FEM in procedures 2 and 3, a large number of time steps is required due to the Courant condition, shown in Eq. (3), for the ordinary mass and damping matrix, which are based on physics.

$$ \begin{array}{cc}\hfill v=\sqrt{E/\rho}\hfill & \hfill \varDelta t<\varDelta {l}_{\min }/v\hfill \end{array} $$
(3)

where v, E, ρ, and Δl min are the propagation velocity of stress, Young’s modulus, density, and minimum size of an element, respectively.

In IEFEM, the number of time steps to obtain the static equilibrium state can be reduced by modifying mass matrix [M] and damping matrix [C] based on Eq. (4), which is also derived from the Courant condition.

$$ {\rho}_i=\varDelta {t}_{cr}^2E/\varDelta {l_i}^2 $$
(4)

where ρ i , Δl i , and Δt cr are the modified density for each direction i, element length in each direction i, and critical time increment, respectively. Equation (4) means that the modified density is determined to satisfy the Courant condition for a given critical time increment Δt cr . The time increment Δt used to calculate Eq. (2) must be less than Δt cr to stabilize the calculation. In this research, Δt cr and Δt are set to 1.0 and 0.7, respectively. The mass matrix of element [M e ] is determined by Eq. (5).

$$ \left[{M}_e\right]={\displaystyle {\int}_{\kern0.5em {V}^e}\left\{{\rho}_i\right\}{\left[N\right]}^T\left[N\right]dV} $$
(5)

where [N] is the shape function of an element. Thus, by using the modified density for each element, the critical time increment limited by the Courant condition is considered as uniform for all elements, regardless of the element size and material constants. Therefore, it is possible to reduce the number of time steps to obtain the static equilibrium state.

As shown in Eq. (6), the diagonal term of damping matrix [C] is determined by using the diagonal term of element stiffness matrix k ii and element mass matrix m ii based on the critical condition of damping in one-dimensional oscillation theory.

$$ {c}_{ii}=2\sqrt{m_{ii}{k}_{ii}} $$
(6)

As described above, an IEFEM analysis is performed by progressing time steps using dynamic explicit FEM until obtaining the static equilibrium state, which is almost the same solution obtained by static implicit FEM. This means that it is possible to achieve an accurate solution by using the same analysis condition used in static implicit FEM, although the computing time and the memory consumption can be dramatically reduced. In addition, it is possible to analyze by only degree of freedom (DOF)-by-DOF and element-by-element calculations in IEFEM. Consequently, IEFEM is highly suitable for parallelization.

3 Parallelization of IEFEM using a GPU

In this research, parallelization using a GPU is introduced to IEFEM to achieve fast computation. GPU is now gaining attention as a high-performance parallel processor because it has hundreds or thousands of computing units inside and a very high capability for numerical calculations in comparison to ordinary CPU. In this research, parallelization is introduced to IEFEM in the parallel programming environment called Compute Unified Device Architecture (CUDA) by the NVIDIA Corporation.

To apply GPU parallelization to IEFEM, as shown in Fig. 2, the stress of an element is calculated on a single computing unit on the GPU and then the stress is integrated to determine the equivalent nodal force of an element on the same computing unit. The calculated equivalent nodal force is superposed on the global equivalent nodal force on the CPU for all elements. For the nodal displacements, a single DOF is processed on a computing unit on the GPU. Figure 3 shows the flow of IEFEM by GPU parallelization. In Fig. 3, DOF-by-DOF and element-by-element calculations are parallelized. Because these calculations cover most of the computing time in IEFEM, faster computation is expected.

Fig. 2
figure 2

Schematic illustration of GPU parallelization

Fig. 3
figure 3

Computing flow of parallelized IEFEM

By using these procedures, IEFEM using a GPU can achieve the efficient computation. For example, it was demonstrated that the computing time and memory consumption are reduced more than 100 times on the fundamental welding problem that has almost 300,000 degrees of freedom compared to those for static implicit FEM which is generally employed in nonlinear structural analyses [10].

4 Three-dimensional residual stress analysis of multi-pass welding of V-groove pipe joint

4.1 Analysis model and conditions

Parallelized IEFEM is applied to the residual stress analysis of the multi-pass welding of the V-groove pipe joint shown in Fig. 4a. The dimensions of pipe joint are 305 mm in outer diameter, 255 mm in inner diameter, and 500 mm in length. The number of nodes, elements, and degrees of freedom is 1,029,240; 1,005,210; and 3,087,714, respectively. Figure 4b shows the mesh divisions near the welding line with the welding pass sequence. The material properties of the base metal and the weld metal are assumed to be SUS304 and Y308L, respectively. The temperature-dependent material properties of SUS304 and Y308L are shown in Figs. 5 and 6 [12]. The welding conditions are shown in Table 1 and the welding of all passes is carried out in the same welding direction. Regarding heat source, a moving rectangular parallelepiped volumetric heat source is employed in the heat conduction analysis, and elements of each pass in the rectangle are uniformly heated by volumetric heat source. The size of the rectangle in welding direction is set to 20 mm which corresponds to approximately seven elements in welding direction. Elements of the weld metal are initially deactivated and elements of the each pass are activated at the beginning of the welding pass. In the high temperature state of metals, the dislocations accumulated due to plastic deformation are thought to disappear even if the metals do not melt. This effect can be modeled by setting equivalent plastic strain of elements to zero [13]. In this research, equivalent plastic strain and back stress of kinematic hardening are set to zero when the temperature of the element exceeds the annealing temperature of 900 °C [12]. The initial temperature, room temperature, and interpass temperature are assumed to be 20 °C.

Fig. 4
figure 4

Analysis model of V-grooved multi-pass welding. a Overall model and b zoomed view of weld joint and welding sequence

Fig. 5
figure 5

Temperature-dependent material properties of SUS304 [12]

Fig. 6
figure 6

Temperature-dependent material properties of Y308L [12]

Table 1 Welding conditions of single V-groove model [12]

The heat conduction analysis is carried out using the conditions shown above. As a result, the number of temperature steps is 20,842. In this analysis, the thermal elastic-plastic analysis is consecutively carried out for the temperature steps obtained by the heat conduction analysis using isotropic hardening, kinematic hardening, and combined hardening, which considers the same ratio of isotropic and kinematic hardening to consider both the effect of expansion of yield surface and the Bauschinger effect. The relation between equivalent stress and equivalent plastic strain is assumed as bilinear.

4.2 Analysis results

Figure 7 shows the residual stress in the axial direction σ z on the cross section 90° from the start point of welding. Figure 7a–c shows the results of isotropic hardening, combined hardening, and kinematic hardening after complete cooling of the eighth pass (fourth layer), respectively. In the same figure, (d)–(f) show the results of the 18th pass (seventh layer) and (g)–(i) show the results of the 33rd pass (final layer). From Fig. 7, it is found that the axial stress of the eighth pass shown in (a)–(c) is not large. But, as for the residual stress of the 18th pass shown in (d)–(f), high tensile stress occurs near the weld metal around the position of current pass. After the welding of all passes, shown in (g)–(i), compressive stress occurs inside of weld metal, and the tensile stress occurs near the inner and outer surfaces of the pipe joint.

Fig. 7
figure 7

Distribution of residual stress in axial direction σ z on the cross section 90° from the start point of welding (a, d, g isotropic 8, 18, and 33 passes, respectively; b, e, h combined 8, 18, and 33 passes, respectively; c, f, i kinematic 8, 18, and 33 passes, respectively)

As to the hardening rule, it is found that the same tendency of the residual stress distribution is obtained for isotropic hardening, kinematic hardening, and combined hardening. However, for the amplitude of the stress, the larger stress occurs in the order of isotropic hardening, combined hardening, and kinematic hardening. In the process of welding, the heated region becomes high temperature and the region expands. However, the surrounding region constrains the high-temperature region and compressive plastic strain will develop on the high-temperature region. After cooling, tensile stress occurs on the welded region. In multi-pass welding, these processes iterated for the number of welding pass. Therefore, relatively small stress can be obtained by using kinematic hardening due to Bauschinger effect, while large stress can be obtained by using isotropic hardening because the yielding surface enlarges even though the tensile and compressive stress acts iteratively. As stated above, it would appear that the difference of the stress distribution between isotropic hardening and kinematic hardening occurred. For the results of combined hardening, the residual stress is almost the midpoint between the results of isotropic hardening and kinematic hardening because the combined hardening considers both isotropic hardening and kinematic hardening in the same ratio in this analysis.

In the same way, the residual stress in the hoop direction σ θ is shown in Fig. 8. From Fig. 8, it is clearly seen that the tensile stress occurs over the weld metal at the 8th and 18th passes. But after the final pass, high tensile stress occurs in or near the weld metal. Near the inner surface of the pipe, compressive stress is distributed. For the axial stress σ z , the tendency of the residual stress distribution is almost the same among isotropic hardening, combined hardening, and kinematic hardening, and the absolute values are larger in the order of isotropic hardening, combined hardening, and kinematic hardening.

Fig. 8
figure 8

Distribution of residual stress in hoop direction σ θ on the cross section 90° from the start point of welding (a, d, g isotropic 8, 18, and 33 passes, respectively; b, e, h combined 8, 18, and 33 passes, respectively; c, f, i kinematic 8, 18, and 33 passes, respectively)

Figure 9 shows the comparison of distribution of residual stress in the axial direction σ z between the analysis and the experimental measurements [12] along line A-A’ shown in Fig. 4b. The experimental results are obtained by the deep hole drilling (DHD) method [14] and the incremental DHD (iDHD) method [15]. From Fig. 9, it is found that the analysis results of isotropic hardening are higher than the experimental measurements, but the similar tendencies are obtained among the analysis results and the experimental measurements.

Fig. 9
figure 9

Comparison of distribution of residual stress σ z along line A-A’

As is the same as Fig. 9, Fig. 10 shows the comparison of the distribution of residual stress in the hoop direction σ θ between the analysis and the experimental measurements. From Fig. 10, it seems that the influence of hardening rule is small. However, as shown in Fig. 8, distributions of stress in hoop direction are larger in the order of isotropic hardening, combined hardening, and kinematic hardening. In addition, it is found that both the values and the tendency of the residual stress in the hoop direction σ θ obtained by the analysis agree very well with the experimental measurements from the figure.

Fig. 10
figure 10

Comparison of distribution of residual stress σ θ along line A-A’

In this analysis, the computing time is approximately 70 h in any case of isotropic hardening, combined hardening, and kinematic hardening. From these results, it is demonstrated that IEFEM can analyze more than 1 million elements in large-scale multi-pass welding with a moving heat source and achieve high accuracy in a realistic computing time. To predict stress corrosion cracking and fatigue cracking precisely, the distribution of residual stress due to welding is important. In this context, IEFEM can be expected as a powerful tool to obtain precise and accurate residual stress distributions.

5 Conclusions

In this research, parallelization using a graphics processing unit was used to improve computational speed of IEFEM, which was developed to achieve faster and less memory consumption in welding residual stress analyses. Parallelized IEFEM is applied to the residual stress analysis of the multi-pass welding of V-groove pipe joint with consideration of the three-dimensional moving heat source, and the influence of the hardening rule is investigated. The following results are obtained:

  1. 1.

    IEFEM can analyze the multi-pass welding of V-groove pipe joint, which has 3,087,714 degrees of freedom, 13 layers, and 33 passes. The computing time is approximately 70 h.

  2. 2.

    The residual stress of the multi-pass welding of V-groove pipe joint is analyzed by using isotropic hardening, kinematic hardening, and combined hardening that considers the same ratio of isotropic hardening and kinematic hardening. As a result, it is found that the same tendency of the residual stress distribution is obtained for all hardening rules. The residual stress is larger in the order of isotropic hardening, combined hardening, and kinematic hardening.

  3. 3.

    The residual stress of the multi-pass welding analysis of V-groove pipe joint is compared with the experimental measurements. As a result, it is clearly seen that the residual stress obtained by using isotropic hardening is higher than the experimental measurements for axial stress σ z , but the value and the tendency of the distribution of residual stress obtained by considering kinematic hardening and combined hardening agree very well with the stresses obtained as experimental results in both axial and hoop direction.