Keywords

1 Introduction

A difficulty with fatigue life data is the characterization of its variability, which can be several orders of magnitude [1], especially for loading near operating conditions. The variability is attributable to experimental error, as well as material microstructure or processing. Thus, estimation and prediction of fatigue life is challenging. A key concern is the characterization of the cumulative distribution function (cdf) for fatigue life, given an applied load, which may be either the stress range Δσ or the strain range Δε. The lower tail portion of the cdf which depicts high reliability is especially critical; however, that is where scatter is more pronounced, and sample sizes are smaller. Empirically modeling fatigue life has been considered numerous times. A simple internet search for statistical fatigue life modeling yields in excess of 10 million citations. A relatively recent work is [2], in which the authors incorporate statistical analysis with traditional stress-life, strain-life, or crack propagation models. While empiricism is used, the thrust is to incorporate as much physically motivated modeling as possible. More frequently, investigators attempt to fit a stress-life (S-N) curve through the data, especially the medians for given Δσ or Δε. A nice review of such practices is [3]. Another example of statistical modeling of fatigue data is contained in [4] in which over two chapters are devoted statistical methodologies. Other examples of statistical stress-life analysis may be found in [5, 6]. While there are many more papers, books, and conferences publications on statistical fatigue modeling, these are representative. In spite of all these references, statistical fatigue analysis is still an open area of investigation.

Herein, an empirically based approach for accurately estimating the fatigue life cdf, given Δσ or Δε is proposed. The methodology merges fatigue life data using a statistical transformation for the estimation. The statistical technique increases the sample size by merging fatigue data for more precise assessment. This is necessitated because there is often large variability in S-N data, and the sample sizes are small. Validation of the modeling is essential, especially for prediction of life outside of the range of experimental observations. The validity of the methodology is evaluated by considering percentile bounds estimated for the S-N data. The development of the methodology and its subsequent validation is illustrated using three different fatigue life datasets.

A fundamental issue in fatigue life estimation is the choice of an underlying cdf. The cdf used in the ensuing analyses is a three-parameter Weibull cdf. A fairly recent example of a traditional statistical S-N methodology using the three-parameter Weibull cdf is [7] where fatigue of structural and rolling contact problems are considered. There seems to be a need for more experimental data to enhance modeling in almost all fatigue analyses. This is addressed in [8] by using normalization for the fatigue life data so that all the data are merged. The normalized data are then modeled with a three-parameter Weibull cdf. Even though the intention in [8] is similar to the emphasis herein, the methodology is somewhat different.

2 Fatigue Life Data

Fatigue life data are most often presented on an S-N plot which shows the fatigue data for a given load. The load is typically stress or strain. Thus S-N can represent stress-life or strain-life. An additional way in which the fatigue data are presented is on a probability plot. Both of these representations will be used subsequently. Three different sets of fatigue life data are considered for the proposed method.

The first set considered is one of the special cases taken from [9]. Fatigue testing was conducted on 2024-T4 aluminum alloy specimens. The fatigue tests were performed on rectangular specimens with dimensions of 110 mm long, 52 mm wide, and 1 mm thick, and with a center cut circular hole of radius 5 mm. Holes were cut using standard procedures with a lathe, and burrs were removed by polishing techniques. Testing was conducted in laboratory environment where temperatures of 295–297 K (approximately 22–24 °C) and relative humidities of 50–56% were observed. Constant amplitude tests were performed at a frequency of 30 Hz on a single machine with a single operator in order to minimize experimental error. Because of the extensiveness of this data, they have been used in a variety of analyses; e.g., [10,11,12]. These data are summarized in Table 1. A total of 222 specimens were tested using eight different values for Δσ. The specimens were tested to fracture. The sample coefficients of variation (cv) are nearly the same, approximately 9%, for the larger values of Δσ. When Δσ is 137 MPa, the cv is about double, and for the two smaller values of Δσ the scatter increases significantly. The fatigue life data are plotted on an S-N graph in Fig. 1 in the traditional linear versus logarithm S-N format. As Δσ is reduced, the increase in the scatter in life is apparent. Modeling the increasing variability for decreasing Δσis the challenge for accurate fatigue life prediction.

Table 1 Statistical summary of fatigue life data for 2024-T4 specimens [9]
Fig. 1
figure 1

Fatigue life data for 2024-T4 specimens [9]

The second set of fatigue data to be considered is data collected at room temperature for ASTM A969 hot dipped galvanized sheet steel with a gauge thickness of 1.78 mm [13]. ASTM A969 is a cold-rolled, low carbon, extra deep drawing steel. This steel is very ductile and soft, and it is age resistant. The automotive industry uses it in applications where severe forming is required, e.g., inner door components, dash panels, body side components, and floor pans with spare tire tubs. Fatigue tests for the ASTM A969 specimens were conducted using a triangular waveform at 25 Hz. The fatigue tests were terminated, i.e., designated as a failure, when the tensile load dropped by 50% of the maximum load. A total of 69 specimens were tested to failure. The data are summarized in Table 2. The cv given Δε is more scattered than those for the 2024-T4 data. The smallest value of Δε has the largest scatter, but the second largest Δε is rather large as well. This behavior is seen graphically on the S-N plot in Fig. 2. Consequently, these data are not as statistically well behaved as the 2024-T4 data. These data have been used to investigate other types of fatigue modeling [14, 15].

Table 2 Statistical summary of fatigue life data for ASTM A969 specimens [13]
Fig. 2
figure 2

Fatigue life data for ASTM A969 specimens [13]

The third set to assess is 9Cr-1Mo steel which were collated from a round-robin test program and were reported in [16]. This steel is creep strengthen, and it is frequently used in thermal power plants to improve the energy efficiency of the power plant by increasing operating temperatures and pressures. Specifically, 9Cr-1Mo is often used for steam generator components of both fossil fired and nuclear power plants. The material from which the data were generated was a single cast, rolled plate with a nominal tensile strength of 623 MPa [16]. A total of 130 specimens were tested to failure. The data are summarized in Table 3. The cv given Δε is even more scattered than the above datasets. In fact, there does not seem to be any discernible pattern. The data are shown on Fig. 3. The reason for the unusual statistical behavior would require more in depth analysis than is provided in [16]. Usually round-robin testing has considerably more variability in results because testing conditions and methodologies are not consistent. Nevertheless, the data will serve as an excellent case for the proposed modeling approach.

Table 3 Statistical summary of fatigue life data for 9Cr-1Mo specimens [16]
Fig. 3
figure 3

Fatigue life data for 9Cr-1Mo specimens [16]

3 Data Fusion for Fatigue Life Analysis

The following is a purely empirical method to improve fatigue life modeling. Because fatigue data are usually limited in number for relatively few different loading conditions, modeling is crude. A methodology that has been developed to account for uncertainty for static properties [17] is adapted for fatigue life data. The basis of the approach is a linear transformation of a collection of experimental observations \(\{ y_{j} :1 \le j \le n\}\) into another set of values \(\{ z_{j} :1 \le j \le n\}\) so that both sets have the same average and sample standard deviation. Let

$$z_{j} = ay_{j} + b.$$
(1)

The choices of a and b in Eq. (1) are easily determined by simple algebra to be the following:

$$a = \frac{{s_{A} }}{{s_{y} }}{\text{ and }}b = N_{A} - \frac{{s_{A} }}{{s_{y} }}\bar{y},$$
(2)

where \(\bar{y}\) is the average and sy is the sample standard deviation for \(\{ y_{j} :1 \le j \le n\}\), and NA and sA are arbitrary values chosen for normalization.

For fatigue data the life times are distributed over several orders of magnitude that the procedure in Eqs. (1) and (2) is applied to the natural logarithm of the life times. Let m be the number of different values of applied stress or strain, i.e., \(\{ \Delta \sigma_{k} :1 \le k \le m\}\) or \(\{ \Delta \varepsilon_{k} :1 \le k \le m\}\). Given Δσk or Δεk the associated life times are \(\{ N_{k,j} :1 \le j \le n_{k} \}\) where nk is its sample size. Let

$$y_{k,j} = \ln (N_{k,j} )$$
(3)

be the transformed life times. Substituting Eq. (2) into Eq. (1) leads to the following:

$$z_{k,j} = \frac{{s_{A} }}{{s_{y,k} }}(y_{k.j} - \bar{y}_{k} ) + N_{A} .$$
(4)

Thus, the averages and sample standard deviations of \(\{ y_{k,j} :1 \le j \le n_{k} \}\) and \(\{ z_{k,j} :1 \le j \le n_{k} \}\) are equal to each other. The next step is to merge all the transformed zk,j values from Eq. (4) for 1 ≤ jnk and 1 ≤ km. The purpose in using the merged values is to have a more extensive dataset for estimation of the cdf. This is especially critical for estimating the extremes of the cdf more accurately. Subsequently, an appropriate cdf FZ(z) is found that characterizes the merged data. It is assumed that this cdf also characterizes the subsets \(\{ z_{k,j} :1 \le j \le n_{k} \}\) of the merged set. With this assumption and the linear transformation in Eq. (4), the cdfs for \(\{ y_{k,j} :1 \le j \le n_{k} \}\) Fy,k(y) can be derived from FZ(z) as follows:

$$F_{y,k} (y) = F_{Z} (\frac{{s_{A} }}{{s_{y,k} }}(y - \bar{y}_{k} ) + N_{A} ).$$
(5)

The approach is designated as the Fatigue Life Transformation (FLT). Recall that the above methodology is applied to natural logarithms. In order to make observations on the actual fatigue lives, the values must be changed back to actual cycles.

4 Flt Analysis for 2024-T4 Fatigue Life Data

To evaluate the effectiveness of the proposed FLT methodology, the fatigue life data summarized in Table 1 and shown on Fig. 1 is considered. Recall that the FLT is applied to the natural logarithm of the fatigue data; see Eq. (3). The arbitrarily chosen values for NA and sA are 26 and 1, respectively. The rather large value for NA was chosen to assure that zk,j in Eq. (4) is positive. The 222 FLT data are shown on Fig. 4, where the axes are labeled to be easily read. Each set of data for a given Δσk are transformed using FLT. These transformed data are well grouped so that it is reasonable to merge them. Figure 5 shows the entire 222 FLT values merged into a common sample space. The FLT merged data contain approximately 7–10 times more data than those for each given Δσk. Thus, estimation for the cdf is necessarily more accurate which results in a better characterization of its lower tail. Also, notice that the cycles are transformed using the FLT procedure; they are not actual cycles to failure, i.e., they are not equivalent to the data shown on Fig. 1. The solid line is the maximum likelihood estimation (MLE) for a three–parameter Weibull cdf W(α, β, γ), where α is the shape parameter, β is the scale parameter, and γ is the location parameter. The form of W(α, β, γ) is

Fig. 4
figure 4

FLT fatigue life data for 2024-T4 specimens given Δσ [9]

Fig. 5
figure 5

Merged FLT fatigue life data for 2024-T4 specimens [9]; Weibull MLE

$$F(x) = 1 - \exp \{ - [(x - \gamma )/\beta ]^{\alpha } \} ,\;x \ge \gamma .$$
(6)

Graphically, the fit is excellent. The Kolmogorov-Smirnov (KS) and Anderson-Darling (AD) goodness of fit test statistics are 0.043 and 0.292, respectively. Both of which indicate that the MLE is acceptable for any significance level αs less than 0.3. Consequently, the MLE W(α, β, γ) cdf is an excellent representation of the merged FLT data. The MLE estimated parameters are \(\hat{\alpha } = 2.894;\;\hat{\beta } = 2.986;\;{\text{and}}\;\hat{\gamma } = 23.327\).

The three–parameter Weibull cdf W(α, β, γ) shown in Eq. 6 was selected for consideration because it has become a very popular cdf to represent fatigue data since its namesake used it for that purpose [18]. Two popular resources for the Weibull cdf, which advocate its use and contain examples of its applications, are [19, 20]. The primary reason for its choice, however, is because there is a minimum value represented by γ. Typically, any collection of fatigue life data is spread over two or three orders of magnitude. Consequently, a nonzero minimum value is required to appropriately represent the fatigue data.

Now, it is assumed that the MLE estimated parameters for the FLT merged data are acceptable for each of its subsets \(\{ z_{k,j} :1 \le j \le n_{k} \}\). The cdf Fy,k(y) for each given Δσk can be determined using Eq. (5) as follows:

$$F_{y,k} (y) = 1 - \exp \{ - [(\frac{{s_{A} }}{{s_{y,k} }}(y - \bar{y}_{k} ) + N_{A} - \hat{\gamma })/\hat{\beta }]^{{\hat{\alpha }}} \} .$$
(7)

Recall that the arbitrary constants NA and sA are 26 and 1, respectively. Equation (7) can be rewritten to put it into the standard Weibull cdf form W(α, β, γ);

$$F_{y,k} (y) = 1 - \exp \{ - [(y - [\bar{y}_{k} + (\hat{\gamma } - N_{A} )(\frac{{s_{y,k} }}{{s_{A} }})])/\hat{\beta }(\frac{{s_{y,k} }}{{s_{A} }})]^{{\hat{\alpha }}} \} ,$$
(8)

where the shape parameter \(\hat{\alpha }\) is the same for each individual cdf, but the scale parameter \(\hat{\beta }_{k}\) and location parameter \(\hat{\gamma }_{k}\) are

$$\hat{\beta }_{k} = \hat{\beta }(\frac{{s_{y,k} }}{{s_{A} }});\;{\text{and}}\;\hat{\gamma }_{k} = \bar{y}_{k} + (\hat{\gamma } - N_{A} )(\frac{{s_{y,k} }}{{s_{A} }}),$$
(9)

which are explicitly dependent on the sample parameters for the fatigue life data given Δσk and the arbitrary constants NAand sA. Recall that the FLT is for ln(Nf); consequently, the range of values is considerably smaller than that for Nf. Table 4 contains the FLT cdf W(α, βk, γk) parameters for each given Δσk. Figure 6 shows the fatigue life data plotted on two parameter Weibull probability paper with the corresponding FLT cdfs W(α, βk, γk). Graphically, these cdfs appear to fit the data well. Indeed, the KS goodness of fit test indicates that all these cdfs are acceptable for any αs less than 0.25. The AD test, which focuses on the quality of the fit in the tails, is more discriminating. The cdfs when Δσk equals 127, 177, 206, 235, or 255 MPa are acceptable for any αs less than 0.25. When Δσk equals 123 or 157 MPa, however, the cdfs are acceptable for any αs less than 0.05. When Δσk equals 137, the AD test implies that the cdf is not acceptable. Although it is not obvious on Fig. 6 because the cycles scale is so large, both the upper and lower tail of the FLT cdf are sufficiently different from the life data. Even so, the overall deduction is that the FLT transformation is acceptable for these fatigue life data. Again, the KS test supports that conclusion, and there is only one value for Δσk, 137 MPa, for which the AD test indicates otherwise.

Table 4 FLT Weibull parameters for natural logarithm of fatigue life data for 2024-T4 specimens [9]
Fig. 6
figure 6

FLT Weibull cdfs for fatigue life data for 2024-T4 specimens [9]

Another way to assess the quality of the proposed FLT methodology is to consider percentiles p of the estimated cdfs. The percentiles are given by

$$y_{p} = \hat{\gamma }_{k} + \hat{\beta }_{k} [ - \ln (1 - p)]^{{1/\hat{\alpha }}} ,$$
(10)

which are computed from Eqs. (8) and (9). One of the most common percentiles that is considered is the median y1/2, i.e., p is 0.5. Figure 7 is an S-N graph, identical to Fig. 1, where the solid line is the FLT estimated median, and the dashed lines are the FLT estimated 99% percentile bounds. The 99% bounds are very tight while encapsulating the entire set of data for each Δσk. They also follow the trend in the S–N data in that they are very narrow when the life data have very little variability, but they are broader when the life data have larger variability. This lends credence to the FLT approach for the 2024-T4 fatigue data.

Fig. 7
figure 7

Fatigue life data for 2024-T4 specimens [9] with FLT percentile bounds

5 FLT Analysis for ASTM A969 Fatigue Life Data

The second applications of the FLT method is for the ASTM A969 fatigue data. Again, the arbitrarily chosen values for NA and sA are 26 and 1, respectively. Figure 8 shows the entire 69 FLT values merged into a common sample space. The solid line is the MLE W(α, β, γ), Eq. (6). The KS and AD goodness of fit test statistics are 0.047 and 0.524, respectively. The MLE is acceptable according to the KS test for any significance level αs less than 0.3. For the AD test, however, it is acceptable only for αs less than 0.2 because there is some deviation between the data and the MLE in the lower tail. Even so, the FLT data are well represented by the MLE W(α, β, γ) cdf. The MLE estimated parameters are \(\hat{\alpha } = 3.057;\;\hat{\beta } = 2.964;\;{\text{and}}\;\hat{\gamma } = 23.289\).

Fig. 8
figure 8

Merged FLT fatigue life data for ASTM A969 specimens [13]; Weibull MLE

Using Eqs. (7)–(9), Fig. 9 has the ASTM A969 data with the FLT cdfs W(α, βk, γk). Graphically, the FLT cdfs appear to characterize the data well. In fact, the KS test indicates that all these cdfs are acceptable for any αs less than 0.3. The AD test, however, implies that the FLT cdfs are marginal, at best. Clearly, the FLT cdfs are not as accurate in the tails. No doubt, larger samples for each Δεk would help with characterization of the extremes. Using Eq. (10), Fig. 10 is an S-N graph, identical to Fig. 2, with the addition of the FLT estimated median, and the FLT estimated 99% percentile bounds. The 99% bounds are very tight, and the all the data are within the bounds for each Δεk. They are somewhat jagged because they follow the pattern of the S-N data. The analysis is not as crisp as that for the 2024-T4 data; however, there seems to be merit in using the FLT approach for the ASTM A969 fatigue data.

Fig. 9
figure 9

FLT Weibull cdfs for fatigue life data for ASTM A969 specimens [13]

Fig. 10
figure 10

Fatigue life data for ASTM A969 specimens [13] with FLT percentile bounds

6 FLT Analysis for 9Cr-1Mo Fatigue Life Data

The third application of the FLT method for the 9Cr-1Mo fatigue data does not perform very well. Figure 3 shows unusually large scatter for the higher values of Δεk which is a good test for the FLT methodology. The arbitrary scaling factors are the same; NA and sA are 26 and 1, respectively. Figure 11 shows the 130 FLT merged values with the MLE W(α, β, γ). The KS and AD goodness of fit test statistics are 0.028 and 0.300, respectively. The MLE is acceptable according to the KS and AD tests for any significance level αs less than 0.3. The merged data are very well characterized by this cdf. The corresponding MLE estimated parameters are \(\hat{\alpha } = 2.497;\;\hat{\beta } = 2.475;\;{\text{and}}\;\hat{\gamma } = 23.769\).

Fig. 11
figure 11

Merged FLT fatigue life data for 9Cr-1Mo specimens [16]; Weibull MLE

The FLT cdfs W(α, βk, γk), Eqs. (7)–(9), for 9Cr-1Mo are shown on Fig. 12. Graphically, the FLT cdfs appear to be acceptable, at least for the cases with more data. Clearly, when Δεk is 0.019, the fit is borderline. The KS test indicates that all these cdfs are acceptable for any αs less than 0.3. On the other hand, the AD test indicates that none of the FLT cdfs are acceptable. The compressed graphical scale for the cycles to failure masks the poor fit of the FLT cdfs to the tails of the data. Figure 13 is the S-N graph, Fig. 3, with the FLT estimated median, and the FLT estimated 99% percentile bounds. Because the FLT cdfs are not very representative of the fatigue data, the 99% bounds are erratic. All the data are within the bounds for each Δεk. That is somewhat positive. They are quite jagged because they follow the S-N data pattern. One of the difficulties in modeling these data are that there are three values for Δεk that have several replicates, but the other values have only a few. Also the scatter in the data when Δεk is 0.020 appears to be larger than expected. Since there is no explanation for this in [16], the reason would purely speculation. The analysis is marginal for the 9Cr-1Mo data. There may be merit in using the FLT approach for some insight, but conclusions would need to be made very cautiously.

Fig. 12
figure 12

FLT Weibull cdfs for fatigue life data for 9Cr-1Mo specimens [16]

Fig. 13
figure 13

Fatigue life data for 9Cr-1Mo specimens [16] with FLT percentile bounds

7 Observations and Conclusions

Three sets of fatigue life data were considered for the proposed FLT method. These datasets were selected because of they have replicate data for several applied stress or strain ranges, Δσk or Δεk, respectively. The primary reason for the proposed FLT method is to accurately model the statistical nature of fatigue life data. Specifically, the estimation of underlying cdfs is crucial. It is well known that fatigue life data have rather large amounts of variability particularly for applied loads similar to typical operating conditions. Modeling these data is very challenging. Associated with this is that many fatigue life data sets have relatively few choices for Δσk or Δεk, and each choice has limited observations.

The FLT introduced in this paper attempts to help improve fatigue life modeling by using a statistically based transformation to merge data, thereby increasing the effective sample size. The FLT approach transforms the fatigue life data for each given Δσk or Δεk so that the averages and standard deviations are the same. Subsequently, the data are merged, and a suitable cdf is statistically estimated for the entire collection. The cdf for each given Δσk or Δεk is obtained by standard change-of-variables methods using the cdf that characterizes the entire transformed and merged data.

Using the 2024-T4 fatigue life in [9], the FLT is very promising partly because there are eight different values for Δσk and a total of 222 data being considered. A three–parameter Weibull cdf W(α, β, γ) is an excellent representation of the merged FLT data, and a W(α, β, γ) is also appropriate characterization for the underlying cdfs given Δσk. Consequently, there is assurance that the lower tail behavior is adequately modeled because of the methodology. Additionally, the validation for the approach is the computed FLT percentiles for the S-N data. The computed FLT 99% percentiles fully encompass all the S-N data, but more importantly, the bounds are quite tight. The conclusion from this analysis is that the FLT methodology is warranted.

To corroborate this conclusion, two other sets of fatigue life data were considered; ASTM A969 [13] and 9Cr-1Mo [16]. In both cases, the merged FLT data are well characterized by a W(α, β, γ). The KS test indicates that the transformed W(α, β, γ) is also suitable for the underlying cdfs given Δεk, but the AD test is more discriminating. The tail behavior of the underlying cdfs given Δεk are marginally acceptable at best, if at all. The 99% percentiles for the ASTM A969 S-N data are quite good. They are encompass the data, and they are tight. For the 9Cr-1Mo S-N data, however, the percentile lines are not very regular. They do encompass the data, but the data have so much variability that little is gained by the analysis.

Based on these three examples, the FLT approach should be employed for fatigue life data analysis when an empirical method is desired. The FLT method excellent for one of the cases, 2024-T4, acceptable for another case, ASTM A969, and marginal for the other case, 9Cr-1Mo. As with all empirical analyses, caution must be exercised when it is implemented. Limited applied loads with limited replicate data for each load hinders accurate modeling for any method including the FLT. As with all empirical methods, the more data there is, the better the accuracy will be. The example which was the worst, 9Cr-1Mo, seems to be poor because there is overly large scatter in the data for the higher loading conditions coupled with applied loads with only a few replicates. Again, the FLT methodology should be implemented with care.

Many sets of experimental fatigue life data contain censoring. This will be investigated in the future. In this case the cdf estimation is more advanced, especially for a three-parameter cdf. In principle the FLT methodology should be similar except for the adjustment for censoring. All things considered, the proposed FLT approach has sufficient promise that further investigation and analysis is certainly warranted. The overarching observation is that the FLT approach is useful if the fatigue data are reasonably well behaved.