1 Introduction

Temper embrittlement of Cr-Mo steels and their weld metals has been investigated since the 1960s [1]. Temper embrittlement can be accelerated by a step-cooling heat treatment over the temperature range of 595 °C down to 385 °C (1,100 °F down to 725 °F), which is usually used to determine the upwards shift in Charpy V-notch temperature for a specific energy level, usually 54 J (40 ft-lb). These investigations have concluded that composition factors can be used to predict the resistance to temper embrittlement. The Bruscato factor was proposed in 1970 and is still used today to accept or reject weld metals [1]. Bruscato further proposed that it was important to minimize Mn and Si for optimum temper embrittlement resistance, but did not propose a factor including those elements. The Watanabe factor was later proposed to take Mn and Si into account, along with P and Sn [2].

The Bruscato factor is given as:

$$ \mathrm{Bruscato}\kern0.5em \mathrm{Factor}=\frac{10\mathrm{P}+5\mathrm{Sb}+4\mathrm{Sn}+\mathrm{As}}{100} $$

The Watanabe factor is given as:

$$ \mathrm{J}=\left(\mathrm{Si}+\mathrm{Mn}\right)\times \left(\mathrm{P}+\mathrm{Sn}\right)\times {10}^4 $$

Reheat cracking has occurred in 2 1/4 Cr–1 Mo–V weld metals during postweld heat treatment. Chauvy and Pillot proposed a reheat cracking factor taking into account the roles of Pb, Bi, and Sb [3]. The reheat cracking composition factor is given as:

$$ \mathrm{Kf}=\mathrm{Pb}+\mathrm{Bi}+0.03\mathrm{Sb} $$

In all the three factors, the elements listed are expressed as parts per million (ppm). Bruscato did not propose a limit for the Bruscato factor, but his report indicates best temper embrittlement resistance when this factor is 10 or less. When the Watanabe factor is less than 150, best temper embrittlement resistance is anticipated. And Chauvy and Pillot proposed that the reheat cracking factor be limited to 1.5. Each of these three factors has been used in materials procurement as accept/reject requirements. Several chemical elements must be measured. In a procurement situation, reproducibility of measurement is a very important consideration. In many cases, the customer for the supplied materials does not trust the supplier, or trusts but attempts to verify, so two (or more) sets of chemical analyses may be used to evaluate the accept/reject criterion with regard to a particular lot of material. Because the analytical results are unlikely to be exactly the same among two or more laboratories, the potential for dispute between customer and supplier exists.

The IIW has a history of undertaking round robin testing, including for chemical analysis, to evaluate reproducibility of measurement [4]. Accordingly, the present round robin was undertaken to examine the reproducibility of measurement of trace elements P, As, Sb, Sn, Pb, and Bi in Cr–Mo steel, and to examine the effects of reproducibility of measurement on the reproducibility of determination of the Bruscato, Watanabe, and reheat cracking factors.

2 Round robin procedure

2.1 Sample preparation

Three small casts (about 90 kg each) of 2 1/4 Cr–1 Mo–V steel were poured by Voestalpine Stahl Linz GmbH., Austria, starting from a single melt divided into three parts. One part received no additions. To each of the other two parts, varying amounts of P, Sb, Sn, Pb, and Bi were added, followed by a mixing time of 20 min in the induction furnace, to arrive at low, medium, and high levels, respectively, of these trace elements. Due to safety regulations, As could not be deliberately added, so the resulting As content of the three casts is essentially the same, and is purely a random result.

Each cast was poured into a mould and cooled for 48 h. After cooling, each cast was trimmed to remove expected segregation effects; then, this was machined to a 140 mm by 140 mm by 300 mm block. Before rolling, each block was annealed at 1,200 °C for 2.5 h. Hot rolling began at about 1,100 °C and finished at about 900 °C at which point each cast was about 160 mm wide, 30 mm thick and 1,200 mm long. The cast was then sliced transversely to the length into 10 mm strips, and each strip was stamped with the cast number and the slice number within that cast. Figure 1 shows a sliced cast.

Fig. 1
figure 1

Sliced cast

The slices were then distributed to the round robin participants.

2.2 Round robin participants and data reporting

At least one sample from each cast was sent to all of the participating laboratories, except that two laboratories (coded P and V) received only a sample from the low trace elements cast. When the results were reported to the author, a code was assigned in a random fashion to that laboratory’s results. Some laboratories analyzed samples by more than one analytical technique. In those cases, the same random code was used for both sets of results on a given sample. However, if a laboratory received more than one sample of each cast, then a different code was assigned to the second or third sets of results. A total of seventeen laboratories reported results from the round robin. However, Laboratories P and V analyzed only As, Sb, Sn, Pb, and Bi on a single sample from the low trace element cast, so the bulk of the round robin consists of data from fifteen laboratories.

In addition to the analytical results, each laboratory reported the method of analysis. Some, but not all, also reported instruments used by make and model, in-house standard deviation of measurement, and calibration standards employed.

Some laboratories reported trace element results in weight percent to three significant places after the decimal. Others reported trace element results in ppm, sometimes without any figures after the decimal place, but sometimes with figures after the decimal place. All of the trace element results were converted to ppm for this report. So the trace results in this report include some data with figures after the decimal place, and some without. When a weight percent was converted to ppm, a non-significant zero appears at the end of some data.

2.3 Analytical methods

A total of six analytical methods were used by participants in the round robin to measure trace elements P, As, Sb, Sn, Pb, and Bi. These are as follows:

  • Optical emission spectroscopy (OES)

  • Colorimetric (wet)

  • Inductively coupled plasma with optical emission spectroscopy (ICP-OES)

  • Inductively coupled plasma with mass spectrometry (ICP-MS)

  • Atomic absorption (AA)

  • Glow discharge mass spectrometry (GDMS)

In addition, one laboratory used a gravimetric (wet) method to measure Si, and several used wet gravimetric methods to measure Cr.

3 Round robin results

Tables 1, 2 and 3 summarize the round robin results for the three casts of steel. The tables are color coded to indicate the analytical method used for each individual data point. It can be seen in these tables that several laboratories used more than one analytical method for several elements in a given sample. Below the individual results, the interlaboratory average and standard deviation are given, followed by the ratio of the interlaboratory average to the standard deviation. This last calculated value is of particular interest because it provides an indication of the interlaboratory reproducibility of results for a given element. A ratio greater than 2.0 indicates a high degree of reproducibility. Then the last three columns of each table consist of the calculated Bruscato factor, the calculated Watanabe factor, and the calculated reheat cracking factor for each sample. Finally, the average, standard deviation and ratio of the average to the standard deviation of each of these three factors are included in the tables.

Table 1 Chemical analysis results for the low trace element cast
Table 2 Chemical analysis results for the medium trace element cast
Table 3 Chemical analysis results for the high trace element cast

Some laboratories did not report any value for one or more elements. In most cases, this seems due to the laboratory’s instruments not being calibrated for the particular element not reported. There were also a number of cases where a laboratory reported only that a given element was present at a level less than some specific value, presumably the detectability limit of the instrument used or the lowest level of calibration in which the laboratory had confidence. In Tables 1, 2, and 3, such data are excluded from calculation of averages and standard deviations.

3.1 Analysis of major elements and elements outside of the factors of interest

It is worth noting that the interlaboratory reproducibility of measurement of each of the main alloying elements C, Mn, Si, Cr, Mo, and V is quite high—the ratio of the interlaboratory average to the standard deviation for each element is 9.6 or higher. Furthermore, analysis of other elements including S, Ni, Nb, Cu, and Ti is quite good—the ratio of the interlaboratory average to the standard deviation is greater than 2.0. Only W in all three casts and Al in the low trace element cast exhibit ratios below 2.0. In the case of W, this is no doubt due to the fact that there are extremely low levels present, and W was not deliberately varied. In the case of the Al in the cast with low trace elements added, a single result from Laboratory Z causes the rather low reproducibility—omission of this result changes the ratio to 2.26.

3.2 Analysis of P, As, Sb, Sn, Pb, and Bi

3.2.1 Phosphorus results

In the case of the low trace element cast (Table 1), the ratio of the interlaboratory average value to the standard deviation is barely greater than 2.0. However, examination of the values indicates that the P result reported by Laboratory F seems unreasonably high, and that reported by Laboratory Y seems unreasonably low. If these two values are excluded from the calculations, the ratio of interlaboratory average to standard deviation rises to 4.38, while the interlaboratory average value barely changes (to 42.1 ppm from 44.6 ppm. In the cases of the medium trace element cast (Table 2) and the high trace element cast (Table 3), the ratio of the interlaboratory average to standard deviation is well above 2.0 even when the results of Laboratory F and Laboratory Y are included.

It is noteworthy that two main methods of analysis were used by the participants—OES and colorimetric. Figure 2 presents the phosphorus results as bar charts color coded so that the various analytical methods used can be readily seen. The results of just the OES and colorimetric methods can be compared in Table 4, both including and excluding the OES result of Laboratory F. From Table 4, it can be seen that the results are quite similar, with the average of each method within one standard deviation of the other and the standard deviations at each level of similar magnitude, when exceptional results of Laboratory F are excluded. So it seems that the OES and colorimetric methods of analysis produce similar results for P.

Fig. 2
figure 2

Phosphorus results color coded by analytical method

Table 4 Comparison of OES and colorimetric results for P

3.2.2 Arsenic results

Arsenic was not deliberately varied, so essentially the same arsenic level can be expected in all three casts. OES was the most used method, followed by ICP-MS. Figure 3 presents the arsenic results as bar charts color coded so that the various analytical methods used can be readily seen. Table 5 compares the arsenic results for OES with those of all methods except OES and with those of ICP-MS only. It can be seen in Table 5 that the ICP-MS only results are easily the most consistent. OES produces results that are similar to all other analytical methods combined, but not as consistent as ICP-MS alone. If all of the ICP-MS results for the three casts are lumped together, the overall average is 19.7 ppm, the standard deviation is 3.3, and the ratio of the average to the standard deviation is 5.96, which is similar to the results for any one cast.

Fig. 3
figure 3

Arsenic results color coded by analytical method

Table 5 Comparison of OES, all other results for arsenic and ICP-MS only

The ICP-OES results for Laboratory Z are consistently much higher than the bulk of the results, and the OES result for Laboratory B on the high trace elements cast seems similarly high. It should be noted that Laboratory B did not report a measurable value for arsenic for the other two casts.

3.2.3 Antimony results

For all three casts, Laboratory Z results seem unreasonable as compared with the others; see Tables 1, 2 and 3. ICP-MS seems to produce the most consistent results. Table 6 compares all results for each cast, all results without Laboratory Z, and results only for ICP-MS. Without Laboratory Z results, the reproducibility of all methods seems sufficient to allow differentiation between levels, but the ICP-MS results provide for the best discrimination among levels. Figure 4 presents the antimony results as bar charts color coded so that the various analytical methods used are easily seen and visually compared. It should be noted in examining Fig. 4 that when the colored bar for a particular laboratory reaches the top of the chart, it reaches considerably beyond that, but the scale of the chart was set to better illustrate the remaining results.

Table 6 Comparison of results for antimony
Fig. 4
figure 4

Antimony results color coded by analytical method

3.2.4 Tin results

There seems to be one unusual result in the Sn analyses that of Laboratory X using GDMS for the high trace element cast (Table 3). This seems especially strange because the same laboratory, using the same method for the low trace element cast and the medium trace element cast, reported results consistent with those of the other laboratories. Results are compared in Table 7.

Table 7 Comparison of results for tin

Table 7 compares the results obtained by OES only, which are the most numerous for each cast, with those obtained by ICP-OES and with those obtained by ICP-MS. The results for the various methods are similar. Figure 5 presents the tin results as bar charts color coded so that the various analytical methods used can be seen and visually compared. The tin results appear considerably more consistent in the medium and high trace element casts than in the low trace element cast, and this is confirmed statistically in Table 7.

Fig. 5
figure 5

Tin results color coded by analytical method

3.2.5 Lead results

The results of Pb analysis are very widely scattered, from 1.1 to 200 ppm for the low trace element cast (Table 1), from 3 to 200 ppm for the medium trace element cast (Table 2) and from 8.49 to 200 ppm for the high trace element cast (Table 3). If the extremely high values reported by Laboratories B and Z are excluded, the ranges of analysis for each cast are still very broad. But when the analysis method is limited to ICP-MS (five laboratories for the low trace elements cast, four for the medium and high trace elements casts), the reproducibility is excellent. Table 8 compares all results, all results except those of Laboratories B and Z, and the results obtained only by ICP-MS. Figure 6 presents the lead results as bar charts color coded so the various analytical methods can be easily seen and results with the various methods can be compared visually. It should be noted that, in examining Fig. 6, in the cases where the color bar for a particular laboratory reaches the top of the chart, it extends considerably beyond that level, but the scale of the chart was chosen to better illustrate the remaining results.

Table 8 Comparison of results for lead
Fig. 6
figure 6

Lead results color coded by analytical method

3.2.6 Bismuth results

Many of the participating laboratories were unable to determine Bi (see Tables 1, 2, and 3). Ten sets of results were reported for the low trace element cast, and eleven reported results for the medium and high trace element casts. Laboratory X is the only laboratory that reported results by two analytical methods—OES and GDMS. For all three casts, the OES result of Laboratory X was considerably greater than the GDMS result of Laboratory X. The overall results reveal a great deal of scatter, but the ICP-MS results (4 laboratories analyzing each cast) are remarkably consistent (see Table 9). It should be noted that Laboratory R attempted to analyze the low trace element cast but could only report that the result was less than 1.0 ppm, while Laboratory V only analyzed the low trace element cast. Figure 7 presents the bismuth results as bar charts color coded so the various analytical methods can be easily seen and the various methods can be compared visually. It should be noted that, in examining Fig. 7, in cases where the color bar for a particular laboratory reaches the top of the chart, it extends considerably beyond that level, but the scale of the chart was chosen to better illustrate the remaining results.

Table 9 Comparison of results for bismuth
Fig. 7
figure 7

Bismuth results color coded by analytical method

3.3 Calculated factors

Some laboratories were unable to analyze all of the trace elements of interest in this round robin (P, As, Sb, Sn, Pb, and/or Bi). As a result, calculated Bruscato, Watanabe, and/or reheat cracking factor(s) are omitted for these laboratories in Tables 1, 2, and/or 3. Some laboratories analyzed trace elements by more than one method, which allowed more than one calculation of one or more of the factors. An example is Laboratory G in Table 1, which analyzed Si, Mn, and P only by colorimetric methods, but analyzed As, Sb, Sn, Pb, and Bi by both ICP-MS and AA. The AA results for Pb and Bi were below the detectability limit of AA for Laboratory G, but the values obtained by colorimetric and ICP-MS could be used with the AA results for As, Sb, and Sn to calculate a second value for each of the three factors, as can be seen in Table 1.

Consideration was given to calculating one or more of the factors by using the detectability limit when no single value was reported for one of the trace elements, but this lead to some very unrealistic results, and that effort was abandoned.

3.3.1 Bruscato factor

The elements P, Sn, Sb, and As are required to calculate the Bruscato factor. For the low trace element cast, data were provided that permitted eleven calculations of the Bruscato factor, as can be seen in Table 1. Fourteen calculations could be made for the medium trace element and high trace element casts, as shown in Tables 2 and 3. It is to be noted that the high P results reported by Laboratory F would have produced a high Bruscato factor for all three casts, but the lack of results for Sb, Sn and As from that laboratory precluded calculation. Likewise, Laboratory Z reported unusually high Sb results, but the lack of a P result prevented calculation of the Bruscato factor for that Laboratory.

Table 10 provides a summary of the Bruscato factor calculations for the laboratories providing sufficient data to make the calculation. In addition to the interlaboratory average, standard deviation and ratio of average to standard deviation for all of the laboratories, the data is subdivided into that from laboratories whose analysis for P, Sn, Sb, and As was entirely by OES, and those whose analysis for at least one of these elements was by a method other than OES. It can be seen in Table 10 that there is no statistically significant difference between the Bruscato factor calculated from data obtained entirely by OES and that obtained by other methods. Figure 8 presents the results as bar charts color coded to allow separation of results obtained entirely by OES from results which included at least one method other than OES.

Table 10 Bruscato factor calculations
Fig. 8
figure 8

Bruscato factor color coded by analytical method

3.3.2 Watanabe factor

The elements Mn, Si, P, and Sn are required to calculate the Watanabe factor. For the low trace element cast, data were provided that permitted 20 calculations of the Watanabe factor (Table 1); for the medium trace element cast, 22 calculations (Table 2); and for the high trace element cast, 23 calculations (Table 3). It should be noted that the high P results of Laboratory F are not included in the calculations because Laboratory F did not report Sn results. Likewise, Laboratory Z did not report P results, so that Laboratory is excluded from the calculations.

Table 11 provides a summary of the Watanabe factor calculations for the laboratories providing sufficient data to make the calculations. Table 11 contains the same type of calculated results as Table 10. As with the Bruscato factor, there is no statistically significant difference between the Watanabe factors calculated from data obtained by OES and those obtained by other methods. Figure 9 presents the results as bar charts color coded to allow visual separation of results obtained entirely by OES from those obtained using at least one method other than OES to analyze at least one element.

Table 11 Watanabe factor calculations
Fig. 9
figure 9

Watanabe factor color coded by analytical method

3.3.3 Reheat cracking factor

The elements Pb, Bi, and Sb are required to calculate the reheat cracking factor. For the low trace element cast, data were provided that permitted 10 calculations of the reheat cracking factor (Table 1); for the medium trace elements cast, 11 calculations (Table 2); and for the high trace elements cast, 12 calculations. Due to the small coefficient for Sb (0.03) in the reheat cracking factor, Sb could easily be ignored, but the laboratories for which Sb data was lacking also did not provide results for either or both Pb and Bi.

Table 12 provides a summary of the reheat cracking factor calculations for the laboratories providing sufficient data to make the calculations. There is very little OES data, so separating that data from the others provides no useful statistical information. There is a great deal of scatter evident in the calculated reheat cracking factor data, but the ICP-MS data has very little scatter among the results for the four laboratories that provided ICP-MS results for Pb, Bi, and Sb. There is a statistically significant difference between the ICP-MS calculations and the remaining calculations. It should be noted, however, that a few of the other results are similar to those of ICP-MS, but there are not enough of these other results to make a statement about whether or not they are statistically significant.

Table 12 Reheat cracking factor calculations

Figure 10 presents the reheat cracking factor results as bar charts color coded to emphasize the analytical method used. In all cases, except the second set of results from Laboratory G, the elements Pb, Bi, and Sb for a given laboratory were all obtained by the same analytical method. It should be noted that, in examining Fig. 10, in cases where the color bar for a particular laboratory reaches the top of the chart, it extends considerably beyond that level, but the scale of the chart was chosen to better illustrate the remaining results. The consistency of the results obtained by the ICP-MS method is obvious.

Fig. 10
figure 10

Reheat cracking factor color coded by analytical method

4 Discussion of results

The present round robin was undertaken to examine the reproducibility of measurement of trace elements P, As, Sb, Sn, Pb, and Bi in Cr–Mo–V steel, and to examine the effects of reproducibility of measurement on the reproducibility of determination of the Bruscato, Watanabe, and reheat cracking factors.

Bruscato [1] had recommended that chemical analysis by OES not be used to calculate the Bruscato factor because he considered that the OES capabilities of the 1960s were suspect. Bruscato preferred wet methods such as the colorimetric determination of P. In Bruscato’s day, the other instrumental methods used in the present round robin did not exist. In the present round robin, there was extensive use of OES, as well as extensive use of other instrumental methods of analysis. Only Mn, Si, Cr, and P were analyzed by wet methods (colorimetric or gravimetric) by at least one laboratory.

It should be recognized that some instrumental methods such as OES require instrument calibration with very well-characterized standards for optimum results. The availability of such standards is considerably more extensive than it was in Bruscato’s day. This in turn could be expected to improve the reproducibility of the instrumental methods. For analysis of the elements comprising the Bruscato factor and the Watanabe factor, this appears to be the case, but it should be kept in mind that the Bruscato factor is dominated by phosphorus and tin. The coefficients for P (10) is larger than any of the other coefficients in the Bruscato factor, and only Sn is present in amounts as great as, or greater than, P. Sb and As levels are lower. For the Watanabe factor, As and Sb do not enter into the calculation. Table 4 indicates that the reproducibility of P analysis is quite good, and Table 7 indicates that the reproducibility of Sn is also quite good. Table 6 indicates that the reproducibility of Sb analysis was good except for the low trace elements cast. Table 5 indicates that the reproducibility of As analysis was mostly acceptable also.

When the Bruscato factor is calculated for the various laboratories providing sufficient data, the reproducibility is high for all three casts, Table 10. Furthermore, there is no statistically significant difference among the Bruscato factors calculated from all of the data, the Bruscato factors calculated from OES data only, and the Bruscato factors calculated from all data except OES results at any of the three trace element levels. This is a very satisfactory result. However, it must be kept in mind that the data from Laboratory F, which provided exceptionally high P analysis results, is excluded from the Bruscato factor calculations because that laboratory did not provide an arsenic, antimony or tin analysis, and the data from Laboratory Z, which provided exceptionally high Sb analysis results, is excluded from the Bruscato factor calculations because that laboratory did not provide P analysis results. With these limitations in mind, it is clear that the low trace elements cast would satisfy a Bruscato factor limit of 10, but the medium trace elements cast and the high trace elements cast would not.

When the Watanabe factor is calculated for the various laboratories providing sufficient data, the reproducibility is again high, and there is no statistically significant difference among the Watanabe factors calculated from all of the data, from OES data only, and from all data except OES data at any of the three trace element levels. It should be noted that the results from Laboratory F, which provided exceptionally high P analysis results, are excluded from the Watanabe factor calculations because that laboratory did not provide Sn results. And the results from Laboratory Z are excluded from the Watanabe factor calculations because that laboratory did not provide P analysis results. With these limitations in mind, it is clear that the low trace elements cast would satisfy a Watanabe factor limit of 150, but the medium trace elements cast would not do so reproducibly, and the high trace elements cast would be very unlikely to do so.

The situation with regard to the reheat cracking factor is considerably less favourable than the situations with regard to the Bruscato factor and the Watanabe factor. When all of the data for Pb are included, the reproducibility is very poor (Table 8). Only when the analysis is restricted to ICP-MS can statistically significant results be found. The same is true of the bismuth analyses (Table 9). As a result, calculation of reheat cracking factors using all available data leads consistently to values for this factor that are not statistically significant. Only when the data used is restricted to ICP-MS results can statistically significant reheat cracking factors be obtained (Table 12).

It should be noted that there are a few results from other analytical methods that seem consistent with the ICP-MS results, but there are not enough of these to evaluate whether or not other methods can provide consistent and reproducible results. The AA results of Laboratory T seem to be mostly consistent with the ICP-MS results for all three trace element casts (Figs. 6, 7, and 10), but no other laboratory provided AA results for Pb, Bi, and Sb.

It should also be noted that, according to Chauvy and Pillot [3], the Reheat Cracking factor needs to be held at less than 1.5 for optimum resistance to reheat cracking, but the reheat cracking factor determined by ICP-MS for the lowest trace element cast averages 2.0 for the laboratories using ICP-MS, with no values at or below 1.5, and the average value is much higher for the other two casts. So it cannot be said that statistically significant differences can be determined for calculated reheat cracking factors below 1.5. This suggests that another round robin, with cleaner material than that of the low trace elements cast of this round robin, is needed to establish reproducibility at that level.

5 Conclusions

Based upon the results of the recently completed round robin on trace elements, the following conclusions seem justified:

  1. 1.

    The reproducibility of chemical analysis of the elements P, As, Sb, and Sn has been shown to be sufficient to allow statistically significant accept/reject decisions to be made for materials based on Bruscato factor calculations. In particular, analysis of these elements by OES, AA, ICP-MS, ICP-OES, GDMS, and wet methods, after proper calibration, produces statistically equivalent results.

  2. 2.

    The reproducibility of chemical analysis of the elements Mn, Si, P, and Sn has been shown to be sufficient to allow statistically significant accept/reject decisions to be made for materials based on Watanabe factor calculations. Again, analysis of these elements by OES, AA, ICP-MS, ICP-OES, GDMS, and wet methods, after proper calibration, produces statistically equivalent results.

  3. 3.

    The reproducibility of chemical analysis of the elements Pb, Bi, and Sb has not been shown to be sufficient to allow statistically significant accept/reject decisions to be made for materials based on reheat cracking factor calculations. In part this is due to the lowest level of trace elements in this round robin being higher than the accept/reject criterion of 1.5 maximum proposed by Chauvy and Pillot [3]. The data suggest that ICP-MS may be capable of allowing such decisions, but that has not been established.