Introduction

The main objective of interlaboratory exercises, from which they derive their increasing utility and importance, is to help laboratories demonstrate their quality and competence in performing tests, which allows the analytical capacity of a method to be evaluated in comparison with the capacities of other similar laboratories. These tests are an important and useful tool for comparing results and verifying a laboratory’s technical proficiency [1, 2].

According to ISO 17043 standard, interlaboratory exercises are widely used for several purposes, and their use is increasing. Typical purposes for interlaboratory comparisons include “(1) to evaluate the performance of laboratories for specific tests or measurements; (2) to identify problems in laboratories and initiate actions for improvement; (3) to establish the effectiveness and comparability of test or measurement methods; (4) to provide additional confidence to laboratory customers; (5) to identify interlaboratory differences; and (6) to educate participating laboratories based on the outcomes of such comparisons” [3].

Testing and calibration laboratories, which are accredited by the International Organization for Standardization (ISO)/International Electrotechnical Commission (IEC) 17025:2017 [1], are required to participate regularly in intercalibration tests because the generated information can be used to improve the analytical process. For this reason, exercises of this type have been organized worldwide, although most have focused on comparing the presence of various pesticides in different foods, such as cereals [4], mango pulp [5], Chinese cabbage [6], brown rice [7, 8], olive oil [9], and husked wheat [10].

Few of these studies have been conducted in Mexico, and most have (1) focused on pesticides in food, for the reasons discussed above and (2) been organized by the National Metrology Center (CENAM, which is its acronym in Spanish). Examples include the studies organized by CENAM in 2013 to compare the determination of pesticides in freeze-dried cabbage [11] and freeze-dried avocado [12] and that conducted in 2017 on pesticides in coffee [13]. However, there are no reports of this type of tests in drinking water.

Strengthening the analytical capacity in Mexico is important because it is necessary to have information that supports the quality of the laboratories that analyze pesticides in different matrices and to have a national database that clearly identifies the laboratories that perform this type of analysis. Therefore, the objective of this study was to evaluate the measurement capacities of nine laboratories to determine the levels of five organochlorine pesticides (OCPs) in drinking water and perform an intercomparison of the results.

Materials and methods

Invitation to the exercise

An invitation to participate in the exercise, by e-mail, was sent to several laboratories that routinely perform OCP analyses in different matrices. Nine accepted the invitation and claimed to have the analytical capacity necessary for the analysis. These laboratories were distributed as follows: one each in Colima, Guerrero, Nayarit, and Sinaloa, two in Mexico City, and three in Sonora. Once the laboratories had agreed to participate, a confirmation was sent along with an identification code (from 1601 to 1609) to maintain the reliability and confidentiality of their participation in the interlaboratory exercise and of the results.

Preparation and distribution of test samples

The mixture of OCPs used to fortify the test material (drinking water) was prepared from certified reference materials of aldrin, β-endosulfan, heptachlor, lindane, and p,p′-DDE with > 98 % purity obtained from Chem Service. For each compound, a stock standard solution was prepared in acetone at concentrations between 2.24 and 3.71 mg/mL, and intermediate (10 mL), and working (50 mL) solutions were prepared from these standard solutions (Table 1). Aliquots of 1.5 mL were placed in amber-colored glass vials, sealed, and labeled a mixture of OCPs in acetone. The vials were stored under refrigeration at 4 °C until they were sent to the participating laboratories.

Table 1 Concentrations of the stock standard, intermediate, and working solutions used in the proficiency testing of OCPs in water

On September 21, 2016, two 505-mL bottles containing drinking water (obtained from the organizing laboratory’s purification plant and previously verified as pesticide-free) as well as a vial with 1.5 mL of the fortifying mixture were distributed to each laboratory for analysis. The samples were sent by messenger service to each of the participating laboratories by cold chain, and the laboratories were instructed to handle them according to the protocol established by the organizing laboratory. The protocol consisted of placing 900 mL of the sent matrix (drinking water) in a 1000-mL volumetric flask and adding 1 mL of the fortifying solution, which had been previously equilibrated at room temperature, to subsequently make a volume of 1000 mL; the mixture was left to stand overnight under refrigeration at 4 °C.

A cover sheet with the corresponding participation code number was attached to each sample, and the laboratories were requested to provide relevant technical information about the analysis, such as the extraction method, measuring instrument, extract cleaning process, quantification method, and type of chromatographic column.

Homogeneity study

To evaluate the homogeneity of the sample, nine samples of 1 L of drinking water that had been previously verified as being free of the analytes of interest were randomly selected and fortified with one milliliter of the organochlorine pesticide mixture. The concentration was established based on reference values for drinking water of the US Environmental Protection Agency (EPA) [14]. The samples were subsequently analyzed, under repeatability conditions, by the organizing laboratory using gas chromatography with an electron microcapture detector (GC-μECD) as indicated in EPA Method 508 [15]. One-way analysis of variance (ANOVA), as recommended in ISO Guide 35, was used to evaluate the homogeneity of water samples with respect to the concentration of pesticides [16].

Stability study

Stability study was performed using two randomly chosen test samples that were analyzed twice in duplicate, including before shipment and shortly after the deadline, to report the results. The stability tests were considered acceptable if the difference in concentration for each pesticide was less than 10 % [9].

Report requirements

The participating laboratories were provided with a list of 19 pesticides that were possibly present in the sample and were informed that only five would actually be present in the sample. The instructions suggested that they make three injections of the extract, calculate the mean and standard deviation of the concentration of OCPs in the drinking water, and report the results in μg/L.

Receipt of results

A record form was sent with the samples to enter the results, which each laboratory completed and sent back to the organizing laboratory by electronic mail. The laboratories were given 21 working days to deliver the results. The deadline for this activity was counted from the date of receipt of the sample by each participating laboratory. The record of the results contained the following information: identification of the participating laboratory, name of the contact and/or responsible person at the laboratory, conditions of the sample on receipt, date and person in charge of the receipt, date of the performance of the test, and background of the method or methods used.

Evaluation of the results

The results were evaluated by means of the z-score [3], which was calculated as

$$Z = \frac{{x - x_{\text{a}} }}{{\sigma_{\text{p}} }},$$
(1)

where x is the mean value obtained by the participating laboratory, xa is the assigned value, and σp is the standard deviation for proficiency.

Calculation of the assigned mean value (X a)

The assigned mean value was calculated from reference values determined by the organizer by analyzing six drinking water samples fortified according to the procedure described above and extracting the analytes according to EPA Method 508 (liquid–liquid partition) with subsequent quantification by gas chromatography (GC) with a μECD detector [15], and the uncertainty of these values was calculated according to the guide to estimate the uncertainty of the measurement [17] (see Electronic Supplementary Material).

Standard deviation for proficiency (σ p)

The appropriate standard deviations for proficiency, per analyte, were obtained from the Horwitz equation (see Electronic Supplementary Material).

Table 2 shows the assigned mean values and standard deviation for each pesticide, which were obtained according to the procedure described above and by which the participating laboratories were compared.

Table 2 Assigned values for the repeatability of samples fortified with the analytes of interest

To rate the results of the laboratories, the z-scores represent the difference between the result of the participating laboratory and the assigned value established by the organizing laboratory for each evaluated pesticide. The z-scores were interpreted as shown below:

  • Satisfactory if |z| ≤ 2

  • Questionable if 2 < |z| < 3

  • Unsatisfactory if |z| ≥ 3

Results and discussion

Homogeneity study

Despite the efforts made to ensure the homogeneity of the material used in the laboratory intercomparison test, these materials are almost always slightly heterogeneous. When the material is divided into portions (aliquots) and distributed to the laboratories, there is some variation in the composition of the samples. In this study, ANOVA was used to determine whether the variation in the composition of the distributed samples was sufficiently small for the exercise.

The F values calculated by one-way ANOVA were compared with the tabulated F value with a 95 % confidence level. There were no statistically significant differences in the pesticide concentrations, which indicated that the analysis materials were sufficiently homogeneous for the proposed objective (see Electronic Supplementary Material).

Stability study

The concentrations of pesticides in the drinking water samples were analyzed before shipment and shortly after the results were received did not show a variation greater than 10 %; therefore, according to Generali [9], the concentrations of the pesticides remained stable during the study period. This was to be expected, considering that the analytes are persistent organic contaminants, and it cannot be assumed that degradation/losses should occur during the transport of the analytical standards.

Analytical methods used by the participating laboratories

The analytical methods used to determine the pesticides included extraction, concentration, elimination of interferences, and analyte detection. These steps can be time-consuming, costly, and risky due to their use of toxic reagents [18,19,20].

Table 3 summarizes the extraction and determination methods used by the participating laboratories. The extraction methods reported by the participants included liquid–liquid extraction (LLE) (5), solid-phase extraction (SPE) (2), microwave-assisted extraction (MAE) (1), and solid-phase microextraction (SPME) (1).

Table 3 Description of the methods used by participating laboratories in the intercomparison exercise for the analysis of OCPs in water

The LLE is a common method used for the determination of OCPs in different types of water, including river water [21,22,23], lake water [24], canal water [25], groundwater [21, 26], and drinking water [27]. Despite its frequent use, this method has more disadvantages than advantages, such as the use of large volumes of toxic organic solvents, the formation of emulsions, low sensitivity, and being a long and tedious process [18].

The SPE has also been used in the determination of OCPs in river water [28, 29], lagoon water [30], groundwater [28, 29, 31, 32], and drinking water [33,34,35]. The advantages of SPE include its low consumption of organic solvents, speed, sensitivity, robustness, selectivity, and easy automation [36]. The disadvantages of SPE include its inability to work directly with solid samples, which occasionally requires an additional step of solvent evaporation of the final extract and reconstitution in a minimal volume to increase the analyte detection sensitivity [19, 20]. In addition, several variables can affect the extraction process, such as the type of adsorbent material in the cartridge, the sample volume, and the solvent used to extract the pesticides [37].

In recent years, the LLE and SPE methods have been revised and improved by miniaturization and simplification for the determination of pesticides in food matrices, which have led to the development of SPME and different liquid-phase microextraction techniques [38] that allow them to be faster and more versatile. Laboratory 1609 reported the use of SPME during sample processing (Table 3), which demonstrated that it uses updated extraction methods.

Solvent selection is important to achieve an efficient extraction of the pesticides. The solvent should have a high selectivity to extract as much analyte as possible with the smallest amount of sample matrix. Another important parameter is the compatibility of the solvent with the next step of the analytical process, which is generally that of cleaning the extract, to avoid tedious processes of evaporation and/or solvent changes [39]. In some cases, mixtures of a low polarity solvent and a water-miscible solvent are used to facilitate the extraction of pesticides with a wide range of polarities [19]. The participants in this exercise reported the use of four different solvents (dichloromethane, methanol, hexane, and ethyl ether), which have different polarities, in the extraction–reconcentration process of the test samples (data not shown). By combining the appropriate choice of extraction solvent and extract cleaning with good analytical specificity, the method is considered to be suitable for the proposed purpose. However, most of the participants did not use a cleaning step, laboratory 1604 used Florisil® adsorbent, and laboratory 1608 used alumina (Table 3). The purification step is often necessary to reduce matrix interferences, which results in improved selectivity.

Pesticides are analyzed and detected after extraction from the matrix. The most commonly used instrumental techniques in the analysis of pesticides are gas and liquid chromatography [20]. The use of gas chromatography–electron capture detection (GC-ECD) for detecting OCPs is common due to its high resolution and good sensitivity in the picogram range. Gas chromatography–mass spectrometry detection (GC–MS) is also widely used for the determination of OCPs in complex matrices [40]. In this study, only two quantification methods were reported: GC-ECD (seven laboratories) and GC–MS (two laboratories) (Table 3).

Although the advantages of GC-ECD described above make it a commonly used method, it should be noted that it does not provide unequivocal identification of the analytes of interest and may require a combination of different columns of different polarities for the determination of OCPs in the samples [41]. Moreover, its high sensitivity allows a wide range of compounds to be coextracted with the analytes of interest, such as plasticizers (phthalate esters), which can prevent a correct interpretation of the results [40].

GC–MS allows the nearly unequivocal identification of the compounds, as long as it is used in tandem mass mode. In this study, two participating laboratories reported the use of this method (laboratories 1607 and 1609), although selective ion monitoring (SIM) was used in the compound identification, which led to the confusion of matrix interference with several of the analytes of interest.

In all cases, the quantification was performed using the external standard method with three- to six-point calibration curves (except laboratory 1606, which used a one-point calibration curve) and using reference material of different brands, including Fluka Analytical (one laboratory), Sigma-Aldrich (one laboratory), Chem Service (one laboratory), Ultra Scientific (five laboratories), and Restek (one laboratory), which were dissolved in different solvents (hexane, methanol, isooctane, and a mixture of hexane and toluene) (data not shown). However, isotope dilution mass spectrometry (IDMS) has recently been found to provide greater accuracy than external standard methods because a major critical factor in the analytical method of organic contaminants is the lack of an appropriate internal standard to monitor and control losses of the whole analytical procedure [10].

Another critical factor in analytical determination is the choice of the chromatographic column because an easy separation of the analytes of interest will be achieved, depending on its composition, with few coextraction interferences [42]. In this sense, the laboratories reported the use of chromatographic columns with different compositions; six laboratories used a column with 5 % diphenyl-95 % dimethylpolysiloxane of different commercial brands, two reported the use of a column with 35 % phenyl-65 % dimethyl arylene siloxane, and one used a column with 50 % diphenyl-50 % dimethylpolysiloxane (Table 3).

The results showed the use of columns with very different stationary phases. The choice of phase takes into account the polarity of the solutes to be separated and their phase retention time as their polarity increases. The choice of the stationary phase will depend not only on the polarity of the solutes but also on an overall view of the complex mixture to be separated. Because the degree of separation of two substances depends on their respective partition coefficients in the stationary phase, each particular mixture must have, at least theoretically, one phase that performs the separation better than the others. However, none of the laboratories reported having had any problems during the separation of the mixture of OCP standards during the determination but only at the time of quantification. Laboratory 1603 indicated the presence of three coelutions: beta HCH-lindane, p,p′-DDE–dieldrin, and endosulfan sulfate-p,p’-DDT.

Pesticides reported by the participating laboratories

Table 4 shows values of the means of pesticide concentrations that were reported. Only laboratory 1603 identified all five analytes that were present in the sample, whereas the other participants reported four (laboratories 1605, 1608, and 1609), three (laboratory 1602), and two (laboratories 1601, 1604, 1606, and 1607) analytes.

Table 4 Concentrations of the analytes reported by the participating laboratories

A false positive result was considered when the presence of pesticides included in the list of possible ones was reported, but (1) they were not used in the preparation (fortification) of the test material, and (2) they were not detected by the organizing laboratory (even after repeated analyses with lower detection limits). False negatives are the results reported by the participating laboratories as “undetected,” despite having the capacity to analyze them and being part of the fortifying mixture [42,43,44]. All laboratories except laboratory 1603 reported false negatives (between one and three compounds) (Table 4), whereas false positives varying from two to four compounds were reported by five participants (laboratories 1601, 1602, 1603, 1607, and 1609; data not shown).

False positives are common in this type of studies [42,43,44] and may be due to multiple factors, such as a matrix effect [45]. None of the participants reported having considered the matrix effect in their measurements. Poole [45] suggested that the use of a fortified matrix for calibration is effective at nullifying the matrix effects.

The extensive use of ECD (reported by seven of the nine participants) could be related to the presence of false positives because this detector has high resolution and good sensitivity (in the picogram range); as a result, many coextractions or, on some occasions, the noise level could be confused with any of the analytes of interest.

The number of false positives has decreased dramatically after the introduction of gas chromatography–tandem MS (GC–MS/MS), which gives greater confidence in identification [45]. Other factors should also be taken into account, such as the performance of the methods by qualified personnel and/or the application of a quality management system (including routine participation in proficiency tests) that allows this type of problem to be identified.

On the other hand, based on the results observed in this intercomparison exercise, false negatives (independent of the detector used for the quantification) were observed mainly due to two factors: (1) calculation errors (the results were at a higher or lower order of magnitude) and (2) the use of standard solutions in an inappropriate range; that is, the range of concentrations used by the participants was higher or lower than the concentration of pesticides in the sample. The clear reduction in the number of false negatives in this type of study is directly related to the implementation of modern technology (GC–MS/MS and LC–MS/MS), which allows the nearly unequivocal identification of trace contaminants and the improvement in the technical skills of the personnel involved in this type of analysis.

z-scores

Table 5 summarizes the z-scores for each pesticide by number and percentage (satisfactory, questionable, and unsatisfactory). Figure 1 graphically shows the z-score of each participating laboratory for aldrin (1a), β-endosulfan (1b), heptachlor (1c), lindane (1d), and p,p′-DDE (1e). The percentages of z-scores with satisfactory results were 20, 20, 20, 40, and 16.67 % for aldrin, β-endosulfan, heptachlor, lindane, and p,p′-DDE, respectively. We also observed z-scores with questionable results for aldrin (20 %), β-endosulfan (20 %), and p,p′-DDE (33.33 %). Finally, the percentages of z-scores with unsatisfactory results ranged from 50 (p,p′-DDE) to 80 % (heptachlor).

Table 5 Summary of z-scores obtained for each of the pesticides evaluated in the proficiency test
Fig. 1
figure 1

z-score obtained by participant for a aldrin, b endosulfan beta, cp,p′-DDE, d lindane, and e heptachlor

It is well known that observations tend to be biased toward values that are lower than expected. In the case of organic analyses, the bias is generally due to an incorrect validation of the analytical method, which eventually leads to low recoveries of the analytes of interest during the sample preparation processes (extraction and cleaning) or to an inadequate recovery correction when performing the calculations [46].

Large variations in the determination of pesticides have also been observed in other intercomparison studies between laboratories [44, 47, 48]. The participating laboratories with questionable and unsatisfactory results, based on the results observed in this intercomparison exercise, had (1) an error in unit conversion or (2) inappropriate standard calibration solutions. Some participating laboratories used standard calibration solution intervals that were higher or lower than the concentrations of pesticides present in the test sample, which meant that the calibration curve did not cover the target concentration range. Improvements are expected to be made to solve these problems.

The results suggest that laboratories should implement better strategies to meet and maintain technical requirements, which are related to ensuring the traceability of measurements, such as method validation, setting up appropriate systems for estimating measurement uncertainties, and other pertinent requirements. This type of exercise allows laboratories to control and improve their analytical performance.

Finally, due to the large number of variables, it was not possible to identify a significant correlation between the analytical methods used and the results obtained by each laboratory. Nevertheless, each laboratory should evaluate its analytical methods and procedures step by step to find possible areas of improvement for the determination of OCPs in drinking water.

Conclusions

This study is the first in Mexico to compare analytical results from different laboratories for the determination of OCPs in drinking water. The results obtained in this exercise indicated that there is a need to harmonize the methods for the analysis of OCP in drinking water following a national or international standard or to have validation of these, because too many variables affect the results and cause significant variations between laboratories. The results of the exercise showed that all of the participating laboratories have the capacity to perform analyses of OCPs in drinking water. However, only one laboratory (1603) could identify and quantify all five pesticides that were present in the test sample; the others identified and quantified between two and four pesticides. Based on the results observed in this intercomparison exercise, factors possibly affecting the non-detection of all of the analytes present in the samples by the participants include not using standardized (validated) methods, even though standard methods should be used, using calibration curves that do not cover the range of concentrations of the test sample, and making calculation errors.

For laboratories that obtained unsatisfactory and/or questionable results, performing corrective actions and continuing their participation in similar laboratory proficiency tests are excellent tools to measure the improvement in the analytical processes implemented in their laboratories.