Introduction

The total amount of active pesticide ingredients used worldwide was estimated to be approximately 2.4 million tons in both 2006 and 2007 [1]. In France, the main European consumer of pesticides, total consumption of these active ingredients was estimated to be approximately 62 thousand tons in the period 2010–2011 [2]. Use of such amounts of industrial chemical pesticides raises the question about their possible effect on health, not only to workers who apply the chemicals but also to the residents of the areas in which they are used and, more broadly speaking, the general population. Pesticides can have adverse health effects on humans, including endocrine and neurological disorders [36], reproductive and developmental toxicity [710], and cancers [1114]. However, assessment of human exposure to pesticides still limits of the quality of the evidence that observational studies can provide and the risk-assessment process. More refined techniques are needed to provide solid evidence of exposure.

In the general population, levels of exposure to pesticides in the diet are usually estimated by combining analysis of the shopping basket with data on the concentration of pesticides in food (obtained via monitoring programs). However, for most consumers sources of exposure to pesticides are not restricted to contaminated food or drink only. The population may come into contact with pesticides as a result of their occupation, domestic use, or by being in proximity to rural areas where pesticides are applied. A complete appraisal of exposure of the general population to different sources of pesticide residues in France was published by the French Agency for Food, Environmental, and Occupational Health and Safety (ANSES) in 2010 [15]. Because of the different routes by which exposure may occur, measurement of the parent compound and/or its metabolites in an accessible biological sample, for example urine, is the method of choice for evaluation of total exposure [16]. During the last decade, biological monitoring in general [17], and of pesticides in particular, has been developed on the basis of analysis of blood (or serum, or plasma), milk, saliva, and urine [1822]. Among these samples, urine has frequently been used for the determination of pesticides because it is collected non-invasively and generally contains non-persistent pesticides or their corresponding metabolites in higher concentrations than in other biological samples [2325]. In addition, monitoring of pesticide metabolites in urine has been shown to be appropriate for farm workers [25]. Nevertheless, most published data obtained from analysis of urine specimens have dealt with organophosphorus, carbamate, or pyrethroid insecticides and triazine and chloroacetamide herbicides [15, 26].

Several analytical approaches are available for multi-residue analysis. Methods for biomonitoring of pesticides have recently been reviewed [2729]; more recently, methods for the screening of veterinary drugs have also been reviewed [30]. Such methods can be split into two main strategies: targeted or untargeted analysis. The most sensitive way to detect a compound is to design the most specific analytical method possible, from sample preparation to detection. This targeted approach is well adapted to the biomonitoring of known and hazardous compounds, for which accurate measurements of biological concentrations are required, to place results in e.g. a public health risk context. In this way, exposure to chlorpyrifos has been characterized by monitoring specific metabolites by use of mass spectrometry [31], exposure to permethrin by enzyme-linked immunosorbent assay [32], or exposure to imidacloprid by fluorescence polarization immunoassay [33]. The targeted approach could be extended, by use of mass spectrometry, to monitoring of several members of the same class of contaminants, for example pesticides [34]. Mass spectrometry coupled to liquid chromatography has become the method of choice for screening and quantification of molecules and their metabolites in biological matrices [35], particularly use of triple-quadrupole mass spectrometers operating in multiple reaction monitoring mode (MRM). Similarly, information-dependent acquisition (IDA) enables screening for up to 700 compounds by use of Q-trap-type instruments [36]. Survey scans are conducted by MRM for the 700 compounds, which trigger a product ion scan as dependant scan. Compound identification is achieved by use of library search on product-ion spectra.

Nevertheless, these targeted approaches can be applied only to known compounds for which standard compounds are available to enable setup of the decomposition reactions to be monitored by triple-quadrupole mass spectrometry. In 2008 the database of known pharmacologically and toxicologically relevant compounds (PTRC) included approximately 50,500 compounds [37]. Although this database is extensive, it is still not exhaustive (in particular for emerging contaminants, novel metabolites, or illegal substances), and only few reference compounds are commercially available. To meet the challenge of multi-exposure to complex conta-minant mixtures, alternative untargeted or semi-targeted analytical approaches are required. When dealing with urine, these approaches must ensure detection of, mostly, polar metabolites, and electrospray ionization is usually the method of choice for ionization. To achieve detection of the maximum number of compounds with a variety of physicochemical properties, use of both positive and negative ionization modes is also usually required. Replacing MRM survey scans of the targeted approach by a full scan performed by high-resolution mass spectrometry (HRMS) is a promising alternative enabling detection of substances whose number is limited only by the size of the database used [3840]. The method is simple to set up and raw HRMS data are generated independently of the list of compounds screened, which enables retrospective study of the data without the need to re-analyze samples. For these reasons, although less sensitive than MRM mode, use of high-resolution mass spectrometry with time of flight (TOF) or Orbitrap mass analyzers for screening and biomonitoring applications [41] is becoming more popular.

In this context, this work was performed to develop an analytical method enabling linking of (possibly unknown) end-products of pesticide metabolism with exposure to complex contaminant groups, for example pesticides, which could be used for screening and biomonitoring. The “exposomics” procedure proposed herein combines an untargeted analytical approach using HRMS with an “omics” type of data mining and in vivo generation of metabolites for structure confirmation. As a first step, urine samples were analyzed by high-resolution mass spectrometry combined with ultra-high-performance liquid chromatography to generate a dataset which can be queried retrospectively. Post-analysis raw data mining was then conducted to extract signals fitting a list of metabolites of interest. These signals were then analyzed by multivariate statistical methods. For differently exposed populations the statistical model generated from identified compounds enabled discovery of metabolites (and therefore pesticides) enabling discrimination of the samples. This approach was developed and applied as proof of concept to study environmental exposure to pesticides and was based on the analysis of urine samples from pregnant women from the PELAGIE cohort recruited in an agricultural region of Brittany in France.

Materials and methods

Chemicals

Methanol and acetic acid were analytical grade and were purchased from Scharlau (Barcelona, Spain). Ultrapure water from a Milli-Q system (Millipore, Saint Quentin en Yvelines, France) was used for mobile phases. All standard pesticides (fenitrothion, fenpropimorph, azoxystrobin, procymidone, quizalofop-p-ethyl, chlorpropham, carbofuran, ethephon, trifluralin, dichlorvos, malathion, and cymoxanil) were purchased from Sigma–Aldrich Chemie (Saint-Quentin Fallavier, France) and were Pestanal grade.

Population and biological samples

Samples were obtained from members of the PELAGIE cohort of 3,421 women in early pregnancy from Brittany (France) enrolled, by gynecologists, from the general population between 2002 and 2006. The objective of the PELAGIE study is to assess the consequences of environmental exposure (solvents, persistent organic pollutants, pesticides …) on pregnancy, birth outcome, and infant development. This cohort is described in detail elsewhere [42]. Each woman in the cohort had to collect the first urine sample of the morning, transfer this to vials containing nitric acid to avoid bacterial degradation, and mail the samples to the study laboratory in a pre-stamped package at ambient temperature, with routine delivery taking, on average, from 1 to 3 days. On receipt, the 10-mL samples were stored at −20 °C until analysis. They were available for 95 % of the cohort. At the same time, data on social and demographic features, diet, and lifestyle were retrieved by questionnaire.

This exploratory study was restricted to women with an available urinary sample collected in 2004, to achieve similar likelihood of pesticide exposure on the basis of agricultural use and to avoid potential variability as a result of storage of biological samples for different times. Ten samples were randomly selected from among urban participants (>200,000 inhabitants), and ten samples from within each of three other groups discriminated by land surface (%) devoted to cereals crops in the municipality of residence. Thus, four groups of 10 samples were available for this study. The proportion of land devoted to cereal crops within each municipality was estimated from the French National Agricultural Census [43]. Exposure groups were defined as “urban”, “weak exposure” (less than 17 % of land devoted to cultivation of cereals), “medium exposure” (between 17 and 25 % of land devoted to cultivation of cereals), and “high exposure” (more than 25 % of land devoted to cultivation of cereals).

Animal experiment

To corroborate identification of detected metabolites, a mixture of 200 μg of each of the seven pesticides (fenitrothion, fenpropimorph, azoxystrobin, procymidone, quizalofop-p-ethyl, chlorpropham, and carbofuran) dissolved in corn oil was administered by gavage to a female Wistar rat (Charles River) weighing 250 g. Immediately after treatment, the animal was placed in a metabolic cage and allowed ad libitum access to water and to a standard commercial diet (SAFE, Augy, France). Urine was collected 24, 48, and 72 h after exposure and stored at −20 °C until analysis. Finally, urine samples were analyzed under the same conditions as for human urine samples.

Liquid chromatography–mass spectrometry

Analysis was by ultra-high-performance liquid chromatography combined with high-resolution mass spectrometry (UHPLC–HRMS). UHPLC was performed with a RSLC3000 (Dionex–Thermo Scientific, Les Ulis, France). The chromatographic pump was operated at a flow rate of 0.25 mL min−1. The mobile phase was a gradient prepared from 95:5:0.1 (v/v) H2O–CH3OH–CH3CO2H (component A) and 100:0.1 (v/v) CH3OH–CH3CO2H (component B). The gradient program was: 0 min, 0 % B; from 15 to 20 min, 100 % B, from 21 to 31 min, 0 % B. Chromatographic separation was achieved on a Hypersil Gold C18 column (100 mm × 2.1 mm, 1.9-μm particles) (Thermo Scientific, Les Ulis, France) maintained at 35 °C.

Urine samples were diluted with an equal volume of mobile phase component A before analysis. The injection volume was 5 μL. Eluted compounds were detected by use of an LTQ-Orbitrap XL mass spectrometer (Thermo Scientific) equipped with an electrospray ionization source. Ionization settings used in negative mode were: spray potential −3 kV, sheath gas (N2) flow rate 35 arbitrary units (au), auxiliary gas (N2) flow rate 0 au, and capillary temperature 320 °C. In positive ionization mode the settings were: spray potential 3.6 kV, sheath gas flow rate 53 au, auxiliary gas flow rate 27 au, and capillary temperature 320 °C. High-resolution mass spectra were acquired at a resolution of 30,000 from m/z 60 to 800. Identification was performed by tandem mass spectrometry (MSn) using the ion-trap mass analyzer of the LTQ-Orbitrap mass spectrometer.

Data processing and statistical treatment

Signals corresponding to ions of potential pesticides and their metabolites were extracted from raw data by use of MetWorks 1.3 (Thermo Scientific), on the basis of the HRMS signal of the exact mass of each [M + H]+ or [M − H] ion, in accordance with a mass measurement error of ±5 ppm. After their detection, structural characterization of potential metabolites was achieved by processing MSn spectra with Xcalibur QualBrowser (Thermo Scientific). UHPLC–HRMS signals of identified metabolites were also measured with Xcalibur QuanBrowser, on the basis of chromatographic peak area normalized to the signal from an internal standard. Normalized signals were then analyzed by multivariate methods using SIMCA-P software (V12; Umetrics, Umea, Sweden). Principal-components analysis (PCA) was first used to detect intrinsic clusters or outliers. Partial least squares–discriminant analysis (PLS–DA) was then used to model the relationship between exposure conditions (“urban”, “weak exposure”, “medium exposure”, and “high exposure”) and UHPLC–HRMS data. PLS–DA is a supervised method which maximizes separation between groups. Data were preprocessed by use of orthogonal signal correction (OSC), with exposure as a correction factor. OSC filtering [44] was used to remove confounding variation (variation not linked to, e.g., physiological, experimental, or instrumental variation). Proportion of explained variance (R2) and predictive ability (Q2) were used to assess the robustness of PLS-DA models. Cross-validation (sevenfold) was used to determine the number of latent variables to include in the PLS-DA model and to calculate the Q2 value. A permutation test (200 iterations) was used to evaluate the validity of models. The variable importance in the projection (VIP), a global measure of the importance of each metabolite in the PLS-DA model, and the Kruskal–Wallis test (threshold of significance = 0.05), a non-parametric version of analysis of variance, were finally used to confirm the source of metabolites.

Results and discussion

Set-up of the untargeted data-acquisition method

The procedure used in the “exposomics” approach developed in this work is shown schematically in Fig. 1. First, samples were analyzed by use of an untargeted mass spectrometric technique. In a second step, a list of compounds to be sought was edited, and screening of raw data for these was performed on the basis of exact mass matching. Detected compounds were then identified by MS–MS and/or MSn experiments, and the corresponding chromatographic peaks (which are representative of their concentration) were integrated. These resulting data were statistically analyzed to classify samples in accordance with collected exposure data. If necessary, the list of monitored compounds can be upgraded, and the “exposomics” procedure can be performed again, without the need for re-injection of samples.

Fig. 1
figure 1

The “exposomics” procedure

For untargeted data acquisition, a typical metabolomics UHPLC–HRMS method [45] was applied. Liquid chromatography was used for separation, because it is highly suitable for analysis of metabolites in urine. The efficiency of the chromatographic separation was improved by use of ultra-high-performance liquid chromatography, and by favoring chromatographic resolution rather than high-throughput analysis. By using a C18-based stationary phase and water–methanol gradient elution, this UHPLC method enabled the separation and detection of a large number of metabolites with different physicochemical properties, for example metabolites of different classes of pesticide. To achieve maximum metabolite coverage, this method of separation was combined with high-resolution mass spectrometry with electrospray ionization in the positive and negative modes. Classical electrospray ionization conditions were used, and mass spectra were acquired at high resolution to obtain accurate mass measurements and therefore gain access to a restricted set of proposed chemical formulas for the ions detected.

The purpose of this data acquisition was to compare different samples on the basis of signals representative of the concentration of the corresponding metabolites in urine, by using statistical analysis of the generated datasets. Taking into account that differences between samples were expected to be very small, the repeatability of data acquisition had to be carefully checked. To check this, six blank samples (mobile phase component A) were analyzed within an analytical sequence of 40 urine samples, to confirm the absence of carry-over effects (results not shown). Second, internal standards were added to samples to normalize the signals of detected metabolites, reducing the analytical variability. Pesticides not detected in any of the urine samples from preliminary UHPLC–HRMS analysis were chosen as internal standards. On this basis, ethephon was chosen as internal standard for analysis performed in negative ESI mode and trifluralin, dichlorvos, and malathion for analysis in positive ESI mode. Finally, a quality-control (QC) sample, prepared by combining all the urine samples, was analyzed in replicate (n = 6) during the analytical sequence. From the extracted ion chromatogram of a randomly selected ion in the QC results, no modification of chromatographic performances (shifts and variability of retention times (RT); and no significant variations of detector response (decrease and variability of the absolute signal) were observed during analytical sequences performed in both negative and positive-ionization modes. In the negative ionization mode, the m/z 283.0825 ion (within the m/z range of expected pesticide metabolites) observed at RT = 8.4 min (almost in the middle of the chromatogram) was used for checking chromatographic and mass spectrometric performance stability. No changes of the RT were observed during the analytical sequence; the relative standard deviation (RSD) of RT was 0.3 %. The RSD of the corresponding chromatographic peak area was 9.9 %, with no significant downward trend of the signal along the sequence. The same results were obtained in positive-ionization mode on the basis of the m/z 257.1492 ion eluted at RT 9.1 min (RSD of RT = 0.1 % and RSD of peak area = 2.9 %).

Generation of the pesticide metabolite list

The objective of this work was to develop an “exposomics” approach able to discriminate individuals according to their living environment, focusing in particular on their environmental exposure to pesticides. Thus, the pesticides and metabolites monitored corresponded to pesticides used in local agricultural practice at the time the urine samples were collected (ca 2004). Pesticides were selected by using data from the results of surveys of Brittany, France, between 2001 and 2006, for crops which included corn, wheat, colza, peas, potatoes, and fresh vegetables (plus strawberries and melons) [46]; and recommendations of the French agricultural chamber and phytosanitary index published by ACTA [47].

From all these data, an initial list of 47 pesticides most likely to have been used in this area during the period of urine collection was produced (Table 1).

Table 1 The pesticides monitored and the corresponding number of metabolites screened

Data available on the metabolism of these 47 pesticides were then collected from the International Program on Chemical Safety (IPCS) database [48], the European Food Safety Authority report on pesticide residues in food [49], the Pesticide Properties Database from the University of Hertfordshire [50], and from “Metabolic Pathways of Agrochemicals” [51, 52]. Further information from the published scientific literature was added when available. Supplementary theoretical phase II metabolites of the already listed phase I metabolites were also added to the resulting metabolite list. This resulted in several metabolites considered to be relevant to mining of raw data for one given pesticide, as indicated in Table 1. Because of the structural similarity of some pesticides, some of their metabolites could be identical. For example, the organophosphorus compounds dichlorvos, malathion, and chlorpyrifos are all metabolized into (among others) dimethyldithiophosphate or dimethylthiophosphate. Thus, these metabolites may be regarded as biomarkers of exposure to organophosphorus insecticides in general but not to one specific compound [53]. In most cases evidence of the presence of pesticides was based on specific metabolites, e.g. dichlorvos glucuronide or desmethyl-dichlorvos for dichlorvos. However, the situation is more complex for some pesticides. For example, of 18 known metabolites of azinphos methyl, no specific metabolite could be used to monitor this pesticide (Table 1). Similarly, the same metabolites were monitored for such compounds as propachlor and dimethachlor. Therefore, for these compounds, only detection of the parent molecule would enable their specific identification. Despite these problems, all metabolites, including those with several possible origins, were monitored to assess exposure to at least one of the parent molecules listed. Finally, a list of 425 compounds (i.e. 378 metabolites and 47 parent molecules) was used when searching raw HRMS data.

Detection and identification of urinary pesticide metabolites

From the raw data acquired in an untargeted manner, compounds were screened on the basis of the theoretical mass of their quasi-molecular [M + H]+ or [M − H] ions, with mass measurement precision set at ±5 ppm. The theoretical exact masses of screened pesticides and metabolites were automatically calculated from their raw formula by using MetWorks. Signal detection and treatment were also performed by using MetWorks to study all 47 parent molecules. No signal-to-noise threshold was set, because, owing to the signal cut-off of the Fourier transform used to process the transient signal, the Orbitrap mass analyzer does not generate a noise level. To overcome this problem, an absolute signal threshold was set at 5 × 103 during at least five consecutive HRMS scans for ion selection. Thus, from the previously described list, 33 signals were detected in data acquired in the positive-ionization mode (128 signals in the negative-ionization mode) which fitted the exact masses of the listed pesticide metabolites (i.e. detection of an ion with an accurate mass in agreement with the theoretical mass of one of the screened metabolites). Several factors can affect efficiency of electrospray ionization, including the physicochemical properties of the analytes (e.g. pK a). Consequently, detection of all the metabolites listed (if present) cannot be guaranteed in such an untargeted approach. At this stage, use of this method for quantitative purposes is precluded, because we deal mostly with compounds for which no reference standard is available. At best, only semi-quantitative data can be provided. Nevertheless, seeking a large set of metabolites increases the likelihood of detecting one of them under the experimental conditions used in this work.

When a potential metabolite was detected, structure was confirmed by a three-step process. The first step consisted in analyzing the isotopic pattern of the detected ion. This enabled confirmation (or not) of the possible identity of metabolites, in particular for sulfated, chlorinated, or brominated compounds, which have characteristic isotopic patterns. An example of this process is illustrated in Fig. 2, which depicts structural characterization of an azoxystrobin metabolite detected in the negative ion dataset at a retention time of 6.6 min (Fig. 2a). The m/z ratio measured for the suspected metabolite of azoxystrobin (m/z 287.0232, Fig. 2b) corresponded to the chemical formula C11H11O7S with a measurement error of 0.4 ppm (theoretically calculated m/z ratio = 287.0231; Fig. 2c). This chemical formula was in agreement with the deprotonated form of methyl-2-(2-hydroxyphenyl)-3-methoxyacrylate sulfate, one of the 26 screened metabolites of azoxystrobin (Table 1). Moreover, the isotopic pattern of this ion was in good agreement with the chemical formula C11H11O7S, with an [M + 2] ion corresponding to the 34S isotope (Fig. 2) measured at m/z 289.0186 (theoretical m/z 289.0189, Δ = −1.0 ppm). It should also be noted that the resolving power (R = 30,000) used in this work enabled separation of the 12C11 1H11 16O7 34S and 13C2 12C9 1H11 16O7 32S ions at [M + 2].

Fig. 2
figure 2

Extracted ion chromatogram of the suspected methyl-2-(2-hydroxyphenyl)-3-methoxyacrylate sulfate ion (azoxystrobin) in urine samples (a), accurate mass measurement and isotopic pattern observed at RT = 6.6 min (b), and theoretical isotopic pattern of [C11H12O7SH] (c)

When the exact mass and isotopic pattern were in agreement with the chemical formula of the hypothesized metabolite, the second step of the identification process consisted in conducting UHPLC–HRMSn experiments on the targeted metabolite, using the linear ion trap of the LTQ-Orbitrap hybrid mass spectrometer. QC samples were used for this procedure. Phase II conjugated metabolites (i.e. glucuronide, sulfate, thiol, glutathione, etc.) could be easily identified by observation of characteristic MS2 fragmentation patterns. In this way, several glucuronide and sulfate conjugates could be identified on the basis of the characteristic losses of 176 u (C6H8O6) and 80 u (SO3), respectively. The structures of phase II metabolites were also confirmed by MSn experiments. As for metabolites produced by Phase I enzymes, identifications were either invalidated when observed fragment ions were inconsistent with the hypothesized structure (e.g. observation of loss of NH3 with a hypothesized structure bearing no nitrogen atom) or were confirmed when observed fragment ions were in agreement with the putative metabolite structure. As an example, the MS–MS spectrum of the suspected metabolite of azoxystrobin detected at RT = 6.6 min is reported in Fig. 3a. It contains a major fragment ion corresponding to the characteristic loss of SO3 at m/z 207.0663 (Δ = 0.04 ppm), consistent with the proposed sulfate conjugate structure of this metabolite. The MS3 spectrum obtained from the m/z 207 fragment ion, reported in Fig. 3b, reveals the formation of two fragments ions, also in agreement with the structure of this metabolite (loss of CO2, m/z 163.0767, Δ = 1.4 ppm, and loss of C4H5O2, m/z 122.0376, Δ = 2.1 ppm). In accordance with the metabolomics standard initiative (MSI) [54], metabolites identified at this stage were labeled as putatively characterized (level 3). After these UHPLC–HRMSn experiments, 24 metabolites were putatively characterized among the 128 compounds detected in the raw data generated in the negative-ionization mode. Only five compounds were putatively characterized under positive-ionization conditions.

Fig. 3
figure 3

MS–MS spectra of the suspected methyl-2-(2-hydroxyphenyl)-3-methoxyacrylate sulfate (azoxystrobin) ion obtained at RT = 6.6 min in (a) human urine samples and (c) a rat urine sample, and MS3 spectra of m/z 207 obtained at RT = 6.6 min in (b) human urine samples and (d) a rat urine sample

The final step of the identification process consisted in validating putative characterizations by comparison with spectral data obtained from authentic standards. Unfortunately, metabolites of xenobiotics are rarely commercially available, and none of the metabolites putatively characterized in this study was available. Moreover, when parent molecules were available, they were not necessarily ionized with the same ionization mode as their metabolites, thus preventing validation of structures by using similarities of substructure spectra. The only exception was fenitrothion which was fragmented during the negative ionization process into 3-methyl-4-nitrophenol. This latter underwent loss of NO (−30 u) whereas the putative 3-methyl-4-nitrophenol ion generated from the suspected metabolite 3-methyl-4-nitrophenol sulfate underwent loss of 44 u. This information invalidated our hypothesized structure for this metabolite detected in the negative mode. For unchanged pesticides detected in samples, for example cymoxanil and carbofuran, the availability of the standard reference compounds enabled validation or invalidation of hypothetical identities. The suspected detection of cymoxanil in the positive mode (Table 2) was invalidated, because the standard compound did not elute at the same retention time (RTStd = 8.6 min, RTSample = 2.4 min). In contrast, identification of carbofuran in the positive mode could be validated (level 1 [54]) by analysis of the authentic standard, which resulted in the same retention time (10.1 min), exact mass, isotopic pattern, and MS2 spectrum as those of the detected urinary metabolite (results not shown).

Table 2 Compounds identified in human urine

Confirmation of metabolite structure by use of in-vivo-generated metabolites

In the absence of commercially available compounds, a small animal experiment was conducted to produce the hypothesized metabolites in vivo by administration of the corresponding parent compounds. After pesticide administration, analysis of rat urine under the same conditions as for human urine led to the detection of several metabolites with the same RT and m/z values, the same isotopic pattern, and the MS2 and MS3 spectra as detected for human metabolites. Returning to the example of methyl-2-(2-hydroxyphenyl)-3-methoxyacrylate sulfate detected in human urine as a putative metabolite of azoxystrobin, the MS2 and MS3 spectra generated from the human urinary metabolite are displayed in Fig. 3a and b, respectively, whereas the MS2 and MS3 spectra of the rat urinary metabolite detected at the same RT are presented in Fig. 3c and d. The MS and MSn spectra of these two metabolites could be superimposed, enabling identification (level 1) of this metabolite in human urine.

Of the 24 metabolites putatively characterized by negative ESI analysis of human urine, 20 were successfully identified (level 1) by comparison with rat urine metabolites (Table 2). One metabolite was invalidated (3-methyl-4-nitrophenol sulfate), and three metabolites remained as putatively characterized (level 3, [54]) because they were not detected in rat urine (possibly because of different metabolism in humans and rodents). By use of positive ionization, only five compounds were putatively characterized (Table 2). As previously stated, the analysis of the authentic standard invalidated the presence of cymoxanil, whereas carbofuran was confirmed (level 1). 1,2-Dimethylcyclopropane-1,2-dicarboxylic acid (Table 2) was detected at the same RT as in the negative ionization mode experiment, enabling its validation. Hippuric acid and 2,6-dimethylaniline were detected and identified in rat urine and can be attributed to metabolism of pencycuron and metazachlor, respectively. However, these compounds can also originate from exogenous compounds other than pesticides or even from endogenous substances, and thus were not validated as specific pesticide metabolites. Finally, only two compounds (i.e. carbofuran and 1,2-dimethylcyclopropane-1,2-dicarboxylic acid) were identified in the positive ionization dataset. Because both corresponded to compounds also detected (with better sensitivity) by use of negative ESI, relative quantification of metabolites was performed considering only the negative ionization dataset, on the basis of 23 metabolites reported in Table 2, representing 23 variables for statistical analysis.

Statistical data analysis

Semi-quantitative data (normalized peak areas for each of the 23 variables measured in urine samples from the 40 individuals) were first analyzed by use of PCA. This analysis revealed neither intrinsic cluster nor outlier individual (results not shown). PLS-DA was then used to model the relationship between exposure conditions and spectral data. In a first attempt, the constructed model was unable to separate the “urban” and “low exposure” groups. These two groups were then merged and a new model was constructed from three defined groups: “urban + low exposure”, “medium exposure”, and “high exposure”. The resulting model revealed two components explaining 51.9 % of the variation (R 2) and with a predictive capacity (Q 2) of 0.359. The model was validated by use of a permutation test (results not shown). The score-plot obtained from the PLS-DA analysis, presented in Fig. 4, enables separation of the “urban + low exposure” group from the “medium exposure” + “high exposure” groups along the first axis, and between the “medium exposure” and “high exposure” groups along the second axis. Discriminating metabolites are reported in Table 3, with their variation trends between groups. Eight metabolites were revealed to be significantly different among groups, with a VIP value >1.0 and a Kruskal–Wallis test p-value <0.05 (Table 3). Among these metabolites, 2-(4-hydroxyphenoxy)propanoic acid sulfate contributed to separation of the three groups (p-value <0.05) but did not enable significant separation of the groups two by two. For the seven other metabolites, a significant increase (+) of their signal and, therefore, of their urinary concentration, resulted from the exposure conditions (Table 3). For methyl-2-(2-hydroxyphenyl)-3-methoxyacrylate sulfate, the E and Z isomers of the glucuronide metabolites of methyl-2-(2-hydroxyphenyl)-3-methoxyacrylate, and 2-methyl-2-phenylpropanoic acid the p-value was <10−4, indicating that these four metabolites made strong contributions to discrimination between the groups. In contrast, the p-values of the four other metabolites were >3 × 10−3. The concentration variations of the four most significant metabolites according to exposure group are presented in Fig. 5, which reveals significant increases in concentration with increasing exposure. Therefore, according to the statistical model generated by untargeted biomonitoring of 40 subjects, exposure could be characterized on the basis of four metabolites of two fungicides, azoxystrobin and fenpropimorph, classically used as pesticides for treatment of cereal crops.

Fig. 4
figure 4

Two-dimensional PLS-DA score plot of human urine samples for “urban + low exposure” (n = 20), “medium exposure” (n = 10), and “high exposure” (n = 10) (R 2Y = 51.9 % and Q 2 = 0.359)

Table 3 Variation of signals as a result of exposure
Fig. 5
figure 5

Variation of filtered normalized signals of (a) methyl-2-(2-hydroxyphenyl)-3-methoxyacrylate sulfate, (b) 2-methyl-2-phenylpropanoic acid, (c) methyl-2-(2-hydroxyphenyl)-3-methoxyacrylate glucuronide (1; E or Z), and (d) methyl-2-(2-hydroxyphenyl)-3-methoxyacrylate glucuronide (2; E or Z) in human urine samples

Conclusion

Exposure of a population to contaminants may be estimated either by monitoring biological samples collected from the population or by measurement of chemical concentrations in the environment, including food [55, 56]. However, most of the time, biomonitoring considers substances taken individually, and current approaches do not address co-exposure. This question is of growing concern and must be considered in future analytical strategies. This entails introduction of techniques and concepts enabling identification, without a-priori consideration, of a variety of (potentially unknown or putative) metabolites representative of a pattern of exposure.

The purpose of the “exposomics” method developed in this work was to obtain untargeted evidence of urinary metabolites resulting from exposure to pesticides. The method is based on LC–HRMS measurements in both positive and negative modes of electrospray ionization coupled to in-vivo-generation of metabolites used as standards for confirmation of metabolite structure, as a preliminary step enabling establishment of a link between urinary biomarkers and exposure to several pesticides. In our study, among the 47 pesticides initially selected for data mining, only two (azoxystrobin and fenpropimorph) were found to discriminate among the sub-populations according to their levels of environmental exposure. Although designed for assessment of feasibility and performed on a limited scale (40 subjects), this study clearly showed that two pesticides used for cereal crop treatment were present at higher levels in women living close to areas where cereal crops were grown. This study also confirmed the relevance of the exposure indicators used in this kind of epidemiological approach (agricultural practice, distance from agricultural fields). Our results may lead to more comprehensive studies for discovery of other metabolites suitable for screening in further biomonitoring studies. This approach may also be useful for refining group definition in epidemiological studies by use of predictive statistical models. Extension of this approach to a greater number of urine samples is in progress, and could be part of a validation process.

Work is also in progress to improve this approach, to ensure wider xeno-metabolome coverage and to extend its application to the identification of other metabolites as markers of exposure to other types of contaminant, for example mycotoxins or PAHs, to contribute meeting the challenge of multiple exposure assessment.