1 Introduction

A turning point in the ongoing scientific research is by no doubt the development of metabolomics which has provided a new dimension within the context of multidimensional biology enabling the in-depth study of global metabolic networks (Krishnan et al. 2005; Allwood et al. 2008; Dunn 2008; Spratlin et al. 2009; Vinayavekhin et al. 2010). Metabolomics has been defined as the comprehensive qualitative and quantitative profiling of a large number of metabolites of a biological system (Trethewey et al. 1999; Fiehn et al. 2000). Its major advantage is the simultaneous monitoring of metabolic networks in a way that enables the association of changes in such networks with biotic and/or abiotic causal agents and the detection of corresponding biomarkers. Thus, metabolomics serve as a linkage between genotypes and phenotypes. The recent advances in analytical instruments and the development of specific software for analyses of vast amount of data and on-line databases have contributed towards the standardization of metabolomics.

Originally, metabolomics was established using plants and microbes as the model organisms (Fiehn et al. 2000; Ratcliffe and Shachar-Hill 2001; Ott et al. 2003; Smedsgaard and Nielsen 2005). To date, metabolomics approaches have been successfully developed in almost every field of science (Ward et al. 2007; Allwood et al. 2008; Dunn 2008; Hall et al. 2008; Carraro et al. 2009; Cevallos-Cevallos et al. 2009; Hunter 2009; Kaddurah-Daouk and Krishnan 2009; Spratlin et al. 2009; Woo et al. 2009; Vinayavekhin et al. 2010).

Focusing on crop protection, a great effort has been given over the last two decades by agrochemical industries and academic institutions towards the development of new crop protection agents that are safer for the environment and consumers and more efficient than the existing ones. Within this framework, required experimental data for registration of pesticides by international organizations such as European Union (EU) [Regulation (EC) No 1107/2009 of the European Parliament http://eur-lex.europa.eu/Result.do?T1=V2&T2=2009&T3=1107&RechType=RECH_naturel&Submit] have been significantly increased, making the process extremely time consuming and expensive.

In the present review, an outlook on applications of metabolomics in pesticide research and development such as in the discovery of modes-of-action (MOA) of bioactive compounds, the assessment of their eco-toxicological and toxicological risk, the discovery of new bioactive compounds, and the evaluation of risk of genetically modified crops will be presented. Since instrumentation that can be applied in metabolomics studies has been extensively reviewed recently (Krishnan et al. 2005; Villas-Bôas et al. 2005; Allwood et al. 2008; Dunn 2008; Lindon and Nicholson 2008a, b; Wishart 2008), in the present review analytical platforms commonly used in metabolomics and principles for data mining will only briefly discussed.

2 Analytical platforms employed in metabolomics analyses

Metabolomics is a powerful tool for the comprehensive study of systems biology that includes not only chemical analyses and data mining but also experimental design and execution, chemical analyses of samples, and data pre-processing, analyses, and biological interpretation. The different steps of a typical metabolomics experiment are presented in Fig. 1. The central step in metabolomics is the selection of the analytical platform that will be used for the analyses of samples. The choice depends on several factors such as physicochemical properties and chemical composition of samples, the aim of the study, and the available instrumentation. Although numerous analytical instruments suitable for metabolomics have been developed, nuclear magnetic resonance spectroscopy (NMR) and mass spectrometry (MS) analyzers are those which have been mainly employed in metabolomics studies (Krishnan et al. 2005; Villas-Bôas et al. 2005; Allwood et al. 2008; Dunn 2008; Lindon and Nicholson 2008a, b; Wishart 2008; Ward et al. 2010). Both analyzers have a great potential for metabolomics and their strengths and weaknesses have been recently reviewed by Lindon and Nicholson (2008a, b).

Fig. 1
figure 1

Representative steps of a typical metabolomics experiment. The progressive loss of information on the metabolite composition of the analyzed biological material could be the result of sample handling and analyses and/or data processing and biological interpretation. FT-ICR/MS Fourier transform-ion cyclotron resonance–mass spectrometry, GC gas chromatography, HCA hierarchical cluster analysis, LC liquid chromatography, MS mass spectrometry, NMR nuclear magnetic resonance spectroscopy, OPLS-DA orthogonal partial least squares-discriminant analysis, PCA principal components analysis, TOF time-of-flight

2.1 Nuclear magnetic resonance spectroscopy

NMR spectroscopy provides structural information on pure compounds and multi-complex mixtures. Nonetheless, in the latter case the identification of compounds becomes extremely challenging, especially when analyzing crude samples without prior separation in one-dimensional (1D) proton NMR (1H NMR) experiments, which represent the majority of NMR metabolomics experimentation (Krishnan et al. 2005). In cases where additional information on chemical composition of samples is required, two-dimensional (2D) analyses such as 1H–1H 2D J-resolved experiments, correlation spectroscopy (COSY), and total correlation spectroscopy (TOCSY) should be performed (Viant et al. 2008).

Although NMR spectroscopy lacks the sensitivity of MS, it has been successfully used in metabolomics mainly because of its high reproducibility (Viant et al. 2009) and minimization of requirements regarding sample preparation. The relatively recent introduction of higher field magnets (900 MHz), cryogenically cooled probes, small-volume microprobes, LC-NMR and LC-NMR/MS instruments, and NMR databases has opened new horizons in the application of NMR spectroscopy in metabolomics (Lindon et al. 2000; Lindon 2003; Markley 2007; Lindon and Nicholson 2008a, b). Additionally, 1H Magic angle spinning (MAS) NMR spectroscopy has facilitated analyses of minute intact tissues without prior treatment, and it is a technique with high, still largely unexploited potential for metabolomics (Wind et al. 2005). The biggest obstacles of the use of NMR spectroscopy in metabolomics are the relative insensitivity (limit of detection 1–5 μM), the cost of instrument purchase and maintenance, especially for instruments which enable observation of 1H frequency higher than 500 MHz, and the lack of automatization of metabolite identification through relevant databases.

2.2 Mass spectrometry

The potential of various MS analytical platforms for metabolomics studies has been thoroughly reviewed (Villas-Bôas et al. 2005; Dunn 2008). Hyphenation of MS detectors with chromatographic separation in combination with existing mass spectra libraries are among the most significant advantages of MS for metabolomics. Additionally, application of direct infusion MS (DIMS) using powerful mass analyzers provide a high-throughput alternative approach for metabolomics (Aharoni et al. 2002; Villas-Bôas et al. 2005; Dunn 2008).

In comparison to NMR spectroscopy, MS detectors provide lower limits of detection but lack the reproducibility of the former (Lindon and Nicholson 2008a, b). In addition, the cost of purchase of a simple MS system such as a quadrupole gas chromatography/mass spectrometry (GC/MS) analyzer is affordable for many laboratories. Due to its reliability, range of applications, and relatively low cost of purchase, operation, and maintenance, GC/MS was the MS platform that was initially used for the establishment and development of MS metabolomics (Fiehn et al. 2000) and until now it remains the predominant MS platform used. Nonetheless, the development and introduction of time-of-flight (TOF) analyzers in combination with GC have significantly reduced the required time of analyses of complex mixtures and improved mass accuracy (Villas-Bôas et al. 2005; Dunn 2008). Furthermore, initiatives have focused on the development of standardised GC-TOF/MS protocols for metabolomics based on inter-laboratory experiments (Allwood et al. 2009). The performance of GC can be further improved performing two-dimensional GC chromatographic separation (GC × GC), which enables greater peak capacities than a single GC separation.

Compared to GC/MS, high performance liquid chromatography (HPLC/MS) is a more recent development in analytical instrumentation. Performing HPLC/MS or ultra high performance/MS (UHPLC/MS) derivatization of samples is not a priori required which is a great advantage compared to GC/MS. The potential of HPLC/MS, UHPLC/MS, and hydrophilic interaction chromatography–MS (HILIC/MS) for metabolomics studies has been recently highlighted (Allwood and Goodacre 2010; Guillarme et al. 2010). In addition to quandrupole MS and TOF/MS, their hybrid quadrupole-time-of-flight (Q-TOF) analyzer has a great potential in metabolomics studies in combination with LC or performing DIMS (Fardet et al. 2008; Pandher et al. 2009).

Capillary electrophoresis/MS (CE/MS), in which components of the analyzed samples are separated according to mass-to-charge ratios, is an alternative to GC/MS and LC/MS and has a great potential for metabolomics (Soga et al. 2003; Ramautar et al. 2006; Monton and Soga 2007; Kralya et al. 2009).

Nonetheless, the introduction of two powerful MS detectors in metabolomics applications, namely Fourier transform-ion cyclotron resonance/MS (FT-ICR/MS) and Orbitrap MS opened new horizons to the analysis of global metabolic networks applying metabolomics. FT-ICR/MS is a superior MS detector providing high accuracy (<1 ppm) and resolving power. Such features have been successfully employed in high-throughput metabolomics studies (Aharoni et al. 2002; Oikawa et al. 2006; Takahashi et al. 2008; Aliferis and Jabaji 2009; Taylor et al. 2009). Orbitrap MS is also a great MS detector comparable to FT-ICR/MS, which has been recently introduced (Hu et al. 2005). Its great potential in metabolomics has been revealed in recent studies (Werner et al. 2008; Chong et al. 2009; Koulman et al. 2009; Okada et al. 2009; Lv et al. 2010). In spite of the great resolving power and mass accuracy provided by those two platforms, the assignment of detected ions to corresponding compounds is challenging. The establishment of guidelines for minimizing candidate molecular formulas (Kind and Fiehn 2007), in combination with target metabolite libraries and MSn analyses contribute to partially surmount that obstacle.

2.3 Integration of different analytical platforms

It is apparent that the developments in analytical instrumentation and software have greatly increased our capacity to comprehensively analyze multi-complex samples. However, the presence of a vast number of metabolites belonging to various chemical groups with different physicochemical properties makes the chemical analyses of complex samples challenging. Thus, since one analytical platform alone cannot provide adequate information across a wide range of compounds, the strategic mating of different analytical platforms could greatly facilitate the overall analysis. The complementariness between NMR and MS as well as that between different MS platforms have been exploited in recent metabolomics studies (Atherton et al. 2006; Moco et al. 2008; Biais et al. 2009; Leon et al. 2009; McKelvie et al. 2009; Aliferis and Jabaji 2010).

3 Data pre-processing, analyses, and visualization

3.1 1H nuclear magnetic resonance spectroscopy data pre-processing

Several factors which could affect robustness of NMR metabolomics analyses have been reviewed by Defernez and Colquhoun (2003). Following spectra acquisition, their phases and baselines are automatically corrected and they are aligned to specific chemical shifts. Usually, offsets of chemical shifts are aligned using the reference signal of trimethylsilyl propionate (TSP, 0.00 ppm). In cases where the water suppression is imperfect the spectral area close to water signal should be excluded from further analyses. Finally, spectra are normalized and combined in a single data matrix which will be subjected to multivariate and/or univariate analyses for the detection of trends and biomarkers.

3.2 Mass spectrometry data pre-processing

Although various combinations of mass detectors and chromatographic separations can be applied in metabolomics studies, in this section some basic principles of MS data pre-processing for metabolomics will be discussed.

Initially, signals of compounds unrelated to the analyzed biological material have to be detected and excluded from analyses. Such signals usually are results of instrument and/or sample contamination, solvent impurities, and/or column bleeding and reagents used for derivatization of samples (i.e. GC/MS). Failure to detect these signals will result in serious misleading during the biological interpretation of results. Analysis of blank samples which have been prepared following the same protocols as the biological samples is extremely useful for detecting those signals. Subsequently, spectra are aligned based on retention times or mass/charge (m/z) ratios. Additionally, multiple signals resulting from a single compound should be detected and summarized. Such signals could be the result of an imperfect derivatization (i.e. GC/MS) or different ionization modes (i.e. FT-ICR/MS, Orbitrap MS). Finally, total ion chromatograms (TIC) or MS spectra are normalized and subjected to multivariate and/or univariate analyses for the detection of trends and biomarkers.

3.3 Multivariate data analyses

Multivariate analysis is the central statistical approach to analyzing the vast amount of data derived by metabolomics experiments. Principal components analysis (PCA) is a valuable tool for the initial review of data sets and the detection of trends and outliers that could have leverage on the analyses. Performing PCA, directions in multivariate space that represent the largest sources of variation (principal components, PCs) are detected. However, according to Eriksson et al. (2001), it is not necessary that PCs discovered applying PCA represent the maximum variation among classes. On the other hand, application of partial least squares-discriminant analysis (PLS-DA) accomplishes a rotation of the projection to unravel hidden variables that contribute to class separation. Thus, the detection of the most influential variables (biomarkers) of the observed separations should be based on PLS-DA. Orthogonal PLS-DA (OPLS-DA) is an alternative to PLS-DA and has been designed to handle the variation in X that is orthogonal to Y. It can be as well applied for the discrimination between treatments and the detection of biomarkers in metabolomics experiments. The partitioning of the X-data applying OPLS-DA provides improved model transparency and interpretability with no effect on its predictive power (Trygg and Wold 2002).

Additionally, application of multiblock multivariate analyses on combined data obtained by different analytical platforms seems to be a powerful statistical method which provides valuable complementary information for high-throughput identification of compounds in complex samples. Pre-processed data matrices obtained by different analytical platforms are combined into a single matrix and subsequently they are subjected to multiblock multivariate analyses such as multiblock PCA or multiblock PLS-DA. Such approach presented by Biais et al. (2009) who performed multiblock PCA on combined NMR and GC-TOF/MS data.

3.4 Hierarchical clustering and heat maps

Although PCA and PLS-DA are the main statistical tests used in metabolomics for the detection of trends and biomarkers, hierarchical cluster analysis (HCA) is a statistical methodology that provides valuable information regarding grouping of treatments in large datasets and can be used complementary to PCA and PLS-DA. The notion of HCA is to group treatments that are close to each other in clusters. HCA basically starts with a number of clusters equal to that of observations and progressively the two closest observations or clusters merged until only one cluster remains. The calculation of clustering distances is based on algorithms such as single, complete, average, and Ward’s linkages. Results of HCA are presented in dendrograms, where observations are plotted versus cluster distances. Commonly, unsupervised HCA is performed. However, the concept of supervised HCA has been developed by Umetrics AB (Umeå, Sweden) under the terms PLS-HCA and OPLS-HCA.

Another powerful tool for analyses and visualization of metabolomics data sets are the so-called heat maps, which usually are combined with two-dimensional (2D)-HCA. In metabolomics experiments, heat maps illustrate fluctuations (fold change) in the concentration of metabolites between treatments which are encoded using a colour-code (Fig. 2). Combination of 2D-HCA with heat maps reveals trends within treatments and additionally, variables (metabolites) whose concentrations change with a similar pattern among treatments.

Fig. 2
figure 2

Heat map illustrating fluctuations in the concentration of metabolites between treatments encoded using a colour-scale. Two-dimensional hierarchical cluster analyses combined with heat maps reveals trends within treatments and variables (metabolites)

4 Web resources for metabolomics and biological interpretation of results

The development of a large number of freely available web-based databases and software for metabolomics data pre-processing, analyses, visualization, and biological interpretation during the last few years opened new horizons in metabolomics. An exhaustive list of such recourses is displayed in Tables 1 and 2.

Table 1 Freely available on-line software for metabolomics data analyses and visualization
Table 2 Freely available on-line metabolic databases

Metabolomics analyses result in a vast amount of data whose biological interpretation is demanding, time-consuming, and requires high-throughput tools for their correlation with physiological processes and metabolic pathways of the biological systems under study. In contrast to the development of software for data pre-processing and analyses, only a few software packages that facilitate the robust incorporation of metabolomics data into metabolic networks are available. Among those, the software Cytoscape (Shannon et al. 2003) enables several high-throughput metabolomics applications through plug-ins such as the MetaNetter (Breitling et al. 2006) and Metscape (Gao et al. 2010). An output of metabolic pathways using Cytoscape v2.6.3 is presented in Fig. 3. The development of advanced, sophisticated software integrating several species-specific databases is expected to boost metabolomics research and contribute to their development as a robust global bioanalytical tool for the study of biological systems.

Fig. 3
figure 3

Partial view of potato metabolic network using the software Cytoscape v2.6.3 and the database of BioCyc (http://www.gramene.org/pathway/potatocyc.html). Enlarged is depicted part of the glycoalkaloid pathway which leads to the production of α-solanine and α-chaconine, two of the most abundant bioactive potato glycoalkaloids implicated in responses of the plant to biotic and abiotic stimuli. Nodes represent metabolites/enzymes/chemical reactions

5 Applications of metabolomics in pesticide research and development

5.1 Elucidation of the modes-of-action (MOA) of bioactive compounds

The discovery of bioactive compounds with modes-of-action (MOA) different than those of the commercially used pesticides is of practical importance for predicting effects on non-target organisms but also for combating pest resistance (Ward and Bernasconi 1999). It is worth noting that approximately 85 different MOA of pesticides have been reported (http://www.frac.info, http://www.irac-online.org/, http://www.hracglobal.com) indicating that a small number of potential biochemical targets has been elucidated so far by using classical research protocols.

Nature is a valuable and inexhaustible source of bioactive compounds with unique structures and MOA. A number of such compounds and/or their chemical analogues have been successfully developed as crop protection agents (Lange and Lopez 1996; Thompson et al. 2000; Balba 2007; Copping and Duke 2007; Jeschke and Nauen 2008; Dayan et al. 2009; Huang et al. 2009). The overall procedure for the evaluation of compounds to be developed as crop protection agents is schematically presented in Fig. 4. Compounds which share the same subcellular target with existing pesticides or non-selectively affect metabolic processes existing in both target and non-target organisms are less likely to be selected for further evaluation. Taking into account the vast number of candidate molecules and the fact that the discovery of their MOA is laborious, time-consuming, and costly, the development of simplified, robust, and high-throughput screening techniques is required. Because of its unique features and capacities, metabolomics ideally fulfill the aforementioned requirements (Fig. 4).

Fig. 4
figure 4

The sequential steps in the evaluation of bioactive compounds as crop protection agents. Steps for which metabolomics approaches have been extensively used are marked with an asterisk (*)

In a first step, metabolomics enables the detection of alterations in the metabolomes of organisms as a result of their exposure to bioactive compounds with known MOA. In a second step those alterations are indirectly associated with the MOA of the applied compounds. Thus, upon the development of the metabolomics model, the rapid screening of the MOA is facilitated, and further research is focused on one or few MOA for further experimentation and elucidation of the MOA of the bioactive compounds under study. However, the development of robust metabolomics models for the study of the MOA of bioactive compounds is a complex procedure. Several factors have to be taken into consideration such as the selection of the appropriate organism or tissue that will be analyzed, and the combination between the applied doses of the bioactive compounds, duration of exposure, and environmental conditions, which determine the severity of the toxic effects. The application of bioactive compounds at sub-lethal doses under the experimental conditions set is preferable for metabolomics. The development of metabolomics models for several biological systems for the investigation of the MOA of bioactive compounds will assist and accelerate pesticide research and development. Such methodologies could be either limited to a high-throughput classification and discrimination between the metabolic profiles of the organism exposed to bioactive compounds or extended to the detection of toxicity biomarkers. The latter is of high importance not only for ecotoxicological studies but for integrated pest management strategies as well. Nonetheless, until now, metabolomics approaches have been mainly developed for the investigation of the MOA of phytotoxic, antifungal and antibiotic compounds, and to a lesser extent for that of insecticides.

5.1.1 Phytotoxic compounds

Regarding phytotoxic compounds, 27 different MOA have been discovered according to data of the Herbicide Resistance Action Committee (HRAC) (http://www.hracglobal.com/Publications/ClassificationofHerbicideModeofAction/tabid/222/Default.aspx). Plants are composed of a large number of metabolites with different physicochemical properties (http://www.genome.jp/kegg/pathway/map/map01060.html), which makes their analyses a challenging task. The number of metabolites present in the plant kingdom is estimated to exceed 100,000 (Dixon 2001). The application of metabolomics in the automatization of the discovery of MOA of bioactive compounds first reported by Aranibar et al. (2001). In this study, artificial neural networks for the classification of herbicides according to their MOA were developed. A very satisfactory classification could be achieved for the four different MOA represented by the applied herbicides using maize (Zea mays L.) as the model plant and 1H NMR spectroscopy for analyses. Soon after, the model was improved by the same research team using herbicides representing nineteen MOA (Ott et al. 2003). Analyses were also performed applying 1H NMR spectroscopy and maize as the model organism. In 2006 (Aliferis and Chrysayi-Tokousbalides), a 1H NMR fingerprinting methodology combined with multivariate analyses for the discovery of the MOA of the natural phytotoxin pyrenophorol was developed. The metabolite was isolated from cultures of a pathotype of the fungus Drechshlera avenae (Kastanias and Chrysayi-Tokousbalides 2000). Changes in the metabolic 1H NMR fingerprints of wild oat (Avena sterilis L.) after treatment with the phytotoxin were compared to those caused by herbicides representing six of the most common phytotoxic MOA performing PLS-DA. Analyses showed that pyrenophorol has MOA different than those of the herbicides that were tested, which makes the metabolite an interesting molecule for further consideration in the context of crop protection. The same year, Oikawa et al. (2006) developed a FT-ICR/MS metabolomics model for the automatization of the detection of the MOA of herbicides. The developed PCA model could discriminate and classify herbicides representing four different MOA using Arabidopsis (Arabidopsis thaliana L.) as the model plant.

5.1.2 Antifungal and antibiotic compounds

Because of their growth habits, and culture morphology, fungi and bacteria represent a biological material whose chemical analyses standardization for metabolomics is a challenging task. Until now, although a significant progress has been performed in microbial metabolomics (Raamsdonk et al. 2001; Allen et al. 2003; Castrillo et al. 2003; Demyttenaere et al. 2003; Smedsgaard and Nielsen 2005; Forgue et al. 2006; Sorrell et al. 2006; Börner et al. 2007; Mas et al. 2007; Mohler et al. 2007; Aliferis and Jabaji 2010), the application of metabolomics approaches in the discovery of the MOA of antifungal and antibiotic compounds is still in its infancy.

The metabolic footprints of prokaryotic or eukaryotic microorganisms reflect their nutritional needs as well as the metabolites excreted into their nutritional substrate, the so-called exo-metabolome (Kaderbhai et al. 2003; Kell et al. 2005). A comparative study between direct infusion MS (DIMS) and GC/MS, the two most common analytical platforms used in microbe metabolic footprinting was published by Mas et al. (2007). Applying DIMS, Allen et al. (2004) developed a metabolic footprinting approach for the study of the MOA of fungicides representing four different MOA using Saccharomyces cerevisiae as the model organism. Pattern recognition analyses gave satisfactory results correlating changes caused in the exo-metabolome of S. cerevisiae to the MOA of the applied fungicides.

In 2007, Yu et al. developed a methodology for the discovery of the MOA of natural antibiotics based on HPLC-ESI/MS metabolic profiling. In the study, Staphylococcus aureus, which is a pathogen of humans and animals, was used. Metabolic profiles of bacteria after treatments with antibiotics with known MOA were compared to those obtained from treatments with the antibiotic compounds under testing. Analyses revealed that the antibacterial MOA of rhizome extracts of the plant Tinospora capillipes Gagnep. resembles that of rifampicin and norfloxacin, which act on nucleic acids. In a similar study, using the same model organism and instrumentation, Yi et al. (2007) applying PCA found that berberine causes changes in the metabolic profiles of the bacterium similar to those caused by rifampicin (inhibition of DNA-dependent RNA polymerase) and norfloxacin (inhibition of DNA gyrase and topoisomerase IV). Thus, it was concluded that the MOA of berberine may be similar to that of rifampicin or norfloxacin. Using also S. aureus as the model organism, Liu et al. (2010) performing GC/MS metabolomics in combination with multivariate pattern recognition analyses concluded that the MOA of the four synthesized antibiotics under study resemble that of clindamycin which inhibits protein synthesis by reversibly binding to the 50S subunit of ribosomes. In addition, metabolites such as d-glucose, proline, and phosphoric and propanoic acids were identified as biomarkers for the observed toxicity.

5.1.3 Insecticides

The research related to the development of metabolomics models for MOA studies of compounds with insecticidal/acaricidal activity is still in its first steps. However, the sequencing of insect genomes such as those of Drosophila melanogaster (Adams et al. 2000), Drosophila pseudoobscura (Richards, et al. 2005), and mosquito Anopheles gambiae (Holt et al. 2002) has been the turning point in the enrichment of our knowledge on the MOA of insecticides.

In spite of the extensive use of organisms with nervous system such as earthworms (Eisenia spp. and Lumbricus spp.) in metabolomics (Warne et al. 2000; Bundy et al. 2002; McKelvie et al. 2009), and the potential of D. melanogaster as model organism in metabolomics (Malmendal et al. 2006; Kamleh et al. 2008; Pedersen et al. 2008) and studies on the MOA of insecticidal compounds (Schneider 2000), very few metabolomics studies have focused on the development of models for the discovery of the MOA of insecticides. In their study using the earthworm Lumbricus rubellus as the model organism, Guo et al. (2009) investigated the discrimination between bioactive compounds with different MOA according to changes in NMR metabolic profiles of L. rubellus after treatments with sub-lethal concentrations. Application of PCA resulted in a modest discrimination between treatments and the detection of corresponding biomarkers for CdCl2, atrazine, and fluoranthene toxicity. The results are indicative of the potential of metabolomics models to discriminate and classify alterations in the metabolome of earthworm caused by insecticidal compounds according to their MOA. Another potential application of metabolomics in the in vitro study of neurotoxicity and the development of robust models using mammalian cell cultures was highlighted in the study of van Vliet et al. (2008). Using rat brain cell cultures the toxicity of mercury chloride and caffeine was studied and corresponding biomarkers applying PCA were detected. Analyses were contacted using FT-ICR/MS and Orbitrap MS.

Although metabolomics seems to be of limited application in studying metabolic deviations in arthropods exposed to neurotoxic insecticides applied at lethal concentrations, it would be a valuable tool for studying sub-lethal effects on pests and their parasitoids or predators (Desneux et al. 2007). Such findings would be very important for developing integrated pest management programs based not only on direct mortality. On the other hand, since a large number of new highly selective insecticides act by inhibiting biosynthetic pathways, energy production, insect behavior or by regulating insect development (Nauen and Bretschneider 2002; http://www.irac-online.org/), metabolomics approaches could facilitate not only research on the biochemical basis of their selectivity but also the development of new selective insecticides. Furthermore, metabolomics could assist in the elucidation of the biochemical basis of the acquired insect resistance which is not attributed to modification of subcellular targets but strictly to metabolic processes.

5.2 Ecotoxicological–toxicological risk assessment

The evaluation of the effects of bioactive compounds on non-target organisms is a key element that determines their potential to be commercialized as pesticides. States such as those of the EU have developed regulations regarding the evaluation and registration of chemicals (REACH 2006). Additionally, organizations such as the Organization of Economic Co-operation and Development (OECD) have published comprehensive guidelines for the testing of bioactive compounds (http://www.oecd.org/document/22/0,3343,en_2649_34377_1916054_1_1_1_1,00.html). Although metabolomics have a wide range of applications in toxicology (Robertson 2005; Lindon et al. 2007), their application in ecotoxicological studies is still in its infancy. Environmental metabolomics is the newly emerged branch of metabolomics for the study of responses of a wide range of organisms mainly to abiotic stimuli (Miller 2007; Bundy et al. 2009).

Earthworms (Eisenia spp. and Lumbricus spp.) are among the most extensively used model organisms in environmental metabolomics mainly by applying NMR spectroscopy. They represent a significant fraction of soil fauna biomass and therefore the study of the effects of candidate pesticides on such biomass is of great importance. NMR metabolomics models have been successfully developed for the assessment of toxicity of bioactive compounds and pesticides such as 3-trifluoromethyl-aniline (Warne et al. 2000), 4-fluoroaniline, 3,5-difluoroaniline, and 2-fluoro-4-methylaniline (Bundy et al. 2002), and DDT and endosulfan (McKelvie et al. 2009) on earthworms. Such studies have also led to the identification of corresponding biomarkers for the observed toxicity.

In addition to earthworms, several fish species have been used for risk assessment of pesticides in aquatic environments applying metabolomics. Applying 1H NMR spectroscopy Viant et al. (2006b) developed a metabolomics approach for the study of dinoseb, diazinon, and esfenvalerate toxicity in eyed eggs and alevins of Chinook salmon (Oncorhynchus tshawytscha). PCA analyses revealed significant changes in the metabolic profiles of the fish in response to exposure to the applied pesticides. Additionally, application of 31P NMR, HPLC–UV and 1H NMR metabolomics in medaka (Oryzias latipes) embryos for the study of dinoseb toxicity, revealed significant changes in their metabolic profiles after exposure to the pesticide (Viant et al. 2006a). The observed toxicity was correlated to increased concentrations of lactate and decreased concentrations of ATP, phosphocreatine (PCr), alanine, and tyrosine. It is quite interesting that Kenneke et al. (2010) by applying NMR metabolomics have shown that exposure of rainbow trout (Onchorhynchus mykiss) to the enantiomers of the fungicide triadimefon resulted in different metabolic profiles of the tissues that were analyzed. Taking into account that a large number of crop protection compounds, especially those of natural origin, exist in more than two enantiomers which might exhibit variable bioactivity (Williams 1997), such findings are indicative of the potential of metabolomics for studying various issues of chirality of pesticides.

Rats and mice represent excellent model mammals regarding the risk assessment of xenobiotics because they have more similarities with humans than other model organisms do. Initiatives such as the consortium for metabonomic toxicology (COMET) have been introduced (Lindon et al. 2005) aiming to standardize application of metabolomics (and metabonomics) in risk assessment studies of xenobiotics. Although metabolomics have been extensively applied in such studies (Robertson et al. 2000; Lee et al. 2007; Kim et al. 2009; Ohta et al. 2009; Sun et al. 2009), their application in pesticide research and development is limited. In 2006, Ekman et al. developed 1H NMR metabolomics for the study of the toxicity of the triazole fungicides myclobutanil and triadimefon to Sprague-Dawley rats. Application of PCA and PLS-DA revealed biomarkers of exposure and/or effect. In another study, van Ravenzwaay et al. (2007) studied the toxicity of herbicides that inhibit p-hydroxyphenylpyruvate dioxygenase (HPPD) using Wistar rats. Application of GC/MS and HPLC/MS metabolomics revealed that tyrosine could serve as biomarker for the observed toxicity. Additionally the pattern of change in several other metabolites was elucidated.

Aquatic plants are also valuable organisms in assessing environmental health (Mohan and Hosetti 1999; OECD 2002a). Among those, duckweeds (Lemna spp.) are probably the most extensively used plants in such applications (Frankart et al. 2002; International Organization for Standardization 2003; Michel et al. 2004; Aliferis et al. 2009) due to their size, morphology, growth habits, and sensitivity to xenobiotics. In a recent study (Aliferis et al. 2009) the potential of Lemna minor L. as a model organism for the ecotoxicological risk assessment of phytotoxins in aquatic environments was studied applying 1H NMR fingerprinting. A robust 1H NMR metabolomics model was developed for the discrimination and classification of herbicides representing four different MOA and the phytotoxic fungal metabolite pyrenophorol. Results of multivariate analyses showed that the 1H NMR profiles of L. minor could be used as a reliable indicator of phytotoxicity of xenobiotics applied at sub-lethal concentrations and for the discovery of biomarkers valuable in studies on pollution of rural and urban environments.

Additionally, Taylor et al. (2009) highlighted the potential of water flea (Daphnia magna) as a model organism for the ecotoxicological risk assessment of copper applying FT-ICR/MS metabolomics. Results showed that D. magna is an organism suitable for the development of robust metabolomics models for the ecotoxicological risk assessment of bioactive compounds.

5.3 Exploitation of the in-depth investigation of changes in plant metabolic networks as response to biotic stimuli

The in-depth study of plant metabolic networks and their changes in response to biotic stimuli is an emerging field of metabolomics which could provide new insights into crop protection and plant breeding. In spite of the development of biological control agents and pathogen-resistant crops, synthetic pesticides remain the major means of crop protection. Furthermore, overuse of pesticides has resulted in increasing levels of residues in food products and development of resistant pests and pathogens (Jutsum et al. 1998; Ward and Bernasconi 1999; Dixon 2001; Ma and Michailides 2005; Owen and Zelaya 2005; Li et al. 2007). It is worth mentioning that major key inventions of modern crop protection agents have been based on the in-depth knowledge of the physiology of host interactions with pests and pathogens.

Several plant pathogens are capable of inducing disease symptoms on their respective hosts by virtue of the phytotoxins that they produce, a number of which have been exploited in the context of crop protection (Copping and Duke 2007; Dayan et al. 2009). Plant-pathogen interactions are regulated by complex biochemical mechanisms leading to compatible or not combinations. Phytopathogenic fungi employ mechanical or chemical means to penetrate plant external barriers, overcome plant’s defence and finally establish themselves into plant tissues absorbing nutrients and water. On the other hand, plants possess defence mechanisms against the invading pathogens mainly by producing secondary bioactive metabolites (e.g. phytoalexins) derived from phenylpropanoid, isoprenoid, alkaloid, or fatty acid/polyketide pathways (Dixon 2001). Discovery of crop protection products with antipenetrant and/or plant-defence eliciting properties is a major concern in modern fungicide research. In order to develop such products, metabolic in planta studies are required. To date only a few metabolomics studies have been conducted for the investigation of plant-pathogen interactions.

Applying NMR spectroscopy and pattern recognition analyses, Choi et al. (2004b) studied changes in the metabolome of Catharanthus roseus leaves after infection by mycoplasma-like organisms (MLOs). Results of PCA showed an increase in the concentration of plant metabolites such as phenylpropanoids and terpenoid indole alkaloids after infection. NMR metabolomics was also applied for the study of changes in tomato (Solanum lycopersicum L.) metabolome during interaction with citrus exocortis viroid (CEVd) and the bacterium Pseudomonas syringae pv. tomato (López-Gresa et al. 2010). Elevated concentrations of metabolites belonging to amino and organic acids, phenylpropanoids, and rutin were detected in tomato plants after bacterial infection. On the other hand, infection of plants by CEVd resulted in up regulation of biosynthesis of metabolites such as glucose, malic acid, and gentisic acid glycoside. An application of NMR metabolomics in the study of tobacco (Nicotiana tabacum L.)-tobacco mosaic virus (TMV) interaction was developed by Choi et al. (2006). PCA revealed an increase in the content of infected plant tissues in metabolites such as 5-caffeoylquinic acid, sesquiterpenoids, and diterpenoids.

The potential of FT-ICR/MS for the study of plant-pathogen pathosystems was highlighted in the study of potato sprouts (Solanum tuberosum L.)–Rhizoctonia solani interactions by Aliferis and Jabaji (2009). Applying FT-ICR/MS metabolomics the de novo synthesis of phytoalexins and glycoalkaloids, and profound changes in several other glycoalkaloids and metabolites detected as response of potato sprouts to R. solani invasion. In another study, Bednarek et al. (2005) investigated changes in aromatic metabolic profiles in A. thaliana roots after infection by the root-pathogenic oomycete Pythium sylvaticum applying NMR spectroscopy. Analyses of infected root tissues showed an increase in indolics whereas their content in phenylpropanoids was decreased.

In addition to the study of plant-pathogen pathosystems, the study of plant interactions with herbivorous insects and nematodes is also important for crop protection. Such pathosystems are characterized by complex chemical interactions which determine the impacts in plant physiology and consecutively yield. Jansen et al. (2009) studied the system cabbage (Brassica oleracea L.)-small cabbage white caterpillar (Pieris rapae Lepidoptera: Pieridae) applying UPLC-TOF/MS metabolomics. Results showed significant changes in the content of plant tissues in coumaroylquinic acids in response to P. rapae infestation. Interestingly, feeding of caterpillars on cabbage sprouts challenged by jasmonic acid (JA) did not cause significant changes in their metabolome compared to that of caterpillars fed on non-challenged sprouts. Using the same host, Widarto et al. (2006) applied NMR metabolomics for the study of changes in plant’s metabolome following attack by pronymphae of Plutella xylostella and Spodoptera exigua. Metabolites such as amino acids, carbohydrates, feruloyl malate, sinapoyl malate, and gluconapin were found to be mainly responsible for the discrimination between control and infested plant tissues. In another study, van Dam and Raaijmakers (2006) using B. oleracea and B. nigra as hosts, detected profound changes in indole glucosinolates of roots after root fly (Delia radicum) infestation. Analyses were performed by HPLC/UV.

Metabolomics studies have been also performed for the study of plants’ responses to thrips (Order Thysanoptera) which is another major class of insects posing a serious threat to crops. Leiss et al. (2009) applying NMR concluded that several metabolites such as pyrrolizidine alkaloids, jacobine, jaconine, and kaempferol glucoside were responsible for the resistance of Senecio jacobaea and S. aquaticus against Frankliniella occidentalis. NMR metabolomics have also revealed the implication of acylsugars in the resistance of wild and cultivated varieties of tomato against F. occidentalis (Mirnezhad et al. 2010).

On the other hand, the detailed underlying mechanisms during plant-nematode interactions are largely unknown. Metabolomics is an approach that could unravel those mechanisms, and the need for their development in such studies has been highlighted (Bezemer and van Dam 2005).

The abovementioned studies are indicative of the potential of metabolomics in unravelling interactions of metabolic nature in complex biological systems. The elucidation of the underlying mechanisms in plant interactions with other organisms is expected to provide alternative or new molecular targets for crop protection agents and new bioactive plant-derived compounds with potential use in crop protection or as templates for chemical syntheses. Furthermore, such studies could dictate new crop breeding strategies for the production of crop varieties with specific metabolic composition for improved resistance against pests and pathogens.

5.4 Risk assessment of genetic modified crops

The progress in genetic engineering enables the re-combination of genomes of organisms belonging to distantly related species. The introduction of genes in the plant genome that encode the biosynthesis of metabolites which enhance plant tolerance to abiotic or biotic stresses has brought an evolution in crop protection (Kos et al. 2009). Such approaches have been incorporated into crop protection strategies such as the integrated pest management (IPM), and represent a significant part of pesticide research and development.

Incorporation into plant genomes of genes which help plants to overcome the lethal effects of herbicides led to the production of herbicide-resistant crops (Duke 2005; Tan et al. 2006). Since their first introduction in mid-‘90s, a number of GM crop varieties resistant to herbicides such as bromoxynil, glyphosate, glufosinate, and triazine have been developed (Duke 2005; Tan et al. 2006). Briefly, the example of glufosinate-tolerant crops will be presented. Two sources of genes have been used in transformations, Streptomyces hygroscopicus (bar gene) and S. viridochromogenes (pat gene). Both genes encode phosphinothricin N-acetyltransferase (PAT) which converts l-glufosinate to the non-phytotoxic metabolite N-acetyl-l-glufosinate (OECD 2002b; Ruhland et al. 2004). On the other hand, genetically engineered plants to express insecticidal metabolites have been introduced in order to minimize yield losses caused by insects (Kos et al. 2009). Plants modified to express insecticidal toxins from Bacillus thuringiensis (Bt plants) are the most extensively used in agricultural practice (Betz et al. 2000).

The increasing use of GM crops has raised concerns over their safety. The main question that needs to be answered is whether alterations in GM plant’s genome cause changes in its metabolome which are potentially harmful to human, animal, as well as to non-target organisms. The complexity of plant’s metabolome and the interactions between several of its components makes the unambiguous answer to the above question a very challenging task applying metabolomics.

Substantial equivalence was introduced by OECD (1993) and FAO (1996) as a reliable indicator of GM food safety assessment. Relevant established guidelines (FAO/WHO 2001; EFSA 2005), EU legislation, and US Acts developed regarding the safety assessment of GM food and food products have been recently reviewed by Dona and Arvanitogiannis (2009). Also, EU initiatives such as The European Thematic Network on the Safety Assessment of Genetically Modified Foods (ENTRANSFOOD, www.entransfood.com) have been introduced for the safety assessment of GM food crops. Nonetheless, based on the potential of metabolomics, FAO and WHO (2000) included metabolic profiling as a complementary methodology to the already existing ones for the safety assessment of GM crops and food products and the estimation of possible unintended effects. Several research groups have successfully performed metabolomics for the comparison between metabolic profiles of GM crops and their isogenic lines by using various analytical platforms. Leon et al. (2009) developed a metabolomics methodology by combining FT-ICR/MS and CE-TOF/MS for the comparison between the metabolic composition of kernels of GM maize and that of the corresponding isogenic lines. GM maize was carrying the B. thuringiensis Cry1Ab gene. Analyses revealed substantial differences between GM and wild kernels, especially for metabolites implicated in amino acid biosynthesis. Similarly, Manetti et al. (2006) applying NMR metabolomics detected metabolic differences between wild and transgenic maize (Cry1Ab gene) mainly in their content in osmolytes and branched amino acids. The abovementioned results are in accordance with those of Piccioni et al. (2009) who applied NMR metabolomics for the discrimination between wild and transgenic maize. Application of metabolomics for the comparative study between wild and transgenic rice carrying the genes cryIAc and sck for resistance against insects, revealed substantial differences in their metabolomes (Zhou et al. 2009). The main differences were detected in their content in lipid acids, carboxylic acids, and carbohydrates. In another study, Le Gall et al. (2003) applying 1H NMR metabolomics detected metabolic differences between non-transgenic and transgenic tomatoes carrying the maize transcription factors LC and C1. Similarly, application of NMR metabolomics and multivariate analysis (PCA) showed substantial metabolic differences between wild tobacco plants and plants engineered to over express salicylate biosynthetic genes (Choi et al. 2004a). The major differences were detected in the composition of plants in chlorogenic acid, malic acid, glucose, and sucrose. On the other hand, based on CE-TOF/MS metabolomics Garcia-Villalba et al. (2008) found very few differences between conventional and transgenic varieties of soybean included in the study. Similarly, combining GC-TOF/MS and flow injection electrospray-MS (FIE-MS) Catchpole et al. (2005) concluded that there are no significant differences between conventional and GM potatoes (high levels of inulin-type fructans).

6 Conclusions and future prospects

Currently, there is an increasing concern globally regarding food production to sustain the exponentially increased human population in a constantly changing environment. However, the demand for increased food production should by no means lead to compromises in food quality. Although metabolomics is a relatively recent tool for the study of systems biology, it could assist agrochemical industry’s research and development in order to overcome the abovementioned major challenges. Indicative of the potential of metabolomics in pesticide research and development is the rapid increase in the number of publications retrieved by the ISI Web of KnowledgeSM acquiring for the terms “herbicide(s) or fungicide(s) or insectides(s)” and “metabolomics” or “metabonomics” or “metabolic fingerprinting” or “metabolic profiling” over the last decade (Fig. 5).

Fig. 5
figure 5

Diagram indicating the number of retrieved publications for the years 2000–2009 performing searches at the ISI Web of KnowledgeSM acquiring for the terms “herbicide(s) or fungicide(s) or insectides(s)” and “metabolomics” or “metabonomics” or “metabolic fingerprinting” or “metabolic profiling”

The development of robust metabolomics models for the automatization of the discovery of the MOA of pesticides using appropriate model organisms reduces the required time for screening of the vast number of candidate molecules according to their MOA in a cost effective way. This approach has shown its potential with the exception of insecticides for which research is in its infancy. Furthermore, metabolomics approaches enable the toxicological–ecotoxicological risk assessment of qualified bioactive compounds using model organisms representative of different levels of organization. Although no solid conclusions have been drawn until now regarding the safety of GM crops applying metabolomics, it is expected that advanced metabolomics studies in combination with toxicological studies for the major findings (i.e. toxicological assessment of biomarkers) will provide an in-depth insight into this issue.

On the other hand, metabolomics has provided a new dimension in the study of systems biology enabling the in-depth understanding of interactions in biological systems such as the interactions of plants with pests and pathogens. This can dictates novel molecular targets and bioactive metabolites that could potentially influence such interactions to our benefit.

Nonetheless, the incomplete knowledge of plant metabolic pathways and the lack of uniform analytical protocols for metabolomics are downsides that need be overcome for the standardization of metabolomics (Sansone et al. 2007; Sumner et al. 2007; Taylor et al. 2008). Furthermore, the integration of different analytical instruments for the analyses of multi-complex samples and the integration of metabolomics with other “omics” approaches in the context of a high-dimensional biological approach is expected to provide new insights into the function and regulation of metabolic networks.