Keywords

1 Introduction

Great scientific change in molecular biology and biochemistry approaches and techniques has begun since the sequencing of human genome in the 1990s. Automated micro-array methods to detect changes in gene expression and the ability to assay and identify proteins by mass spectroscopy methods led respectively to two new revolutionary disciplines: transcriptomics and proteomics which are useful for the comprehension of the complex interactions between genetic make-up and environmental factors. It is crucial to underline that the small molecules involved in biochemical processes provide a wide range of information on the status of living systems when studying changes in genes expression due to every variation in life conditions and external perturbations. The process of monitoring and evaluating such changes is termed “metabonomics” or “metabolomics” when mostly model organism or plant system are studied (Lindon et al. 2006).

The use of NMR spectroscopy combined with multivariate statistic investigation of the complex data sets led to the following definition: “the quantitative measurement of the dynamic multiparametric metabolic response of living systems to pathophysiological stimuli or genetic modification” by Jeremy Nicholson and colleagues (1999).

The metabolites, or small molecules, within a cell, tissue, organ, biological fluid, or entire organism constitute the metabolome (Miller 2007). Metabolomic analyses can be categorized as either nontargeted or targeted (Issaq et al. 2008; Verpoorte et al. 2008). Nontargeted metabolomics is a nonbiased quantitative analysis of all—or a large number of—metabolites found in a biological sample (Issaq et al. 2008). By contrast, targeted metabolomics analyzes a specific group of metabolites (Issaq et al. 2008; Verpoorte et al. 2008).

The complementary role of metabolomics in regard to other omic techniques may actually provide a potential solution to the many weak points that are encountered with other omic methods (Griffin and Bollard 2004; Bilello 2005; van Ravenzwaay et al. 2007). Indeed, despite the development of methods to detect changes in genomic, transcriptomic, and proteomic profiles, key information needed to make significant interpretations based on these data are usually not promptly available (Ankley et al. 2006). Changes in gene expressions and protein synthesis due to external stressor exposure of an organism usually cause the activation of homeostatic controls and feedback mechanisms; these changes could be intensified at the metabolome level (Nicholson et al. 2008; Ankley et al. 2006; van Ravenzwaay et al. 2007). As a result, metabolomics could be considered a more sensitive and reliable indicator of the external stress than other omic technologies (Nicholson et al. 2008; Ankley et al. 2006; van Ravenzwaay et al. 2007).

The popularity of metabolomics in many fields of scientific research like nutrition, medicine, clinical pharmacology, and toxicology has grown considerably thanks to the ability to detect subtle molecular changes and the comprehensive nature of metabolite measurements (Lin et al. 2006; Chen et al. 2008; Fialho et al. 2008). As a result, metabolomics is nowadays considered a rapid and sensitive technique that would be able of clear up relationships between metabolite levels and an external stressor, be it contaminant exposure, nutritional deficit, or disease (Miller 2007). For example, cancer cells’ metabolic profiles have been used to monitor and comprehend tumor progression (Griffiths et al. 2002; Griffiths and Stubbs 2003; Griffin and Shockcor 2004). Metabolomic investigations in non-model organisms will be particularly exciting in the characterization of organism metabolic responses to anthropogenic (or manmade) stressors, such as pollutants and climate changes, highlighting another important application of the environmental metabolomics. This approach would utilize “sentinel” (or representative) organisms of a particular ecosystem to reveal the condition of the environment (Viant 2008b).

In the last decades, in order to understand the toxic effects of several kinds of xenobiotic residues (pharmaceuticals, pesticides, nanoparticles, heavy metals) in the environment and the related biological changes, various nontarget organisms of the aquatic food chain have been selected as bioindicators for their suitable characteristics. In fact, invertebrates such as bivalve molluscs Mytilus edulis and Mytilus galloprovincialis (Tuffnail et al. 2009; Fasulo et al. 2012; Bonnefille et al. 2018), crustaceans Daphnia magna and Gammarus pulex (Taylor et al. 2010; Gómez-Canela et al. 2016; Kovacevic et al. 2016; Nagato et al. 2016; Wagner et al. 2017) and vertebrates like fish such as Danio rerio and Oncorhynchus mykiss (Samuelsson et al. 2006; De Sotto et al. 2017) represent a valid tool to study the ecotoxicogenomics and the metabolomics. These bioindicators are chosen for these kinds of studies because: they are important components of the aquatic ecosystem, are easy to recognize, are sensitive to a wide range of stressors, are abundant and widely distributed, and are suitable for laboratory experiments. In addition, these aquatic nontarget organisms have been utilized by different scientific environmental researchers (Brezovšek et al. 2014; Parrella et al. 2014, 2015; Isidori et al. 2016a; Kundi et al. 2016; Russo et al. 2018; Kovács et al. 2015, 2016; Gačić et al. 2014) through specific tests according to international standard guidelines not only for highlighting physiological alterations (mortality, offspring reduction, inhibition of growth, malformations) but also for evaluating genotoxic, mutagenic and teratogenic damages. Since there is not a clear relationship between biochemical mode of action of xenobiotics and defined endpoints such as mortality and reproduction, a detection of the metabolic profile in aquatic sentinel organisms may be of great scientific interest. In fact, significant variations in amino acids, in glucose and other metabolite concentrations in cells, tissues, or biofluids (Kovacevic et al. 2016; Wagner et al. 2017) may occur in organisms after sub-lethal exposure to hazardous and pseudo-persistent contaminants like pharmaceuticals. Indeed, among all contaminants, pharmaceuticals are frequently detected in aquatic ecosystems because of their high consumption and scarce removal efficiencies by wastewater treatment plants (Negreira et al. 2014; Lenz et al. 2007; Isidori et al. 2016b).

Among pharmaceuticals, antineoplastic drugs are suspected to be hazardous for aquatic nontarget species (Parrella et al. 2014, 2015; Isidori et al. 2016a; Kundi et al. 2016; Russo et al. 2018). To the best of our knowledge, there are no studies in literature regarding the overall metabolomic responses in different nontarget organisms exposed to antineoplastic drugs, while only few studies (Table 18.1) have utilized metabolomics as a robust tool to evaluate the biological environmental responses in the aquatic sentinel species exposed to other pharmaceutical classes. On the other hand, in recent studies (Laith et al. 2017; Mumtaz et al. 2017; Ruiz-Torres et al. 2017), metabolomic approach has been considered a good strategy to identify, select, and provide secondary metabolites from natural promising sources, such as plants and marine invertebrates and vertebrates, as new drugs for cancer therapy, rather than using metabolomics for studying environmental biological effects.

Table 18.1 Summary of metabolomic responses to pharmaceuticals in aquatic organisms

In this chapter, we provide an overview of the experimental design, analytical techniques, and statistical methods used in environmental metabolomics as well as an overview of recent studies using aquatic nontarget organisms in metabolomics to demonstrate the potential of this technique to detect and understand the mechanisms of exposure to some pharmaceuticals.

2 Experimental Design and Analytical Methods Used in Environmental Metabolomics

The basic procedure used in an environmental metabolomics study is outlined in Fig. 18.1. Furthermore, an accurate outline of the entire metabolomic experimental scheme from the experimental design, to data mining and biological interpretation is described by Wolfender et al. (2013).

Fig. 18.1
figure 1

Environmental metabolomics experiment workflow (from Simpson and McKelvie 2009; Lankadurai et al. 2013). NMR, nuclear magnetic resonance spectroscopy; MS, mass spectrometry; PCA, principal component analysis; PLS-regression, partial least-squares regression analysis. In some cases, a further step might be present between the design of experiment and the sampling, consisting in the experimental performance when the experiment has to be performed in laboratory controlled conditions

The first step in any metabolomics experiment is the experimental design, which, in case of environmental metabolomics, involves the selection of a model organism or microorganism, a type of external stressor (e.g., exposure to contaminants, heat/cold, starvation, or disease), and the mode/route of exposure. A good experimental design mainly depends on the starting point and goals of the research. Focusing on the biological question underpinning the research is the most crucial step, as metabolomics usually takes into account a large number of samples (Brown et al. 2008). Taking into consideration the biological variability and choosing a suitable number or replicates from an early stage of the research is also compelling, for organisms grown in either natural or controlled conditions.

When planning and carrying out the experiment, it is extremely important to consider that metabolism is highly dynamic and changes occur at different timescales, depending on the organism, and on the metabolic pathway considered. For example, some changes might occur according to the developmental stage or phenology (Scognamiglio et al. 2014) while others to circadian clock (Eckel-Mahan et al. 2012), with the results that the first changes can only be observed on a longer timescale, while the second ones are responsible for cyclic changes in metabolites’ concentration during the light/darkness alternation. It has to be specified that there are many other physiological reasons for metabolic alterations that also depend on the organism (and on the biological medium) taken into consideration. Therefore, this dynamic feature has to be borne in mind when planning timing for treatment, sampling, and so on, in order to make sure to detect and discriminate the changes caused by the external perturbation from the ones due to physiological changes in the metabolism. Taking into consideration the two parameters previously mentioned (developmental stage and circadian clock), for example, it is crucial to collect samples at the same moment of the day and at the same organism growth stage. Monitoring environmental variations (e.g., photoperiod, relative humidity, temperature) throughout the progress of the experiment is important as well.

Once the organism has been exposed to the external stressor, the biological medium to study will be harvested and may include blood, urine, or other biological fluids, and (or) tissue/organ extracts (Simpson and McKelvie 2009). Quenching metabolism immediately after exposure is essential to minimize the influence of puzzling variables in the analysis of the metabolic response (Lin et al. 2006). This is commonly done by flash freezing with liquid nitrogen and storing at −80°C. However, in order to improve precipitation and inactivation of soluble enzymes, acid treatment, or extraction with cold mixtures of organic solvents such as methanol, ethanol, acetone, or acetonitrile might be used (Lin et al. 2006). When suitable, also freeze drying of the samples is a good practice (Kim and Verpoorte 2010).

The further step is the choice and development of a suitable extraction method. Considering the high variability and complexity of matrixes to be tested, sample extraction and preparation methods can vary a lot depending on the matrix to be analyzed and on the analytical platform used (Kim and Verpoorte 2010). As no single extraction method can isolate every metabolite within a sample, a proper extraction procedure will need to be selected and tested, based on the goals of the experiment. In general, an aqueous buffer extraction is sufficient to obtain a polar metabolite profile, but a more rigorous extraction involving a mixture of polar and nonpolar solvents is required to extract both polar and nonpolar metabolites (Wu et al. 2008). Various extraction methods involving a mixture for organic solvents (methanol, chloroform, and acetonitrile) and water in varying ratios have been examined by Lin et al. (2006). The methanol/chloroform/water (final solvent ratio of 2/2/1.8, respectively) extraction method, which was first described by Bligh and Dyer (1959), has been considered one of the most reproducible and with the highest recovery of both polar and nonpolar metabolites. Wu et al. (2008) then went a step further, and examined three different strategies to add the methanol, chloroform, and water to the tissue samples for extraction: (i) stepwise addition—the original Bligh and Dyer (1959) method of adding each solvent one by one; (ii) two-step addition—methanol and water are added in step one and chloroform and water are added in step two; (iii) all-in-one addition—all three solvents are added together. They stated that the two-step addition was the best out of the three: this assessment is based on metabolite yield, extraction reproducibility, and sample throughput. Recently, Liebeke and Bundy (2012) compared four different solvent systems for the extraction of metabolites from the tissue of the earthworm Lumbricus rubellus: (i) chloroform: methanol: water, 2:1:1 (CMW); (ii) 75% hot ethanol (hEt); (iii) acetonitrile:methanol:water, 2:2:1 (AMW); (iv) isopropanol:methanol: water, 2:5:2 (IMW). Extracts were analyzed using NMR, gas chromatography (GC)–MS and Fourier transform ion cyclotron resonance (FT-ICR) MS. They determined that the AMW extraction gave the best results in terms of reproducibility and good yield for metabolite extraction (Liebeke and Bundy 2012). It has furthermore been shown that in case of plant material, a mixture of phosphate buffer and MeOH (1:1) is usually able to extract big part of the metabolome (Kim and Verpoorte 2010).

Also, the presence of proteins binding the typical used NMR internal standards 2, 2 dimethyl-2-silapentane-5-sulfonate sodium salt (DSS) and sodium 3-trimethylsilyltetradeuteriopropionate (TSP) in aqueous buffer extracts has led to large differences in the quantification of metabolite concentrations (Nowick et al. 2003). Consequently, the development of extraction methods that not only capture both polar and nonpolar metabolites but also include the precipitation of proteins is essential. Nevertheless, one of the main problems of metabolomics is still the lack of a standardized and reproducible extraction protocol, so a great effort should be made in this direction.

Once the extractions are completed, the samples need to be prepared for the analytical platform of choice. A list of the available techniques and of the viewpoints on their advantages and disadvantages in metabolomics applications have been extensively discussed in the literature (Table 18.2) (Scognamiglio et al. 2015). It is important to emphasize that the choice of the analytical platform and strategy heavily depends on the object of the research, and it is commonly acknowledged that the better results can be achieved combining different extractions and analytical measurements (Allwood et al. 2008; Kim and Verpoorte 2010). However, nuclear magnetic resonance (NMR) spectroscopy or mass spectrometry (MS) are doubtlessly considered the most powerful tools and the ones that will be here described more in detail. NMR and MS use in metabolomics is so widespread that in the past there was a common misconception that “metabonomics” dealt with NMR-derived metabolic profiling studies, while “metabolomics” dealt with MS-derived metabolic profiling studies (Robertson 2005).

Table 18.2 Reviews and recent environmental metabolomics papers with corresponding analytical techniques (highlighted box indicates the reviewed\used methodology)

In MS, the analysis is done after the fragmentation of the molecules, which this leads to the generation of ions that are later separated by their mass-to-charge ratio and finally analyzed by a detector. A number of ion sources and of analyzers is available (Xiao et al. 2012). Direct injection MS enables the injection of a crude extract directly into an electrospray mass spectrometer, resulting in one spectrum per sample, but this method is not particularly quantitative and not often used (Lin et al. 2006).

MS use in metabolomics is usually coupled to liquid chromatography (LC; Wu et al. 2005; Bajad and Shulaev 2007; Hughes et al. 2009), gas chromatography (GC; McKelvie et al. 2009; Flores-Valverde et al. 2010; Warren et al. 2012), or capillary electrophoresis (CE; Sato et al. 2004; Ramautar et al. 2012; Yamamotoya et al. 2012). These chromatographic techniques separate the complex sample mixtures so that they can be analyzed by MS, but this can make the overall analysis time consuming (Robertson 2005; Pan and Raftery 2007). Also, GC methods usually entail elaborate derivatization steps that are very lengthy and thus inconvenient for high-throughput analysis (Lin et al. 2006), besides introducing bias in the analysis due to the used chemical reactions. However, the combination of MS with chromatography, when there is availability of pure certified chemical standards, is a useful approach for identification and quantification. Indeed, retention time can be considered as an additional hint of metabolite identity, while the chemical standards are used also to set up proper external calibration curves for compounds quantification (Allwood et al. 2008).

Mass fragment databases are usually employed for easy preliminary identification of compounds (Noctor et al. 2007; Pan and Raftery 2007). Only the use of tandem MS (MS-MS), preferably HR (MS)n instruments allows for structural definition of compound identities. This usually excludes the countless, to date unknown metabolites, which are reported as “unknowns” or as putatively identified metabolites. In this case, isolation and characterization by NMR is fundamental to definite structural elucidation (Sumner et al. 2007).

The newest LC-MS/MS approaches are a useful tool for metabolite identification and quantification (Xiao et al. 2012), although it is always recommended to follow published guidelines and to refer to minimum reporting standards for the level of confidence of chemical structural elucidation (Fernie et al. 2011; Goodacre et al. 2007; Sumner et al. 2007).

The biggest advantage of MS-based techniques is sensitivity (typically picogram level), making the technique very suitable in studies that are targeting novel biomarkers (Robertson 2005; Pan and Raftery 2007) but, on the other hand, the identification of unknown metabolites is problematic. FT-ICR MS in particular provides high resolution and mass accuracy, but the instrument is costly and is thus not widely used (Pan and Raftery 2007).

The principal downside of MS-based approaches is their difficult standardization, as a consequence of a number of combinations of chromatographic systems (the separation step also includes bias in the analysis), ion sources, and different analyzers that highly impact the analysis output.

Beyond the potentially long and not always reproducible chromatographic separations, the difficult standardization, and the structural elucidation power limitation, other drawbacks of MS include: the presence of matrix effects, the destructive nature, its selectivity for certain analytes, ion suppression causing extensive variations of signal intensities, and the lack of more robust methods for chromatographic separations (Robertson 2005; Pan and Raftery 2007).

Often, due to its selectivity, MS has been used in targeted metabolomics studies (Edwards et al. 2006; Lutz et al. 2006; Issaq et al. 2008; Southam et al. 2011; García-Cañaveras et al. 2012) or in the elucidation or confirmation of metabolites first observed by NMR (Bundy et al. 2002a; Crockford et al. 2006).

On the other hand, NMR is nondestructive, nonselective, possesses cross-laboratory reproducibility, and lacks sample bias (Robertson 2005; Pan and Raftery 2007; Viant et al. 2009). As a result, NMR has been used extensively in the nontargeted or comprehensive studies of all or most of all the metabolites in a biological sample (Verpoorte et al. 2008; Simpson and McKelvie 2009; Whitfield Åslund et al. 2011a, b; Li et al. 2012; Ritota et al. 2012).

NMR is an instrumental analytical technique that allows obtaining detailed information on the structure of molecules by observing the behavior of atomic nuclei in a magnetic field. The frequency at which each nucleus resonates depends on its chemical surrounding environment so each compound has a highly specific spectrum, which is an information-rich graph. Rapid identification of all of the compounds present in a mixture can be performed thanks to the combination of one-dimensional (1D) and two-dimensional (2D) NMR techniques. Recent advances in the identification of unknown compounds in the analyzed mixtures allow NMR to obtain important structural information without requiring further purification (Forseth and Schroeder 2011). Indeed, one of the main strengths of this technique is the unique set of structural information furnished that in most cases guarantees the definitive structural elucidation of the compounds, sometimes including stereochemistry.

Besides its power in structural elucidation, 1H NMR is commonly used in metabolomics thanks to several other advantages: easy sample preparation, ease of standardization, and high reproducibility, and the solvent used and the magnetic field strength being the only variables (Verpoorte et al. 2007). Indeed, in NMR-based metabolomics the only bias is introduced by the solvent choice (Allwood et al. 2008) and the reproducibility appears very good and allows a comprehensive identification and quantification of a large number of compounds with short analytical times (including the extraction procedures). The need of a standardization procedure for the metabolome extraction previously mentioned makes NMR-based metabolomics convenient thanks to minimum sample preparation. In fact, extraction can be carried out directly in deuterated solvents, often with a mixture of phosphate buffer in D2O and MeOD (1:1). Furthermore, NMR is fully quantitative (Kim et al. 2011; Wishart 2008). The calculated precision and accuracy of a 500 MHz NMR spectrometer in a quantitative 1H NMR analysis of external standards has been demonstrated to be around 1% (Burton et al. 2005).

The main disadvantage of NMR, compared to MS, is its low sensitivity. This is an issue in the analyses of endogenous metabolites and in particular detecting novel biomarkers. Indeed, these metabolites are usually present at very low levels that cannot be reached with NMR. The sensitivity of 1H NMR is also dependent on the number of protons in the molecule, the structure and size of the molecule. Nowadays, the problem of low sensitivity can be attenuated using ultrahigh magnetic field strength NMR spectrometers and probes that are cryogenically cooled to 4.5 K; these probes may result in a four-times raise in sensitivity (Logan et al. 1999; Griffin 2003; Pan and Raftery 2007; Grimes and O’Connell 2011) reaching the stage at which structures can be solved using very small quantities (in the microgram range) (Harvey et al. 2015). The other advantage of microcoil probe use is also the lower sample mass requirements, which is a big benefit for small organisms (Lacey et al. 1999; Pan and Raftery 2007; Grimes and O’Connell 2011; Poynton et al. 2011). The only disadvantage with these methods is the affordability of such high-end instrumentation and although sensitivity has been drastically increased, NMR is still surely less sensitive when compared to mass spectrometry (Forseth and Schroeder 2011; Kim et al. 2011). Nonetheless, compared to MS, the sensitivity of NMR is independent from metabolite pKa or hydrophobicity, making it a very adaptable choice for representative analyses (Pan and Raftery 2007). An additional downside of NMR is that some classes of lipids can only be identified as total groups and not as individual compounds by means of 1D NMR.

A large part of environmental metabolomics studies still use NMR because of the comprehensive nature of nontargeted metabolomics and the ability to generate hypotheses involving complex environmental stressors for which there are no known modes of action (Bundy et al. 2001; Tjeerdema 2008; Viant 2008a; Viant et al. 2009; Brown et al. 2009; Simpson and McKelvie 2009). For all these reasons, NMR is an ideal environmental metabolomics discovery tool. It should be noted that once metabolites of interest are discovered using NMR, targeted MS-based methods can be subsequently developed for the routine monitoring of these metabolites.

The majority of NMR-based metabolomics studies still use one-dimensional (1D) 1H NMR experiments (Bundy et al. 2002b, c; Samuelsson et al. 2006; Brown et al. 2010; McKelvie et al. 2010, 2011). 1D 1H NMR experiments are advantageous for metabolomic studies, which usually have hundreds of samples, because of their short acquisition times (10–15 min per sample), allowing for high-throughput analyses and a high number of sample replicates (Pan and Raftery 2007; Yuk et al. 2010).

Analyzing aqueous samples using 1H NMR requires the application of water suppression techniques (Nicholson and Wilson 1989; McKay 2009). Water concentration in samples is much higher (50 mol/L) compared to the millimolar metabolite concentrations, leading to the suppression of signal intensities in the peaks of other compounds due to saturation of the NMR receiver by the H2O signal (Bothwell and Griffin 2011). While deuterated solvents are mostly used (deuterium resonates at a different frequency than 1H in the NMR), there is always residual H2O present in samples. The best and most used water suppression methods for metabolomics are presaturation, nuclear overhauser effect spectroscopy (NOESY) presaturation, and presaturation utilizing relaxation gradients and echoes (PURGE; Bundy et al. 2002a; Viant et al. 2003; Simpson and Brown 2005; Wishart 2008; McKelvie et al. 2010; Poynton et al. 2011). Several studies have used PURGE water suppression for NMR-based metabolomic analyses (McKelvie et al. 2011, 2013; Yuk et al. 2011, 2013). Among the several water suppression techniques, PURGE provided superior water suppression with the least amount of parameter optimization and the fewest number of spectral regions that need to be excluded because of variations in the suppression of the solvent peak. Furthermore, comparing various 1D and 2D NMR techniques, PURGE 1H NMR has demonstrated to be the most rapid, informative, and economic method for analyzing aqueous metabolomics samples (McKay 2009; Yuk et al. 2010).

The complexity of any biological sample due to the large number of molecules that they possess, results in a large number of peaks within the small chemical shift range of a 1H NMR spectrum (0–14 ppm). This leads to difficulty in identifying compounds that are present at low concentrations, considering the common chance of peak overlapping generated by different metabolites. This usually means that some peaks are masked by larger peaks from compounds present in higher concentrations (Pan and Raftery 2007). Some of the best techniques to alleviate the spectral overlap and improve resolution between peaks are: Carr–Purcell–Meiboom–Gill (CPMG), J-resolved spectroscopy (J-RES), and other various 2D NMR techniques.

Carr–Purcell–Meiboom–Gill (CPMG) is used to remove broad resonances associated with molecules of high molecular mass or molecules with constrained motion and hereby offers better resolution of low molecular mass metabolites (Weckwerth 2007; Wishart 2008).

J-resolved spectroscopy (J-RES) projection may improve spectral resolution. J-RES is a two-dimensional (2D) NMR technique, in which the chemical shift information is on one axis and the spin–spin coupling information is on another. Projecting only the chemical shift axis, an equivalent to a 1D proton decoupled spectrum is obtained, which has less spectral overlap and enables better detection of specific metabolites (Viant et al. 2003; Pan and Raftery 2007; Yuk et al. 2010), although it also results in the loss of some information.

Other 2D NMR techniques have also been used to increase spectral resolution because they have an additional dimension into which the signals can be dispersed. Besides J-RES, some of the other common 2D NMR techniques in metabolomics are 1H correlation spectroscopy (COSY) and 1H-13C heteronuclear single quantum coherence (HSQC) spectroscopy (Xi et al. 2008; Ludwig and Viant 2010; Yuk et al. 2010; Flores-Sanchez et al. 2012). The main benefit of using 2D NMR techniques, such as HSQC, is that the 13C axis has a large chemical shift range (200 ppm), which allows for greater spectral dispersion and enhanced resolution (Xi et al. 2008; Chylla et al. 2011; Hu et al. 2011a, b). However, a drawback of most 2D NMR techniques is the lower sensitivity, which results in longer acquisition times—sometimes many times more than 1D experiments (Jacobsen 2007). In fact, 2D experiments such as HSQC may require long acquisition times for adequate signal-to-noise (S/N) ratios. For this reason, the use of 2D NMR techniques in metabolomic studies is limited to complementing compound identification from 1D 1H NMR experiments (Yuk et al. 2010).

After the samples are analyzed, the data are processed and statistical analyses are performed using multivariate and univariate analyses. These are done in conjunction with the quantification and identification of the metabolites.

The final step then involves biological interpretation of the data to make a connection between the external stressor and the metabolic response of the organism.

3 Statistical Methods Used in Metabolomics

The interest in metabolomics is due to its ability to generate large volumes of data in a high-throughput way, so one of the biggest challenges is to find a way to visually analyze all of the collected data (i.e., NMR or MS data) to identify differences between samples in a reasonable time frame (Robertson 2005). Both, multivariate statistical and pattern recognition methods are employed to smooth the analysis of metabolomics data sets and to obtain meaningful relationships between the external stressor and the metabolic response (Trygg et al. 2007; Coen et al. 2008). Pattern recognition methods are able to reduce the dimensionality of metabolomics data from hundreds of variables into two or three components that are orthogonal to each other (Trygg et al. 2007).

PCA is an unsupervised method, meaning that the model is not provided with any prior information concerning the identity of the samples (Holmes and Antti 2002), and is the most widely used multivariate statistical approach in metabolomics (Bundy et al. 2002a, 2004; Trygg et al. 2007; Wishart 2008; Simpson and McKelvie 2009). The association of the samples in a PCA scores plot is based on the similarities in their metabolic profiles. PCA figures out the comprehensive variability in a data set, which is explained by a set of uncorrelated variables called principal components (PCs); these are linear combinations of the original variables (Trygg et al. 2007). The first PC (PC1) explains most of the variation in the data and PC2, which is orthogonal to PC1, explains the second most variation in the data and so on. PCA allows for dimensional reduction of the data into a low-dimensional plane, such as PC1 versus PC2. The scores plot (e.g., PC1 vs. PC2) allows for a visual examination of the relationship between the samples based on their metabolic profiles.

PLS regression analysis and PLS discriminant analysis (PLS-DA) are also used often as multivariate statistical tools in metabolomics (Barker and Rayens 2003; van Ravenzwaay et al. 2007; Ekman et al. 2008; Jones et al. 2008; Whitfield Åslund et al. 2011b). PLS-regression and PLS-DA are supervised methods. In this case, the classification of the samples as either control or experimental is known to the model. Predictive models are built adding predefined variables to maximize the separation between the sample classes. These variables are measurable quantities such as the contaminant exposure concentration. In PLS-DA these are dummy variables: for example, we can assign all the controls a value of zero and the experimental group a value of one to distinguish the sample classes (Trygg et al. 2007).

In order to reduce models’ components, make it easier to interpret and more relevant, supervised methods such as orthogonal projections to latent structures (OPLS) and OPLS-DA have been increasingly used in metabolomics studies (Trygg and Wold 2002). These are basically extensions of PLS and PLS-DA where the orthogonal variation to the predefined variables is removed from the model (Trygg and Wold 2002; Bylesjö et al. 2006), but could also be analyzed together with the identification of the uncorrelated variables sources.

Cross-validation methods such as the leave-one-out cross validation (LOOCV) are required to evaluate the robustness of the supervised methods such as PLS-regression, PLSDA, OPLS, and OPLS-DA (Westerhuis et al. 2008; Varmuza and Filzmoser 2009; Whitfield Åslund et al. 2011b). LOOCV is performed by first randomly eliminating one of the samples from the original data set, which is called the test set, then the model (PLS/PLS-DA or OPLS/OPLS-DA) is built on the remaining samples (the training set). This process is repeated until all of the samples have been left out of the model at least once. The training set models created are eventually used to predict the test set. The Q2Y, which is known as the goodness of prediction (Westerhuis et al. 2008), represents the ability of the model to predict the test set. This value can be used to evaluate the robustness of a model: typically, a Q2Y > 0.4 is considered a strong model (Jones et al. 2008; Westerhuis et al. 2008). The significance of PLS/OPLS models needs to be evaluated, and this can be done using permutation testing (Eriksson et al. 2006; Alam et al. 2010; Whitfield Åslund et al. 2011b). Permutation testing consists of maintaining the data set constant, while randomly permuting the order of the predefined variables a set number of times. For each permutation a new PLS/OPLS model is fitted, and the Q2Y is calculated providing a reference distribution of the Q2Y statistic. The significance of the original PLS/OPLS model and the confidence in its validity is increased if its Q2Y value is higher than the values obtained for all the PLS/OPLS models built during the permutation tests (Eriksson et al. 2006).

Although, metabolomics studies mostly use multivariate statistics, complementary univariate statistical analyses are also attended to further increase the amount of information obtained from the research. T tests are commonly used to assess the significance of the separation between the controls and stressed organisms in PCA and PLS-DA scores plots, and to define which metabolites in the 1H NMR spectra of the treatment class increased or decreased significantly relative to the controls. A T test filtered difference 1H NMR spectra can also be created by subtracting the buckets of the average controls from that of each average exposure class. Not statistically significant (= 0.05) bucket values metabolite peaks can be replaced with a zero, resulting in a T test filtered 1H NMR difference spectrum (Ekman et al. 2008, 2009). T test filtered difference 1H NMR spectra and loading plots can be used together to determine which metabolites are potential biomarkers of exposure to a particular contaminant.

4 Metabolomic Responses Observed in Aquatic Nontarget Organisms Exposed to Pharmaceuticals

Despite the immense potential of metabolomic research for assessing environmental pollutants, only a small number of studies have been conducted till now to evaluate the metabolomic responses observed in various aquatic nontarget organisms exposed to pharmaceuticals. In fact, Bonefille et al. in 2018, evaluated the effects of the nonselective, nonsteroidal anti-inflammatory drug diclofenac against the marine Mytilus galloprovincialis, chosen for its ease in being handled, for its capability in accumulating toxins, and for its sedentary nature. In this mussel, these authors studied metabolomic perturbations caused by 100 μg/L diclofenac, concentration not affecting organisms’ viability; then, the metabolomic analysis was performed by liquid chromatography-hyphenated to high-resolution mass spectrometry (LC-HRMS) in extracts of digestive gland, and alterations in the tyrosine and tryptophan metabolisms were observed at concentrations only few orders of magnitude higher than those found in seawater (1 μg/L, Gaw et al. 2014). In particular, after a 7-day exposure, tyrosine pathways were down-modulated, while steroid hormone biosynthesis and tryptophan pathways were positively modulated.

In addition to mussels, other nontarget invertebrates such as crustaceans were suitable tools for metabolomic analysis. In fact, Taylor et al. (2010), Kovacevic et al. (2016), and Wagner et al. (2017) studied metabolomic responses in the cladoceran crustacean Daphnia magna after exposure to various pharmaceuticals. In particular, Taylor and coauthors, in 2010, explored D. magna metabolic changings after 24 h exposure to 1.4 mg/L of the nonselective β-adrenergic receptor blocker propranolol by direct infusion Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR MS). High-quality metabolite profiles were detected both in hemolymph and in the whole-organism extracts from 14-day-old daphnids and metabolic perturbations were found in the multiple fatty acid and oxylipid metabolites. Wagner et al. (2017) performed a similar study testing 0.67 mg/L of propranolol in both neonates (<24 h old) and adult daphnids (18 days old) by nuclear magnet resonance spectroscopy 1H-NMR observing in both populations an increase in amino acid metabolites and a reduction in glucose levels when compared to control. Furthermore, Kovacevic et al. (2016) studied the metabolic profile in the same freshwater crustacean testing three different pharmaceuticals: triclosan (μg/L), carbamazepine, and ibuprofen (mg/L), at sublethal concentrations after 48 h exposure. Triclosan is used for impeding bacterial growth by inhibiting enzymes involved in fatty acid synthesis, carbamazepine is a sodium channel blocker used for the treatment of epilepsy and neuropathic pain for its effects on serotonin systems, while ibuprofen is a nonsteroidal anti-inflammatory drug inhibiting cyclooxygenase enzymes involved in prostaglandins synthesis. Kovacevic and coauthors analyzed adult daphnid metabolites by 1H-NMR, and alterations in amino acid content as well as in sugar glucose levels were observed according to a concentration-dependent relationship between daphnids’ metabolic responses and drug exposure concentrations, reflecting changes at organ, organismal, and population levels. In light of the foregoing, the freshwater consumer D. magna seems to be a very sensitive aquatic bioindicator for the evaluation of the metabolomic responses to many environmental pollutants also considering the advances used in analytical metabolomic techniques.

Other nontarget sentinel species belonging to a higher level of the food aquatic chain are represented by marine and freshwater fish. These organisms were among the earliest organisms used in environmental metabolomics thanks to their similar biochemical mechanisms in comparison to humans, in response to pharmaceuticals. In fact, in 2006, Samuelsson et al., using 1H-NMR, studied the effects of the synthetic contraceptive estrogen ethinylestradiol on the rainbow trout Oncorhynchus mykiss (11 months old) and observed alterations in the plasma metabolite profile at 10 ng/L with a strong induction of the lipoprotein vitellogenin synthesis. Furthermore, Mills et al. (2016) explored physiological responses to the endocrine-active pharmaceutical Tamoxifen in adult fish Tautogolabrus adspersus using the gas chromatograph coupled to a time-of-flight mass spectrometer (GC/ToF-MS). Thus, Mills et al. observed high levels of proline, threonine, alanine, lysine, tyrosine, and tryptophan, and found down-regulated metabolites involved in amino acid synthesis and metabolism, phospholipid synthesis, glucoronidation, and glycolysis, proving that T. adspersus could represent a sensitive nontarget organism, useful for studying metabolomics perturbations after exposure to pharmaceuticals. As reported in scientific literature, fish have been used not only to observe the metabolomics responses of estrogen-like molecules, but also to study metabolomics alterations caused by the exposure to antibiotics. In fact, De Sotto et al. (2017) studied environmental effects of 0.1 mg/L of clarithromycin, florfenicol and sulfamethazine, individually and in mixtures, on adults of Danio rerio after 72 h exposure using high-performance liquid chromatography with quadrupole time-of-flight (QTOF) mass spectrometer. When clarithromycin and florfenicol were tested individually, they were able to yield more metabolites than those found for sulfamethazine, and the most affected pathway was the metabolism of purines, especially guanosine involved in protecting neurons against excitotoxic damages. The similarity between clarithromycin and florfenicol could be explained thanks to the same mode of action of these two antibiotics, which inhibit protein biosynthesis interacting with 50 S subunit in nontarget organisms. Surprisingly, when De Sotto et al. (2017) tested antibiotics in mixtures, a small amount of metabolites was observed, probably due to antagonistic interactions. In line with the scientific literature taken into account here, Danio rerio is a good model in environmental metabolomics to identify the effects of pharmaceuticals, due to its similarity to human metabolism and its ease in absorption of small molecules through skin and gills.

In conclusion, scientific interest is constantly increasing in the wide field of environmental metabolomics, a very useful approach to understand the impact of various environmental xenobiotics in nontarget organisms, through different analytical platforms. In the last years, it has been applied to evaluate metabolic changes in different aquatic organisms of the trophic chain after pharmaceutical exposure (only few scientific papers to date); to the best of our knowledge, no studies on anticancer drugs exist at the moment. Since the consumption and the administration of this class of pseudo-persistent pharmaceuticals are increasing as also their occurrence in the aquatic systems, it would be advisable to use metabolomic strategies to understand anticancer drugs’ environmental toxic effects.