1 Introduction

Focusing on the systematic analysis of cellular behaviour at molecular level, metabolomics emerged as a powerful tool capable of screening a large number of metabolites in biological samples and providing valuable physiological information on numerous biological systems (Baker 2011). Since the metabolite levels depend on the properties and activity of enzymes, the metabolomic data reflect the regulation of transcription and translation processes, the regulation of protein–protein interactions, and the allosteric regulation of enzymes and their interactions with metabolites (Villas-Bôas et al. 2005b). Therefore, metabolomics data is considered the real molecular phenotype of a cell. Consequently, metabolomics tools are gaining tremendous popularity in different fields of life sciences; from the discovery of biomarkers (Liu et al. 2010), diagnostic and biochemical characterization of diseases (Nishiumi et al. 2010; Asiago et al. 2010), to the phenotyping of microbial (Mas et al. 2007, Villas-Bôas et al. 2008) and plant (Hirai et al. 2010) mutants and the characterization of enzyme activity (Baker 2011; Goldstone et al. 2009).

However, it is consensus between researchers today that there is no single platform or method to analyze the whole metabolome of a cell due to mainly the wide dynamic range of metabolites in biological samples coupled to their large chemical diversity. Whilst for the analysis of proteins (proteome) and mRNAs (transcriptome) one analytical platform will usually suffice; the metabolomics community rely on a suite of sample preparation and detection techniques (Baker 2011). In recent years, we have observed a remarkable advance in detection technologies for analysis of metabolites. Several hyphenated technologies coupling separation (e.g. chromatography or electrophoresis) with powerful detectors (e.g. mass spectrometry and nuclear magnetic resonance) have evolved that allow us to detect and often identify dozens to hundreds of compounds in a single analysis (Dunn et al. 2011). In addition, novel approaches using mass tags or isotope-labelling techniques permit us to also quantify a large amount of metabolites present in a biological sample (Bennett et al. 2008; Dunn et al. 2011).

Unfortunately, improvements of the analytical techniques were not followed by similar improvements in methods for sample preparation. Instead, different research groups use different methods for sampling, quenching of metabolism and extraction of metabolites. The choice of methods seems to be guided by the classes of metabolites being targeted by the study as well as by the biological material in study. Nonetheless, there are some important works addressing the issue of optimization of sampling and quenching (Villas-Bôas et al. 2005a; Faijes et al. 2007; Villas-Bôas and Bruheim 2007; Winder et al. 2008) as well as the extraction of intracellular metabolites (Villas-Bôas et al. 2005a; Faijes et al. 2007; Winder et al. 2008; Canelas et al. 2009; van Gulik 2010; Shin et al. 2010; Dietmair et al. 2010; Gromova and Roby 2010; El Rammouz et al. 2010). The need for quenching the cell metabolism during sampling for metabolome analysis appear to be widely in practice nowadays, which was not the case in the early days of metabolomics; and several works have reported the different degrees of efficiency of different methods for extraction of intracellular metabolites (Maharjan and Ferenci 2003; Villas-Bôas et al. 2005a; Faijes et al. 2007; Rabinowitz and Kimball 2007; Winder et al. 2008; Canelas et al. 2009; Shin et al. 2010; Dietmair et al. 2010; Gromova and Roby 2010; El Rammouz et al. 2010 among others). However, these studies were mainly focused on a single cell type and aimed at identifying which protocol extracts the larger number of metabolites with minimal degradation products.

The extraction of intracellular metabolites is a fundamental step in most metabolomics studies involving cells and tissues (except for the analysis of exometabolome (Kell et al. 2005)). To achieve an efficient extraction of intracellular metabolites, the cell wall (when present) and membrane need to be permeabilised and the metabolites are, then, extracted by an extraction agent (Villas-Bôas 2007). The extracting agent is usually one or more organic solvents (Villas-Bôas 2007) or a combination of organic solvent with water or buffer (Villas-Bôas 2007). Pure boiling water has also been proposed as an extracting agent for intracellular metabolites (Canelas et al. 2009). Polar solvents such as methanol, water, methanol–water mixtures, or ethanol; extract mostly polar metabolites, and non-polar solvents such as chloroform, ethyl acetate, or hexane extract lipophilic components (Villas-Bôas 2007). Boiling solvents, acid and alkaline solutions are also used to extract intracellular metabolites from cells (Canelas et al. 2009; Villas-Bôas 2007). Some sophisticated methods even involve microwave heating, ultrasonic vibration or supercritical fluids (Villas-Bôas 2007). Nonetheless, due to the high chemical diversity and the wide dynamic range of metabolite intracellularly, there is no single extraction method capable of completely extracting the whole cell metabolome. Therefore, a good extraction method should be able to extract the maximum number of metabolites in their original state (e.g.; avoiding chemical degradation), quantity (e.g.; avoiding losses) and ratio (e.g.; the relative proportion a metabolite in relation to the others).

While the information that different extraction methods yield different number of metabolites is important, it poses question about the real level of metabolites inside the cells. Therefore, are the methods that yield the maximum number of detected analytes the most appropriate to be used? Or the fact that some extraction methods could potentially produce metabolites via degradation of biological polymers such as proteins, nucleic acids and polysaccharides should be considered? It is well-known that sonication (Ershov 1998; Clark and Christopher 2000; Wu et al. 2008), heat (Ershov 1998; Clark and Christopher 2000) and alkali/acid (Han et al. 1983; Marcus 1985; Oliyai and Borchardt 1993; Ershov 1998; Clark and Christopher 2000) are capable of hydrolysing proteins and polysaccharides. But above all questions, one is essential to be answered in order to make metabolomic findings reliable: is it possible to reach the same biological interpretation by using different protocols for extraction of intracellular metabolites? If the answer is yes, then the different protocols for extraction of intracellular metabolites can be used as long as the same protocol is employed in all studies to allow proper comparisons. But if the answer is no, then the sample preparation procedure in metabolomics has to be rethought.

To answer this major question, the efficiency of four different methods for extraction of intracellular metabolites: (i) Boiling ethanol, (ii) cold methanol–water solution coupled to freeze–thaw cycles (referred to as ‘freeze–thaw’), (iii) pure cold methanol, and (iv) pure cold methanol coupled to sonication were compared using identical replicate samples of four different microbial cell types (Escherichia coli as Gram negative bacterium, Enterococcus faecalis as Gram positive bacterium, Saccharomyces cerevisiae as yeast, and Aspergillus sp. as filamentous fungus). The physiological state of each organism was, then, characterized using the pathway activity profiling (PAPi) tool based on the metabolite profile generated by each method (Aggio et al. 2010).

2 Materials and methods

2.1 Chemicals

Methanol, ethanol, HEPES, sodium hydroxide, chloroform and sodium sulphate used for metabolite extractions and chemical derivatization were all of analytical grade and purchased from different suppliers. The derivatization reagent methyl chloroformate (MCF), pyridine and the isotope-labelled internal standard l-alanine-2,3,3,3-d4 were obtained from Sigma–Aldrich (St. Louis, MO, USA).

2.2 Microbial strains

Escherichia coli W3110 and Enterococcus faecalis AR96/0360 were maintained on nutrient agar plates containing (peptone, glucose, agar) at 37°C. Saccharomyces cerevisiae VIN13 and an unidentified species of Aspergillus sp. were maintained on YPD (yeast extract, peptone, dextrose) agar plates at 30 and 25°C respectively.

2.3 Microbial cultures

Escherichia coli and E. faecalis were cultured in two 500 ml shake-flasks containing 250 ml nutrient broth (5 g l−1 peptone and 1 g l−1 d-glucose) incubated at 37°C. S. cerevisiae and Aspergillus sp. were cultured in 1.5 l shake-flasks containing 500 ml YPD broth (6 g l−1 yeast extract, 5 g l−1 peptone and 10 g l−1 d-glucose) incubated at 30°C (S. cerevisiae) and 25°C (Aspergillus). All microbial cells were grown aerobically using a rotary shaker at 200 rpm.

2.4 Quenching of microbial cultures

The four microbial cultures were quenched at late exponential growth phase using the cold glycerol-saline method described by Smart et al. (2010). 6 × 50 ml of each microbial culture were rapidly (~1 s) transferred (by pouring) to pre-cooled 250 ml graduated centrifuged flasks containing 200 mL of cold quenching solution held in a refrigerated ethanol bath (Grant Instruments) at −23°C. The quenched samples were quickly manually mixed followed by 5 min acclimatization in the cold bath. The samples were centrifuged at 36,086 × g for 35 min at −20°C using a refrigerated centrifuge (Sorvall Evolution RC, Thermo Fisher Scientific). The supernatant were then removed, and the cell pellets were resuspended in 12 ml of cold washing solution (−23°C), pooled together and mixed until complete homogenization using a vortex. The pooled cell suspension of each microorganism was subdivided into 20 × 2 ml replicate samples, followed by a second centrifugation at 36,086 × g for 30 min at −20°C. The supernatants were discarded and the cell pellets (identical replicates) were subjected to different intracellular metabolite extraction.

2.5 Extraction of intracellular metabolites

Four different protocols for extraction of intracellular metabolites were tested using five identical replicate samples of each microbial species (coming from the same pool of quenched samples). Before each extraction procedure started, the samples were spiked with 20 μl of internal standard solution per sample (l-alanine-2,3,3,3-d4 10 mM).

2.5.1 Method 1 (M1): boiling ethanol

We followed the procedure described by Gonzalez et al. (Gonzalez et al. 1997). A solution of 75%(v/v) boiling ethanol–water solution containing 0.25 M of HEPES at pH 7.5 was poured directly on the cell pellets. The mixture was incubated for 3 min at 80°C. After cooling down the cell suspension in an ice bath for 3 min, the volume was reduced by evaporation at 45°C using a speed vacuum apparatus (Speed Vac® Plus, Savant Instruments, Inc., Holbrook, NY, USA). The residue was resuspended to a final volume of 3 ml using bidistilled water and centrifuged for 10 min at 15,543 × g (4°C) to remove insoluble particles. The supernatant was stored at −80°C for further analysis.

2.5.2 Method 2 (M2): freeze–thaw cycles

We followed the procedure described in Smart et al. (2010). The cell pellets were resuspended in 2.5 ml of cold methanol–water solution (50%v/v) at −30°C. Each sample was mixed vigorously using a vortex mixer for 1 min. The mixed samples were then frozen at −80°C and subjected to three freeze–thaw cycles with 1 min of vigorous shaking using a vortex mixer between each cycle. After the third freeze–thaw cycle, the samples were centrifuged at 36,086 × g for 20 min at −20°C and the supernatant was collected and stored at −80°C. Another 2.5 ml of cold methanol–water solution (50%v/v) at −30°C were added to cell pellets and each sample was resuspended using a vortex mixer for 30 s, followed by another centrifugation at 36,086 × g for 20 min at −20°C. The supernatant was collected and pooled with the first one and kept at −80°C for further analysis.

2.5.3 Method 3 (M3): pure methanol

We slightly modified a procedure described in Villas-Bôas et al. (2005a). The cell pellets were resuspended in 2.5 ml of cold pure methanol at −30°C, followed by vigorous shaking using a vortex mixer for 1 min. The samples were then kept in a cold bath at −30°C for 20 min, followed by another vigorous shaking using a vortex mixer for 1 min. The samples were then centrifuged at 36,086 × g for 20 min at −20°C. The supernatant was collected and stored at −80°C. The extracted pellet was then resuspended in another 2.5 ml of pure cold methanol (−30°C). The resuspended samples were centrifuged at 36,086 × g for 20 min at −20°C, and the supernatant was collected and pooled with the first one. The samples were stored at −80°C until further analysis.

2.5.4 Method 4 (M4): pure methanol coupled to sonication

We followed the same protocol described for method 3 but instead of keeping the samples in cold bath at −30°C for 20 min we subject the samples to sonication for 1 min in an ice bath using an ultrasonic liquid processor (Sonicator 3000, Misonix Inc, Newtown, CT, USA) operating at 20 kHz.

2.6 Metabolite analysis

The cell extracts containing intracellular metabolites were freeze-dried using a 12 l Labconco Freeze Dryer (Labconco Corporation). The extracts containing large volume of organic solvent were diluted with bidistilled water to keep the concentration of organic solvent below 20%. The freeze-dried solids were resuspended in 200 μl of sodium hydroxide solution (1 M) and derivatized according to our standard laboratorial procedure (Smart et al. 2010). The derivatized samples were analyzed using a GC–MS system (GC7890 coupled to a MSD5975, Agilent technologies), with a quadrupole mass selective detector (EI) operated at 70 eV. The column used for all analyzes was a ZB-1701 (Phenomenex), 30 m × 0.25 mm (internal diameter) × 0.15 mm (film thickness), with 5 m guard column. The MS was operated in scan mode (start after 6 min; mass range 38–650 a.m.u. at 1.47 scans/s (28).

2.7 Data analysis and statistical validation of the data

The main objective of this study was to compare the efficiency of different methods for extraction of intracellular metabolites and determine their impact on the biological interpretation of the metabolite profiles generated. Therefore, we used the GC-peak heights of the MCF derivatives to determine the relative concentrations of each detected metabolite in the samples. An absolute quantification of each metabolite is an unnecessary procedure for a strictly comparative study. All comparisons of metabolite levels were based on data from samples quenched from the same culture flask at the same growth phase and homogenised before extraction. Therefore, all sample replicates were identical in terms of metabolite composition. All data were first normalized by the amount of biomass (dry-weight) of each sample as well as by the peak height of the internal standard (l-alanine-d4). We used five technical replicates (identical samples individually extracted) to test each extraction method, which corresponded to a total of 20 samples from each microorganism.

Hierarchical clustering analysis of all normalized fragment masses generated by GC–MS (heat-maps) was performed using the software Genespring MS1.2 (Agilent Technologies). Statistically significant fragment masses above the signal-to-noise ratio and identified metabolites were determined by univariate analysis of variance (ANOVA) using either the software Genespring MS1.2 (Agilent Technologies) or R 2.9.0 software (http://www.r-project.com). The data was log-transformed to fit the normal distribution criteria. Hierarchical clustering analysis and principal component analysis were performed using only statistically significant GC–MS mass fragments using Genespring MS1.2 (Agilent Technologies).

In order to identify metabolites in our samples, we used the Automated Mass Spectral Deconvolution and Identification System (AMDIS) coupled to an in-house MS library of metabolite standards. The identification was achieved based on the retention time and mass spectrum of the analytes (>90% MS similarity). AMDIS is a software freely distributed by the National Institute of Standards and Technology and has been largely applied to metabolomics. Although AMDIS algorithm is considered powerful in deconvoluting and identifying chromatographic peaks, it produces some inaccuracies in relation to compound quantification. Therefore, we used an in-house R (www.r-project.org) script which recalculates the correct intensity of each compound previously identified by AMDIS. A detailed description of this process and our in-house R script can be found in Smart et al. (2010).

The profile of identified metabolites generated by each extraction method was normalized as described above and log-transformed prior to pathway activity profiling (PAPi) analysis according to Aggio et al. (2010). Only pathways with statistical significant activity score (ANOVA) were used for comparison. The cut-off value was 0.005.

3 Results

3.1 Raw data: total mass fragment ions obtained by GC–MS analysis

To visualize possible differences in the chemical composition of each set of samples extracted by the different methods, we built heat-maps based on the total collection of mass fragments ions generated by the GC–MS analysis of each sample (Fig. 1). For each sample, the corresponding mass fragment profile was built in relation to chromatographic retention times and mass-to-charge ratio (fragment size) of each fragment mass. Considering only masses above the noise threshold level (S/N 50:1), we observed that the total number of detected fragment masses increased with the complexity of the cell type. We detected a maximum of 402 masses in E. coli samples, 639 in E. faecalis, 932 in S. cerevisiae and 1141 in Aspergillus sp. samples. Within each organism, mass profiles were compared and regrouped according to their similarities (retention time and size) using hierarchical clustering analysis, generating the heat-maps (Fig. 1).

Fig. 1
figure 1

Heat-maps of mass fragment ions generated by GC–MS analysis of intracellular metabolites of four different organisms. The heat-maps were generated by hierarchical clustering analysis (HCA) of all significant mass fragment ions generated by GC–MS analysis of metabolites extracted by four different methods and derivatized by methyl chloroformate (MCF). Each column represents an individual sample and each row a mass fragment ion generated by GC–MS. In the rows, dark colour indicates high abundance of a specific mass fragment ion and light colour the absence of a specific mass fragment ion. Note that samples from the same data class (extraction method) clustered together. a Escherichia coli; b Enterococcus faecalis; c Saccharomyces cerevisiae; and d Aspergillus sp. M1 boiling ethanol method, M2 freeze–thaw method, M3 pure methanol method, and M4 pure methanol coupled to sonication method

Firstly, we observed that independently of the extraction method and cell type, we obtained a reproducible profile of metabolites considering the five technical replicates used to test each extraction method (Fig. 1). This reproducibility is further confirmed by principal component analysis based on the statistically significant mass fragments (Fig. 2). Despite being reproducible, each method clearly yielded a distinct profile of masses (Fig. 1). Large groups of masses were detected in samples extracted by one method and were totally absent in other method (or vice versa). Based on the heat-maps (Fig. 1) it appears that pure methanol and pure methanol coupled to sonication shared most extracted masses, even though we could still identify some mismatch, mainly considering E. faecalis samples (Fig. 1b). Interestingly, both methods employed methanol as the extracting solvent, which may explain why their profile of masses were similar. However, sonication had a more pronounced effect on extraction of intracellular metabolites of E. faecalis samples.

Fig. 2
figure 2

Two-dimensional projections of principal component analysis (PCA) of mass fragment ions generated by GC–MS analysis of intracellular metabolites of four different organisms. Each colour represents a data class (extraction method). The majority of technical replicates (dots) clustered close to each other and clearly distinguished the different data classes (extraction methods). a Escherichia coli, b Enterococcus faecalis, c Saccharomyces cerevisiae; and d Aspergillus sp. M1 boiling ethanol method, M2 freeze–thaw method, M3 pure methanol method, and M4 pure methanol coupled to sonication method

The differences in mass profile generated by the different extraction methods were further confirmed by principle component analysis (Fig. 2). Replicate samples coming from the same organism and same cultivation flasks clustered completely apart according to the method used to extract intracellular metabolites. The differences in the profile of fragment masses explained by the first principal component ranged from 37.2% (in E. faecalis samples) to 56.4% (in E. coli samples) (Fig. 2). Principal component analysis also confirmed that extraction with pure methanol and pure methanol coupled to sonication presented the most similar metabolite profiles (except for E. faecalis) as stated above (Fig. 2).

This method-specific clustering of samples is mostly explained by the quantitative compositional differences between the sample groups. In fact, a non-neglectable proportion of fragment masses appear to be method-specific. In E. coli samples, from a total of 113 masses detected at statistically different levels (ANOVA) across all samples, nineteen were method-specific: nine for boiling ethanol, four for freeze–thaw, one for pure methanol and five for pure methanol coupled to sonication. However, when combining samples extracted with pure methanol with samples extracted with pure methanol coupled to sonication, a total of 53 masses appeared only detected in samples extracted by those two methods. But, for E. faecalis samples, pure methanol coupled to sonication seems to have extracted a significant larger number of compounds compared to pure methanol only (Fig. 1b).

3.2 Profile of identified metabolites

Although we have detected hundreds of metabolites using our GC–MS platform for metabolite analysis (Smart et al. 2010), we could only accurately identify using our in-house mass spectra library, 26 metabolites in E. coli samples, 43 in E. faecalis, 35 in S. cerevisiae and 46 in Aspergillus sp. samples (Fig. 3). Similarly to the profile of all fragment masses generated by GC–MS, a significant proportion of the identified metabolites appeared to be method-specific (Fig. 3). But the method-specificity of these metabolites appears also to be specific to cell types. For instance, the amino acid tyrosine was only detected in E. faecalis samples extracted by boiling ethanol, whilst this amino acid was detected in all S. cerevisiae samples (independently of the extraction method used) as well as in E. coli samples extracted by pure methanol and pure methanol coupled to sonication (Fig. 3). The nucleotide NADH, on the other hand, has been detected only in E. coli and Aspergillus sp. samples extracted by freeze–thaw cycles, which may suggest that this could be a better method to extract this metabolite.

Fig. 3
figure 3

Venn visualization and bar graphs of relative abundance of the different identified metabolites comparing four different extraction methods and four organisms. a Escherichia coli, b Enterococcus faecalis, c Saccharomyces cerevisiae, and d Aspergillus sp. Red boiling ethanol method, blue freeze–thaw method, green pure methanol method, and orange pure methanol coupled to sonication method. Bar graphs show only statistically significant data (P < 0.05)

Nonetheless, considering the number of metabolites extracted, pure methanol couple to sonication appears to be the most efficient extraction method of those tested, closely followed by pure methanol. However, the freeze–thaw method appears to be as efficient as pure methanol coupled to sonication for S. cerevisiae samples (Fig. 3c), but less efficient than pure methanol for the other cell types. Thus, boiling ethanol appears to be the least efficient of the four methods tested regarding to the number of metabolites extracted. Therefore, at least for the cell types tested, the intracellular metabolite profile seems to be more dependent on the solvent used to dissolve the metabolites during extraction than on physical forces to enhance cell disruption (heat/sonication).

On the other hand, it is important to emphasize that several metabolites were commonly extracted by all extraction methods (Fig. 3), but their relative concentrations varied considerably depending on the extraction method. Within the metabolites commonly extracted by all four methods, more than 90% of them were detected at significantly different level when comparing extraction methods (P value < 0.05). Similarly, the rate of recovery of the internal standard 2,3,3,3-d4-alanine varied considerably depending of the extraction method (Fig. 4), but internal standard recovery also varied depending on the cell type being extracted. For instance, the freeze–thaw method was the one which best recovered the internal standard in E. coli, E. faecalis and S. cerevisiae samples, while boiling ethanol and freeze–thaw method show similar recovery of the internal standard in Aspergillus samples. Interestingly, except for freeze–thaw method, the recovery of internal standard from bacterial samples was much poorer compared to fungal biomass. Also, sonication coupled to pure cold methanol extracted significantly less internal standard compared to pure cold methanol, which strongly suggests degradation of internal standard during sonication.

Fig. 4
figure 4

GC–MS peak intensities of the 2H-labelled internal standard 2,3,3,3-d4-alanine spiked in different microbial cell biomasses and extracted by different extraction methods. An identical amount of internal standard was spiked in each microbial sample before extraction and identical sample replicates were subjected to four different methods for extraction of intracellular metabolites (n = 5). a Escherichia coli, b Enterococcus faecalis, c Saccharomyces cerevisiae, and d Aspergillus sp. M1 boiling ethanol method, M2 freeze–thaw method, M3 pure methanol method, and M4 pure methanol coupled to sonication method

3.3 Biological interpretation of the different metabolite profiles

Since the metabolite profile generated in this work consisted of relative metabolite levels (not absolute quantification) normalised by internal standard and biomass concentration in each sample, the only mean for biological interpretation relies on comparative analysis. There are different ways to interpret metabolite profile data, which depend on the biological question being asked. To test the impact of extraction methods on the biological interpretation of metabolite profile data, we decided to artificially compare the physiological state of the different microbial species between themselves, by selecting two species for a pair-wise comparison. Although those comparisons carry no biological relevance, they allow us to determine if we can reach similar conclusions with different extraction methods.

Based on the hypothesis generating algorithm (PAPi), which explores metabolite profile data and the KEGG database to predict and compare metabolic pathway activities among different experimental conditions (Aggio et al. 2010), it is apparent that different extraction methods indeed lead to different biological observations regarding the metabolic state of the cells being analyzed (Fig. 5).

Fig. 5
figure 5

Comparative metabolic pathway activities of different organisms based on intracellular metabolite profiles generated by different extraction methods. The activity scores for each pathway were calculated using our Pathway Activity Profiling (PAPi) algorithm (Aggio et al. 2010). PAPi calculates for each metabolic pathway listed in KEEG database an activity score based on the number of metabolites identified from each pathway and their relative abundances. As a result, the activity score represents the likelihood that a metabolic pathway is active inside the cell and, consequently, allows the comparison of metabolic pathway activities using metabolite profile data. Only pathways with contradictory results between extraction methods are shown in this figure. M1 boiling ethanol method, M2 freeze–thaw method, M3 pure methanol method, and M4 pure methanol coupled to sonication method

We compared the pathway activity profile of the following two groups: (i) E. coli versus E. faecalis and (ii) S. cerevisiae versus Aspergillus sp. Although several observations (metabolic pathway activity) have been commonly identified by using data generated through any of the four extraction methods, many of them were biologically contradictory when comparing two organisms (Fig. 5). From a total of 27 metabolic pathways predicted to be active using data obtained from both organisms of group (i) and 39 of group (ii), 9 and 8 of them, respectively, showed contradictory results (P value < 0.05) (Fig. 5). For instance, the aminoacyl t-RNA biosynthesis pathway was predicted to be up-regulated in E. coli when compared to E. faecalis if using data obtained from boiling ethanol and freeze–thaw extractions (Fig. 5a). However, if data obtained from pure methanol and pure methanol coupled to sonication extractions are used; this pathway appears to be actually down-regulated in E. coli when compared to E. faecalis (Fig. 5a). Another contradictory example concerns the pyruvate metabolism in the group (ii) (Fig. 5b). This pathway was predicted to be down-regulated in S. cerevisiae when compared to Aspergillus sp. if data obtained from pure methanol couple to sonication and boiling ethanol extractions were used (Fig. 5b). The same pathway appeared up-regulated in S. cerevisiae compared to Aspergillus sp. when using data obtained from pure methanol and freeze–thaw extractions (Fig. 5b). Other pathways with contradictory comparative activities are shown in Fig. 5.

The activity of some metabolic pathways could also only be predicted when using data generated from specific extraction methods (Fig. 6). This was expected considering that some metabolites were detected only using specific methods. In E. coli for instance, both pathways for fatty acid and glycerolipid metabolism have been predicted to be active only when using data from samples extracted with boiling ethanol and pure methanol coupled to sonication (Fig. 6). Another example is the pathway for thiamine metabolism which has been predicted to be active in E. faecalis when using samples extracted by all methods except pure methanol couple to sonication. Similarly, the calcium signalling pathway has been predicted to be active in Aspergillus sp. only when using data obtained from freeze–thaw extraction method.

Fig. 6
figure 6

Metabolic pathways which activity was predicted only when using intracellular metabolite profiles generated by specific extraction methods. M1 boiling ethanol method, M2 freeze–thaw method, M3 pure methanol method, and M4 pure methanol coupled to sonication method

4 Discussion

The cell metabolome is a direct reflection of its metabolic state (Villas-Bôas et al. 2005b). However, our results seriously question our current ability to obtain an accurate profile of cell metabolites that truly reflects the original state of the small molecules inside the cells; mainly regarding their concentration (relative or absolute) and ratio. It is acceptable the idea of not being able to detect and/or identify every single metabolite in a cell. However, we should at least be able to accurately measure the level and ratios of those metabolites we can detect and identify in our samples. Our results strongly suggest that this is possibly not the case with methodology currently in use.

Our study clearly shows that each extraction method generates a specific and unique profile of intracellular metabolites, regardless of the biological material used (Figs. 1, 2 and 3). The polarity of the solvent used to dissolve the metabolites seems to be the major factor influencing the selectivity of each method. That is evidenced by the similarities in the metabolite profiles of most samples extracted by both pure methanol and pure methanol coupled to sonication (Figs. 1, 2 and 4). However, the influence of the nature of the biological matrix being extract cannot be disregarded as evidenced by the differential recovery of the internal standard by different methods and from different cell types (Fig. 4). Overall, most less-polar metabolites appear to have been better extracted by pure methanol and pure methanol coupled to sonication methods (i.e.; oleic acid, stearic acid, myristic acid, caprinic acid). Boiling ethanol, on the other hand, seems to extract better more polar compounds (i.e.; γ-aminobutyric acid, aspartic acid, glycine, lactic acid). Although boiling ethanol extraction makes use of 70% ethanol solution in water as extracting solvent, the extracted metabolites are further re-suspended in pure water after solvent evaporation, which explains why more polar compounds were favoured.

Moreover, when combining physical forces to chemical extraction, it is possible that labile metabolites and large molecules could be degraded (Han et al. 1983; Marcus 1985; Oliyai and Borchardt 1993; Ershov 1998; Clark and Christopher 2000; Wu et al. 2008). While labile metabolites would have their concentration reduced, the degradation of large molecules would increase the level of some metabolites such as amino acids and peptides originated from protein degradation, or mono- and disaccharides from polysaccharide degradation. Thus, when comparing the efficiency and the scope of pure methanol with pure methanol coupled to sonication methods, for instance; it is unwise to assume that all compounds only extracted by sonication are genuinely intracellular metabolites and not false positives generated by ultrasonic degradation of large molecules (Ershov 1998; Clark and Christopher 2000; Wu et al. 2008). Indeed, our heat-maps generated with GC–MS raw data suggest that a number of masses were only detected in samples extracted by either boiling ethanol or pure methanol couple to sonication (Fig. 1), although method-specific masses were also seen for the other two methods. Whist this does confirm that degradation of macromolecules is occurring when physical forces are employed, we cannot eliminate this possibility.

Considering the chemical principles behind solvent extraction of intracellular molecules, it is not surprising that each extraction method generates a different profile of metabolites, because all solvent extraction methods distribute the analytes into two phases according to the distribution constant of each analyte being extracted, which depend on their solubility, the temperature of extraction as well as the relative volume of both phases (Villas-Bôas 2007). The extraction rate is, then, based on the kinetics of molecule migrations, which mainly depend on the temperature, diffusion rates between the two phases, and solvent access to the intracellular medium (Villas-Bôas 2007). Therefore, each solvent system will have a distinct distribution constant for each metabolite due to the different solubility of the different molecules as well as the different extraction temperatures. In order to preserve the chemical integrity of labile metabolites and avoid further biochemical reactions, solvent extractions of metabolites are usually performed at very low temperatures (below 0°C), which also impacts on the solubility of many metabolites in the extracting solvent. Considering the distribution constant of each analyte in the given solvent and the extraction rate principles discussed above, the differential composition of metabolites in the different extracts suggests that the extractions were not carried out until completion. Therefore, an alternative to improve extraction efficiency would be to increase the number of washing steps or to combine different solvents in sequential steps, pooling all the extracts together at the end of extraction. For instance, the polarity of solvents could either increase or decrease in a sequence of 2, 3 or 4 steps, while keeping low temperature to preserve the chemical integrity of the metabolites. In addition, physical forces such as heating and sonication should be avoided.

5 Conclusion

In this comparative study we have shown that distinct intracellular metabolite profiles are obtained depending on the method used for extraction of intracellular compounds. The polarity of the solvent used for extraction of metabolites seems to be the major factor behind this differential composition of metabolites in the resulting extracts, but the influence of the biological matrix cannot be eliminated. The different metabolite profiles obtained by different extraction methods impact negatively on the biological interpretation of the results. Contradictory observations can be made based on metabolite profiles coming from the same samples but extracted by different protocols.

Despite metabolomics being a hypothesis-generating tool where analytical accuracy is often compromised in order to gain throughput and scope (Villas-Bôas et al. 2007), it is important that every step in the analysis is of minimally acceptable analytical standard. While we cannot avoid false negatives in our methodology, it is a sine qua non condition in metabolomics to have a reliable recovery of detected metabolites. Therefore, the metabolomics community must seriously pursue more efficient alternatives for extracting intracellular metabolites, and, most importantly, standardize sample preparation protocols for metabolomics. This study strongly suggests that a more powerful extraction method would involve sequential extraction of a sample with different solvent polarity while maintaining the extraction temperature low to avoid chemical degradation of metabolites. However, further validation of this hypothesis has yet to be carried out.