1 Introduction

The field of metabolomics is highly interdisciplinary. It systematically investigates metabolite profiles in a biological system or systems (i.e., cell, tissue, organ, biological fluid or organism) (Brown et al. 2005). When properly carried out, metabolomics experiments and their associated data can reveal the state and condition of an organism at a specific point in time. This provides valuable information on a biological response to internal and external perturbations, such as growth, genetic modifications, disease, environmental effects etc. (Dettmer et al. 2007; Vinaixa et al. 2016). In combination with other omics-based data (e.g. genomics, transcriptomics and proteomics) as part of a systems biology based approach, metabolomics can improve our understanding of complex cellular pathways and biological mechanisms (Beale et al. 2016). Consequently, it can assist in achieving better management outcomes relating to: human health and disease (Beale et al. 2017; Kumarasingha et al. 2016; Robinson et al. 2016; Vemuri et al. 2018), food quality and crop production (Beleggia et al. 2011), and assessing/managing the impacts of anthropogenic pollution (Bundy et al. 2009; Beale et al. 2018a; Gyawali et al. 2016) amongst many other uses.

Metabolomics analyses comprise one, or more, of several analytical techniques combined with, bioinformatics for the qualitative or (semi)quantitative identification of metabolites providing a detailed, biologically relevant interpretation of the results. The choice of technique is often dependent on the experimental objective and sample type/matrix being investigated. Previously it was common for two or more independent or hyphenated techniques to be used in order to achieve a wide-ranging profile of metabolites. Gas or liquid chromatography hyphenated to mass spectrometry (GC–MS or LC–MS) and nuclear magnetic resonance (NMR) spectroscopy are the most commonly employed analytical platforms for metabolic profiling within the scientific literature. In recent years, many groups have focused their efforts in specializing in just one technique in order to try and extend our capability as a community in the field of metabolomics. It should be noted, metabolomics is not limited to GC–MS, LC–MS and NMR. Any technique that can provide comprehensive coverage of the ‘metabolome’ can be applied. A pertinent example is capillary electrophoresis (CE), which combined with mass spectrometry (CE–MS), has matured into a promising tool; particularly for the study of polar ionogenic metabolites (Zhang et al. 2017). However, metabolite libraries for this method are currently lacking compared to some of the more common techniques (i.e., GC–MS). As such, the focus herein is to review the recent advancements of GC–MS applications related to metabolomics-based research, which is one of the most efficient and well used analytical platforms in the field to date.

GC–MS is generally considered a versatile analytical platform (Tsugawa et al. 2011). This is due to its robustness, excellent separation capability, selectivity, sensitivity and reproducibility (Villas-Bôas et al. 2005; Koek et al. 2011; Mastrangelo et al. 2015). The two main forms of ionization used in GC–MS are electron ionization (EI) and chemical ionization (CI). To date, most GC–MS methods in metabolomics use EI. The availability of several mass spectral databases/libraries corresponding to EI-based GC–MS helps make it the method of choice for many analysts. Other advantages include: ease of use (in terms of analyses time and operating costs), and its capability to provide insight into compound identification. The later can be achieved via a low end GC system coupled with a single quadrupole MS detector. Something that is lacking with an equivalent entry level, single quadrupole LC–MS system. GC–MS somewhat avoids problems common to LC–MS such as matrix effects and ion suppression by co-eluting compounds, and thus achieves greater chromatographic resolution (Gowda and Djukovic 2014; Mastrangelo et al. 2015). However, one inherent limitation in GC–MS is that it can be used only to separate and identify low molecular weight (ca. 50–600 Da) and volatile compounds. For the detection of polar, thermolabile, non-volatile metabolites, the use of chemical derivatization is required prior to analysis. This improves volatility, thermal stability, sensitivity, and detector response (Poojary and Passamonti 2016). Such an approach does bring its own problems in that one is measuring the derivative as a proxy for the target compound. Care must therefore be taken to add enough derivatizing agent to the sample so that all of the target compounds are transformed to their respective derivatives without adding so much that the corresponding peak dominates/obscures others in the total ion chromatogram (TIC).

As eluded to already, GC–MS has been widely applied in metabolomics studies. Specifically, where there is an emphasis placed on metabolite profiling and quantification. In the last decade a number of review articles have been published associated with GC–MS based research, articles relating to plant science (Hong et al. 2016), medicine (Fearnley and Inouye 2016; Stringer et al. 2016), food science (Ibáñez et al. 2013; Scalbert et al. 2014), environmental science (Lankadurai et al. 2013), natural products chemistry and drug discovery (Cuperlovic-Culf and Culf 2016; Wishart 2016), and biotechnology (Mozzi et al. 2013; Simó et al. 2014). Koek et al. (2011) provided a comprehensive review on literature published prior to 2010, focusing on the status and perspectives of GC–MS based metabolomics. The purpose of this review is not to provide a comprehensive review or commentary covering the applications of solely GC–MS based metabolomics applications per se. Instead, the focus is to highlight and provide a review on recent developments in GC–MS techniques broadly applied in metabolomics, with specific emphasis on key steps within the GC–MS workflow so that researchers new to the field can obtain a broad overview and perspective of recent advancements and common pitfalls. Numerous metabolomics workflows have been published that try and capture similar themes. For example, Mastrangelo et al. (2015) presented a tutorial for GC–MS based untargeted metabolomics that included a workflow for sample preparation, analysis, data processing and biological interpretation of metabolic data. A number of alternate papers are also available describing similar strategies and protocols (Garcia and Barbas 2011; Datta et al. 2012). It is evident in all these processes that experimental and technical parameters such as sample preparation and extraction, sample extract derivatization, sample analysis and chromatography settings (columns, detectors and hyphenated systems) all play crucial roles in obtaining reliable metabolic data. In addition, bioinformatics analysis of any acquired data is equally important, and depending on the sample matrix, both the data acquisition and analysis approach will often vary. It is of the view of the authors herein that experimental design, sample selection and sample preparation are topics that have been extensively covered elsewhere (Villas-Bôas et al. 2005; Álvarez-Sánchez et al. 2010; Gu et al. 2011; Dunn et al. 2012). As such, these topics are not covered in any great detail in this review. However, it is noted that sample selection has a significant impact on selecting the most appropriate sample preparation strategy from specific sample types. Which sample or samples to choose is highly dependent on the aim and objective of a particular study. It is therefore not possible to formulate general all-purpose rules outside general ‘good science’. It is however the authors’ strong recommendation that bio-statisticians/bioinformaticians should be consulted early in the design stage, before any samples are taken and analyzed. This not only ensures sound experimental design but works towards safeguarding data quality and reliability. As a guide within the context of a typical GC–MS workflow, Fig. 1 illustrates the basis for the structure of this review; with specific focus provided on sample extract derivatization approaches for GC–MS analysis, different chromatographic strategies and data analysis/bioinformatics approaches.

Fig. 1
figure 1

Typical GC–MS based metabolomics workflow that provides the basis for structuring this review

2 Preparing samples for GC–MS analysis

The quality of any acquired metabolomics dataset is highly affected by the sample preparation approach undertaken (Villas-Bôas et al. 2005; Gu et al. 2011). It should be noted that no single sample preparation strategy is capable of encompassing all metabolites in any sample type (Álvarez-Sánchez et al. 2010). This is due to the fact that the metabolome is very complex and contains thousands of metabolites that are highly variable in terms of chemical diversity, polarity and molecular weight (Dunn and Hankemeier 2013). Metabolites concentrations also vary widely within a cell, from mg/mL to less than pg/mL. Further compounding the analytical challenge, in vivo metabolite turnover rates can also vary significantly. Since only very sensitive detectors are able to detect such metabolites in very low concentrations (Wishart 2011; Dias et al. 2016), a pre-concentration step is often necessary. Pre-concentration steps can include solid phase extraction (SPE), solid phase microextraction (SPME), liquid–liquid extraction or the incorporation of an evaporation/reconstitution step.

In metabolomics, the first step in any workflow is halting metabolically active cells and tissues by quenching biological samples (Dunn et al. 2011; Hernández Bort et al. 2014). Regardless of approach, all quenching steps are aimed to stop metabolism faster than the turnover of metabolites while minimising metabolite losses. Depending on the type of sample, the proceeding steps include the extraction of metabolites using appropriate solvents (Pinu et al. 2017). While this description is rather simplistic, it is by no means ‘simple’, however, the basic philosophy does hold true; metabolites and metabolism are in constant flux and metabolite concentrations are known to vary significantly in terms of their stability and transport (Villas-Bôas et al. 2005; Koek et al. 2011). Many secondary metabolites are unstable in the presence of oxygen and/or light, when removed from the cell, stored for long periods of time or under certain analytical conditions. This can cause a significant hurdle regarding their analysis. Similarly, there is no universal pre-analytical treatment of samples (e.g. quenching and extraction of metabolites) that allow us to determine thousands of metabolites, simultaneously.

Just as there is no single sample preparation strategy capable of extracting all metabolites, there is also no single analytical technique that is capable of detecting, identifying and quantifying all the possible metabolites that may be present (Wishart 2011, 2016). This is due to the extensive chemical diversity and variable nature of metabolites from the various types of biological material that can be sampled in any one study. For example, a targeted metabolomics approach that only focusses on specific groups of metabolites, the use of a single instrumental platform and/or a specific extraction protocol is often sufficient. However, for an untargeted metabolomics approach, comprehensive metabolome datasets cannot be obtained using the same strategy. Therefore, a combination of a variety of sample preparation protocols and multiple analytical instruments are suggested for a untargeted metabolome analysis (Duportet et al. 2012; Pinu et al. 2014).

Once sampling and quenching have been carried out, the next step is to extract the metabolites both from intra- and extracellular environments. This step can be quite complicated, especially for plant and bacterial cells or species with tough outer cuticles. Ideally, an extraction method should be reliable and reproducible, with suitable controls that account for the permeability of the cell envelope while allowing the release of intracellular metabolites (Villas-Bôas et al. 2005). This should be conducted without any unwarranted chemical and biochemical degradation. It should be noted, and is strongly recommended, in order to assess metabolite degradation (and technical variability), multiple internal standard(s) that cover a range of chemical classes should be spiked into the biological samples prior to extraction. Thus, losses during the extraction process can be corrected using isotopically labelled internal standard normalization techniques (Villas-Bôas et al. 2007). This involves using at least one isotopically labelled internal standard from each metabolite class present in the sample. It is noteworthy that studies with a larger number of samples often make use of pooled biological controls as a measure of quality assurance (QA) and quality control (QC). Furthermore, multiple biological and technical replicates are also analysed randomly throughout a sample batch/sequence to ensure the reproducibility of the extraction method/s used (Pinu et al. 2017; Broadhurst et al. 2018).

Disruption of the cell wall and envelope can be obtained by using a mechanical/physical process including homogenisation, sonication, heat (usually at physiological temperature) and pressure, or by using chemical agents (Villas-Bôas et al. 2007). Mechanical disruption of cells is beneficial for plant materials (e.g. grinding), however, an extraction process using appropriate solvents additionally needs to be carried out to determine metabolite levels. On the other hand, for most microbial cells, the mechanical disruption of the cell wall is not preferable. The mechanical disruption process can rupture the cells and allows cross contamination between intracellular and extracellular metabolites (Villas-Bôas et al. 2007).

The application of chemical agents (solvents) to lyse cells in order to extract intracellular metabolites is one of the most popular approaches in metabolomics studies (Duportet et al. 2012; Raterink et al. 2014). Depending on the partitioning coefficients, solubility, and solvent temperature, metabolites are dispersed into two distinct phases. A suitable solvent should have an excellent extraction rate and concentrate metabolites into a single phase. Extraction rates can be expedited by altering the extraction solvent temperature, therefore increasing the diffusion rate of a solvent so that it can penetrate the cell to extract endogenous metabolites (Villas-Bôas et al. 2007). Table 1 provides a summary of a range of solvents and extraction methods typically used. While selecting solvent and extraction conditions, one should consider both the type of cells analysed and the metabolite class of interest. There are many chemical extraction protocols which only aim to extract specific classes of metabolites such as amino acids, sugars, organic acids, volatile metabolites and fatty acids (Calingacion et al. 2012; Zhao et al. 2014; Wang et al. 2015). Although the main aim of ‘global’ metabolomics analyses is to extract and analyse as many metabolites as possible, most of the published extraction methods are still unable to cover all metabolite classes. Therefore, multiple extraction steps using both physical and chemical processes are recommended that allow the analysis of global metabolite profiles (Duportet et al. 2012).

Table 1 Comparison of extraction methods for non-targeted profiling of polar metabolites (Villas-Bôas et al. 2005; Hyotylainen 2013; Tulipani et al. 2013)

Many different extraction methods have been published in the recent literature for the analysis of the metabolome. For instance, acidic solutions (e.g. perchloric acid, hydrochloric acid and trihydrochloric acid) have been used at low temperature (4 °C) along with freeze–thaw cycling for extracting metabolites, and this method is suitable for the analysis of polar and acid-stable compounds (Faijes et al. 2007; Park et al. 2012). However, the hydrolysis of proteins and polymers also occurs during this process, thus the observed metabolite profile is not entirely accurate. Similarly, alkalis are being used for the extraction of metabolites mainly from yeast and filamentous fungi. The subsequent recovery of metabolites is poor and saponification of lipids occurs under alkali conditions, thus requiring neutralisation and salt removal steps (Villas-Bôas et al. 2007). The use of extremely cold solvents (e.g. cold methanol and glycerol solutions) is common for the extraction of metabolites from different types of cells. This is predominately due to a decrease in metabolite degradation potential at subzero temperatures. This method is suitable for thermolabile metabolites (Canelas et al. 2009; de Jonge et al. 2012). For instance, cold glycerol solutions (below < − 20 °C) are used for the intracellular metabolite extractions from different bacteria and yeasts (Granucci et al. 2015; Jäpelt et al. 2015). This protocol has shown good recovery and reproducibility of amino acids, organic acids, amines and some fatty acids. However, it is very difficult to remove the remaining glycerol from the metabolite extracts, which might pose considerable problems for silylation derivatization in GC–MS based applications. Methyl chloroformate (MCF) based derivatization can be used as an alternative (Jäpelt et al. 2015). Cold methanol solution (< − 20 °C) coupled with freeze thaw cycles, on the other hand, is a very effective extraction solvent that makes use only of a single organic solvent, thus making the removal of solvents from the sample simple via an evaporation step (Hajjaj et al. 1998; de Jonge et al. 2012). This method has been applied to a wide ranges of microorganisms and is suitable for the extraction of polar and mid-polar metabolites, but recovery of non-polar metabolites is poor (Villas-Bôas et al. 2005).

Although buffered chloroform-methanol-water was first used for the extraction of total lipids from animal cells, Dekoning and Vandam (1992) later adopted it to extract metabolites from yeast cells by using a mixture of buffered chloroform-methanol-water at low temperature (− 40 to − 20 °C) and shaking the mixture vigorously (~ 300 g for 45 min). Unlike the two methods described above, this method is highly useful for the extraction of both polar and non-polar metabolites. However, this method does make use of chloroform, which is toxic. The buffered chloroform method is also time consuming, laborious and some buffers are known to cause problems with different analytical instruments (Villas-Bôas et al. 2007).

Although not suitable for thermolabile metabolites, heated solvents are sometimes used for the extraction of polar metabolite. Buffered, boiled ethanol (75% v/v, 80 °C) is usually added to the quenched cells and held for several minutes to deactivate enzymes and proteins. This process increases the chance of cell disruption and allows the extraction of polar metabolites (Gonzalez et al. 1997). Although retention time reproducibility and the efficient extraction of metabolite recoveries was quite good with this method, the overall recovery of metabolites for different classes of compounds including nucleotides, phosphorylated metabolites and tricarboxylic acids was poor (Prasad Maharjan and Ferenci 2003). Deionised hot (95 °C) water also has been used as an extraction solvent for the analysis of polar metabolites, with some success (Hiller et al. 2007).

Metabolites extracted from intracellular and extracellular compounds are mostly present in very small quantities. Therefore, a concentration step using an appropriate approach improves the instruments limit of detection (Smart et al. 2010). Once concentrated, samples are either chemically derivatized or analyzed directly depending on the chemical nature of the analytes/solvent and the analytical platform being used. For a detailed discussion on the sample concentration of metabolites, the interested reader is directed to the works by Pinu and Villas-Boas (2017b). In summary: lyophilization, freeze drying, solvent evaporation or vacuum drying methods are commonly used in metabolomics. While freeze drying is the method of choice for many, predominately used to remove the water content from samples, non-aqueous samples are commonly concentrated using solvent evaporation techniques (i.e., under a stream of nitrogen or vacuum centrifuge) (Pinu and Villas-Boas 2017b).

2.1 Volatile metabolites

Like any other group of metabolites, volatile metabolites are extremely diverse in nature and ubiquitously present in different types of samples including plant, animal and microorganisms (Rowan 2011). For instance, many plants release different types of volatile metabolites during their growth and development that work as a defense mechanism against pests and also as an attractant for pollinators (Qualley and Dudareva 2014). While volatiles are also an important constituent for flavor and aromas of different food and beverage products (Pinu and Villas-Boas 2017a), they also have the potential to be characterized as candidate biomarkers in human breath of many diseases (e.g. lung cancer, amongst others) (Beale et al. 2017, 2018b; Lubes and Goodarzi 2017). Therefore, a substantial amount of research has already been undertaken to determine volatile metabolites, either in a targeted or an untargeted manner. Although analysis of volatile metabolites in a targeted manner has been performed since the development of GC instruments, untargeted approaches are becoming increasingly popular and widely applied in different areas of metabolomics research. This untargeted approach to volatiles has resulted in the term ‘volatome’ or ‘volatilome’ being coined, which is defined as the comprehensive analysis of volatile compounds in any type of sample (Phillips et al. 2013; Das et al. 2014). As Rowan (2011) and Lubes and Goodarzi (2018) have both published detailed reviews on recent developments relating to the analysis of volatile metabolites and volatile biomarker identification using GC–MS, our review herein is focusing on the advancements of non-volatile metabolite analysis. It should be noted, one recent advancement in volatile analysis that is applicable to metabolomics research and precludes the publication of the aforementioned review papers, is the introduction SPME Arrow (Piri-Moghadam et al. 2017; Soria et al. 2017; de Souza et al. 2018). The SPME Arrow contains a larger volume of sorbent compared to a standard SPME fiber, which provides improved robustness and extraction efficiency compared to conventional static headspace, dynamic headspace and standard SPME (Helin et al. 2015; Kremser et al. 2016). As such, SPME Arrow is considered an advancement/upgrade to conventional SPME, and is one that requires additional investment in order to ensure the correct autosampler CTC modules and injection tools, and suitable inlet septa are available to facilitate its use. Like SPME, SPME Arrow fibers are available for a range of sample matrices and compound classes.

2.2 Non-volatile metabolites (derivatization methods)

A prerequisite for GC–MS-based metabolomics is the derivatization of polar compounds to reduce analyte polarity and increase thermal stability and volatility. Active hydrogens in the functional groups of molecules containing carboxylic acids (–COOH), alcohols (OH), amines (–NH2), and thiols (–SH) can be derivatized by alkylation, acylation or silylation (Dettmer et al. 2007). Table 2 lists the commonly used derivatization reagents used in metabolomics-based studies. Trimethylsilylation (TMS) reagents are the most commonly used because of their comprehensive coverage of compound classes and their relative ease of use (Kanani and Klapa 2007), with N-methyltrimethylsilyltrifluoroacetamide (MSTFA) and N,O-bis(trimethylsilyl)acetamide (BSA) being widely reported (Xu et al. 2010; Koek et al. 2011). It is important to note that MSTFA and BSA are both considered general purpose reagents with a wide application range and have a comparable silylation strength (Koek et al. 2011). However, N,O-bis(trimethylsilyl)trifluoroacetamide (BSTFA) is increasingly becoming popular amongst metabolomics researchers and is known to yield fewer ‘artifacts’ when compared to other commonly used silylation reagents, such as BSA, MSA and MSTFA (Little 1999).

Table 2 Common derivatization reagents for metabolomics-based GC–MS analysis (after Lai and Fiehn 2018; Farajzadeh et al. 2014)

In some cases, trimethylsilylchlorosilane (TMCS) is added to the silylation reaction as a catalyst at a concentration of 1% or 10% (Orata 2012). In this context, TMCS is used to increase the reactivity of MSTFA and BSTFA (i.e., increase the TMS donor potential) and assists in the derivatization of secondary alcohols and amines (Orata 2012). Furthermore, pyridine as an anhydrous solvent, is often used to prepare metabolite extracts for GC analysis, as it acts as an acid scavenger and accelerates the derivatization reaction without a need for prolonged elevated dervatization temperatures (Hyotylainen 2013). Several other reagents or mixes of reagents are also available that are far more selective than BSTFA and MSTFA, e.g., trimethylsilylimidazole (TMSI) or a mix of hexamethyldisilazane (HMDS) with TMCS and a mix of TMSI/BSA/TMCS. Such reagents or combinations of reagents have been developed to derivatize sterically hindered hydroxyl groups (Koek et al. 2011). Trimethylsilyl cyanide (TMSCN) has also been found to outperform MSTFA-based methods in terms of silylation reaction speed, sensitivity, and repeatability (Khakimov et al. 2013), however, it is not as widely used due to its increased volatility as a derivatization reagent with respect to MSTFA.

The stability of derivatized metabolites can be improved via the use of larger silyl group, such as N-methyl-N-(tert-butyldimethylsilyl)trifluoroacetamide (MTBSTFA). However, some researchers have reported difficulties in derivatizing sugars and some amino acids using this reagent (Dettmer et al. 2007). These include mono-, di- and tri-saccharides, sugar phosphates and sugar alcohols which can readily be identified via using GC–MS. However, there are several underlying issues with measuring sugars by GC–MS. The first issue which is generally the case for all polar, high boiling primary metabolites is that they need to be made volatile, so they are suitable for GC–MS. Secondly, sugars (mono, di- tri) fragment similarly at 70 eV, this means that they are difficult to distinguish based on MS spectra alone and can only identified based on comparison of its retention time to that of an authentic standard and its retention indices (RIs). Finally, care must be taken when attempting to identify sugars, even with open-source deconvolution programs such as AMDIS. This software generally is unable to differentiate hexose-sugars in biological standards. As such, it is therefore advised that users create a curated in-house retention time locked (RTL) library to aid in the absolute identification of sugars or utilise commercially available libraries with retention indices (Kind et al. 2009; Gika et al. 2018).

One approach to overcome this challenge that provides improved mass fragmentation and chromatography (in terms of separation and less complex chromatograms) is the application of a two-step derivatization protocol that involves oximation with methoxyamine hydrochloride (MOX) (Ruiz-Matute et al. 2011). While there are other reported one-step and two-step approaches that provide similar outcomes, MOX is commonly used in untargeted metabolomics studies, as it protects carbonyl moieties of keto acids and sugars prior to silylation (Hyotylainen 2013). Such an approach improves the analytical response by eliminating multiple reaction products being formed by inhibiting ring formation of reducing sugars during silylation (Yi et al. 2014). Interestingly, Koek et al. (2006) compared several derivatization reagents for the analysis of alcohols, aldehydes, amino acids, amines, fatty acids, (phospho-) organic acids, sugars, sugar acids, (acyl-) sugar amines, sugar phosphates, purines, pyrimidines, and aromatic compounds in microbial metabolomics. They found that MSTFA produced the best results overall, and the addition of TMCS as a catalyst did not improve the performance significantly (Koek et al. 2006). Abbiss et al. (2015) compared BSTFA and MSTFA directly in the derivatization of rat urine samples and found that BSTFA showed a significantly greater TIC intensity compared to the MSTFA reagent.

In general, it is important to optimize the sample extraction and derivatization protocols in metabolomics-based studies, investigating multiple derivatization reagents, using both authentic standards from multiple metabolite compound classes, and spiked sample matrices. Such an approach ensures that a maximum number of analytes are detected, and the recovery and derivatization of each class being analyzed can be assessed and reported (Schummer et al. 2009; Villas-Bôas et al. 2011; Elie et al. 2012). This is particularly important in complex matrices such as urine, where urease is used as a treatment step to reduce and remove high levels of urea. This then enables the simultaneous GC–MS analysis of urinary organic acids, amino acids and sugars after BSTFA and TMCS derivatization (Shoemaker and Elliott 1991; Kuhara 2001). If left untreated, the urea component in urine would result in the minor constituents being less accessible to the derivatization agent, and their subsequent analysis will be impacted. Furthermore, for the use of silylation reagents, it is imperative that extracts are dry and free of residual water (Dettmer et al. 2007). For example, 1 µL water in an extract will use up approximately 20 µL of silylation reagent (i.e., MSTFA) (Koek et al. 2006). Derivatized extracts should also be kept free of water to avoid any unwanted hydrolysis of derivatized metabolites prior to analysis. Silylation can also result in unwanted conversion reactions, for example, Halket et al. observed the conversion of arginine into ornithine with the use BSTFA and MSTFA (Halket et al. 2005). Such conversions and artifacts make data interpretation convoluted and can impact the end biological inference.

To overcome such challenges, an alternative reagent that can be used for aqueous samples is chloroformate [i.e., methyl chloroformate (MCF), ethyl chloroformate (ECF) and propyl chloroformate (PCF) (Matysik et al. 2016; Primec et al. 2017; Yang et al. 2017; Muguruma et al. 2018)]. Chloroformate enables the derivatization of amino and non-amino organic acids, phosphorylated organic acids and fatty acid intermediates to take place at room temperature, and is less prone to the matrix effects and is increasingly being used (Smart et al. 2010; Pinu et al. 2014; Casu et al. 2018). Alkylation using chloroformate reagents works by replacing the active hydrogen from a molecule by an alkyl group and mostly the primary and secondary amines, amides, sulphonamides, thiols, phenols, carboxylic acids and alcohols are the target compounds (Söderholm et al. 2010). Other benefits of using MCF derivatzation in particular are that the reagents are inexpensive, reactions occur within 2–3 min and the derivatives can be easily separated from the reaction mixture (Villas-Bôas et al. 2011). In addition to MCF, diazomethanes are also used for alkylation derivatization and have been used for methylation of carboxylic acids, acidic herbicides and fatty acids (Vijaya Saradhi et al. 2007; Patnaik et al. 2008; Ranz et al. 2008). Minimum by-products are produced by this reaction and it is rapid and simplistic. However these reagents are highly toxic and have a limited storage time (Wells 1999). There are some other microwave assisted derivatisation methods such as acylation. This is a method that adds an acyl (–COR) group to the molecule by replacing reactive hydrogen. The acylated derivatives are hydrolytically stable, but the reagents are hazardous (Söderholm et al. 2010).

Lastly, in order to complete the chemical derivatization process, physical processes are required to facilitate the reactions. As such, there are many strategies researchers have investigated in order to undertake sample derivatization in GC–MS based metabolomics research (Moros et al. 2017). These range from batch sample derivatization using conventional heating blocks and agitators through to more advanced high-throughput techniques that rely on microwave activation energy to drive reactions and automated instrumentation that enable in-time sample derivatization. Figure 2 illustrates an overview of these different approaches and the following section provides some detail on their use in metabolomics research.

Fig. 2
figure 2

Overview of common derivatization protocols used in GC–MS based metabolomics

2.2.1 Offline derivatization

Offline derivatization is by far the most commonly reported approach for sample derivatization in GC–MS based metabolomics research (Hyotylainen 2013). Typically, offline derivatization refers to the batch preparation of dried extracts for GC–MS analysis, where dried extracts (of known aliquot) are reconstituted in the derivatizing reagent (either neat or mixed with a solvent such as pyridine, prepared using a two-step approach comprising MOX followed by silylation or silylation only). The reconstituted extracts are then heated for a prescribed time, with or without agitation, either at room temperature (Gordon 1990), in an oven (Gordon 1990) or using a compact dry heating block with agitation functionality (i.e., thermomixer) (Karpe et al. 2015) or shaking incubator (Warren et al. 2012). Once the samples are derivatized for a prescribed time, the samples are then transferred to clean vials prior to analysis by GC–MS. It is important to note, an offline derivatization approach is typically limited to smaller batches (of ca. 40–50 samples) and are analyzed randomly to account for analytical variations that arise from samples stagnant in the autosampler rack for prolonged periods post derivatization (Villas-Bôas et al. 2011; Zarate et al. 2016). As illustrtaed by Villas-Bôas et al. (2011), the reproducibilty of metabolites [in terms of residual standard deviation (RSD)] can vary significantly over a 72 h period. However, it was noted by Kind et al. (2009) that there was no evidence of different trimethylsilylation ratios depending on post reaction times in samples derivatized using an offline batch approach (e.g., while waiting for an injection on autosampler racks). Instead, it is proposed that the actual ratio of synthesis and degradation of trimethylsilyl derivatives may rather depend on the presence and activity of catalytic sites in the GC–MS injector (Kind et al. 2009), which can be maintained by replacing inlet liners and septa regularly (i.e., after 50 injections) and analyze sample blanks periodically (Sumner et al. 2007). As a general recommendation, derivatized samples should be analyzed within 24–48 h of preparation.

2.2.2 Microwave-assisted derivatization (MAD)

Use of microwave heating to improve the efficiency of GC-based derivatization protocols in reactions comprising silylation, acylation, and alkylation has been well documented (Söderholm et al. 2010). To date, microwave-assisted derivatization (MAD) has primarily be applied for the preparation of clinical (Kouremenos et al. 2010), forensic (Söderholm et al. 2010), food (Xu et al. 2011), industrial (Karpe et al. 2016) and environmental samples (Beale et al. 2013).

MAD provides researchers with the ability to dramatically reduce the time needed to derivatize samples compared to conventional offline approaches, where the derivatization time decreases from ‘hours’ to ‘minutes’ as a result of the increased pressure within the sample vial and the use of microwave energy to drive derivatization. For example, Silva (2006) developed a microwave-assisted derivatization (MAD) protocol using a domestic microwave that rapidly reduced the time needed to prepare samples for the GC–MS analysis of the monosaccharides glucose and galactose in human plasma. In this study, aldonitrile penta-acetate derivatization was carried out using a domestic microwave that reduced the time of derivatization from 2.5 h to 8 min. Similarly, Kouremenos et al. (2010) used a commercial scientific microwave instrument for the optimization of authentic metabolite standards, where the derivatization reaction was completed within 90 s. Furthermore, with the advancement of specialized scientific microwave instruments, high throughput rotors capable of housing GC-ready vials in thermally uniform silicon carbide plates can enable researchers the ability to undertake high throughput sample derivatization of upwards of 80 samples per reaction in large scale omics research (Beale et al. 2016). Such approaches have also been applied to prepare batches of > 40 samples in environmental and industrial metabolomics applications (Beale et al. 2012; Karpe et al. 2015). However, the uptake of MAD by the scientific community thus far has been limited (Söderholm et al. 2010), until recently, domestic microwave ovens have been used without a clear scientific rationale and in many cases lacking high-quality reproducible procedures (Söderholm et al. 2010). Compounding the issue further, specialized instrumentation for MAD requires a significant initial capital outlay and the requirement for specialised rotors. This may be an expense that can not be justified where the same capital outlay could be better used to invest in in-time derivatization capability for GC assets already in use in metabolomics research. Autosampler vendors acknowledge this need and have been developing modules that incorporate such offline workflow components (i.e., MAD) onto CTC autosampler racks. For example, Gerstel (http://www.gerstelus.com) have developed CTC compatible modules that perform vacuum concentration, centrifugation and FAME devitization via miniaturized microwave instruments. Likewise, organizations like Anatune in UK (http://www.anatune.co.uk) provide deployable bespoke GC–MS solutions (MultiFlex GC-MSD and GC-Q-TOF) that utilize these technologies for specific workflows, of which a metabolomics protocols incorporating sample centrifugation and drying with in-time derviatization is included as an offering. Such principles are discussed in more detailed in the following sections.

2.2.3 In-time (in-line) derivatization

With an advancement in robotics and automated systems, CTC autosamplers used in GC–MS analysis have increased in popularity. This is primarily due to the fact that a CTC auto sampler affords the analysts greater flexibility in terms of the type of samples that can be analyzed (e.g. liquid, headspace, SPME) without reconfiguring the GC system. A few authors have compared the use of CTC autosamplers for undertaking derivatization in metabolomics research (Ewald et al. 2009; Zarate et al. 2016). For example, Abbiss et al. (2015) compared offline batch with an automated online batch and an in-time (i.e., a sample ready for injection as required) TMS derivatization methods for the analysis of rat urine using BSTFA and MSTFA. For the offline protocols, metabolites were derivatized using a two-step protocol of 20 µL of MOX (20 mg/mL in pyridine) and agitated for 90 min at 1200 rpm and 30 °C in an Eppendorf thermomixer comfort. Following this, 40 µL of BSTFA or MSTFA was added and further incubated at 37 °C for 30 min at 300 rpm. For the automated online protocols (batch and in-time), reagents (MOX, BSTFA or MSTFA) were added using a CTC CombiPAL autosampler fitted with an agitator. The online method was kept the same as the control with the exception to agitation, which was limited to 500 rpm (the maximum setting of the CombiPAL). It is noteworthy to mention that the automated batch protocol prepared a batch of samples at once (n = 4) and the automated in-time procedure used a staggered approach resulting in a sample being ready for immediate injection every 70 min (Abbiss et al. 2015). It was concluded that the offline and online protocols were statistically comparable, with the in-time method displaying significantly fewer unresolved compounds.

It should be noted, the greatest benefit of moving to an in-time derivatization protocol is to reduce the time and free up labor resources that would otherwise be needed to prepare samples following a batch derivatization protocol. Also, an in-time derivatization protocol assists in working towards eliminating human error that may occur while manually transferring samples and reagents. This type of protocol also allows a GC–MS instrument to be operated continuously (24/7) with the samples being overlapped, thus increasing the throughput (Zarate et al. 2016). The only limitation is the vial holding capacity for the autosampler, the frequency to which liners, septa and syringes need to be changed and the ongoing performance of the analytical column and MS detector. As such, the emphasis on a strong QA/QC regime throughout a sample sequence will ensure optimal, ongoing analytical performance. Furthermore, when investing in high-throughput capability for metabolomics research, it appears more advantageous to invest in a CTC capability for in-time derivatization compared to acquiring a microwave instrument for MAD. Although, as eluded above, CTC autosamplers have modules that enable microwave capability. While this technology is currently limited to larger vials (> 4 mL) focused on food FAMEs analysis, with further development, it may be possible to include microwave approaches within a metabolomics CTC workflow.

2.2.4 In-liner derivatization

The derivatization of metabolite extracts can potentially require long incubation times before injection, and as such, a potential drawback is for the derivatization reagent and derivatives (such as TMS compounds) to undergo hydrolysis (Koek et al. 2006). However, the advancement of MAD and in-line derivatization methods has addressed these issues to an extent (potentially, if proper protocols are followed). However, the issue still remains where such technology is not accessible, and time and labor resources are constrained. In such cases, in-liner derivatization can be applied, where the instantaneous (within a few seconds) derivatization of the sample is performed inside the GC inlet. To achieve in-liner derivatization, the sample and derivatization reagent are drawn into the GC syringe using a multi-layer “sandwich” injection, with an air gap in between solutions, prior to injection into a hot inlet (Ferreira et al. 2013; Marsol-Vall et al. 2016). Such an approach enables the derivatization of the sample to be completed within seconds during the injection cycle, eliminating time needed in offline sample preparation methods and not adding any additional time to the sample sequence or GC cycle (Docherty and Ziemann 2001). Khakimov et al. (2013) used in-liner derivatization using MTBSTFA with 1% tert-butyldimethylchlorosilane (TBCS) as the derivatization reagent to analyze blueberry extracts. Docherty and Ziemann (2001) used BSTFA in-liner derivatization for the detection of mono- and dicarboxylic acids in a smog chamber. The use of in-liner derivatization is yet to be fully assessed for all chemical classes and for two-step derivatization protocols (i.e., MOX and BSTFA). However, its application in fatty acids, pentachlorophenol and opiates has been reported (Docherty and Ziemann 2001). An analogous approach is on-column derivatization using micro solid-phase extraction (μ-SPE) cartridges. Pandohee and Jones (2016) demonstrated its application for the analysis of short-chain fatty acids in olive oil and proposed it's suitability more broadly in metabolomics and lipidomics studies. In-liner derivatization may be advantageous to researchers who do not have access to MAD or in-time derivatization apparatus and have a targeted set of metabolites which are to be analyzed.

Undoubtedly, in-line derivatization is advantageous as it limits multiple steps in the derivatization process (e.g. time, mixing and heating). Similar to in-time derivatization methods, in liner approaches have the potential to allow continuous instrument operation. However, such an approach would result in a stronger requirement to maintain a clean inlet and ensure the liner is regularly changed. To accommodate this, there are CTC attachments that enable automated liner replacement to be included as part of the sequence method. However, it is important to mention that the inlet liner (if not appropriately selected) can lead to a number of side reactions leading either to artefacts, overestimation of the relative response ratio or calculated concentration of metabolites. For example, glutamine and/or glutamate can readily be converted to pyroglutamate (via pyrolysis) in the inlet. Alternatively, arginine can readily be converted to ornithine (and becomes somewhat chromatographically unresolved). In terms of sugars (e.g. glucose), if there is insufficient MOX and TMS this results in unconverted glucose in the form of pyranose and furanose, scattered throughout the TIC. These are just a few examples of the drawbacks of chemical derivatization and the artefacts that may result via a poorly executed in-liner approach.

3 Sample analysis by gas chromatography coupled with mass spectrometry (GC–MS)

3.1 Gas chromatography methods of ionization

A variety of ionization sources are available for different analytical instruments used in metabolomics. However, electron ionization (EI) mass spectrometry is the most common form of MS detector used in GC-based metabolomics research. EI is considered a ‘hard’ ionization method that leads to the reproducible fragmentation of molecules into well characterized mass spectral fingerprints. Across all EI instruments, ionization is performed at 70 eV and mass spectra are typically considered reproducible between instruments manufactured from different venders and across instruments with different mass analyzers (e.g. quadrupole, time of flight, etc.). This standardization of EI ionization allows the use of mass spectral libraries, such as the Agilent G1676AA Fiehn GC/MS Metabolomics Retention Time Locked (RTL) Library (Heinz et al. 2001; Kind et al. 2009), or the publicly available GOLM Metabolome Database (GMD) (Kopka et al. 2005; Toepfl et al. 2005), National Institute of Standards and Technology (NIST) Mass Spectral Database (Daniel et al. 2010), and the Wiley Registry™ (Hoboken, USA) of Mass spectral Data (now part of the Wiley Spectra Laboratory), 11th Edition (Jamdar et al. 2010). The highly reproducible retention indices (RIs) from GC can also be used for orthogonal confirmation of isobaric compounds that often produce similar mass spectra but distinctly separate in the chromatographic domain. RIs are typically determined via the use of an alkane or FAME standards mix, to which a retention time calibration file can be created and used to annotate compounds via assigning RI information (Strehmel et al. 2008). The use of both RIs and RTL methods enable retention time drift to be minimized and results in an increase in metabolite identification confidence in untargeted workflows. Alternative ionization methods such as chemical ionization (CI), which are ‘softer’ approaches that use no impact energy and result in less fragmentation and a greater occurrence of the parent ion. CI can be traced back to the early 1950s, when Talrose and Lyubimova (1952) used the technique and generated a world-wide interest in ion–molecule reactions. While not new, CI is considered state-of-the-art in terms of analytical technology that can be applied in metabolomics-based analyses, and warrants inclusion herein. CI enables a greater linear range of metabolite concentrations over a range of metabolite classes to be analyzed (Lisec et al. 2016). Such a linear range is a limitation with current EI–MS detectors. In terms of principles, CI forms new ionized species when molecules in the gas phase interact with ions, which can involve transfers of electrons, protons or other charged species. When positive ions results from a CI experiment, the phenomenon is described as ‘positive chemical ionization’, and when negative ions are formed, the term ‘negative chemical ionization’ is used.

In metabolomics, however, CI has not been applied as extensively as EI (Jaeger et al. 2016). This may be due to CI being a soft ionization technique, thus resulting in less sensitive fragments that will not aid in the identification as well as EI. However, the CI technique does produce abundant molecular ions, and thus is useful in targeted methodologies and its popularity is increasing (Warren 2013). In 2011, Turner et al. (2011) used CI and EI methods to compare the metabolomics profiles from five exhaled breath samples, and noted distinct compounds which established differences in selectivity and sensitivity between CI and EI (Pacchiarotta et al. 2010; Turner et al. 2011; Warren 2013; Ruttkies et al. 2015). More recently, Kloehn et al. (2015) used PCI and NCI in their parasite extraction and sample preparation through the use of perfluorotritrimethylsilyl (PFtriTMS) derivatives of deoxyribose and ribose with methane as reagent gas. Analogous to CI, high resolution, accurate mass GC–MS instruments fitted with low energy EI sources (15 eV) afford CI-like spectra to be obtained via softer ionization experiments and greater preservation of molecular ions. Such technologies are gaining broader uptake within untargeted metabolomics studies (Dunn et al. 2011).

Other soft ionization approaches include atmospheric pressure chemical ionization (APCI) and atmospheric pressure photo ionization (APPI) sources. APCIs were first used in GC applications in the 1970s. In modern times APCI sources are typically coupled to high mass resolution systems including time-of-flight (TOF) and Fourier transform (FT) orbitrap and ion cyclotron resonance (ICR) systems. APCI sources use a softer ionization approach than EI generating the molecular ion. Ionization occurs through passing the chromatographic gas stream through a corona discharge where first, nitrogen gas radicals are formed. Ionization in both positive and negative modes is possible via several different mechanisms. APCI sources also can use LC and GC interchangeably and can be swapped from one chromatographic system to another within minutes on the same mass spectrometer. APCI has been used for metabolomics applications including environmental pollutants (Nácher-Mestre et al. 2014; Portolés et al. 2014), cell cultures (Wachsmuth et al. 2015), analysis of cerebrospinal fluid (Carrasco-Pancorbo et al. 2009) and more recently for analysis of Spanish olive oils (Sales et al. 2017). Hybrid GC–LC APCI sources continuous infusion of water enhanced ionization of methylchloroformate derivatives of pancreatic cancer cell metabolites (Wachsmuth et al. 2014).

APPI source for GC was first introduced in 2007 by Waters (McEwen 2007), soon followed by introduction of sources from Luosujarvi et al. (2008) and Bruker (Carrasco-Pancorbo et al. 2009) for metabolite profiling and environmental pollutants (Bin et al. 2014). There are also several other available ionization sources including a GC–APPI source developed and released on the Thermo Orbitrap (Huba and Gardinali 2016; Kersten et al. 2016) and a novel hybrid GC–APCI/LTP source that utilizes low temperature plasma (LTP) to induce ionization within a standard APCI source, the latter used to assess a variety of volatile organic compounds (VOCs) (Norgaard et al. 2013).

3.2 Mass analyzers used in gas chromatography

The mass analyzer is an essential part of a mass spectrometer which measures the mass-to-charge ratio (m/z) of molecules present as charged ions. To distinguish one mass peak from another the mass analyzer must be able to resolve individual ions of similar mass. Both mass resolution and resolving power (RP) describe the extent to which any mass analyzer can achieve resolution of individual ions. Mass resolution is defined as the degree of separation between two adjacent ions observed in the mass spectrum (Δm) at full width half mass (FWHM) of the peak. RP is defined as the nominal mass (m) divided by the difference in masses (Δm). Higher mass RP is essential for high mass accuracy whereby a higher RP allows identification of the center of a peak and determination of mass error. Mass error is the difference between the observed mass and theoretical mass of a given ion; lower mass error allows higher confidence assignment of molecular formula aiding tentative identification. The combination of high mass resolution and high mass accuracy enhances identification of contributing molecules by being able to accurately identify their mass and thus being able to resolve ions very close in mass.

A variety of different mass analyzers have been coupled to GC with differing mass resolutions and mass accuracies (Table 3). Within the high throughput metabolomics context where analysis of large numbers of complex samples is desired either robust library spectra with suitable standards (as described above) coupled to highly sensitive detectors for confident assignment of identity are used. The alternative is to use a high mass resolution detector with low mass error, that has the ability to measure the molecular ion, providing confident assignment of molecular formula.

Table 3 List of common mass analyzers and instrument configurations detailing: mass resolution, approximate mass range, tandem MS/MS capabilities and acquisition speed

The most common mass analyzers used for GC–MS applications are low mass resolution quadrupole mass filters including GC single quadrupole (GC–Q–MS) and triple quadrupole instruments (GC–QqQ–MS). Quadrupole based systems have several advantages including high sensitivity and good dynamic range but can be impacted from slower scan rates and lower mass accuracy relative to high mass resolution based systems. Generally, GC–Q-–MS based systems are capable of resolving ca. 500–800 compounds in an individual run and identify relative to spectral libraries a few 100 (100–350). This is of course dependent on the sample being analyzed and the extraction protocol followed. Further increases in sensitivity and use of quantification strategies can be achieved by utilizing GC–QqQ–MS systems for targeted metabolite assays. The selection of an appropriate ion(s) for quantification must be stressed. MS/MS should be performed on authentic standards for metabolites of interest, and sensitive yet unique ions should be selected for QqQ experiments (minimally one qualifier and one quantifier ion i.e. at least two MRM transitions). Ion-trap technology may also be used in this context for low resolution GC–MS and increased sensitivity for targeted GC–MS/MS applications (Arrebola et al. 2013; Brandt et al. 2014; Visentin and Pietrogrande 2014). The disadvantage of low-resolution mass detectors is difficultly elucidating the structure of a detected but unknown molecule. This further supports the benefits of high resolution accurate mass data over unit mass resolution data obtained on Q–MS and QqQ–MS instruments. Comparison of acquired spectra to spectral libraries of authentic standards (as above) aids in identification of metabolites but is limited in deducing the structure of unknown metabolites. Whilst GC–MS databases were predominantly generated with low resolution MS, it is possible to search high-resolution data against these spectra libraries. Whereas the reverse (searching low-res spectra against a high-res database) is not possible. Due to the historical use of GC–Q–MS based systems, most spectral libraries do not contain any accurate mass information which would significantly aid in the identification process. Of note, it is important to be aware that GC–MS derived peaks are not necessarily made up of individual metabolites. In fact, each peak could potentially represent mixtures of co-eluting metabolites. Spectra of putative metabolites can be obtained from overlapping peaks by applying spectral deconvolution methods (Fancy and Rumpel 2008). These methods come standard in vendor supplied data acquisition/analysis software tools or can applied via freely available programs such as AMDIS (http://www.nist.gov). Furthermore, metabolite identification has changed remarkably in recent years. Comprehensive commercial databases such as NIST and METLIN (Guijas et al. 2018) have been complemented by extensive open-access MS/MS databases containing hundreds of thousands of spectra, including: mzCloud, MassBank, the Global Natural Product Social Molecular Networking site and the Human Metabolome Database (Gardinassi et al. 2017; Wishart et al. 2018). Furthermore, a range of in silico predictive tools have emerged recently to assist with the interpretation of high resolution MS/MS data (Ma et al. 2015; Vinaixa et al. 2016).

Uptake of modern accurate mass analyzers used in high resolution mass spectrometry (HRMS) will significantly increase depth of coverage by their ability to resolve more ions and widen the impact of GC–MS metabolomics applications. The types of HR–MS systems are time-of-flight (TOF), double-focusing systems (DFS) employing a dual electronic and magnetic sector and Fourier transform (FT) based instruments, both orbitrap and ion cyclotron resonance (ICR) (Dunn and Ellis 2005; Dettmer et al. 2007; Lei et al. 2011). Of these, TOF technology is the most utilized and offers high mass resolution, high mass accuracy, and very fast scan speeds which can be very useful for deconvolution of overlapping GC peaks. This is particularly evident when resolving the narrow and sharp peaks generated using fast GC methods or GC × GC separation prior to detection. TOF technology has recently been used to develop a high mass resolution deconvolution algorithm and library, BinBase and vocBinBase (Skogerson et al. 2011). The use of automated software and high mass accuracy promises to significantly enhance metabolite identification.

Higher mass resolution is available from new DFS electronic and magnetic sector analyzers which offer high mass resolution with variable scan speeds and high sensitivity. GC–HRMS from traditional FT based systems is attained at the sacrifice of scan speed, which can limit application for high speed GC. GC–FT–ICR–MS applications have been seen in ‘petroleomics’ characterizing crude oil and bio-oil products (Barrow et al. 2014; Schwemer et al. 2015; Zuber et al. 2016). Recent GC–orbitrap applications in pharmacological research and environmental contaminants (Peterson et al. 2014; Baldwin et al. 2016; Postigo et al. 2016).

The use of GC–HRMS is particularly advantageous for stable isotope labelling (SIL) approaches incorporating 13C or 15N into analytes, two advanced approaches have been reported that include molecular-ion directed acquisition (MIDA) (Peterson et al. 2014) for discovery metabolomics and isotopic ratio outlier analysis (IROA) analysis (Qiu et al. 2016) for metabolite identification. Hyphenated systems and tandem mass spectrometry further open the applicability of GC–MS. Coupling low resolution quadrupole filters to collision induced dissociation prior to a high mass resolution detector, TOF, orbitrap or FT–ICR. Additionally, those who are new or undertaking analytical chemistry experiments need to consider what is involved in method validation not only for GC–MS but also for complimentary analytical platforms such as LC–MS and NMR. This involves many steps that should be undertaken thoroughly, including: specificity/selectivity, accuracy/precision, repeatability/reproducibility, trueness/recovery, linearity of authentic standards in solvent and in sample matrix, limits of detection/quantification, stability experiments etc. (FDA 2018).

3.3 Common metabolite classes analyzed by GC–MS

The following section details metabolite class-specific considerations for common metabolomics-based metabolites analysed by GC–MS, namely: amino acids (AAs), mono, di- and tri-saccharides (sugars), and fatty acids (FAs). The focus of this review has been limited to the three main derivatized chemical classes analyzed by GC–MS (see the review by Dettmer et al. (2007) and Zhou et al. (2017) for VOC metabolomics). A summary of selected GC columns and their application for AAs, organic acids, and FAs in specific biological matrices are also presented in Table 4. Note that, the most commonly used phase is 5% phenyl, 95% methyl siloxane which provides the most generic selectivity for untargeted metabolomic applications (Fancy and Rumpel 2008).

Table 4 Summary of some selected GC columns for GC–MS based separation and identification of amino acids, sugars, organic acids and fatty acids from various biological and non-biological matrixes

3.3.1 Amino acids (AAs)

Amino acids (AAs) form an important class of cellular metabolites, fundamental to numerous biochemical processes (Otter 2012; Krumpochova et al. 2015). The chromatographic analysis of AAs has recently been reviewed (Dołowy and Pyka-Pająk 2014). GC coupled with a FID or ECD has been widely described, however, GC–MS-based approaches are becoming more widely applied (Krumpochova et al. 2015).

Krumpochova et al. (2015) investigated LC-based (e.g. HILIC–MS and RPLC–MS) and GC–MS platforms in terms of AAs reproducibility and analysis. It was found that while HILIC–MS was more advantageous in the untargeted analysis (in combination with a broader set of other metabolites), and RPLC–MS was able to identify all AAs investigated, a GC–MS-based approach was faster and more reproducible for most targeted AAs. Figure 3 illustrates a GC–MS chromatogram of AAs; the total run time was 7 min and 18 were resolved. Arginine could not be analyzed due to its thermal instability of the derivative (Krumpochova et al. 2015).

Fig. 3
figure 3

Analysis of propyl chloroformate derivatized amino acids by GC–MS in SIM mode (Krumpochova et al. 2015). Identified AAs include: alanine (Ala) [RT = 1.1 min], glycine (Gly) [RT = 1.2 min], valine (Val) [RT = 1.4 min], leucine (Leu) [RT = 1.6 min], isoleucine (Ile) [RT = 1.7 min], serine (Ser) [RT = 1.9 min], threonine (Thr) [RT = 1.9 min], proline (Pro) [RT = 2 min], asparagine (Asn) [RT = 2.1 min], aspartic acid (Asp) [RT = 2.7 min], methionine (Met) [RT = 2.7 min], glutamic acid (Glu) [RT = 3 min], phenylalanine (Phe) [RT = 3.1 min], glutamine (Gln) [RT = 3.7 min], lysine (Lys) [RT = 4.4 min], histidine (His) [RT = 4.6 min], tyrosine (Tyr) [RT = 4.9 min], and tryptophan (Trp) [RT = 5.1 min]

GC–MS has widely been used for the analysis of amino acids present in different types of biological samples. Jiménez-Martín et al. (2012) investigated the suitability of a one-step derivatization procedure using N-methyl-N-(tert-butyldimethylsilyl)trifluoroacetamide for the simultaneous analysis of 22 free amino acids in a variety of food sources by GC–MS. All 22 free amino acid derivatives were correctly detected and resolved, with reported %RSD in the range of 1.9–12.2%. Similarly, Hope et al. (2005) analyzed amino acids and organic acids by comprehensive two-dimensional (2D) gas chromatography (GC) coupled to time-of-flight mass spectrometry (GC × GC × TOFMS). Dettmer et al. (2012) investigated the quantitative analysis of 22 free amino acids using propyl chloroformate/propanol derivatization (carried out directly in aqueous samples) using a single quadrupole GC–MS. Villas-Bôas et al. (2003) presented MCF derivatization protocol for amino acids and di- and tricarboxylic acids in fungi.

3.3.2 Organic acids

Similar to AAs, organic acids are also important group of primary metabolites. GC–MS metabolic profiling of organic acids has been used since the 1970s in the detection of inborn errors of metabolism (IEMs) (Pauling et al. 1971; Chalmers and Lawson 1982; Hoffmann et al. 1989; Duez et al. 1996). IEMs result from genetic mutations that affect an enzyme involved in intermediary metabolism. Organic acids are involved in many areas of intermediary metabolism (e.g. amino and fatty acid metabolism) and there is a corresponding large number of IEMs in which organic acids accumulate in vivo as a result of a deficient enzyme. These IEMs can be diagnosed based on the detection in urine of abnormally elevated organic acids associated with each disorder.

GC–MS is crucial for both qualitative and quantitative analyses of urinary metabolites, and the specific elevated metabolites arising from many IEMs including isovaleric acidemia, propionic acidemia, pyroglutamic academia and 3-methylcrotonylglycinemia have been discovered by using this technique (Eldjarn et al. 1970; Jellum et al. 1970). By 1980, Tanaka et al. had developed a method in which 155 metabolites were putatively identified, which helped demonstrate the importance of GC–MS in diagnostic medicine (Tanaka et al. 1980). Urinary organic acids are mostly extracted using a liquid–liquid extraction procedure similar to the following: organic acids are extracted with diethyl ether and/or ethyl acetate under specific acidic conditions, with or without the addition of sodium chloride, dehydrated with sodium sulfate, and finally evaporated to dryness and derivatized to increase their volatility, so as to be compatible with GC–MS analysis. As noted earlier, steps need to be carried out in order to remove the high levels of urea in urine for organic acid analyses (Shoemaker and Elliott 1991). Currently, these types of analyses are routinely performed in hospitals using GC–MS with ‘general purpose’ fused silica (5%-phenyl)-methylpolysiloxane columns (Pitt et al. 2015). In addition, organic acid analysis by GC–MS has been applied to other areas of interest in metabolomics (Mamer et al. 2013; Khakimov et al. 2014; Irwin et al. 2018).

3.3.3 Fatty acids (FAs)

The analysis of FAs by GC–MS is complicated by their polarity and inadequate volatility. Therefore, prior to fatty acid analysis, it is necessary to convert polar carbonyl groups into suitable volatile non-polar derivatives—such as methyl, ethyl or isopropyl esters, that are obtained by means of esterification (Dołowy and Pyka 2015). Formation of fatty acid methyl esters (FAMEs) by transesterification can be obtained by a wide range of alkylation reagents (Quehenberger et al. 2011) such as methanolic hydrochloric acid, methanolic boron trifluoride (Zhang et al. 2013; Sertoglu et al. 2014; Takahashi and Yoshida 2014), sulfuric acid (Han et al. 2011; Wang et al. 2012) and acetyl-chloride (Ecker et al. 2012; Kopf and Schmitz 2013). But the advantages of using commercially available Meth Prep II (methanolic m-trifluromethylphenyltrimethylammonium hydroxide) reagent include that it is capable of a one-step transesterification reaction of lipids such as sphingolipids, glycerophospholipids and glycerolipids, as well as FAME derivatization of fatty acids (Goetz et al. 1984). It is important to mention that precautions must be taken to minimize the uneven esterification. Otherwise it could significantly affect the resolution of minor fatty acid isomers as well as the presence of non-saponifiable constituents or impurities, which could also affect chromatography resolution and sensitivity of the MS detector (Dołowy and Pyka 2015). Of the different GC detectors, FID and MS are most commonly used in fatty acid analysis (Chuang et al. 2013; Cruz-Hernandez et al. 2013; Jurczyszyn et al. 2014; Nishi et al. 2014), a summary of some selected studies that use the GC–MS are listed in Table 3.

The combination of GC with mass spectrometry (MS) technique is one of the most powerful tools in identifying and characterizing fatty acids, and is the most popular technique used today. This provides a perfect solution to the ‘unknown peak’ issues faced with the FID detector (Casal and Oliveira 2010). Here fatty acids are characterized based on m/z with a resolution and sensitivity that could distinguish two different masses, and the obtained results could be precisely compared with a spectral mass database or library such as Wiley or NIST (Aini et al. 2009; Masic and Yeomans 2014; Nakagawa et al. 2014). The carrier gases that are used commonly for the separation of fatty acids are nitrogen, helium or hydrogen. Nonetheless, previous research shows that a large variety of columns (stationary phases) with different properties (such as polar and nonpolar columns) have been used. In spite of this, most authors choose columns for the fatty acid separation depending on their application.

4 2D GC–MS in metabolomics

Overcrowded NMR chemical shift spectra and chromatograms (or in some cases electropherograms) are a fairly common feature of metabolomic studies and this overcrowding can hinder subsequent analysis and interpretation of the underlying data (Pandohee et al. 2015). While there are many computational tools available for deconvoluting overlapping chromatograms including empirical methods, comparison with library spectra, eigenvalue analysis, regression and others (Colby 1992; Kind et al. 2009) these are computationally intensive, require specialized knowledge and can introduce errors. In addition, there is no universally standardized and accepted ‘best’ procedure. Analytical methods that increase the available separation space in the primary analyses and thus avoid the need for spectral deconvolution are therefore provide significant benefits in metabolomics. Indeed, developing methods to describe the metabolome in much greater resolution is arguably essential if the community is to provide a complementary dataset to that of genomics and proteomics. Such information is needed to construct computer network models to accurately describe cellular functions and advance our understanding of biological systems.

Two-dimensional gas chromatography (2D-GC) offers a potential method to increase peak capacity and separation power of this technique (Marriott and Shellie 2002). The method has been around for over twenty years but is still generally considered relatively novel and is far from being fully established due to the complexity (or perception of complexity) of such systems. While the finer details of 2D-GC are outside the scope of this review, the intent here is to highlight its potential application for further use in metabolomics. As such, the interested reader is directed to the comphrensive review by Mondello et al. (2008). This review by Mondello et al. on two-dimensional gas chromatography-mass spectrometry talks at length about the principles and challenges (in terms of technigues, types of modulation and there difficulties, and issues with respect to chromatogram wrap-around etc.). However, before discussing 2DGC applications, it is first necessary to define some of the terminology; this is discussed in more detail in the proceeding sections and Fig. 4 below provides a graphical overview of a typical GC × GC–MS system as a point of reference.

Fig. 4
figure 4

Graphical schematic of a typical GC × GC–MS instrument. Note: Inj refers to the GC injection port; LMCS refers to the longitudinally modulated cryogenic system for sample focusing onto the second column; M refers to the modulator, which facilitates the cryogenic focusing; 1D refers to the first dimension column; 2D refers to the second dimension column; and TOFMS refers to the Time of Flight Mass Spectrometer detector. Chromatogram A represents the chromatographic output on the first dimension; Chromatogram B represents the chromatographic output on the second dimension

4.1 Multidimensional gas chromatography (MDGC) and comprehensive two-dimensional gas chromatography (GCxGC)

Multidimensional chromatography (either gas or liquid) involves coupling two columns, with uncorrelated retention mechanisms (e.g. polar and non-polar) and running the sample on both (Pandohee et al. 2015). Multidimensional gas chromatography (MDGC) is the most commonly used 2DGC method and utilizes ‘heart-cutting’ in which only selected portions of eluate from the first column (dimension) are transferred to the second. In contrast, ‘comprehensive’ gas chromatography (GC × GC) involves transferring every portion of the eluate coming from the primary column to the secondary one, where it undergoes a further separation step before reaching a detector. The resulting data from each method can then plotted in either a 2D or 3D space. The total peak capacity of the system is theoretically (though usually not quite) the product of the peak capacities of each dimension and the resulting separation space usually far exceeds that of standard 1D systems, with the added benefit of increasing the instrument’s dynamic range (Mitrevski et al. 2009). Multidimensional chromatography has been growing in popularity for a variety of applications over the last 20 years as the technology and software needed to make this process accessible and easier have been established. The usefulness of this technique has been demonstrated across several fields including biomedicine, environmental science and plant biochemistry, a brief discussion of 2DGC in each of these areas will be given here.

One of the first demonstrations of the power of GC × GC was work by Welthagen et al. (2005) who demonstrated its use in the analysis of complex metabolite profiles from mouse spleen tissue. The resulting two-dimensional chromatograms proved that mass spectral quality and sensitivity were largely improved by the enhanced resolution power of GC × GC. The improved capacity also allowed for the detection of peaks that could not be separated with one-dimensional GC analysis. The GC × GC analysis identified almost three times as many metabolites as 1DGC (1200 compared to 500 compounds). The potential for the technique in biomarker identification was also clearly demonstrated via the analysis and discrimination of spleens from New Zealand Obese (NZO) mice and lean C57BL/6 control strains via their metabolic profiles.

The use of GC × GC has also been applied to investigate biomarkers for diabetes mellitus (Li et al. 2009). This study identified five potential biomarkers including glucose (which might be somewhat expected) as well as 2-hydroxyisobutyric acid, linoleic acid, palmitic acid and phosphate. The work also showed that elevated free fatty acids were pathophysiological factors in diabetes. This result may have been overlooked if a standard 1DGC approach had been used, as such analyses tend to use either a polar or a non-polar column (not both) and focus only on aqueous phase metabolites.

Kouremenos et al. (2010) moved the biomedical applications of GC × GC forward in terms of investigating the best derivatisation methods for samples and also investigating the potential of GC × GC for the metabolomic analyses using different column sets. Kouremenos et al. also applied GC × GC coupled with time-of-flight mass spectrometry (TOF-MS) to the analysis of urinary organic acids from patients with inborn errors of metabolism (Kouremenos et al. 2010). Although the sample size was limited, methylmalonic acidemia and deficiencies of 3-methylcrotonyl-CoA carboxylase and medium chain acyl-CoA dehydrogenase gave diagnostic profiles while patients with deficiencies of very long chain acyl-CoA dehydrogenase and mitochondrial 3-hydroxy-3-methylglutaryl CoA synthase showed significant increases in urinary excretion of dicarboxylic acids. The advantage of the GC × GC in this study was again the superior resolving power which in this case enabled the separation of separating isomeric organic acids that were not resolved using 1DGC.

Interestingly, Kouremenos and co-workers later demonstrated a LC-GC × GC system (Kouremenos et al. 2016), although they did not use it for biomedical work but instead described how to set up the system. The approach was found to lead to a higher selectivity and peak capacity, with little sample preparation needed, but with a trade-off being a longer sample run time of ~ 40–60 min in some cases. The reduction in the overlap of different compound classes achieved by the LC step was found to simplify the mixture introduced to the GC × GC which facilitated compound identification. The system therefore has great potential for the targeted and untargeted analysis of very complex sample types of the kind commonly seen in biomedical studies.

The use of MDGC has also been reported for breath gas analysis in the clinical environment by Mieth et al. (2010). This study included 11 patients undergoing cardiac surgery, in which propofol, 1,2-dichloroethane and 2,2,4,6,6-pentamethylheptane were shown to be present at elevated levels, although the compounds 1,2-dichloroethane and 2-propanol could have been present due to environmental contamination. Potential biomarkers could be determined in breath even in the presence of very high concentrations of the anesthetic sevoflurane. The authors were also able to profile intravenous drugs and clinical contaminants as well as metabolites. There is clearly great potential for GC × GC/TOF-MS to be used as a screening tool for the detection of new biomarkers in clinical breath analysis and for serious diseases such as cancer where early, fast diagnosis could save lives (Beale et al. 2017). GC × GC based testing has however, yet to make it into routine clinical testing/practice.

The use of GC–MS has a long history in the study of plant extracts and products; the work of Fiehn, Kind and co-workers at the Max Planck Institute of Molecular Plant Physiology (Germany) and later at the University of California (USA). Their work relating to plant metabolism via GC and GC × GC are some of the most well-known pieces of research in metabolomics. These applications range from unravelling plant gene functions in physiological contexts (so called silent phenotypes) (Weckwerth et al. 2004), to unbiased detection of unexpected metabolic responses under environmental stress conditions (Hirai et al. 2004), to extend and enhance plant functional genomics studies (Fiehn et al. 2000) and co-regulation of biochemical pathways that previously been mapped separately (Fiehn 2003). Metabolomics has also been used in the study of plant disease (Allwood et al. 2008; Jones et al. 2011). GC × GC has been particularly applied to the analysis of essential oils and related substances (Shellie et al. 2002; Shellie and Marriott 2003).

One of the more applied studies was that of Beckner Whitener et al. (2016) who used solid phase micro extraction (SPME) to capture and analyse the untargeted volatile compound profile of Sauvignon blanc grape based wine inoculated with different types of yeasts. The study took the novel approach of combining the SPME and GC × GC–TOF-MS analysis with sensory data. The work showed that each wine had a distinct profile in terms of both metabolomic/chemical and sensory profiles. This in itself is perhaps not so surprising, but the power of the GC × GC-analysis allowed 300 unique features to be identified as significantly different across the study. The data not only gives a more detailed profile of these yeasts contribution to Sauvignon blanc wine than previously reported and helps increase our understanding of the contributions of non-Saccharomyces yeast to winemaking.

A study by Beckner Whitener et al. (2016) illustrates the potential of the technique to food product analysis, particular for food fraud cases in terms of point of origin or route of manufacture testing. Such methods have been demonstrated in honey for example where the use of GC × GC can significantly increase sample throughput and reduce the risk of erroneous identification (Cajka et al. 2007). GC × GC based testing could also be used for quality control of plant samples in medial/herbal products as well as food and some discussion has already taken place in the literature on this subject (Belliardo et al. 2006).

5 Data analysis and bioinformatics

The raw data files generated by GC–MS platforms is comprised of a complex three-dimensional data format that consists of retention time, m/z values and the intensity or abundance in each of the axes (for GC × GC derived data, there is an additional dimension resulting from the second column set). This comprehensive information needs to be processed before any statistical techniques are used to analyse the data. Typically, data processing involves a series of steps that translates the instrument generated raw data into a two-dimensional data matrix suitable for statistical analyses. This can typically be undertaken using proprietary software that is packaged with the GC–MS instrument used to collect the data. Alternatively, raw data can also be processed into an open standard form such as mzML (Martens et al. 2011). Open source programs such as msConvert (Adusumilli and Mallick 2017) that is part of ProteoWizard can be used for the conversion.

After conversion, a range of software tools can be used that further process the mzML files (O’Callaghan et al. 2012; Kuich et al. 2014; Wehrens et al. 2014). However, things to note when performing GC–MS based metabolomics data processing relate to retention time drift, that impact peak alignment, and metabolite identification. Due to the sensitive nature of the analytical instruments, a shift in retention time is often observed over the course of a run, especially for large batches that can last days. The first step in establishing and identifying metabolites is to find molecular features that occur as peaks in the processed data. Defining a peak involves steps such as correcting for baseline over the course of the run and deconvoluting any peaks that closely co-elute with each other (O’Callaghan et al. 2012).

Once unique peaks have been separated out, peak alignment is undertaken during data processing to correct for such retention time drifts, which aligns the same peaks across all samples. Lastly, the resolved peaks are identified either using authentic standards or by querying the peak’s mass spectrum against a library of mass spectra such as those described previously in Sect. 3 (i.e., NIST (Noble 2009), HMDB (Ren et al. 2015), GMD (Kopka et al. 2005) and FiehnLib (Kind et al. 2009), (https://chemdata.nist.gov/mass-spc/amdis/explanation.html)). There are still only a few tools that can automatically produce a list of possible metabolites from the m/z signals at a particular retention time (Moco et al. 2007) outside of proprietary tools, and there still remains a limited connection between experimental MS data and available chemical databases (Wishart et al. 2007; Hummel et al. 2010).

Once the GC–MS data is processed and the peaks identified, the data is next subjected to a series of data pre-treatment steps before meaningful statistical analyses can be performed. These include the imputation of any missing values and the normalization and transformation of the processed data. Missing values occur in the processed data matrix either due to the heterogeneous nature of the biological samples or due to the limit of detection of the analytical instruments. There are various techniques that are used to impute missing values as detailed in Armitage et al. (2015). After the imputation process, the resulting data matrix would be devoid of zeroes for values and hence the data is then normalized to either an external numerical measure (weight, number of cells), an internal measure (median of the metabolites of the sample) or to added quality metrics such as one or more internal standards. In addition, due the heteroscedastic nature of metabolomics data, the normalized data is usually log transformed before applying any post- statistical methods. Two broad types of statistical methods are used to investigate the behaviour and differences of metabolites. Univariate and multivariate methods attempt to analyze the behaviour of the overall system and within a system by considering the measurements of all the metabolites that are being studied in a biological system.

Due to the nature of metabolomics studies where hundreds if not thousands of features are investigated, it is important that the appropriate statistical test are applied. A detailed review of the different types of statistical methods used in metabolomics can be found in Bartel et al. (2013) and Ren et al. (2015). Tools such as SIMCA (Tsugawa et al. 2011) and the web based Metaboanalyst (Xia et al. 2009) can be used to perform these statistical analyses. The interested reader is directed to recent review articles that discuss the various software tools for processing metabolomics data (Spicer et al. 2017). Once statistical analysis has been performed on the data and metabolites are identified that are significantly different across the diverse groups, the next step is to understand its role in the biological system and one of the ways to study this is through metabolic pathways. Tools such as PathWhiz (Pon et al. 2015), VANTED (Rohn et al. 2012), Metscape (Gao et al. 2010; Karnovsky et al. 2012), MetPA (Xia and Wishart 2010), Metaboanalyst (Xia et al. 2009) plus others allows a user to highlight the metabolites of interest in different metabolic pathways and thereby study its role and impact on the underlying biological system.

6 Final remarks

Metabolomics has potential applications to a multitude of fields or research, such as agriculture, biotechnology, health and biomedical sciences in terms of diagnostic and prognostic value. Furthermore, GC–MS based applications have a rich and extensive history in the study of small molecules derived from biological processes. Our review provides updated methodologies and specific applications using GC–MS. It outlines a comprehensive GC–MS metabolomics workflow which involves sample preparation (i.e., quenching, selection of solvents etc.) and other techniques, as well as different types of chemical derivatization methods. Mass analyzers used in GC–MS metabolite profiling are discussed, as well as GC columns which are best suited to specific applications. Multidimensional GC techniques and its emerging applications in the field of metabolomics offers resolving power of different types of stationary chemistries are also addressed in this review. However, though the methodology and applications have comprehensively been discussed one does need to take into account a number factors which could bias results. As mentioned, the metabolites that are identified (amino acids, organic acids, sugars and sugar phosphates) and the selection of chemical derivatization agent is critical in obtaining not only the best sensitivity but selectivity of the targeted analyte(s). The generation of metabolite artifacts due to chemical derivatization or pyrolysis in the inlet needs to be identified so overestimations in quantitative data is avoided. Finally, a discussion of bioinformatic tools and approaches are essential in interrogating metabolomic data to understand the biological system, and potentially the role it plays in unison with genomics, transcriptomics and proteomics to answer the biological questions in a complete systems biology approach. The focus of this review is to highlight recent key publications in the area of GC–MS metabolomics, which may be of benefit to new and existing researchers in the field and the flourishing metabolomics community.