1 Introduction

The discovery, development and quantification of novel biomarkers is a top priority in the field of biomedical research. One such target for biomarker identification is metabolite. Metabolome is commonly defined as the sum of all the metabolites found in biological samples having a low molecular weight, between 50 and 1500 daltons (Da). Metabolomics is the comprehensive analysis of the metabolome in the cells, biofluids, tissues or organisms (Vlaanderen et al. 2017). Variations at the metabolite level are most likely to be indicative of many diseases and such changes often appear in biological fluids before appearance of the first sign of disease. For this reason, the discovery of new diagnostic biomarkers based on metabolomics can enhance the understanding of human diseases and eventually improve the quality of life of patients (Ren et al. 2015). Metabolomic investigations generally employ such analytical techniques as NMR (Wishart 2019), LC–MS (Dervilly-pinel and Bizec 2015), and GC–MS (Moreno et al. 2015). Recently, LC-NMR (Hammerl et al. 2019) and LC-NMR/MS (Bhatia et al. 2019) techniques have also been used for identifying metabolites in complex mixtures.

GC–MS is a routine and useful method for analyzing biofluids. However, a single chromatographic peak recorded by GC–MS may contain several components and the obtained mass spectrum is therefore difficult to interpret. Gas chromatography-tandem mass spectrometry (GC–MS/MS) technique can solve this problem by enabling us to analyze separately each compound of such peaks (Wu and Colby 2016). GC–MS and GC–MS/MS have been extensively applied in metabolomic studies owing to several advantageous features, such as reproducibility, sensitivity and availability of mass spectral databases. However, the main prerequisite for GC–MS analysis is that the compound should be volatile and thermally stable. As most of the metabolites are polar and non-volatile in physiologic fluids they cannot be directly analyzed by GC–MS without sample preparation. Therefore, metabolic profiling via GC–MS usually requires modification of the polar functional group of a molecule by chemical derivatization to reduce the polarity, increase the thermal stability and volatility of the analytes (Moros et al. 2017a). Various biological samples can be analyzed by GC–MS, of which urine has several advantages for metabolic research. It is easy-to-obtain in large volumes, and largely free from interfering proteins, lipids and complex molecules and the possibility of home sampling. Further, urine possesses an extreme diversity of metabolites consisting of microbial metabolites as well as mammalian metabolites (De Paepe et al. 2018). Typical untargeted urinary metabolomic analysis workflow is summarized in Fig. 1. In this review paper, we describe different aspects of sample collection and preparation techniques adopted in the untargeted metabolomics of urine matrices using GC–MS. Besides, GC–MS instrumentation for urine analysis is discussed, along with mostly used approaches for metabolite identification and data normalization. Statistical analysis and biomarker identification are not discussed in this paper and the reader interested in them is referred to papers of (O’Shea and Misra 2020).

Fig. 1
figure 1

Workflow of GC–MS-based untargeted metabolomic analysis of urine

2 Collection and storage of urine samples

One of the most important points about metabolomics is that applying the proper collection and storage method is essential to avoid misleading conclusions. It is highly significant to identify and separate artificial and analytical changes from the biological and physiological of interest. The most common sources of error in metabolic studies are the following: (a) sample collection and storage; (b) sample preparation, and (c) instrumental variation and calibration. Several studies have described the effect of sampling and storage conditions on the stability of metabolites in urine (Bernini et al. 2011). Metabolites have different chemical stability, depending on the conditions they experience during sample preparation. At higher temperatures, spontaneous reactions such as decarboxylation of α-keto acids, hydrolytic reactions, formation of lactones or other by-products may occur (Maher et al. 2007). Hence sample handling is a serious challenge in the metabolomics laboratory. Bacterial contamination and metabolism may cause both metabolite appearance and degradation. Furthermore, biological samples may have various degrees of biochemical activities, e.g. esterase, transaminase, deaminase, and peptidase, which may vary noticeably in disease states. Therefore, enzymatic reactions during sample collection and storage must be eliminated or minimized to prevent any possible variation in the metabolic profile, both from qualitative and quantitative aspects (Chetwynd et al. 2017). The effects of storage conditions on urine metabolites have previously been investigated and it is suggested that to avoid metabolite alteration, urine samples should be stored at − 80 °C and storage at room temperature even for short-term ought to be avoided because it can lead to bacterial growth and metabolite degradation (Aurelie Roux et al. 2014). Another study showed that urine samples can be kept at room temperature for 4 h while sample stability remained unchanged for 36 h at 4 °C and 3 months at − 80 °C (Khamis et al. 2019). Urine metabolome stability at the storage temperature (+ 4 °C, − 20  C, − 80 °C and − 80 °C) were investigated (Laparre et al. 2017). It demonstrated that storage at − 20 °C and − 80 °C exhibit metabolite stability over a long period.

Urine samples that were frozen at − 80 °C and thawed up to three times showed no significant change in metabolite profiles of the samples (Lee et al. 2012). In another study, more than two freeze and thaw cycles changed the metabolite concentration in the urine samples. Moreover, Hexose and four Acylcarnitines (C3, C4, C8:1, C16:1-OH) concentrations altered after repeated freeze/thaw cycles (Rotter et al. 2017). It is suggested that freeze–thaw cycle should be avoided as much as possible to prevent metabolite variation. To avoid this altogether, it would be advisable to divide the sample in small aliquot prior to freezing for long-term storage, so that any sample remnant to be discarded following each experiment.

To prevent sample decomposition during transport and storage, addition of preservatives has been considered. Although a number of preservatives have been used, it is still a controversial issue which preservative should be selected. Addition of preservatives usually inhibits metabolic changes driven by bacterial growth in urine samples. If samples are frozen at or below − 20 °C immediately after collection, no preservative is necessary (Dunn et al. 2008). If sample transport or storage at room temperature exceeds 120 min, the addition of preservatives for quantitative and qualitative analysis should be considered (Eisinger et al. 2013). Table 1 represents a number of prevalent preservatives and their limitations. Boric acid is cheap but its addition to urine is limited on account of the potential formation of chemical complexes in the samples (Castro and Ferna 2012) and changing initial pH value of sample (Zheng et al. 2019). The use of hydrochloric acid (HCl) for urine preservation has been reported. HCl is a strong acid, toxic, corrosive, and it causes variation in initial concentration of some of the metabolites (Feres et al. 2011). Thymol prevents the conversion of creatinine to creatine during storage, decreases the risk of spontaneous crystallization, maintains citrate concentration and inhibits bacterial contamination and growth (Wang et al. 2019). When the urine sample container is frequently opened, toluene is a good choice for preservation of urine because it prevents contact with air by forming a thin layer on the surface of sample (Kim et al. 2009). The most commonly used preservative, especially in metabolomic studies, is sodium azide which is a useful preservative against microorganisms. Particularly, it effectively inhibits the activity of E. coli, the main pathogen in urinary infections. Unlike boric acid, sodium azide does not form chemical complexes in urine sample. High toxicity is the main limitation of sodium azide which leads to difficulty in storage and handling of samples (Wu et al. 2015). However, preservatives have negative impacts on the chemical and physical properties of urine samples. Although sodium azide is the most used preservative for urine, storage samples at − 20 °C after urine collection without adding preservative is the best approach.

Table 1 Common urine preservatives and their limitations

3 Sample preparation

3.1 Urease treatment

Urea, the final product of the amino acids-nitrogen metabolism in human, is a highly abundant metabolite found in the urine and having four N–H hydrogens can undergo derivatization generating a huge or often multiple peaks in GC. Some metabolites have the same retention time as urea and therefore it can lead to masking these metabolites by a huge peak. Furthermore, high amounts of urea can cause other problems including excess usage of derivatization reagent, column overloading, peak distortions, matrix effects, and shortening the column and filament life for GC–MS analysis (Palmas et al. 2018). To overcome these problems, the universal approach is treating urine samples with urease before analysis. Although urease can remove urea from the urine sample, its adverse effects on the measurement of urinary metabolomics are still unclear. Urea removal causes higher repeatability and reproducibility of resulting data, as well as more metabolite identification compared to untreated samples (Liesenfeld et al. 2015). Varying amounts of urease enzyme were used in the published papers and optimal amount of urease has not been reported, yet. The experimental conditions like urine volume has been investigated and it is suggested that urease pretreatment is particularly useful when urine sample volume is 25 µL or higher (Webb-Robertson et al. 2014). In addition, commercially available urease enzyme usually contain contaminants and thereforeurease pre-treatment can lead to erroneous results due to presence of these contaminants in urine Jáčová and et al. (Jáčová et al. 2019) used a new method to the urea removal for the GC–MS analysis of urine by immobilization of urease on magnetic nanoparticles. This approach decreased number of artifacts introduced by urease and alterations in the profiles of urine metabolites obtained by GC–MS. Figure 2 shows the effect of urease treatment and its considerable effect on urine chromatogram (Kind et al. 2007). Some metabolites can only be detected after removal of urea. The peak overlaps between urea and other metabolites were diminished by means of their specific mass spectra and advanced chromatographic techniques (Kim et al. 2020). In addition, presence of urea showed no negative effects on derivatization efficiency (Ahn et al. 2017). However, there are no reports on urinary untargeted metabolomics by GC-Quadrupole platform without urease pre-treatment. Overall, urinary metabolomic analysis without urease pretreatment does not worsen identification of metabolites, but rather urease pretreatment introduces artifacts and misinterpretations into urinary metabolomic analysis. Therefore, we recommend urease pretreatment step should be eliminated from the standard protocol of urine sample preparation for GC–MS metabolomics.

Fig. 2
figure 2

GC–TOFMS partial chromatograms representing untreated (red) and urease pretreated (blue) of the same urine sample (Kind et al. 2007), reprinted with permission (Color figure online)

3.2 Derivatization of urine samples

The selection of the analytical method depends on the aim of the metabolomics. If the purpose is to apply a targeted method, it is obvious that sample preparation should concentrate on the metabolites of interest. Commonly, metabolomics is performed by non-selective methods to allow identification of all metabolites in the matrix. GC–MS is the most standardized technique for untargeted analysis (Fiehn 2016). Volatile, low-molecular-weight metabolites can be sampled and analyzed directly without derivatization. However, the majority of metabolites in urine are nonvolatile and unstable at the high temperatures and therefore, derivatization prior to GC analysis is required. Various approaches for chemical derivatization of metabolites have been reported with different reagents such as alkylation, acylation and silylation. A two-step derivatization procedure, oximation followed by silylation, is the most frequently used protocol for GC–MS metabolomics (Chun et al. 2011). Oximation serves two purposes: it simplifies the chromatogram by reducing carbohydrates peaks and protects α-keto acids and aldehydes. As long as monosaccharides have the cyclic and open-chain structures, silylation usually lead to multiple derivatized peaks for each carbohydrate. Since different types of carbohydrates are present in urine, the generation of multiply derivatized compounds increases complexity in the resulted chromatogram and metabolite identification (Megías-Pérez et al. 2019). Furthermore, retention time variation increases the chance of misalignment of sugar peaks. Also, oximation reaction protects α-keto acids and aldehydes against decarboxylation and keto–enol tautomerism (Moros et al. 2017b). Both hydroxylamine and alkoxyamines reagents are used for oximation (Fiehn 2016). Although hydroxyl amines are more reactive, alkoxyamines especially methoxyamine are infinitely preferable due to simple chemistry and the ease of identification. The choice of oximation reagent also depends on the mass shift for identification purposes. For example, when ethoximated and methoximated derivatives of plant extracts were compared, the presence of carbonyl moieties in unknown peaks is clearly determined by a retention time shift to longer retention times (Lisec et al. 2006). The optimization of experimental conditions may require some attention. For instance, Liebeke and Puskás have reported that drying samples between methoxymation and silylation leads to two- to tenfold increase in metabolites signal in untargeted metabolomics analysis by GC–MS for yeast cells, plant and animal tissue, and human urine (Liebeke and Puskás 2019).

The next stage of silylation is done using a silylating reagent. Silylation is the substitution of active hydrogens for a silyl group in a molecule. Active hydrogens are present on acids, alcohols, thiols, amines, amides, enolizable ketones, and aldehydes (Kim and Verpoorte 2010). It has been noted that the activity of an active hydrogen towards silylation follows approximately the order shown in Table 2 (Moldoveanu and David 2018). Silylation reactions reduce the polarity of the compound and decrease hydrogen bonding, so derivative becomes more volatile. Different silylation agents have been used for metabolite analysis by GC–MS. The most commonly used silylation agents are MSTFA (N-methyl-N-(trimethylsilyl)trifluoroacetamide), MTBSTFA (N-tert-Butyldimethylsilyl-N-methyltrifluoroacetamide) and BSTFA (N, O-Bistrifluoroacetamide) (Supplementary Table S1). Each of these reagents has known advantages and disadvantages. The BSTFA reagent is less volatile than MSTFA but achieves higher derivatization reaction yield for carboxylic acids (Zhu et al. 2017). Carbohydrates which are derivatized using BSTFA are unstable and produce many fragments on the MS spectra, while derivatized samples using MTBSTFA are less sensitive to humidity (Schummer et al. 2009). The yield of silylation reactions is affected by the solvent and catalyst (Zhao et al. 2017). Catalysts such as trimethylchlorosilane (TMCS) (Yang et al. 2017) and tert-butyldimethylchlorosilane (TBDMCS) (Christou et al. 2014) are added to improve the hydrolytic stability of the reaction mixture and to increase the reactivity. Trifluoroacetic acid (TFA), which has been used as catalyst, eventuates in silylation of even sterically hindered hydroxyl groups (Poole 2013). Pyridine is the most common solvent and also, acts as catalyst for reactions using TMCS. Silylation reactions generally do not proceed completely at room temperature, and it is essential to optimize the reaction temperature when developing derivatization methods. Silylation reagents and derivatives are moisture and air-sensitive and must be stored sealed to prevent degradation (Moros and Helen 2017). Generally, silylation reactions are performed under anhydrous conditions and an excess amount of derivatizing reagent is used to consume any moisture present in the reaction vessel to ensure complete derivatization. Albeit, the sample injected into the GC column contains derivatized and non-derivatized metabolites as well as the excess derivatizing reagent that may damage the GC column particularly those with stationary phases that have active hydrogen (Caban and Stepnowski 2018).

Table 2 Derivatization preference and the order of silylation reactivity for different functional groups

Although carbohydrates and alcohols are derivatized efficiently by silylation, some metabolites such as the amino acids and some organic acids produce relatively unstable silylated derivatives (Villas-Bôas et al. 2011). An alternative derivatization method that has been used in GC–MS metabolomic studies is alkyl chloroformate derivatization using methyl-, ethyl-, and propyl-chloroformate (Primec et al. 2017). Alkyl chloroformate derivatization reactions have a number of advantages (Villas-Bôas et al. 2011) such as: (a) an instantaneous reaction that can take place directly in aqueous media without the requirement of sample heating, (b) lower reagent costs, (c) separation of the resulting derivatives from the reaction mixture simply by an organic solvent, hence, reducing damage to the GC column and less chemical contamination, (d) amenable to automation using commercially available analytical robotics, and (e) introducing a relatively low molecular weight group to the metabolites, that may be advantageous for the study of certain higher molecular weight metabolites (Tumanov et al. 2016). Using methyl chloroformate (MCF) derivatization, the amino group of a metabolite is converted to a carbamate and the carboxylic group to methyl ester. Other alkyl chloroformates (butyl, isobutyl, hexyl and octyl) have also been used but none is considered as a general derivatization reagent in urine GC–MS based untargeted metabolomics because alkyl chloroformates can derivatize limited metabolites which have amine and carboxylic acid functional groups (Dettmer et al. 2019).

The formation of acyl derivatives is another type of derivatization which an active hydrogen from a metabolite is replaced by an acyl moiety. Acylation with small acyl groups or fluorinated acyl groups improve volatility, although not as marked as for silylation or methylation. Derivatization with fluorinated acyl groups is often used to improve detectability in GC with electron capture detector or negative ion chemical ionization MS detection (Shin et al. 2017). The most of GC methods reviewed rely on derivatization with oximation reagent followed by silylation. Silylation reagents are the most common derivatization reagents and these are most proper for comprehensive GC–MS analysis. Only a few authors used an alternative derivatization, e.g. chloroformates or no derivatization at all. Therefore we focused on GC–MS methods using oximation and subsequently silylation.

3.3 Internal standard and QC correction

Since the main constitute of urine is water, sample preparation before GC–MS analysis is necessary. During sample treatment, part of the sample may be lost through sample handling steps like incomplete extraction, derivatization or decomposition during storage. In addition, when a compound is introduced into the ion source only a small proportion of its molecules become ionized and mass analyzers can only detect ions. Furthermore, MS detector response may vary during day-to-day operation as a result of uncontrolled parameters such as temperature and pressure variation and the general conditions of the ion source. To minimize these errors and response variations in quantitative analysis, internal standards are used. An internal standard is a compound, different from the analyte but with similar chemical properties and not present in the sample, which is added to every sample prior to analysis. The application of internal standards in GC–MS metabolomics is common and either structure related compounds or stable isotopically labeled (SIL) compounds are used (Liu and Locasale 2017).

SIL internal standards are compounds in which one or more atoms within a molecule are replaced by its stable isotopes, such as 2H, 13C, 15N, or 17O. By reason of structural similarity between an analyte and its SIL, co-elution may be observed. Therefore, it is important that for partially resolved pairs, the minimum mass difference to be at least 3 mass units considering the resolution of typical quadrupole mass analyzer to avoid interferences. In addition, the label of a SIL internal standard should not be exchangeable with that of the matrix or the solvent. Many researchers have shown that SILs are preferred internal standards to minimize systematic errors, matrix effects, and detector response variation in MS-based metabolomics, because they are chemically similar to the analyte (Freund and Hegeman 2017).

Quantitative analytical methods are commonly based on utilization of SIL for each analyte measured. However, this is not practical for untargeted metabolomics because usually a large number of metabolites are present in the sample. These metabolites are too diverse chemically and addition of an appropriate SIL internal standard might not be possible for every metabolite since many are not commercially available and cost-effective. An alternative approach is to prepare stable isotopically labelled internal standards using isotopically labelled derivatization (ILD) reagents. This method is based on this assumption that the derivatization yield and the stability of resulting derivatives are similar for both unlabeled derivatization reagent and its deuterated analog. Lien and et al. used d9-MSTFA as reagent to prepare internal standards for quantification of metabolite groups such as sugars, amino acids and non-amino organic acids (Lien et al. 2012). Internal standard for each metabolite was obtained by derivatization of a mixture of standards with d9-MSTFA and then spiking this solution into MSTFA derivatized samples before GC–MS/MS analysis. They found that this method improves data precision more than normalization of metabolite responses group by group using one SIL metabolite per group. Besides, D9-BSTFA was used to verify the number of silylation group per metabolite and also generate a labelled internal standard for every metabolite (Qiu et al. 2016). In another study, d3-MCF was used to develop a GC–MS method for quantitation of almost seventy metabolites of amino acids and non-amino organic acids in rat liver, serum and urine (Kvitvang et al. 2011). In this report, individual standards were derivatized using d3-MCF in d4-methanol and spiked into MCF derivatized sample solutions. Alkyl chloroformates could produce stable derivatives in aqueous samples and amino and carboxyl, functional groups are differently labelled. Carboxyl groups react with the alcohol, used as solvent (usually methanol), while the amino group reacts with the chloroformate derivatization reagent. Since deuterated alkyl chloroformate is expensive to buy and the reagents to synthesize it yourself are also expensive and some of them are not commercially available, a less expensive method has been suggested (Bruheim et al. 2013). In this method, other chloroformates and alcohols should be used as internal standard synthesis, provided that retention time of analyte and its internal standard was different. Indeed, there is an internal standard for each metabolite in the chromatogram but presence of a large number of peaks in chromatogram cause erroneous results and misalignment.

In spite of extensive utilization of SILs, some researchers have reported disadvantages for SILs as internal standard (Moreno et al. 2015). Deuterium labeled internal standards may show unpredictable behavior which can lead to negative impacts on accuracy and precision of quantitative metabolomics. This is thought to be due to the substitution of deuterium for the carbon bond hydrogen, which causes a subtle difference in the lipophilicity of the analyte and its SIL analog (Iyer et al. 2004). Therefore, the retention time of the SIL compound may change during the chromatographic separation. Moreover, it is observed that an analyte and its deuterated internal standard may have different recoveries during sample preparation. Weiling has demonstrated a 35% difference between recovery of unlabeled and deuterated haloperidol (Wieling 2002). Deuterated internal standards are unstable in water because of hydrogen–deuterium exchange which has limited the use of these compounds as internal standard. It should be noted that the deuterium has twice the mass of hydrogen, while 12C and 13C show a mass difference by only 8%. So, 13C, 15N and 18O labelled internal standards are more similar to their corresponding unlabeled analytes compared to 2H labelled internal standards, and therefore expected to behave more similarly in chromatographic separations. The main problem of 13C, 15N and 18O labelled internal standards is that they are more expensive than their deuterated analogues. Furthermore, identification of labelled metabolites requires special softwares like isoMETLIN (Cho et al. 2014) and DExSI (Dagley and McConville 2018) because general metabolite databases do not include MS spectra of isotopically labeled metabolites.

Adjustments of the operating condition of GC–MS between batches of samples can be the cause of analytical variation that cannot be corrected for using solely internal standard calibration. Therefore, it is recommended to use QC samples for monitoring the performance of the method and to increase the reliability of data obtained for large-scale sample analysis (Broadhurst et al. 2018). A QC sample in the context of metabolomics contains all the metabolites characteristic of the study and it could be obtained by pooling all samples in the study or by using additional control groups and pooling the samples derived from these control groups. QC sample, applied for an entire study at regular intervals, can help determine variations of all processes involved in terms of data acquisition and data pre-processing (Dudzik et al. 2018). It should be noted that in studies involving a large number of samples that are analyzed in multiple batches, a quality check of each batch requires same QCs pooled from samples in all batches. In addition, blank samples, which are analyte-free and prepared exactly as the test samples, reveals peaks for compounds derived from derivatization reagents and carryover. Another approach in the experimental design is randomization of the sample analysis sequence which can minimize the bias introduced when preparing and analyzing replicate samples jointly (Dunn et al. 2012).

In general, using pooled QC samples is the best approach to equilibrate the GC–MS, perform intra-study reproducibility measurements and correction of systematic errors while internal standard correction is the method for rapid assessment of data quality for each test sample, independent of any pooled QC sample. The advantage of internal standard approach over the pooled QC method is that data collection and assessment for individual sample can be performed independently of any other sample in the batch. In addition, the internal standard correction can be performed manually for any sample immediately after the raw data are obtained while pooled QC method is a post-data acquisition quality control process.

4 GC–MS in urine metabolomics

4.1 GC inlet

The GC inlet liner is where the urine sample is introduced to instrument. The inlet’s main purpose is to prevent the sample contacting the metal walls of the GC inlet. Choosing the proper GC liner is crucial to achieve correct results, without negatively impacting the analytes. In most GC–MS based urine metabolomic studies, an aliquot of up to 2 µL of the derivatized urine sample is injected into a heated injector. If silylating reagents are used, stainless steel injection ports lead to poor precision and accuracy of results (Pasikanti et al. 2008). Although there are some reports about poor reproducibility of glass liners for urine metabolomics, they are the most common liners used in GC–MS based urine metabolomics (Kaspar et al. 2008). Different inlet liners are commercialized for use in GC and they differ in geometric design, volume, chemical properties and the presence or absence of some sort of packing. Biofluids contain nonvolatile compounds that accumulate in liners, causing progressive increase in noise and metabolite degradation. Previous reports showed that GC inlet liners generally required exchanged after 20 sample injections (Koek et al. 2006) while some authors recommend changing liner after 50 injections for urine samples (Schroeder et al. 2008). To avoid manual exchange of liners, an automated liner exchange (ALEX) was used to inhibit matrix effect of biological samples (Kind et al. 2007). Fiehn recommends that after 10 injections every liner must be changed to new one and silylation reagent should be injected to GC to deactivate surface of new liner (Fiehn 2016). Deactivated baffled glass liner, glass wool packed liner, quartz wool packed liner and the chemically inert SILTEK® liner were compared for determination of free amino acids in biofluids. Based on the results, glass wool packing caused sample loss and biased results, because of catalytically active sites, while SILTEK® liner provided more reproducible and more accurate results compared to others (Kaspar et al. 2008). Dirty liners cause sample deterioration, discrimination and loss of sensitivity. In author’s laboratory new and re-packed liners are conditioned at 300 °C for several hours before use and liner and septum are changed after every 50 injections. Meanwhile the liner performance is checked on regular basis. Thus, it would be advisable to use the same type of liner through the study and changing it on regular bases. A literature search retrieved 17 papers dealing with GC–MS untargeted metabolomics in human urine with clinical applications published during 2018 and 2019 (Supplementary Table S1). We found that information related to the type of liner was lacking in majority of papers which implies the need to study the effects of inlet liner on a wider range of urine metabolites.

4.2 Ion source

Almost all GC–MS based metabolomics applications apply electron ionization (EI) as ion source because the full scan response in EI mode is approximately proportional to the amount of compound injected and all compounds suitable for GC analysis can be detected non-discriminatively. In addition, the assignment of the identity of peaks via a database of mass spectra is straightforward, due to the extensive and reproducible fragmentation patterns obtained in full-scan mode. The fragmentation pattern can be used to identify or classify unknown metabolites (Allen et al. 2016). Due to extensive fragmentation, EI spectra of trimethylsilyl derivatives are commonly dominated by trimethylsilyl fragments (e.g. m/z 73 and 147) and the molecular ion is seldom present in a spectrum (Kubinec et al. 2017). Consequently, different metabolites can have similar EI spectra, and this causes obstacles for identification of “unknowns’’ and the identification and deconvolution of overlapping peaks. Chemical ionization (CI) can overcome this obstacle because this technique is a soft ionization technique which preserves the precursor ions., However, the results obtained by CI depend strongly on the reagent gas and its pressure (Raro et al. 2016). Consequently, CI spectra are less suitable for a database search and the use of CI within the metabolomic studies remains limited to targeted metabolomics (Kubinec et al. 2017).

4.3 Column

Both split and splitless mode have been used for urine analysis. In metabolomic studies, split mode is usually preferred because metabolites are present in urine at a variety of concentrations and sample overloading may be occurs in splitless mode (Pasikanti et al. 2008).

Fused silica capillary GC columns can operate at very high temperatures and because of their small internal diameter have limited sample capacity. For example, sample capacity of a typical column is limited to about 50–100 ng per analyte. Columns with varying polarity (non-polar to polar), chemical composition of stationary phase, and length (10 to 60 m) have been utilized in metabolomic analysis. However, a slightly polar 5% diphenyl cross-linked with 95% dimethylpolysiloxane stationary phase with dimension of 30 m × 0.25 mm × 0.25 µm (length × internal diameter × film thickness) has been used in most studies (Supplementary Table S1). Smaller bore columns provide higher resolution but have lower sample capacity which may lead to compromised sensitivity.

4.4 Mass analyzers

Quadrupole (Q), time-of-flight (TOF) and Fourier transform ion cyclotron resonance (FTMS) are the most common mass analyzer in GC–MS based metabolomics studies. Recently ion trap and Orbitrap mass analyzers have also been utilized. GC-FTMS systems have not been routinely used for untargeted metabolomics and metabolic profiling because they are not cost-effective. Recent reports show that GC-FTMS has been used for targeted metabolomics such as:analysis of methasterone (Magalhães et al. 2019) and for study of metabolism of 19-nortestosterone (Piper et al. 2016).

High mass resolution and rapid scanning rate of TOF technology can be very advantageous for deconvolution of overlapping GC peaks. This is particularly useful for resolving the narrow and sharp peaks generated using two-dimensional gas chromatography (GC × GC) technique (Keppler et al. 2018). GC × GC enhances detection and identification of metabolites in urine samples and allows finding new biomarkers of disease. Comprehensive GC × GC-TOFMS was applied to female reproductive health monitoring and 935 unique metabolites were detected. Four metabolites (2-pentanone, 3-penten-2-one, carbon disulfide, acetone) were identified as putative biomarkers of fertility (Eshima et al. 2019). The use of TOFanalyzer has been reported to find urinary biomarkers associated with interstitial cystitis. A total of 490 metabolites were identified, which 290 out of 490 metabolites were unknown compounds (Kind et al. 2016). GC-TOFMS has also been applied to obtain unique urinary metabolic signatures for both healthy males and females (Fan et al. 2018) which provides faster mass domain acquisition rates and less spectral distortion than GC-QMS.

Orbitrap based instrumentation seems to provide some advantages in metabolomics studies, including high resolution, increased mass accuracy and increased limits of detection. GC-Orbitrap-MS was introduced in 2015 and is currently marketed by one manufacturer (Q-Exactive™ Orbitrap GC ™ (Thermo Scientific)). Utilization of high-resolution GC–MS technology in metabolomics research is still rare. Possible reason for the lack of interest in the GC-Orbitrap-MS in metabolomics is that making use of high-resolution GC–MS data is not straight forward. Orbitrap-MS Spectra are not always comparable to those obtained by unit resolution quadrupole MS. Almost all available GC–MS spectral libraries contain only unit-mass data, making high-resolution data unnecessary for their use. In addition, computational support that utilizes high-resolution GC–MS data is still underdeveloped. In theory, the identification of unknowns is supported but the absolute gain of information when employing high-resolution GC–MS in comparison to established unit-mass analyzers is not fully explored (Stettin et al. 2020).

Quadrupole MS is presently the most common type of mass analyzers including GC single quadrupole (GC-QMS) and tandem instruments (GC-QqQMS). Quadruples have high sensitivity and dynamic range but suffer from slower scan rates and lower mass accuracy compared with TOF analyzers. Both (GC-QMS) and (NMR) were used to discover new biomarker for bipolar disorder (BD) disease (Chen et al. 2019). In total, 13 biomarker were identified which could be helpful in developing an objective diagnostic method for young and middle-aged BD patients during depressive episode. In another study, urine metabolomics was performed using GC-QMS in order to find metabolic variations driven by breast cancer (BC) as well as their potential to diagnose BC. A total of 227 metabolites were identified and both dimethylheptanoylcarnitine and succinic acid were suggested as potential biomarkers in urine for BC (Cala et al. 2018). GC–MS/MS provides higher sensitivity and precision for quantitative analysis of low-level metabolites and mostly used for targeted metabolite assays. Quantitative determination of metabolites of interest must be done based on selection of an appropriate ion(s) and MS/MS parameters should be adjusted for metabolites of interest (Newman et al. 2019). GC triple quadrupole is capable of multiple reaction monitoring (MRM) of analytes, which improves the confidence in the identification and quantification of co-eluting analytes in complex matrices (Struck-Lewicka et al. 2015). Overall, each of these mass analyzers has its own advantages and limitations. Selection of a mass analyzer for metabolomic studies depends on the aim of the metabolomics project, throughput, and instrumental costs. GC-QMS and GC-TOFMS are the most common platforms for GC–MS based untargeted metabolomics.

4.5 Metabolite identification

There are many public and commercial MS databases which comprises standard EI fragmentation of compounds (Koo et al. 2014). The NIST and Wiley databases are the most widely used and comprehensive library (Vinaixa et al. 2015) which Wiley 11th Edition and NIST 2017 are the latest version of those. Besides, The Urine Metabolome Database is a freely available electronic database (https://www.hmdb.ca) containing detailed information on more than 3000 small molecule metabolites found in human urine along with their concentration values (Bouatra et al. 2013). MzCloud, MassBank, PRIMe, GNPS, Golm Metabolome Database (GMD), MoNA and Metlin are other open-access databases which have been used in metabolomics. These databases contain mainly TMS-derivatized metabolites spectra, but most of these data overlap with that contained in the NIST library (Vinaixa et al. 2015).Metabolite identification in analysis of GC–MS data is achieved by mass spectrum matching. However, unambiguous metabolite identification often requires two independent parameters retention time/index and mass spectral, matching. The retention time of a chemical substance is dependent on the chromatographic conditions while retention index (RI) is more reproducible and less dependent to experimental parameters. Different kinds of RI have been applied in GC–MS studies but Kovats index (KI) and linear retention index (LRI) mostly used in literature. The KI was defined for isothermal experimental condition (Kovàts 1958) while the linear retention index was defined for programmed temperature condition (Vanden and Kratz 1963). To obtain RI, usually a mixture of n-alkanes (typically C10-C30) is employed as retention index indicator (Dunn et al. 2011). Alternatively, Fiehn group used fatty acid methyl esters (FAMEs) as retention index markers in metabolomics. FAMEs produce more pronounced mass spectra that may be superior for automated detection of the RI in complex samples (Fiehn 2016). Since most laboratories use n-alkanes for RI calculation, complex mathematical equations have to be used to convert those back to the more common KIs. However, RIs values are limited in databases and just over 11% of molecules in NIST library have the RI information (Zhang et al. 2017). In untargeted metabolomics, retention indices may be theoretically estimated for those metabolites that do not have experimental RI values. Due to the large number of experimental parameters, predicted RIs may not be accurate enough for unambiguous identification but can help to achieve higher confidence identification (He et al. 2018).

5 Data normalization

The main aim of quantitative determination of metabolites is to find the differences of each metabolite in a large number of samples. Water consumption and other physiological and dietary factors can significantly affect the concentrations of endogenous metabolites in urine (Chen et al. 2013). Variations in concentration as high as 15-fold has been reported (Warrack et al. 2009) which can bring about misleading results and wrong conclusions without the use of proper sample or data normalization techniques. As shown in Table 1, information related to normalization methods was lacking in approximately 40% of human urine metabolomics studies. Different methods for normalization have been reported which provide a way to compensate variation of urine concentration by external factors. Normalization methods can be classified into three main groups: (i) pre-acquisition, by dilution or reconstitution before analysis (Edmands et al. 2014); (ii) post-acquisition, by data correction (Karpievitch et al. 2014); and (iii) pre- and post-acquisition by a combination of techniques (Gagnebin et al. 2017). The most common normalization method uses creatinine, which metabolite concentrations were expressed as ratios relative to creatinine to correct for dilution, assuming a constant proportion of creatinine excretion for each urine sample (Sawant et al. 2018). Creatinine can be determined from its mass spectrometry signal provided that signal is not outside the linear range of the mass analyzer. Another way is to use the spectrophotometric methods and a commercial assay kit (Zhang and Watson 2015). However, it is proved that this normalization method is largely unsuitable, because creatinine excretion can be varied by various factors such as diet, activity, gender, physical activity and kidney impairment (Huestis et al. 2019). Zamora-Ros et al. showed that quantitative results when expressed by volume (24 h) are more reliable than creatinine correction for assessing metabolites (Zamora-Ros et al. 2011). However, this method is tedious and time consuming for both patients and scientists. Osmolality is another factor that has been used to normalize urine measurements based on the assumption that osmolality is a direct measure of total solute concentration (Khamis et al. 2018). Khamis and co-workers demonstrated that data normalization by osmolality and creatinine provide similar results (Khamis et al. 2018). Other conventional urine normalization methods include specific gravity and conductivity (González-Domínguez et al. 2014) have also been reported; but these techniques are used less often and will not be further discussed here. Recently, post-acquisition normalization methods have been used for metabolomics studies. Normalization to all MS signals is a post-acquisition approach in which the peak area of each ion was divided by peak area sum of all ions in each sample (Mizuno et al. 2017). Presence of exogenous and unwanted compounds causes erroneous results achieving by this treatment. To overcome this problem, normalization to MSTUS (total useful MS signals) was applied (Mattarucchi et al. 2012); in this method, the peak area of each ion was divided by peak area sum of all ions common to all samples. Normalization in urine metabolomics is a challenging subject and a gold standard does not seem to be reported yet. There does seem to need further study and newer normalization methods to achieve more confident results in metabolomics.

Chromatographic data processing and metabolite characterization are challenging processes in metabolomics because of the chemical diversity of metabolites that are present over a wide range of concentration. There are various databases which established to data interpretation for metabolomic studies which use complex algorithms and MS spectral databases. Due to wide spectrum of subjects and methods, a comprehensive study of different approaches in data analysis needs other review and we could not discuss here.

6 Conclusion

Urine metabolomics has potential applications to medical and biomedical fields in terms of diagnostic and prognostic values. Furthermore, urinary metabolomic analysis based on GC–MS is an established procedure in the study of small molecules derived from biological processes. To avoid metabolite degradation, it would be advisable to divide the urine sample in small aliquot prior to freezing at − 80 °C for long-term storage. Although some metabolites can only be detected after removal of urea by urease enzyme, urease pre-treatment should be avoided to obtain more accurate results. The peak overlaps between urea and other metabolites can be diminished by deconvolution methods and advanced chromatographic techniques like TOF analyzers and GC × GC. Various reagents have been proposed for urine analysis by GC–MS which silylation by MSTFA + TMCS give more reliable results rather than other proposed reagents. The use of isotope labeled internal standards and QC samples are the best method for correction of recorded data as well as minimizing instrumental errors. Although this review has attempted to describe significant factors which could affect results, there are other parameters which have not studied comprehensively, yet. Every laboratory is advised to make fine adjustments to all steps involved in the procedure and pay attention to details to achieve reproducible results free from artifacts.

However, the aim of identification of complete urinary metabolome in a single run will be far from being accomplished due to urine complexity and lack of reproducibility when comparing data from different laboratories. There are some variables in urine sample preparation and analysis which will be studied in near future to obtain a standard protocol for urine untargeted metabolomics with a wide coverage of urinary metabolome. Full advantage of untargeted metabolomics can only be achieved by comparing the data obtained with obtained data from normal ranges healthy populations. Thus, it is expected that the same validated protocol will be employed to conduct studies aiming at the generation of reliable normal ranges data which can be used in different laboratories for biomarker discovery and disease diagnosis. Regarding instrumentation, it is expected that a new generation of more affordable GC-high resolution mass spectrometers will gain presence in the laboratories.