1 Introduction

Metabolomics is defined as the global measurement of dynamic metabolic changes caused by genetic, pathological, or environmental perturbations. This approach facilitates the screening and early detection of any metabolic disturbances related to the origins of health and diseases (de Sousa et al. 2017; Pouralijan Amiri et al. 2019). However, there are several challenges that must be taken into account while planning metabolomics-based studies. These issues are of great importance and are closely related to experimental strategies chosen as well as sample handling and sample preprocessing methods.

Blood and urine are both regarded as a pool of the metabolome (Bouatra et al. 2013; Psychogios et al. 2011). Urine contains numerous metabolites and end products originating from metabolized nutrients and drugs, while most metabolites in blood reflect the metabolism of endogenous substances (Yin et al. 2015). Hence, metabolomics analysis of blood and urine could provide complementary data reflecting the state of the whole system at a particular point in time. Use of blood and urine as preferred subjects of study are due to their collection methods being simple and minimally invasive.

Even slight variations in collection, preprocessing, and storage of samples can significantly affect metabolite stability, influence analytical results, and decrease the credibility of the research (Kirwan et al. 2018; Lee and Kim 2017). These factors can result in failed validation studies dealing with samples of different origin, and they should be attentively evaluated. Hence, it is important that protocols for sample collection and handling are fit for downstream applications, such as metabolomics studies (Nishiumi et al. 2018).

The reliability of metabolomic studies depends to an essential part on the stability of the samples and thus the storage in biobanks, which serve as professional repositories of biological samples, is the gold standard. The primary objective of biobanks is not merely archiving, but also distributing conserved and documented biological samples for research. The quality of biological samples is crucial for the outcome of subsequent studies. Sample quality is related to pre-analytical variation, the impacts of which directly depend on the purpose of usage. Certain molecules in the metabolome are more sensitive to handling and storage procedures than others; additionally, changes of metabolites due to residual enzymatic activity in biofluid samples can be extremely fast (Salvagno et al. 2017). The objective of biobanks is to ensure that the analyzed sample is representative for the metabolome before the sample is collected, meaning that its quality is close to that of a freshly collected sample (Kirwan et al. 2018).

Herein we provide a general review and considerations of the main factors that can introduce undesirable variations to affect subsequent metabolomics analysis. Important aspects such as collection, pre-processing, storage, and stability which are known to cause bias on final results are discussed, and final recommendations for assuring a posterior reliable data acquisition for plasma, serum, and urine samples are compiled.

2 Preparation before sample collection

2.1 Collection of clinical information

One of the most important factors for sample grouping and data analysis in metabolomics study is clinical information, including but not limited to type, stage and medication of a disease. Selecting samples with differing clinical information may help us better understand disease mechanisms, progression, and drug actions or identify and exclude the impacts of these factors, depending on the purpose of study. For example, differences in a lipidomics study may also be due to a mismatch in the study population of subjects taking statins (Yin et al. 2015). Studies have shown that certain metabolites were altered under different disease conditions. Wen et al. reported that bilirubin level in serum consistently showed a statistically significant trend with increasing non-small cell lung cancer (NSCLC) stage (Wen et al. 2015). Fan et al. detected metabolic profiles of plasma samples in various molecular subtypes of breast cancer and found multiple different metabolic characteristics between human epidermal growth factor receptor 2 (HER2)-negative and -positive patients as well as estrogen receptor (ER)-negative and -positive patients (Fan et al. 2016). Metabolomics analysis with tissues of various types of thyroid malignancies showed that 28 metabolites were identified, with abundances that were significantly diverse among different types of thyroid tumors, including lipids, carboxylic acids, and saccharides (Wojakowska et al. 2015). These studies indicated that both pathological and molecular subtypes can affect the results of metabolomics analysis. Thus, comprehensive clinical information should be recorded while samples are collected in order to facilitate the data analysis after the experiment.

Besides disease status, a variety of physiological conditions and exogenous factors may lead to dynamic changes, although the metabolome in blood is a tightly controlled homeostatic system (Salvagno et al. 2017). These multiple intrinsic and extrinsic factors, which can affect composition of the metabolic profiles, include age and gender (Ishikawa et al. 2014), body mass index (BMI) (Morris et al. 2012), circadian and physiological rhythm (Minami et al. 2009), diet (Gibney et al. 2005), exercise (Weigert et al. 2014), drugs (Griffin and Bollard 2004), and others (Ackermann et al. 2019). For example, it was found that the levels of some blood lipid species of healthy adults showed gender- and age-associated differences (Ishikawa et al. 2014). Lawton et al. measured 300 compounds in plasma and found that significant changes in the relative concentration of more than 100 metabolites were associated with age. Fewer differences were associated with sex and race (Lawton et al. 2008). Several other studies have shown that alterations in the amino acid profiles are associated with obesity, especially branched-chain amino acids (Morris et al. 2012). Therefore, special attention should be given to the accuracy and comprehensiveness of this information during the collection of biological samples to avoid the errors or bias of research results caused by these factors. Therefore, results-matching for age, sex, and BMI is recommended, and well-considered preparation of the study subjects is needed before sample collection for metabolomics studies (Townsend et al. 2016).

2.2 Identification of collecting vessels

The development of modern analytical mass spectrometry (MS) makes the analysis of metabolites highly sensitive, but simultaneously the intensity of chemical noise and interferences also increases. Multiple factors may contribute to matrix effect and chemical noise in samples, such as plastic consumables (Yao et al. 2016), anticoagulants (Yin et al. 2013), and extraction solvent (van der Sar et al. 2015). In the detection process, these substances may strongly influence the ionization process resulting in significant interferences (Ellervik and Vaught 2015). For example, certain plasticizers have been shown to affect data acquisition by ion suppression (Keller et al. 2008).

Several studies have shown that interference from the collection tubes may be manufacturer dependent (Dunn et al. 2011; Deprez et al. 2002; Bando et al. 2010). Yin et al. found that polyethylene glycol and plastic particles in blood-collecting vessels were additional sources of noise in liquid chromatography–mass spectrometry (LC–MS) analysis (Yin et al. 2013). Researchers have detected strong chemical noise in lithium-heparin plastic polymer sample collection tubes, but glass collection tubes can eliminate this problem (Yin et al. 2015). Similarly, Yao et al. reported that the use of plastic consumables introduces a variable competing background signal of palmitate, while best results were obtained when glass wares were used (Yao et al. 2016). Some widespread plasticizers contaminants can be seen on the website: Contaminants observed in LC–MS background (https://www.lc-ms.nl/contaminants.htm). The spectrum of polymers from plastic tubes can be observed in the literature (Yin et al. 2015). Therefore, even if a specific blood-collecting vessel is selected, due to the types of plastic, the composition and purity of additives may vary within different labs. Before collection of samples, the collection consumables should be tested to avoid unexpected chemical contaminants, and glass wares may be the better choice in some cases.

3 Collection and pre-processing of blood sample

Sample collection is the first and most important step in metabolomics research. The quality of biological samples determines the quality level of subsequent research to a certain extent. In the following chapters we discuss the selection of blood sample type and collection tubes as well as the effects of pretreatment and hemolysis.

3.1 Sampling: plasma or serum

Both bio-fluids, serum and plasma can be successfully used in metabolomics as long as the sample collection is standardized and the same sample matrix is constant throughout the study. Although both originate from blood, plasma and serum can be regarded as two different biological liquids. Plasma is collected by adding anticoagulants, centrifuging right after collection, aliquoting, and storing, with most of the coagulation factors retained. Its advantage is that it can be placed on ice immediately, and the adverse effects of sample exposure at room temperature can be avoided. Serum is produced by a natural coagulation process, which requires 30 min of centrifugation at room temperature, and then centrifuging, aliquoting, and storing. As shown in Table 1, the preparation process of plasma and serum is similar except for collection and coagulation. The coagulation process has been shown to affect the composition of serum metabolites. Higher lactate and lower glucose in serum was identified previously by gas chromatography–mass spectrometry (GC–MS) and attributed to continued glycolysis and metabolism in the serum samples during clot formation (Teahan et al. 2006; Dettmer et al. 2010). In addition to the need for proper coagulation of serum vessels at room temperature, another important factor contributing to the difference in serum and plasma metabolism is that activated platelets release a variety of metabolites like lipids, proteases, and phospholipases during coagulation (Barri and Dragsted 2013; Liu et al. 2018). Based on relevant effects of platelets on the metabolite pattern in serum samples, it is also important to consider the platelet number, which can vary around 50-fold between patients (Lehmann 2015).

Table 1 Preparation process and relevant factors of plasma and serum samples

There are many differences between serum and plasma metabolites as shown in Table 2. Denery et al. detected more ion characteristics in serum, while the signal of phosphatidylinositol in plasma increased significantly. The study also showed that the content of protein fragments in serum was higher than that in plasma (Denery et al. 2011). It was found that other substances with higher content in serum included more than ten kinds of amino acids, exogenous dipeptide, xanthine, hypoxanthine, lysophosphatidylcholine (LPC), and thromboxane B2 (Barri and Dragsted 2013; Liu et al. 2018; Nishiumi et al. 2018; Paglia et al. 2018; Yu et al. 2011; Kaluarachchi et al. 2018). The residual enzymes in serum (e.g. proteases, phospholipases) may still be active. This activity can influence pattern and amount of metabolites, e.g. this may account for the increase in LPC (Liu et al. 2018). These studies collectively suggest that serum may provide higher sensitivity than plasma when focusing on these metabolites. Note that if serum is selected, all samples must have the same coagulation time in the study, and attention should be paid to the interference of biomarkers from activated platelets in the process of data statistics.

Table 2 Differences between serum and plasma metabolites

At present, both serum and plasma are used in metabolomics, however it is not clear which matrix is more suitable. The respective material needs to be selected carefully depending on experimental design and technical conditions. Yu et al. found that plasma provided more reproducible metabolomics profiles (Yu et al. 2011), while Breier et al. reported that serum results had higher reliability (Breier et al. 2014). Importantly, for metabolomics research, it is strongly recommended that the consistency of sample type and collection process be maintained in a single study, which helps to control error and ensure the quality of metabolomics analysis. The processing of serum samples, including sample tube, clotting time, and temperature as well as the anticoagulant tube and centrifugation conditions of plasma samples should be carefully checked in multicenter studies (Liu et al. 2018). Serum and plasma can both be successfully used in metabolomics as long as sample collection is standardized and the same sample matrix is constant during the study (Kamlage et al. 2014; Hirayama et al. 2015). In conclusion, the choice of serum or plasma depends on the metabolites of interest and the purpose of study.

3.2 Selection of the anticoagulant tube

The main difference between serum and plasma is the coagulation status. Serum is coagulated, plasma is not. To prevent coagulation of blood in the collection tube, it is treated with an anticoagulant. Ethylenediaminetetraacetic acid (EDTA), heparin, and citrate are the most commonly used anticoagulants in a clinical setting. Heparin is an antithrombin activator, while EDTA and citric acid chelate calcium ions. The characteristics of these anticoagulants are shown in Table 3. The selection of anticoagulants is an important factor to be considered because anticoagulants can lead to matrix effect, increase chemical noise, and affect the quality of observed metabolic profiles (Yin et al. 2013; Nicholson et al. 1983).

Table 3 The characteristics of heparin, EDTA and citric acid anticoagulants

Heparin is a poly-glycosaminoglycan. Although the presence of Li+ may improve the ionization efficiency of many metabolites such as phospholipids and triglycerides (Mei et al. 2003), it can also increase the signal of plastic polymers and produce significant matrix effects (Yin et al. 2015). Similarly, it has been reported that heparinate blood collection tubes lead to chemical noise in the mass spectra (Yin et al. 2013). However, some supporters believe that the characteristics of heparin antithrombin may have a positive impact on the reliable measurement of certain metabolites (Di Gregorio et al. 2017). Among the anticoagulants, heparin is recommended by Paglia et al. for plasma samples used for LC–MS-based metabolomics of hydrophilic compounds because no plasma interferences or matrix effects were noticed in this polarity range (Paglia et al. 2018). The Human Serum Metabolomics Association (HUSERMET) recommends the use of heparin for metabolomics research (Dunn et al. 2011).

EDTA not only inhibits blood coagulation, but also inhibits Mg2+ dependent enzymes in red blood cells, such as glycolytic enzymes hexokinase, making it more suitable for metabolomics research (Fobker 2014). However, in nuclear magnetic resonance (NMR)-based analysis, EDTA tubes are not recommended because of their strong noise signals (Nicholson et al. 1983). In contrast to NMR, K-EDTA plasma is considered the optimal matrix for MS-based profiling (Amberg et al. 2017). Moreover, Paglia et al. proved that the blank EDTA vacuum tube contains a large amount of sarcosine in a quantitative targeted metabolomics study with LC–MS (Paglia et al. 2018), which should also be fully considered in the experimental design.

Citrate is a small molecule, which can impact the detection of metabolites in non-targeted metabolomics research (Kirwan et al. 2018). The pH value and ionic strength of citrate are not suitable for classical lipid extraction and LC–MS analysis. It can change the pH value of samples and then change the subsequent extraction conditions (Gonzalez-Covarrubias et al. 2013). In addition, citrate itself is an endogenous metabolite, the concentration of which cannot be measured if it is used as anticoagulant. Moreover, heparin and EDTA exist in solid form, while sodium citrate is used in aqueous solution as an anticoagulant. If the blood inhaled in vacuum is not filled in the anticipated proportion, the final serum citric acid concentration will change, which will affect the function of platelets and increase the variation factors of metabolites (Gonzalez-Covarrubias et al. 2013).

Therefore, the choice of anticoagulants in metabolomics is still a topic under discussion, and there is no clear suggestion. Hebels et al. reported although good quality data were obtained for all anticoagulants used, the metabolomic profiles were strongly influenced by the anticoagulant employed (Hebels et al. 2013). In contrast, Denery et al. analyzed and compared the performance of three commonly used anticoagulants (lithium heparin, sodium citrate and potassium EDTA) in metabolite coverage by LC–MS. The results showed that there were only slight differences among various anticoagulants (Denery et al. 2011). Barri et al. also reported that only subtle metabolite differences between the different plasma preparations were noticed using LC–MS platform, which were primarily related to ion suppression or enhancement caused by citrate and EDTA anticoagulants (Barri and Dragsted 2013). Similarly, another study by NMR found that the metabolomic profile was unaffected by whether the anticoagulant was heparin or EDTA (Pinto et al. 2014). Therefore, it is suggested that all the defects of the selected anticoagulants should be taken into account, and anticoagulants that are consistent with the whole research can be accepted.

3.3 Selection of the serum collection tube

Different from anticoagulant tube, the serum collection tube doesn’t contain anticoagulants. Both gel free tubes and polymeric gel containing tubes (serum separator tube) have been used in serum collection. Polymers in vitro can activate hemagglutination and promote serum separation, but also increase the possibility of sample contamination from polymers (Yin et al. 2015). Lopez et al. compared the serum metabolites of gel tubes and gel free tubes and found that the gel affected the metabolism of alanine, proline, serine, and glycerol lipid, and affected two main metabolites, aconitine and lactic acid (Lopez-Bascon et al. 2016). Besides serum tubes without additives, there are rapid serum tubes containing clot activators. Inorganic silicates, ellagic acid, and thrombin are frequently used coagulants, which also can be a source of analytical errors (Cuhadar et al. 2012). One study showed that phenylalanine dipeptide levels in silicate-containing blood-collecting vessels were significantly higher than those in tubes with thrombin or no additives (Liu et al. 2018). Another study reported that methionine sulfoxide levels were higher in serum from gel-barrier tubes compared to tubes with clot activators (Breier et al. 2014). Therefore, it may be risky to collect serum samples using different clotting procedures (e.g. initiated by thrombin or silicate) in one set of study. It is recommended to use an additive-free common collection tube made of plastic or glass.

3.4 Effect of time delay in blood preprocessing

For NMR and LC–MS methods, an aliquot of 200 μL of serum or plasma is mostly sufficient, which means that 500 μL of whole blood has to be drawn (Amberg et al. 2017). After drawing, prolonged exposure (more than 2 h) of whole blood to room temperature is a major risk of the pre-analytical phase affecting the stability of some molecules, especially due to blood cells metabolism and gradual release of intracellular compounds. A number of studies have tested the impacts of time delay during processing on metabolome, with time ranging from 2 to 48 h at 4 °C or room temperature, and the results showed changes in different metabolites (Kamlage et al. 2014; Bervoets et al. 2015; Brunius et al. 2017; Wang et al. 2018; Nishiumi et al. 2018). In addition, the results showed that when the treatment delay occurred at room temperature, metabolite levels and metabolomic characteristics changed faster than at low temperatures. Lowered ambient temperature minimizes the metabolic activity of cells and enzymes and keeps the metabolite pattern almost stable. After a delay of only 15 min at room temperature, some metabolites of plasma samples changed (Nishiumi et al. 2018), but in another study the levels of almost all metabolites of plasma samples were stable for up to 6 h in iced water (Kamlage et al. 2014). Therefore, timely separation of serum or plasma from blood cells is as vital as ever for metabolomics testing, and processing delays at room temperature should be avoided.

It was clearly demonstrated that blood processing time (i.e., the time from blood collection, centrifugation to freezing aliquots, and the time from thawing samples to analysis) should be reduced to the minimum, preferably to less than 2 h (Kamlage et al. 2014). From the practical perspective it should be mentioned that in most clinical studies it is feasible to separate blood cells from plasma within 2 h. Moreover, plasma supernatant should be carefully removed after centrifugation, without direct contact with buffer layer, so as to avoid contamination with blood cells. It should be emphasized that also the volatilization of organic solvents used for metabolite extraction should be performed promptly and consistently.

3.5 Effect of centrifugation speed and temperature

Helmholtz Zentrum München (HMGU) suggested that the collection scheme should be 2750 g room temperature centrifugation for 10 min, and United Kingdom Biobank suggested that the centrifugation scheme should be 2500 g room temperature centrifugation for 10 min, using intermediate brake deceleration (Kirwan et al. 2018). Centrifugal temperature is generally preferred at room temperature, but many of the latest schemes also choose to separate blood at 4 °C (Lesche et al. 2016). There have been studies focused on the impact of centrifugation conditions on metabolic profiles. It was reported that centrifugation at room temperature resulted in a higher yield of microparticles and free DNA in serum, but fewer microparticles and less hemolysis in plasma (Ammerlaan et al. 2014). Lesche et al. found that different centrifugation speeds at 1500×g or 3000×g caused significant differences in the NMR-derived metabolomic profiles of plasma samples (Lesche et al. 2016). However, Jobard reported that only the time delay and storage temperature between the processes of blood draw and centrifugation had a significant impact on the blood metabolome, while centrifugation parameters (temperature, time and rotational speed) did not alter the observed (Jobard et al. 2016). In spite of these various results, standardization of centrifugation conditions are recommended to ensure comparability of samples, especially when it comes to multi-center studies.

3.6 Effect of hemolysis

Hemolysis is characterized by the release of hemoglobin and other intracellular components, including structural proteins, enzymes, and metabolites, from red blood cells after damage or rupture of cell membranes, which can significantly alter the spectrum of metabolites in blood samples (Lippi et al. 2008). The dissolution of red blood cells increases the concentration of metabolites (such as tryptophan) and lipids (such as phospholipids) from the original cells. About 18% of ionic mass signals are affected by hemolysis (Kamlage et al. 2014). Hemolytic and non-hemolytic samples can be easily distinguished by color because free hemoglobin changes the color of serum or plasma from light yellow to pink or bright red. However, whether there is slight hemolysis needs to be verified by laboratory results of hemoglobin. Colorimetry and hemolysis index can be applied for determining the level of hemolysis in samples.

The interference of hemolysis is approximately linearly dependent on the final concentration of free hemoglobin in the sample, which can directly or indirectly cause many changes in the metabolic spectrum (Kamlage et al. 2014). Several studies have shown the effect of hemolysis on metabolite composition in blood samples. Using non-targeted LC–MS metabolomics, Yin et al. found that 69 metabolites in hemolytic samples changed significantly compared with control plasma. LPC C16:0 and C18:0 were significantly increased in hemolytic samples and were strongly correlated with free hemoglobin (Yin et al. 2013). Another study by Kamlage et al. described changes in amino acids and carbohydrates, suggesting significant changes in 47 plasma metabolites in grade I hemolysis and 81 plasma metabolites in grade II hemolysis (Kamlage et al. 2014). Denihan et al. used quantitative metabolomics to detect the effect of hemolysis on serum metabolites in cord blood and found that 43 metabolites changed compared with normal serum. The contents of acyl-carnitine, L-acetyl-carnitine, hexanoyl-carnitine, phenylalanine, and guanosine in hemolytic samples increased, while the concentrations of lipids (phosphatidylcholines (PCs), LPCs and sphingolipids) decreased (Denihan et al. 2015). However, the results of lipid changes here are contrary to previous findings. This may be related to experimental design and the determination methods of hemolytic samples. Thus, hemolytic samples should not be analyzed by metabolomics, especially for non-targeted metabolomics studies.

Hemolysis is one of the major risks in blood collection, but it can be avoided by careful blood collection and treatment. In order to avoid hemolysis: (a) gently suck blood; (b) avoid severe shaking of test tubes and outpatient pneumatic conveyance; (c) avoid high-speed centrifuge or long-term centrifugation; (d) avoid storage of whole blood in 4 °C for more than 4 h.

4 Storage and preservation of blood sample

After collection, another important concern is sample storage, which should have minimal to no effect on the metabolome of the sample being studied. All samples from the same laboratory should be stored under the same conditions. Dry ice should be used to preserve biological samples during transportation. Several studies utilizing LC–MS, GC–MS, and NMR have investigated the effect of different storage temperatures and number of freeze thaw cycles, as discussed below (Palmas et al. 2018; Anton et al. 2015; Hirayama et al. 2015; Rotter et al. 2017).

4.1 Effect of storage temperature

After cell separation, the stability of metabolites is still affected by the presence of enzymes and many other proteins in serum and plasma. Anton et al. investigated the changes of metabolites in serum samples under different storage conditions. The results showed that the samples exposed to room temperature for 12 h had undergone significant degradation, especially in the increase of LPCs and the decrease of PCs. In addition, the samples placed on wet ice had better stability, only 11% (30/262) had slight changes in metabolites after 16 h at 4 °C (Anton et al. 2015). Therefore, it is suggested that the sample operation after centrifugation should be carried out on ice as much as possible.

A considerable amount of clinical hospitals lack − 80 °C refrigerators, and samples are usually temporarily stored at − 20 °C. Studies based on NMR metabolomics showed that − 20 °C storage can lead to significant changes in some metabolites, including glucose and proline (La Frano et al. 2018). These changes may be related to proteins, especially albumin in serum or plasma, which can also absorb and release many small compounds in frozen samples, thus affecting the concentration of metabolites (Hernandes et al. 2017). It was reported that storage for 30 months at − 80 °C made no significant effect on the untargeted metabolomic profiles derived using NMR (Pinto et al. 2014). However, in a study with longer period, Haid et al. found that the levels of half detected metabolites were altered using a targeted MS-based platforms after storage for 5 years (Haid et al. 2018). According to these studies, storage at − 80 °C or lower temperature is considered as the preferred condition, for less than 30 months.

4.2 Effect of freeze–thaw

It is suggested that all samples should be packed into sufficient samples for single analysis to avoid frequent freeze–thaw (Yin et al. 2015). Repeated freeze–thaw may affect the results of metabolomics, but a safe number of repeated freeze–thaw cycles is still controversial (Pinto et al. 2014; Breier et al. 2014; Anton et al. 2015). Breier reported that two freeze–thaw cycles affected the concentration of methionine sulfoxide, amino acids, PCs, and acetylornithine (Breier et al. 2014). Anton et al. found that four repeated freeze–thaw led to a slight increase in the concentration of phenylalanine and other amino acids (glycine, methionine, tryptophan, and tyrosine), possibly due to some protein degradation during thawing and re-freezing. This may be related to slight persistent metabolism (Anton et al. 2015). Zivkovic et al. also reported a slight change in serum lipid composition after repeated freeze–thaw. However, the overall metabolites seem to be stable after four freeze–thaw cycles (Zivkovic et al. 2009). The results of Yin et al. showed that no more than 0.5% of ions changed after four repeated freeze–thaw cycles (Yin et al. 2013).

The reactions of different metabolites to freeze–thaw were different. The concentration of carnitine, lipid, alanine, glucose, and acetone changed after 4–5 freeze–thaw cycles at room temperature (Pinto et al. 2014; Fliniaux et al. 2011). However, Comstock et al. reported that cholesterol, micronutrients, and hormones in human plasma did not change significantly after repeated freeze–thaw at room temperature (Comstock et al. 2008). Therefore, repeated free-thaw cycles should be avoided as much as possible to ensure the stability of metabolites.

5 Collection and pre-processing of urine sample

Urine is one of the most widely studied matrices in metabolomics. However, there are several challenges that must be taken into account while planning metabolomics-based studies. One of these important issues is the well-standardized procedures regarding collection, sample handling, pre-analytic processing, and storage. Herein, we discuss the pre-analytic variables affecting metabolomics based on urine and provide some recommendations concerning standardized protocols.

Although urine sample collection is easy, urine analysis is quite susceptible to pre-analytic issues since patients often collect urine specimens by themselves. The relevant information about the practical aspects of urine collection should be explained to the patients to obtain optimal preparation. If necessary, illustrated instructions for sampling may also be provided. The information may include instructions to wash the outer genitals with water, confounding factors (e.g. drug or dietary intake, physical exercise, etc.), and time of collection (Delanghe and Speeckaert 2016; Zhang et al. 2012).

5.1 Selection of collection time point

When collecting urine samples, the timing of collection can make an apparent qualitative and quantitative difference in the urinary metabolome (Giskeodegard et al. 2015). Typically, there are four types of urine samples that can be collected, first morning void, random urine, spot urine, and a 24 h urine collection (Fernández-Peralbo and Luque de Castro 2012; Soldi et al. 2018).

Generally, first morning voids are the preferred sample type, following an overnight fast of several hours thus reducing the effect of the last meal or medication (Chang et al. 2011). Random urine may be collected at any time of day, but differences in collection times and fasting status result in variability in urine metabolites (Kim et al. 2014). Spot urine samples are taken at a certain time of day, and are particularly common after some form of intervention, such as diet or medication (Fernández-Peralbo and Luque de Castro 2012). A typical example of spot urine samples is the nutritional study based on targeted analysis of phenols from olive oil through urine (Garcia-Villalba et al. 2010). It is worth noting that a number of metabolites are known to be excreted in a diurnal rhythm or cosine rhythm (Giskeodegard et al. 2015). By contrast, a 24 h urine collection comprises of a pooled sample of all voids within a 24 h period, which can reduce the impact of any circadian variation in the sample and represent a complete 24 h circadian cycle. When a 24 h sample collection is not feasible, spot urine samples at specific time points consistent with all subjects is recommended to resolve the problem (Fernández-Peralbo and Luque de Castro 2012).

Therefore, specific urine samples should be selected according to the design and requirements of the test, and variability from collection time and fasting status should be minimized to ensure the comparability of samples.

5.2 Collection methods and containers

Collection of urine samples is relatively easy in the general population, for which highly trained staff is not required. Subjects are asked to collect a urine sample into a container and return the sample. They can conveniently collect 24 h urine samples at home and transport them to a clinic or collect the spot urines at the clinic. For immobile patients, urine samples can be collected with a catheter, and for babies, urine can be collected using absorbent pads in diapers. A minimum volume of 500 μL of urine sample should be used for NMR and LC–MS analysis each, allowing different extraction procedures as well as repetitions as for the latter method (Amberg et al. 2017).

Metabolomics analysis requires a container that does not degrade the compounds of interest. Unlike serum or plasma containers, those for urine are usually bare polypropylene containers without special characteristics or reagents. An issue that needs attention is avoiding losses of metabolites resulting from non-specific adsorption of container surface while the urine samples are collected, stored, or processed (Fernández-Peralbo and Luque de Castro 2012). Adding additives in urine samples to increase metabolite solubility or minimize interaction with container surfaces is essential in this case. The most common additives are surfactants. Silvester and Zang observed averaged 35% losses of lipophilic compounds such as quaternary amines in urine in the absence of any additives, and this phenomenon can be avoided in the presence of surfactant additives (Silvester and Zang 2012). However, surfactant addition may lead to ionization suppression in the subsequent MS analysis, and this side effect could be minimized by using an isotopically-labeled internal standard (IS) (Li et al. 2016).

5.3 Strategies of contamination reduction

A further concern of urine collection is bacterial contamination and metabolism with a contribution to urinary metabolites by bacteria. It is recommended to collect a mid-stream urine sample to minimize the contamination (Emwas et al. 2016). Furthermore, antibacterial additives such as sodium azide and sodium fluoride can reduce metabolic variation as a result of bacterial contamination and metabolism. However, preservatives may affect some chemical properties and alter the appearance of particles (Scalbert et al. 2009). Bernini et al. reported that filtration, alone or in combination with mild pre-centrifugation, could completely eliminate bacteria from urine, which can be a better choice (Bernini et al. 2011). Storage of urine samples at − 80 °C can also prevent metabolism of urinary metabolites by contaminating bacteria, without adding any chemicals to the sample itself, which can be beneficial for GC–MS, LC–MS and NMR spectroscopy studies (Palmas et al. 2018).

5.4 Normalization strategies

Unlike plasma and serum, which are physiologically controlled, urine volume is affected by water intake, physiological factors, and external environment, resulting in the difference in metabolites concentration between individual samples. The volume of urine in the same experiment can encounter up to 15-fold variations (Warrack et al. 2009). Therefore, in metabolomics studies based on urine samples, normalization is necessary to minimize the error caused by different urine output of individuals. Strategies were established to estimate urine concentrations for normalization. The most popular strategies include relative concentration to one reference compound such as creatinine (Alberice et al. 2013), measurement of the total solute concentration (osmolality) (Chetwynd et al. 2016), urine/pure water density ratio (specific gravity) (Miller et al. 2004), and 24-h urine volume (Zamora-Ros et al. 2011). Under normal conditions, urinary creatinine output is relatively constant, but this estimator can be altered by disease or abnormal physiological conditions such as kidney impairment (Waikar et al. 2010). The variability of creatinine excretion makes it unreliable to normalize creatinine concentration in this case. Osmolality is another factor which allows a more comprehensive evaluation of the sample concentration (Chadha et al. 2001). Warrack et al. reported that applying normalization based on osmolality achieved better results compared with creatinine and 24-h urine volume (Warrack et al. 2009). However, Gagnebin et al. reported that the relation between creatinine/osmolality and the sample concentration is modified in the case of kidney failure, and the use of single normalization based on these measurements could be detrimental (Gagnebin et al. 2017). In addition to pre-acquisition sample normalization, post-acquisition data treatment were also applied in normalization, such as MS total useful signal (MSTUS) (Warrack et al. 2009), probabilistic quotient normalization (PQN) (Filzmoser and Walczak 2014). Combination of pre-acquisition sample normalization and post-acquisition data normalization could decrease the unwanted variability and enhance data quality in kidney failure studies (Gagnebin et al. 2017). Therefore the application of two different normalization techniques is recommended when working with urine samples in study of kidney disease.

6 Storage and preservation of urine sample

6.1 Time window

One concern with metabolomics is the time taken for collecting all samples, extracting samples, and then collecting a metabolic profile for each sample. Time between sampling and performance of the examination procedure is critical for the reliability of urine results. The concentration of urine constituents can change over time, making the measured result useless (Kirwan et al. 2018). Most parameters critically depend on the time window between sampling and analysis. In particular, the importance of adherence to early time points in urine analysis (within 90 min) has been stressed in automated urine analysis. In large-scale clinical metabolomics, analytical run times can operate from hours to days in length, while urine samples are kept in the auto-samplers of LC–MS, GC–MS and NMR analyzers at 4 °C (Saude and Sykes 2007). The stability of urine samples during the time delay have been detected in studies. Budde et al. reported that the NMR signals were altered with increased time and temperature but were fairly stable for 24 h at 10 °C, which was the temperature of the NMR cooling rack (Budde et al. 2016). As a comparison, Rotter et al. reported that some amino acids (arginine, valine, leucine, and isoleucine) significantly decreased by 40% in concentration detected by targeted MS when stored at ~ 9 °C for 24 h (Rotter et al. 2017). Barton et al. and Dunn et al. both found that a 24 h delay at 4 °C did not make any significant effect on the metabolomic profiles, respectively on NMR platform and MS-based platform (Barton et al. 2008; Dunn et al. 2008). Other studies suggested that 48 h at 4 °C did not significantly alter the urinary metabolome (Gika et al. 2008, 2007). As such, it is recommended that only 48 h worth of samples should be stored at any one time in an auto-sampler.

6.2 Storage temperatures

Several studies utilizing LC–MS, GC–MS and NMR have investigated the effect of different storage temperatures on urine samples. It has been demonstrated that urine samples rapidly degrade when stored at room temperature even for short periods of time, with glycolytic metabolites showing signs of degradation or metabolism (Saude and Sykes 2007). Appearance of acetate and a decrease of the intensity of citrate resonance were found in samples stored at 4 °C after one week presumably due to microbial contamination (Lauridsen et al. 2007). For long-term storage, − 20 °C or − 80 °C has been demonstrated to have no effect on the urinary metabolome after storage for 6 months in a LC–MS analysis (Gika et al. 2008). Similarly, Lauridsen et al. found that storage for up to 26 weeks at − 80 °C didn’t alter metabolomics profile tested by NMR (Lauridsen et al. 2007). Thus, it is optimal to freeze urine samples as soon as possible following collection to minimize the time spent at room temperature, and − 80 °C or lower temperature is recommended for long-term storage. These results do not mean that all the components of the sample are stable under these various conditions, and it is quite possible that individual metabolites do decompose with time, but that in doing so they do not grossly impact the principal component analysis (PCA) result (Fernández-Peralbo and Luque de Castro 2012). When a research project moves from non-targeted to targeted analysis of specific metabolites it would clearly be prudent to perform more rigorous studies on the stability of the analyses (Saude and Sykes 2007; Gika et al. 2008).

The European Consensus Expert Group Report has formulated a number of recommendations for urine biobank: (i) combined use of mild pre-centrifugation (1000–3000×g at 4 °C) and filtration should remove cells and particulate matters; (ii) storage at a temperature of − 80 °C or lower; (iii) experimentally defined processing time limits; (iv) specimen storage without additives, unless specified for a particular downstream analysis (Yuille et al. 2010).

6.3 Effect of freeze–thaw

During the metabolomics workflow from sample collection to analysis, it is likely that samples will be frozen and thawed a number of times due to the time taken to collect and prepare samples. The impact of multiple freeze–thaw cycles has been investigated using LC–MS, and it is reported that up to nine freeze thaw cycles have no significant impact upon the urinary metabolome (Gika et al. 2008). However, it was reported that three freeze–thaw cycles affected acylcarnitines and hexose and two freeze–thaw cycles affected urea (Saude and Sykes 2007; Rotter et al. 2017). Therefore it is prudent to limit the number of freeze thaw cycles to as few as reasonably possible.

7 Conclusions

Clinical samples such as blood and urine are widely used in metabolomics studies. Many components in samples are unstable and may be oxidized, aggregated or degraded. Metabolomics analysis is sensitive to variability in sample processing and preparation. The inconsistency in sample pretreatment may be the main potential cause of conflicting results (Siskos et al. 2017). Multiple factors can impact the results of testing and analysis, including the mismatch of clinical information, the choice of collection container, time, and conditions of processing and storage. In order to reduce systemic bias in metabolomics, it is important to identify all factors that may lead to unnecessary and uncontrollable pre-analysis variations and follow standardized procedures of collecting and preprocessing samples. The summary of advice for blood and urine samples preprocessing is presented in Tables 4 and 5.

Table 4 Advice for blood samples preprocessing
Table 5 Advice for urine samples preprocessing

Therefore, every metabolomics-based study must describe the collection and processing of samples in detail and formulate and implement standard operating procedures. This strategy is essential for controlling and reducing experimental variations and ensuring the reliability of metabolomic results. Best practices and standard operating procedures should be part of every multicenter research or bioinformatics process to ensure the validity, feasibility, and comparability of metabolomics research. We are aware that it is often difficult to manage these issues for the institutions (e.g. hospitals, medical centers) in which the samples are collected. To minimize these constraints, the collaborating researchers need to obtain case and control sample information, including that of collection process, storage, and clinical history, which is very useful for explaining the abnormal values of experimental results.