Introduction

Metabolomics consists of the untargeted analysis of low molecular weight metabolites (less than 1 kDa) in biological samples [14]. The principal aim of this omic science is to provide insight into the metabolic status of complex living systems. Comparison of the metabolic profiles from different phenotypes can be used to identify specific metabolic changes, leading to the understanding of physiology, toxicology, and disease progression [5, 6].

The more widespread analytical platforms in metabolomics are nuclear magnetic resonance (NMR) spectroscopy and mass spectrometry (MS) coupled with liquid chromatography (LC) or gas chromatography (GC) [7]. Although NMR spectroscopy is more robust, repeatable, and needs a simpler sample treatment, which is ideal for high-throughput analysis, this technique has an important drawback that is overcome by MS: its low sensitivity. Historically, most metabolomic studies were done using GC coupled with MS, but the limitations related to the molecular weight and type of metabolites that can be analyzed and the extensive preparation required led to the emergence of LC coupled with MS as a metabolite profiling tool. When MS is coupled with ultra-high performance LC (UHPLC), it is able to analyze simultaneously thousands of metabolites, requiring a small sample volume for analysis. Nonetheless, MS does not provide as much structural information as NMR spectroscopy, and with MS it is more difficult to accomplish the identification of biomarkers. As the aforementioned techniques have different strengths and drawbacks, the ideal case would be to analyze the same sample set with a combination of them [7, 8].

In all fields of analytical chemistry it is vital to use a repeatable sample preparation protocol in order to minimize the differences between samples due to the analytical process. This is especially important when human samples are analyzed because the influence of diet, environmental effects, and genetics-related factors causes high interindividual variability itself. In a metabolomic study, two different approaches can be used for the comparison of different sample treatment protocols: univariate and multivariate analysis. With use of a traditional statistical approach, the median and mean coefficient of variation (CV) of the intensities of all detected features [mass-to-charge ratio (m/z) and retention time (RT) pairs] can be calculated among the replicates of each of the different protocols. It is also possible to know the percentage of features that have a CV lower than a certain value among the replicates—for example, 0.3, which is considered the maximum acceptable value for the total error for targeted LC–MS analysis [9]—and the percentage of features present in all replicates or in a certain number of replicates. The comparison of these values between protocols gives an overview of the differences in repeatability. However, it is important to bear in mind that metabolomic data sets are complex matrices composed of thousands of features that should be analyzed also using a multivariate approach to combine multiple features and enhance the statistical comparison of methods. Principal component analysis (PCA) is the most suitable analysis for this purpose. One way of analyzing the repeatability in PCA is by calculating the intrareplicate distance within the model—the Euclidean distance (ED)—a value that gives a comparison of relative repeatability among the replicates.

As in metabolomics the general aim is to study as many metabolites as possible, the best sample preparation would be the one in which the sample is modified as little as possible (e.g., a simple dilution is the usual protocol for urine metabolomic studies [10]). Previous works have reported the study of sample treatment of matrices such as liver [11, 12], plasma [1318], cells or microbes [1924], and feces [25] by different analytical techniques. In the case of plasma samples, direct injection is not possible without the rapid degradation of the columns and loss of sensitivity caused by the gradual buildup of nonvolatile compounds in the cone of the ionization source. For these reasons and because of the important ion suppression caused by endogenous compounds in LC–MS, it is crucial to apply a cleanup step before the analysis. This is usually achieved by protein precipitation (PPT) using an organic solvent, methanol (MeOH) or acetonitrile (ACN) being most often used. An alternative approach that does not involve necessarily PPT and thus allows the study of only non-strongly protein bound metabolites is solid-phase extraction (SPE). A combination of organic solvent PPT and SPE can be used with the aims of releasing metabolites bound to proteins and prolonging column life. These SPE approaches remove substances that are prone to being strongly retained on the column if the same stationary phase is used. Because one of the major challenges in metabolomics is the analysis of a great number of plasma samples, the removal of these substances could be very helpful in order to keep the ionization source clean and lengthen the column life. However, to our knowledge, SPE has been applied in only one metabolomic study of plasma samples analyzed by LC–MS [17]. Finally, another cleanup option is offered by phospholipid-removal cartridges such as HybridSPE cartridges. The packed-bed filter/frit assembly of these cartridges acts as a depth filter for the concurrent physical removal of precipitated proteins and chemical removal of phospholipids, whereas small molecules pass through unretained, giving, as a result, a cleaner sample in which ion suppression caused by phospholipids is eliminated [26].

It is important to bear in mind the final purpose of a study. This means that the protocol that provides the highest number of metabolites is not always the most suitable. For instance, if the interest resides in nonpolar compounds that are not phospholipids, the use of HybridSPE cartridges could help to reduce the matrix effect in that chromatographic area. In contrast, if we are interested in the lipid region, the use of a solvent such as 2-propanol or dichloromethane during the extraction would be a better option.

In this study, eight human plasma preparation protocols were evaluated for their suitability for UHPLC–electrospray ionization (ESI) time-of-flight (TOF) MS based untargeted metabolomics: four of them based on organic solvent PPT with either ACN or MeOH in 2:1 and 3:1 (v/v) ratios with plasma; two approaches in which SPE C18 cartridges were used, with and without a previous PPT step; and HybridSPE cartridges and microextraction by packed sorbent (MEPS). To our knowledge, this is the first time that the effectiveness and repeatability of HybridSPE and MEPS protocols have been evaluated in human plasma metabolomics. Other plasma sample preparation protocols, as lowering pH with acid or protein denaturation using heat, were not considered from the beginning because according to Want et al. [15] the number of extracted features obtained with these protocols is considerably low. The aim of this work was, therefore, to study these different human plasma preparation protocols by means of univariate and multivariate analyses in order to select a reliable and repeatable sample preparation protocol for the untargeted metabolic analysis of human plasma that offers a high number of extracted features. Other parameters, as the number of extracted features, the reproducibility of a series of selected features, and the amount of protein remaining in each reconstituted plasma extract, were also studied.

Materials and methods

Reagents and materials

The solvents for LC–MS and sample preparation (MeOH and ACN) were OPTIMA® LC–MS grade and were obtained from Fisher Scientific (Los Angeles, CA, USA). Formic acid (FA) and sodium formate, both LC–MS grade, leucine enkephalin (high-performance LC grade), bovine serum albumin solution, and Bradford reagent were from Sigma-Aldrich (St Louis, MO, USA). Purified water from a Millipore (Milford, MA, USA) Milli-Q Element A10 water system was used. For UHPLC–ESI-TOF-MS analysis, 96-well plates were from Waters (Milford, MA, USA), well plate cap mats were purchased from VWR International (Leicestershire, UK), HybridSPE precipitation cartridges were purchased from Sigma-Aldrich (St Louis, MO, USA), and Strata-X SPE polymeric cartridges were from Phenomenex (Torrance, CA, USA). For MEPS, an eVol® XR handheld automated analytical syringe coupled to a 500 μL MEPS syringe that contains an MEPS C18 cartridge, all purchased from SGE Analytical Science (Melbourne, Australia), was used.

Plasma samples

Blood samples from ten healthy volunteers were collected in EDTA tubes. Written informed consent was obtained from all the volunteers. Plasma samples were prepared by centrifugation in a 5804 R centrifuge (Eppendorf, Hamburg, Germany) at 1,200g at 4 °C for 10 min. Pooled samples were prepared by mixing equal amounts of the ten plasma samples in one vessel. This plasma pool was divided into 10 mL fractions, which were used to evaluate the different sample preparation protocols. All the samples were stored at -20 °C until analysis.

Plasma sample preparation

Frozen plasma was thawed on ice. Eight sample preparation protocols were compared (Fig. 1), each using six preparation replicates and three instrumental replicates. All the sample extracts at the end of the treatment contained the same proportion of plasma to reconstitution solvent (50:50 v/v MeOH:H2O).

Fig. 1
figure 1

Sample treatment protocols evaluated. ACN acetonitrile, MeOH methanol

Organic solvent PPT

This type of sample preparation consisted of PPT with either cold MeOH or cold ACN. The starting volume of plasma was 250 μL. Two different ratios of organic solvent to plasma were tested: 2:1 and 3:1 (v/v). After brief vortexing, plasma samples were centrifuged at 13,000g for 5 min. Then 600 μL and 800 μL of the supernatant were taken for the 2:1 and 3:1 (v/v) ratios, respectively, and they were dried under a stream of N2 at 40 °C in a TurboVap evaporator (Zymark, Barcelona, Spain). Finally, the dried extracts were reconstituted in 200 μL of 50:50 (v/v) MeOH:H2O.

SPE procedures

Two SPE approaches were studied: one included a PPT step prior to the extraction and the other included a simple dilution of the plasma sample with H2O. The PPT was done with MeOH in a 3:1 (v/v) ratio with plasma, as detailed in the previous section, but instead of evaporating the supernatant to dryness, when the volume was less than 200 μL, we added 800 μL of H2O. In the case of the dilution with H2O, 800 μL of H2O was added to 200 μL of plasma. SPE cartridges were previously activated with 1 mL of MeOH and then further conditioned with 1 mL of H2O. Diluted samples were loaded into the cartridge and cleaned with 1 mL of H2O. Subsequently, 1 mL of MeOH was used as the elution solvent. Finally, the eluate was dried under a N2 stream at 40 °C and reconstituted in 200 μL of 50:50 (v/v) MeOH:H2O.

Microextraction by packed sorbent

First, an MEPS C18 cartridge was activated with 100 μL of MeOH and then further conditioned with 1 mL of H2O. Then, 100 μL of plasma diluted with 400 μL of H2O was loaded into the cartridge. After the sample had been cleaned with 100 μL of H2O, 500 μL of MeOH was used for elution. The guide provided by the manufacturer (SGE Analytical Science, Melbourne, Australia) was followed for this protocol. Finally, the eluate was dried under a N2 stream at 40 °C and reconstituted in 100 μL of 50:50 (v/v) MeOH:H2O.

HybridSPE procedure

The protocol performed was the one recommended by the manufacturer (Sigma-Aldrich, St. Louis, MO, USA). Firstly, a PPT step was performed by the addition of 750 μL of cold 1 % FA in ACN to 250 μL of plasma. After brief vortexing, the mixture was centrifuged at 1,600g for 3 min. Then, 800 μL of the supernatant was loaded into the HybridSPE cartridge and a vacuum was applied. At this point the eluate is ready for immediate LC–MS/MS analysis, but in order to make this method comparable with the other protocols, it was dried under a N2 stream at 40 °C and reconstituted in 200 μL of 50:50 (v/v) MeOH:H2O.

UHPLC–ESI-TOF-MS analysis

Analysis was performed using an Acquity UPLC system coupled to an ESI quadrupole TOF Synapt-G2 mass spectrometer (Waters, Milford, MA, USA) operated in positive ESI mode. Chromatography was done at 40 °C on a Waters Acquity BEH C18 column (1.7 μm, 2.1 mm × 100 mm) with the following solvent system: solvent A 0.1 % FA in H2O, and solvent B 0.1 % FA in MeOH. A linear gradient was used at a flow rate of 0.5 mL/min from 100 % solvent A to 100 % solvent B in 21 min. Then 100 % solvent B was held for 2 min, after this time the initial conditions were reached, and finally, a 3-min reequilibration step was included. The injection volume was 5 μL. The ESI conditions were a source temperature of 120 °C, a desolvation temperature of 450 °C, a cone gas flow rate of 10 L/h, a desolvation gas flow rate of 900 L/h, a capillary voltage of 0.7 kV, and a cone voltage of 30 V. The TOF detector worked in resolution mode (approximately 2,000 full width at half maximum), and all mass spectral data were acquired in centroid mode by scanning an m/z range of 50–1,200 with a scan time of 0.1 s and an interscan delay time of 0.02 s. The quadrupole was not used in this study. Leucine enkephalin ([M + H]+ = 556.2771 m/z) (2 ng/μL in 0.1 % FA in 50:50 v/v ACN–H2O) was used as a lock mass, and was introduced straight into the mass spectrometer at a flow rate of 10 μL/min. The instrument was calibrated before the analysis using a 0.5 mM sodium formate solution (calibration error less than 1 ppm). All UHPLC–ESI-TOF-MS operations were run under the control of MassLynx 4.1 (Waters, Milford, MA, USA).

Design of the analysis

A blank, consisting of 50:50 (v/v) MeOH:H2O solution, was included at the beginning and at the end of the run in order to test for any possible contamination or carryover effect. A quality control (QC) sample, prepared by combining equal aliquots of the replicates from each method, was injected regularly every eight injections throughout the run to monitor the sensitivity and the stability of the UHPLC–ESI-TOF-MS platform. This QC sample was also used to condition the system at the beginning of the analysis. It was observed in a previous test that at least 15 injections of a sample containing the matrix studied were necessary to stabilize the system, and that even after the analysis of a low number of plasma samples the sensitivity fell (data not shown). For this reason and because of the purpose of the study (evaluating the repeatability of each of the sample preparation protocols), the order of injection of the samples was not randomized to minimize the effect of the instrumental drift arising from column degradation or contamination of the MS source on the evaluation of repeatability within each protocol. The analysis order is shown in Table 1. The preparation replicates of each protocol were injected one after the other. Instrumental replicates of one of the preparation replicates were injected between them (replicates 1_1, 1_2, and 1_3). Although this way of analyzing the samples is the most convenient to study the repeatability, it is not the optimum to set the number of extracted features, taking into consideration the drop of sensitivity throughout the run. This problem was overcome by analyzing in a different batch three replicates of each protocol randomly (data not shown).

Table 1 Order of sample injection

Data processing

The raw spectrometric data acquired were processed using XCMS [27] (version 1.30.3) in order to convert the three-dimensional LC–MS raw data (RT, m/z, intensity) into a table of time-aligned detected features, with their RT, m/z, and intensity in each sample. XCMS is written in the R statistical programming language and is freely available under an open-source license. The version of R used was 2.14.1. The samples were grouped according to the different sample preparation protocols. Blanks and QC samples were treated as separate groups. The CentWave algorithm [28] was used for peak picking with a peak width window of 3-20 s—peakwidth = c(3,20)—and a maximum tolerated m/z deviation in consecutive scans for a peak of 15 ppm (ppm = 15). The m/z width for the grouping was set to 0.015 Da (mzwid = 0.015), and the bandwidth parameter chosen was 5 s (bw = 5) for the first grouping and was then determined from the time deviation profile after RT correction. These values are the ones commonly used in UHPLC systems coupled to a high-resolution mass spectrometer, and after studying the raw data and the results obtained, we decided that they were appropriate for the analysis performed. The rest of the parameters were set to the default values. Finally, the missed peaks during the peak picking algorithm were integrated automatically with the fillPeak function. This step is essential for reducing the observed CV values, as many zero values are replaced with real peak intensities. To avoid LC–MS artifacts (those peaks that do not represent molecular ions of metabolites such as isotopes, common adducts and fragments, multiple charge states, etc.), R-package CAMERA was used to filter the detected peaks. The peak marker tables (comprising m/z–RT pairs and their corresponding intensities for each sample) generated were exported into Microsoft Office Excel 2007 (Microsoft, Redmond, WA, USA) for univariate analysis and into SIMCA-P+ 11.5 (Umetrics, Umeå, Sweden) for multivariate analysis. Data outside the interval 0.5–21 min—that is, up to the elution peak and from the point at which the column washing step of the analysis started—were discarded.

Evaluation of method repeatability

To obtain the most in-depth information on sample preparation repeatability, the data were analyzed using two different approaches: univariate and multivariate statistics. Furthermore, other parameters calculated directly from the resulting table from XCMS, as the total number of features and the CV of the area of some selected features for each protocol, can be used to study the repeatability of each protocol.

Multivariate statistics

PCA was performed on all data after logarithmic transformation—10log(peak area)—and mean centering. Scores plots were examined to assess the degree of similarity between the different protocols and to identify outliers or trends in the data. The ED within the PCA data was calculated for the intrareplicate distance within the model [29] for each protocol taking into account the first three principal components. These EDs give a comparison of relative repeatability among the replicates, which can be difficult to observe in the three-dimensional PCA scores plots obtained. The lower the value, the better is the repeatability it indicates.

Univariate statistics

Univariate statistics were used to compare the CV distribution of feature intensity among the replicates for each protocol. Another way of comparing the CV distribution is by calculating the percentage (or number) of features which have a CV lower than a certain value. In this work, the number of features with a CV lower than 0.15, 0.20, and 0.30 were calculated. The two first values are the CV values considered to represent an acceptable degree of repeatability according to the International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use [30] for bioanalytical methods for targeted analysis in any concentration (0.15), except for the one corresponding to the limit of quantitation (0.20), and 0.30 is the acceptable CV in biomarkers analysis [17].

Other parameters

To select the most suitable sample preparation protocol for metabolic profiling, there are some other parameters that should also be considered: the total number of features obtained and the area deviation of some selected features for each protocol. As the behavior of each detected feature depends on its own chemical structure and the RT at which it is eluted from the chromatographic column, no internal labeled standards are commonly used in metabolomics. For this reason the area deviation offered by each protocol was assessed by studying six features with different RTs and m/z present in all the replicates of all the protocols.

Protein concentration estimation

The Bradford assay was used to estimate the amount of protein remaining in three replicates of each reconstituted plasma preparation protocol. Samples were first diluted 1:2 (v/v) with distilled H2O and then mixed with Bradford reagent in a 1:20 (v/v) ratio of sample to Bradford reagent. A calibration curve was generated using known concentrations (from 0.78 to 100 mg/L) of a standard protein (bovine serum albumin).

Results

The eight sample preparation protocols (each performed on six preparation replicates and three instrumental replicates) were compared, resulting in a set of 90 injections, 64 plasma samples plus 24 QC samples (QC and QC conditioning), and two blanks. Different criteria were used to evaluate the protocols: repeatability by multivariate and univariate analyses, number of extracted features, repeatability of a series of selected features, and the amount of protein remaining in each reconstituted plasma extract.

Feature distribution

The base peak intensity chromatograms obtained with all the sample preparation protocols show the feature distribution with chromatographic time (Fig. 2). In general, until 16 min, all of them, except for the HybridSPE and MEPS protocols, had similar profiles with different intensities and different number of visible features, which were especially higher for an ACN-to-plasma ratio of 2:1 and an MeOH-to-plasma ratio of 2:1. From this time on, in the cases of the HybridSPE, MEPS, direct SPE, and SPE plus PPT protocols, fewer chromatographic peaks were detected in comparison with the other protocols, and again an ACN-to-plasma ratio of 2:1 and an MeOH-to-plasma ratio of 2:1 provided the highest peak intensities and the greatest number of features. Taking into account that the last few minutes correspond to the lipophilic region, we conclude that solvent precipitation extracts had more lipophilic material than those of other protocols and that the removal of a great part of compounds of this kind was achieved when any cartridge was used for sample preparation.

Fig. 2
figure 2

Raw base peak intensity ultra-high-performance liquid chromatography (UHPLC)–electrospray ionization-time-of-flight mass spectrometry chromatograms of a pooled plasma sample treated with the eight preparation protocols—ACN in a 2:1 (v/v) ratio with plasma (ACN 2:1), ACN in a 3:1 (v/v) ratio with plasma (ACN 3:1), MeOH in a 2:1 (v/v) ratio with plasma (MeOH 2:1), MeOH in a 3:1 (v/v) ratio with plasma (MeOH 3:1), HybridSPE, microextraction by packed sorbent (MEPS), solid phase extraction (SPE) and protein precipitation (PPT), and direct SPE—injected through a BEH C18 UHPLC column (2.1 mm × 100 mm)

Evaluation of method repeatability using multivariate statistics

The repeatability of the chromatographic method used was assessed in the PCA. A clear tendency of the QC samples was observed as the instrument sensitivity falls (44 % of signal drop from the first QC sample to the last one, data not shown). The explained variation (R 2 X) was 70.5, 79.9, and 86.7 for the first, second, and third principal components, respectively, and the predicted variation (Q 2 X) for these components was 70.2, 79.5, and 86.1, respectively. When the model was built without including the QC samples in the data processing step (Fig. 3), in order to avoid their effect in the PCA, a large increase in R 2 X and Q 2 X was observed for the first three components (91.6, 96.0, and 97.9 and 88.5, 92.6, and 95.8, respectively). All the protocols were well separated from each other, and the results agreed with the differences observed in the base peak intensity chromatograms. This can be observed in the distribution of the sample treatment protocols along the first principal component.

Fig. 3
figure 3

Three-dimensional principal component analysis scores plot for all the plasma preparation protocols after removal of liquid chromatography–mass spectrometry artifacts and quality control samples

To give a number to the relative repeatability of the protocols shown by the PCA scores plot, the EDs generated from the first three components were calculated. In Fig. 4, the EDs represent the mean of the distances among the replicates of a particular protocol.

Fig. 4
figure 4

Calculated Euclidean distances (ED) (expressed as mean ± confidence interval (95 % confidence level)

The protocol involving an ACN-to-plasma ratio of 2:1 yielded the lowest mean ED. When this value was statistically compared (95 % confidence level) with the EDs of the other sample preparation protocols, significant differences were obtained in all cases, except for SPE plus PPT.

Evaluation of method repeatability using univariate statistics

Box plots were used to visualize the distribution of feature intensity CV values among the replicates (Fig. 5). Each box shows the degree of dispersion of CV values of features for one condition by displaying the 25th percentile (bottom of the box), the median, the mean, and the 75th percentile (top of the box). According to this plot, the protocol based on solvent precipitation with ACN in a 2:1 ratio with plasma, showing a median and a mean CV of 0.12 and 0.15, respectively, was the most repeatable protocol. An analysis of variance confirmed that the observed differences between this protocol and the other protocols were significant.

Fig. 5
figure 5

Box plots comparing the distribution of feature intensity coefficients of variation (CV) among replicates for the different sample preparation protocols. For each box, the bottom corresponds to the 25th percentile, the middle band corresponds to the median (numerical value indicated), the diamond corresponds to the mean, and the top corresponds to the 75th percentile

Another way of visualizing the distribution of feature intensity CV values is by calculating the number of extracted features for each method with an intensity CV lower than 0.15, 0.20, and 0.30. These data are shown in Fig. 6. In terms of the number of consistently detected features, again the protocol with a 2:1 ACN-to-plasma ratio seemed to be the most repeatable protocol, with 93 % of the features having a CV below 0.30. It is important to bear in mind that the values of the total number of features indicated in this section are not real because they are affected by the fall of sensitivity during the analysis.

Fig. 6
figure 6

Distribution of repeatable features for each protocol. The numbers above the columns express the percentage of features with a CV < 0.30 of the total number of detected features

Number of extracted features

The number of extracted features from XCMS for each protocol was also considered in the selection of the optimal conditions. As mentioned in “Design of the analysis,” this value was calculated by injecting in a different batch only three replicates from each protocol randomly, so the drop in sensitivity during a long run does not lead to an underestimation of the number of features of the samples analyzed at the end of the batch (data not shown). The order obtained, in terms of the number of extracted features, did not differ from the one observed in the designed run. As Table 2 shows, the maximum number of detected features was achieved with PPT with MeOH in a 2:1 ratio with plasma as the protocol, closely followed by PPT with ACN in a 2:1 ratio with plasma. The best feature extraction efficiency of MeOH in serum samples was reported by Want et al. [15], and the best feature extraction efficiency of MeOH in plasma samples was reported by Bruce et al. [16]. A large decrease in the number of extracted features was observed for the HybridSPE, MEPS, and direct SPE protocols, which makes sense if one takes into account that the former includes a mechanism for the removal of phospholipids and in protocols based on C18 cartridges without a previous PPT step protein-bound compounds are not studied, whereas protocols including a PPT step result in a drastic alteration of the three-dimensional structure of proteins that allows the release of metabolites bound to them.

Table 2 Pairwise comparison showing the number of common features and the percentage, taking as 100 % the total detected features in the protocol, and the total detected features in each case

The number of common repeatable features between methods was calculated. Pairwise comparisons of the different plasma preparation protocols showing the number of common features and the percentage, taking as 100 % the total detected features in the protocol, and the total detected features in each case are indicated in Table 2. A high percentage of common features (more than 76 %) can be seen for the different proportions of organic solvent to plasma used, with the percentage being highest in the case of ACN in a 3:1 ratio with plasma (87 %). On the other hand, it is observed that with use of SPE protocols it is possible to detect features that are not observed in PPT protocols, meaning that both types of protocol are complementary.

Repeatability of selected metabolites

Analytical and sample preparation repeatability was further investigated using selected features. For this purpose six features with different RTs and m/z values present in all the replicates of all the protocols were chosen. Table 3 lists these features and their measured m/z value, RT, and intensity CV among the replicates for each protocol.

Table 3 Repeatability of six features with different m/z and retention times (RT) present in all the replicates of each sample preparation protocol

All the protocols provided a mean CV below 0.2, with the SPE plus PPT protocol yielding the best results in terms of repeatability, offering a mean CV of 0.12 among preparation replicates and 0.04 when only instrumental replicates are taken into account.

Protein concentration

The concentration of residual protein was estimated for each plasma preparation protocol using the Bradford assay. According to the results obtained by Bruce et al. [16], ACN has better protein removal efficiency than MeOH in plasma samples. As can be observed in Table 4, our results agree with these observations, with both PPT protocols based on PPT with MeOH being the ones that resulted in the largest amount of residual protein. The efficiency of SPE cartridges in terms of protein removal was not reported in any of the aforementioned studies, but from these results, the use of these cartridges helps to get a cleaner extract compared with a simple PPT with MeOH, as was expected. It is worthwhile remarking that in the case of the SPE plus PPT protocol, MeOH was used as the organic solvent for PPT and, therefore, maybe if ACN had been used as the PPT solvent, this combination would have resulted in the lowest remaining protein concentration.

Table 4 Absorbance measurements and concentrations of residual protein (expressed as mean ± confidence interval; 95 % confidence level) estimated for each plasma preparation protocol using the Bradford assay

Discussion

Many reported approaches for plasma sample preparation have focused on a small subset of metabolites [14, 1618]. As our aim was to perform untargeted metabolic profiling of plasma samples encompassing a wide range of chemical structures, it was also essential to study the repeatability of the different plasma preparation protocols used for all the detected features, as reported by Want et al. [15]. For this reason the efficiency and repeatability of the selected plasma preparation protocols were investigated using univariate and multivariate analyses, and other parameters, as feature distribution, number of extracted features, the repeatability of some selected features, and the amount of protein remaining in the reconstituted samples, were studied. All of these criteria should be taken into account when selecting a sample preparation protocol for metabolic profiling studies.

To our knowledge, the use of SPE cartridges in metabolomics has been evaluated in only one work [17], but the authors did not study sample preparation repeatability. Our work is the first in which the efficiency of the SPE plus PPT, MEPS, and HybridSPE protocols in metabolomics has been evaluated.

Before starting to analyze the different plasma preparation protocols studied, we point out that an appropriate study design is essential to fulfill the desired aim. The LC–MS method applied entails a drop of sensitivity during a single run (44 %), and this needs to be taken into consideration, otherwise the results would not be reliable at all. The proposed design to solve this problem is to inject the samples, instead of randomly, classified by protocols. In this way, the highest drop of sensitivity within a protocol is only 7 % (calculated from the instrumental replicates), and the distance from the first replicate to the last one is constant for all the protocols. Another point to bear in mind related to the drop of sensitivity is its effect on the number of detected features. To overcome this, a different analysis in which only three preparation replicates of each protocol are analyzed randomly is performed.

After the evaluation of the different criteria applied, we found that PPT with ACN in a 2:1 (v/v) ratio with plasma produced the best results for most of the parameters studied. This protocol offered one of the lowest EDs between replicates in multivariate analysis, the lowest CV values in univariate analysis, good repeatability of the selected features, and one of the lowest concentrations of remaining protein in the final extract. Furthermore, although PPT with ACN in a 2:1 ratio with plasma offered 9.5 % fewer total detected features than PPT with MeOH in a 2:1 ratio with plasma, the difference in terms of repeatable features (CV < 0.30) was only 1.2 %. The satisfactory results in terms of repeatability could be explained by the combination of the simplicity of the sample preparation protocol itself (the same for all the PPT protocols) and the high efficiency of ACN as a PPT agent (the HybridSPE protocol also included a PPT with ACN step). The better results achieved by PPT with ACN when compared with MeOH in terms of repeatability could be explained by the fact that the extracts obtained with PPT using ACN as an organic solvent exhibited lower ionization suppression than the ones obtained with MeOH [31]. On the other hand, in the comparison of PPT with ACN with the HybridSPE protocol it must be taken into account that the HybridSPE protocol includes an additional step that could be the cause of the observed differences in repeatability. The lower number of extracted features of the protocols that include retention mechanisms was probably due, on the one hand, in the case of direct SPE and MEPS, to the fact that metabolites bound to proteins are not released and, on the other hand, when HybridSPE cartridges were used, to the removal of phospholipids. Finally, according to the Bradford assay, the extracts obtained using ACN as an agent for PPT were cleaner than the ones obtained with MeOH, which were the extracts that contained the highest concentration of remaining proteins. This agrees with the results reported by Bruce et al. [16], and thus, MeOH should not be an option when a long column life is desired.

Although Michopoulos et al. [17] observed that SPE offers good repeatability, our results do not totally agree when SPE is compared with PPT protocols. A reason for this difference could be that they did not include preparation replicates and that they focused on only a small subset of metabolites. Another difference between their study and ours is that all our protocols were performed manually, but they used an automatized SPE station, which could significantly improve repeatability. Want et al. [15] stated that ACN was the organic solvent which provided less repeatability in terms of univariate analysis in serum samples. The reason for the difference between their results and ours cannot be easily established, but could be because serum and plasma are similar, but different matrices. Another explanation could be that they injected the sample preparation replicates randomly, and therefore, sample treatments with replicates further from each other show a higher CV owing to the decay in sensitivity.

The MEPS and HybridSPE protocols did not result in the expected good repeatability, but maybe the automatization of the processes and/or the optimization of the parameters that affect the extractions could significantly improve the results.

In terms of complementarity it is important to mention that pairwise comparison showed that many of the detected features in protocols that included an SPE mechanism were not detected when only PPT protocols were used. These findings mean that if both types of protocols are combined, a higher coverage of the metabolome could be obtained.

Finally, this work shows an approach to compare the results obtained from different sample preparation protocols for untargeted metabolomics that takes into consideration many different parameters that affect directly the results. This strategy could be applied to any other matrix and/or sample preparation protocols.

Conclusions

We have reported an appropriate design of a study for the evaluation of different human plasma sample preparation protocols for use in a metabolomic study. After the evaluation of the eight plasma preparation protocols studied (based on organic solvent precipitation or three different cartridges or a combination of both), we can conclude that a simple PPT with ACN in a 2:1 (v/v) ratio with plasma is the protocol that globally better satisfies all the requirements established: the best results in terms of univariate and multivariate repeatability, a high number of extracted features, and good removal of proteins from the sample, which is very relevant when a great number of samples is to be analyzed. On the other hand, the use of PPT- and cartridge-based extraction protocols offers a different coverage of the metabolome.