Introduction

Lipophilic compounds such as tocopherols, phytosterols, and policosanols are becoming more popular because of their benefits to human health. Dietary tocopherols, especially α-tocopherol, can prevent oxidative stress in vivo and are beneficial for overall health (Haffner 2000). Phytosterols have also been shown to reduce total serum cholesterol levels, reduce low-density lipoprotein (LDL) cholesterol levels, and increase high-density lipoprotein (HDL) cholesterol levels in the blood (Arruzazabala et al. 2002). In addition, consuming 5–20 mg/day policosanols has been shown to lower total cholesterol by 17–21 %, and to lower LDL by 21–29 % and increase HDL by 8–15 % (Arruzazabala et al. 1993; Gouni-Berthold and Berthold 2002). Previously, we determined the phytochemical diversity among varieties of perilla (Kim et al. 2012d). However, to date, few studies have reported the contents of lipophilic compounds in leafy vegetables.

Most vegetables contain low-to-moderate levels of vitamin E. Due to their abundance in our diets, vegetables are an important and consistent source of vitamin E. Vegetable oils are known to be the best natural sources of dietary plant sterols (Piironen et al. 2000). Validated food composition data are needed to further investigate the effects of natural plant sterol levels. The policosanol composition of various materials, such as beeswax, rice bran, wheat germ, sugar cane, sorghum grain, perilla seed, and wheat grain, has been analyzed (Hwang et al. 2004; Lin et al. 2004; Irmak and Dunford 2005; Adhikari et al. 2006; Kim et al. 2012d). However, to the best of our knowledge, there is little published data on the variation in policosanol composition of leafy vegetables.

Plant surfaces exposed to the atmosphere usually have a layer of wax that preserves plant water balance, minimizes mechanical damage to cells, and resists fungal and insect attacks (Irmak and Dunford 2005). Sterols are integral membrane components that serve to regulate the fluidity and permeability of membranes, and affect various membrane functions such as simple diffusion and active transport across the membrane. Sterols also modulate the activities of membrane-associated proteins, including enzymes, receptors, and signal transduction components (Piironen et al. 2000). Correlations between the concentrations of various metabolites can be computed to understand metabolic associations (Park et al. 2014). These correlations are the net result of direct enzymatic conversions and indirect cellular regulation of transcriptional or biochemical processes. Thus, additional information can be obtained from functional genomic studies (Steuer et al. 2003).

Plant identification and differentiation at the species and individual genotype levels are of major importance for plant scientists (Goodacre et al. 2007). Differences in species or the origin thereof may result in different metabolic phenotypes, as plant metabolite composition is influenced by species or origin. Metabolic profiling facilitates classification of samples of diverse biological status, origin, or quality, using chemometrics such as principal component analysis (PCA) and partial least squares discriminate analysis (PLS-DA) (Kim et al. 2013). Previously, the multivariate tools of metabolome were used to classify herb plants (Kim et al. 2012a), to predict the flower colors of Catharanthus roseus (Pan et al. 2012) and the quality of green tea (Pongsuwan et al. 2007), and to identify several unknown vegetable oils (Rusak et al. 2003). However, no metabolic profiling studies have been conducted on the leafy vegetables. Thus, we used PCA and PLS-DA to discriminate according to species and family, and compare the quality of leafy vegetable samples, often consumed in Korea.

The most commonly used technique for determining tocopherol content is normal-phase HPLC with ultraviolet or fluorescence detection. Gas chromatography (GC) is the most frequently used method of analyzing sterols and policosanol (Adhikari et al. 2006). Furthermore, GC–mass spectrometry (GC–MS) can be used to simultaneously quantify lipophilic compounds such as fatty acids, tocopherols, phytosterols, and policosanol (Lechner et al. 1999; Shepherd et al. 2007). This study evaluated the lipophilic composition such as policosanols, sterols, and tocopherols in nine leafy vegetables, and to identify vegetable species by lipophilic profiling using GC–MS and multivariate analysis.

Materials and methods

Samples and chemicals

Mustard, oak, Chinese mallow, chicon, kale, lettuce, tatsoi, amaranth, and chard were purchased at local supermarkets. Each sample was collected in triplicate. The samples were freeze-dried at −70 °C for at least 72 h and pulverized into a fine powder using a mortar and pestle. The milled powders were kept at −70 °C prior to extraction. Eicosanol (c20), heneicosanol (c21), tricosanol (c23), tetracosanol (c24), hexacosanol (c26), heptacosanol (c27), octacosanol (c28), triacontanol (c30), campesterol, stigmasterol, β-sitosterol, 5α-cholestane, pyridine, ascorbic acid, and N-methyl-N-trimethylsilyltrifluoroactamide (MSTFA) were obtained from Sigma (USA). Docosanol (c22) was obtained from Aldrich. Tocopherols were purchased from Merck (Germany). All other chemicals used in this study were reagent grade, unless stated otherwise.

Preparation of stock solution and standard solutions

Standard stock solutions of α-tocopherol, β-tocopherol, γ-tocopherol, α-tocotrienol, β-tocotrienol, γ-tocotrienol, δ-tocotrienol, cholesterol, brassicasterol, campesterol, stigmasterol, β-sitosterol, and the internal standard (IS) used in this study were prepared at 100 μg/mL in n-hexane. Standard stock solutions of c20, c21, c22, c23, c24, c26, c27, c28, c30, β-amyrin, and α-amyrin were prepared at 100 μg/mL in chloroform. Calibration samples, ranging from 0.25 to 5.00 μg, were prepared by mixing appropriate aliquots of each stock solution. All standard solutions were stored at −20 °C, with the exception of stock solutions, which were stored at −70 °C.

Extraction and analysis of samples

Extraction of tocopherol, phytosterol, and policosanol was performed according to the method developed by our group, with a slight modification (Kim et al. 2012d). Lipophilic metabolite was extracted from the powdered samples (0.05 g) by the addition of 3 mL of ethanol containing 0.1 % ascorbic acid (w/v) and 0.05 mL of 5α-cholestane (10 µg/mL) as an IS, mixed by vortexing for 20 s, and placed in a water bath at 85 °C for 5 min. After removal from the water bath, 120 µL of potassium hydroxide (80 %, w/v) was added, and the samples were vortexed for 20 s and returned to the water bath for 10 min. After saponification, the samples were immediately placed on ice, and deionized water (1.5 mL) was added. Hexane (1.5 mL) was then added to each sample, followed by vortex-mixing for 20 s and centrifugation (1200×g, 5 min). The upper layer was pipetted into a separate tube, and the pellet was re-extracted using hexane. The hexane fraction was dried in a centrifugal concentrator (CC-105, TOMY, Japan). For derivatization, 30 µL of MSTFA and 30 µL of pyridine were added, and the mixture was incubated at 60 °C for 30 min at a mixing frequency of 1200 rpm using a Thermomixer Comfort (model 5355; Eppendorf AG, Germany).

Samples (1.0 µL each) were injected with a split 10:1 ratio and gas chromatography was performed using an Rtx-5MS column (30-m length, 0.25-mm diameter and 0.25-µm thickness). The GCMS-QP2010 Ultra system, with autosampler AOC-20i from Shimadzu (Japan), was used to separate the lipophilic compounds. The injection, interface, and ion source temperatures were set at 290, 280, and 230 °C, respectively. Helium gas was used as a carrier, with a column flow rate of 1.0 mL/min. The temperature was standardized to 2 min isothermal heating at 150 °C, followed by an increase with a ramping rate of 15 °C/min up to 320 °C, with a hold time of 10 min. Mass spectra data were recorded at 2.00–23.33 min runtime. The chromatograms and mass spectra were evaluated using the Labsolutions GCMSsolution software version 4.11 (Shimadzu).

Statistical analysis

All analyses were performed at least in triplicate. Data acquired from GC–MS were subjected to PCA and PLS-DA (SIMCA-P version 12.0; Umetrics, Sweden) to evaluate the relationships among groups. The PCA and PLS-DA output consisted of score plots to visualize the contrast among samples and loading plots to explain the cluster separation. Pearson’s correlation analysis was performed using the SAS 9.2 software package (SAS Institute, USA). HCA and heat map visualization of the correlation coefficients were performed using the software Multi-Experiment Viewer version 4.9.0 (http://www.tm4.org/mev/).

Results and discussion

Lipophilic metabolites profiling of samples

In this study, lipophilic compounds such as policosanol, phytosterol, and tocopherol in vegetables were measured by GC–MS (Fig. 1). Previously, we analyzed 14 lipophilic metabolites using GC–MS (Kim et al. 2012b). Under the GC–MS conditions described here, the peaks of 22 lipophilic compounds were detected within 17 min, resulting in high-throughput analysis of almost 75/24 h. Quantification was performed using selected ions, as described in Table 1. Precursor and product ions in the mass spectra of tocopherol and phytosterol were in agreement with those in the literature (Van Pelt et al. 1998; Schummer et al. 2009). The mass spectra of policosanol TMS derivatives showed that the molecular ion [M-15]+ was generally dominant. TMS-derivates of sterols showed mainly the fragments [M-90]+, [M-129]+ and m/z 129.

Fig. 1
figure 1

Representative total ion chromatogram of a standard mixture obtained in scanning mode as TMS derivatives separated on Rtx-5MS column. The upper traces were recorded in SIM mode. Peak 1 c20 (eicosanol); 2 c21 (heneicosanol); 3 c22 (docosanol); 4 c23 (tricosanol); 5 c24 (tetracosanol); 6 5α-cholestane (internal standard); 7 c26 (hexacosanol); 8 β-tocopherol; 9 γ-tocopherol; 10 c27 (heptacosanol); 11 δ-tocotrienol; 12 c28 (octacosanol); 13 γ-tocotrienol; 14 α-tocopherol; 15 cholesterol; 16 brassicasterol; 17 α-tocotrienol; 18 campesterol; 19 c30 (triacontanol); 20 stigmasterol; 21 β-sitosterol; 22 β-amyrin; 23 α-amyrin

Table 1 Gas chromatographic and mass spectrometric data of trimethylsilyl derivatives of lipophilic compounds

Quantification was performed by means of three-point calibration curves, for which the concentrations of a mixture of 22 authentic standards ranged from 0.25 to 5.0 µg, whereas the amount of the IS was constant at 0.5 µg. The calibration curves of 22 lipophilic compounds measured under optimized saponification conditions were linear (r 2 = 0.9912–0.9997), and LODs varied within the range of 0.005–0.020 µg (Table 2). These results confirmed the suitability of the present method for measurement of tocopherols, tocotrienols, policosanols, and sterols in unknown samples.

Table 2 Linearity and limit of detection for the assay of 22 lipophilic compounds as TMS derivatives

Tocopherols are lipophilic antioxidants with important functions in plants and humans (Haffner 2000). Phytosterols are bioactive lipophilic compounds capable of reducing serum cholesterol (Agren et al. 2001). Policosanols have various beneficial physiological effects, including reducing platelet aggregation, endothelial damage, and foam cell formation, and improving the exercise performance of coronary heart disease patients (Irmak and Dunford 2005). Thus, we analyzed the contents of these metabolites in nine vegetables: mustard leaf, oak leaf, Chinese mallow, chicon, kale, red lettuce, tatsoi, amaranth, and chard. In the present study, the alkaline saponification method was used to measure lipophilic components. The quantitative results include free sterols and their esters, but not steryl glycosides. In total, 18 metabolites, including three tocopherols, four phytosterols, nine policosanols, β-amyrin, and α-amyrin, were detected in the samples.

Classification of vegetable species by principal component analysis and correlation between metabolites in vegetables by hierarchical clustering analysis and Pearson’s correlation analysis

Food component analysis traditionally involves classifying food constituents into very broad categories, such as proteins, fats, carbohydrates, fiber, vitamins, trace elements, solids, and ash. However, the advent of metabolomics has allowed metabolite profiling combined with chemometrics to direct quality control strategies for improving and optimizing the specific balance of the components of fresh food (Kim et al. 2012c). PCA is an important tool, especially in preliminary steps of a multivariate analysis, to obtain an overview of, and find patterns in, complex experimental data. The data obtained for the lipophilic metabolites were subjected to PCA to outline the lipophilic profile differences among the vegetable samples. PCA transforms the original variables, using an orthogonal linear transformation, to a new set of uncorrelated variables known as principal components (PCs). Representation of the PCs scores and loadings in a bi-dimensional plot can provide an overview of the data, pointing out patterns hidden in the data set and identifying correlations among variables. Each point in the score plots represents an individual sample, and samples exhibiting similar variances are clustered together.

The PCA results clearly demonstrated the absence of large variances among samples of the same species (Fig. 2). Furthermore, the results showed apparent separation among three families, the exception being Asteraceae. The first PC, accounting for 31.9 % of the total variance, resolved the measured metabolites profiles of Amaranthaceae and other families such as Brassicaceae and Malvaceae. The corresponding loading plots identify the metabolites responsible for separation on the score plots. The significant metabolites for PC1 were policosanols such as c22, c21, c26, and c23. In addition, in PC1, the corresponding loading was negative for c28 and c30, suggesting that the c28 and c30 contents of chard were higher than those of other vegetable species.

Fig. 2
figure 2

PCA score plots (A) and loading plots (B) derived from lipophilic metabolites of nine vegetable species. PC1 and PC2 accounted for >54.2 % of the total variance. The ellipse represents the Hotelling T2 with 95 % confidence in the score plot. The loading plots represent the original variables in the space of the PCs. They reveal the magnitude and direction of correlation of the original variables with the first two PCs. c20, eicosanol; c21, heneicosanol; c22, docosanol; c23, tricosanol; c24, tetracosanol; c26, hexacosanol; c27, heptacosanol; c28, octacosanol; c30, triacontanol

To examine detailed relationships among the concentrations of the 18 metabolites in vegetable, we performed Pearson’s correlation analysis and HCA on data for the accessions (Fig. 3). There were strong correlations between metabolites that participate in closely related pathways, demonstrating the robustness of the present experimental system. In our studies, significant positive relationships were observed between β-amyrin and α-amyrin (r = 0.7740, P < 0.0001). Likewise, there were significant correlations among sterols such as cholesterol, campesterol, and β-sitosterol. These results suggest that the reasonable score range of the components could be used for sample selection according to the correlation between the variables and these two components.

Fig. 3
figure 3

Correlation matrix and cluster analysis of results of the 18 lipophilic metabolites from nine vegetable species. Each square indicates a Pearson’s correlation coefficient for a pair of compounds. The value for the correlation coefficient is represented by the intensity of the blue or red color, as indicated on the color scale. Hierarchical clusters are presented as a cluster tree. c20, eicosanol; c21, heneicosanol; c22, docosanol; c23, tricosanol; c24, tetracosanol; c26, hexacosanol; c27, heptacosanol; c28, octacosanol; c30, triacontanol

Classification of the vegetable species by PLS-DA

To optimize the separation among Amaranthaceae, Asteraceae, Brassicaceae, and Malvaceae, we then utilized PLS-DA to visualize the metabolite differences. PLS-DA is a projection method that separates groups of observations by rotating the PCAs to obtain the maximum separation among classes. In this study, classes were the families. Good separation was obtained among the score plots in PC1 and PC2 of PLS-DA (Fig. 4). We used an internal validation method for model validation. The quality of the model was described by R 2 and Q 2 values. R 2 is defined as the proportion of variance in the data explained by the model and indicates the goodness of fit, and Q 2 is defined as the proportion of variance in the data predictable by the model and indicates the predictability. Usually, Q 2 > 0.9 indicates a strong predictive ability of the model, Q 2 > 0.5 indicates a general predictive ability of the model, and Q 2 smaller than 0.5 indicates a poor ability of the model (Eriksson et al. 2001). The predictive model had an R 2 of 0.614 and Q 2 of 0.547. According to Eriksson et al. (2001), Q 2 > 0.5 indicates a good predictive model. In the first component of the PLS-DA, the loading plot indicated that c30 levels were higher in amaranth and chard than in other species. The loading plot also indicated that high levels of campesterol were present in the Brassicaceae samples, and high levels of β-amyrin and stigmasterol were present in the Asteraceae samples. The contribution of variables in the projection could be explained using variable importance in the projection (VIP) values (Fig. 5). Variables with VIP > 1 have the greatest influence on the model. Campesterol was found to be the most significant component in the PLS-DA model for the vegetable family.

Fig. 4
figure 4

PLS-DA score plots (A) and loading plots (B) derived from lipophilic metabolites of nine vegetable species. PC1 and PC2 accounted for >50.7 % of the total variance. The ellipse represents the Hotelling T2 with 95 % confidence in the score plot. The loading plot defines the orientation of the PLS planes with the original variables (metabolites and family variables). They reveal the magnitude and direction of correlations of the original variables with the first two PLS components. Family variables: $M8.DA(1), Amaranthaceae; $M8.DA(2), Brassicaceae; $M8.DA(3), Malvaceae; $M8.DA(4), Asteraceae. c20, eicosanol; c21, heneicosanol; c22, docosanol; c23, tricosanol; c24, tetracosanol; c26, hexacosanol; c27, heptacosanol; c28, octacosanol; c30, triacontanol

Fig. 5
figure 5

The influence of variables used to create a family predictor for a vegetable sample. VIP indicates the relative influence of each metabolite to the grouping. Metabolites with higher VIP values are more influential. c20, eicosanol; c21, heneicosanol; c22, docosanol; c23, tricosanol; c24, tetracosanol; c26, hexacosanol; c27, heptacosanol; c28, octacosanol; c30, triacontanol

Lipophilic metabolite contents in leafy vegetables

To date, few studies have reported comprehensive measurements of lipophilic compounds in vegetables. The contents of policosanols, sterols, and terpenoids in leafy vegetables commonly consumed in Korea are shown in Table 3. In general, c26 and c28 were the most abundant policosanols in vegetable samples. The c28 level was higher in chard (0.68 mg/g). Due to its cytoprotective effects, c28 is one of the most abundant alcohols in policosanols and is consumed as an alternative to aspirin for patients suffering from gastric irritation (Taylor et al. 2003). One study involving tail-suspended rats showed that c28 could counteract some effects of simulated weightlessness on rats. Chard might thus benefit astronauts during space travels (Bai 1997). In this study, three foods belonging to the Asteraceae species contained higher amounts of α-amyrin and β-amyrin. The relative α-amyrin and β-amyrin contents could potentially serve as markers to distinguish Asteraceae from other families.

Table 3 Composition and content (µg/g) of lipophilic compounds in nine vegetables

In summary, this study demonstrated the diversity of lipophilic compounds in nine leafy vegetables and identified correlations among their contents. PCA and PLS-DA were performed to identify differences among the plant species and families. PCA facilitated visualization of complex data and identification of significant differences between species. PLS-DA enabled discrimination of vegetable samples from different families. Metabolites—such as campesterol, β-sitosterol, β-amyrin, stigmasterol, cholesterol, c28, α-amyrin, and c26—are useful to identify specific plant families. The HCA results indicated strong correlations among metabolites in closely related pathways.

One means of enhancing the intake of phytochemicals would be to increase their content in fresh vegetables by utilizing crop production practices such as selection of species and cultivars. These vegetables could serve as fresh market products or as raw materials for functional foods and supplements. Reliable discrimination of biomarkers of vegetables is essential for the consumer and producer. This study indicates that chard vegetable contains relatively high levels of c30. Furthermore, the Asteraceae species, including chicon, lettuce, and oak, contain relatively high levels of α-amyrin, β-amyrin, and stigmasterol. Therefore, these metabolites could be the chemotaxonomic markers to discriminate the Asteraceae species from other leafy vegetables. The potential of metabolite profiling combined with chemometrics as a powerful tool for assessing food quality was demonstrated here.