1 Introduction

Maize (Zea mays subsp. mays), an important organism for genetic and molecular studies, is one of the most important crops in the world (Serna-Saldivar et al. 1994; Wang et al. 2012). Cultivated maize was domesticated from the grass teosinte in Central America around 10,000 years ago (Jiao et al. 2012). Nowadays, the crop displays remarkable morphological and genetic diversity, which has been considered to be the product of multiple independent domestications (Matsuoka et al. 2002; Liu et al. 2003).

During the past decades, various modern genetic and molecular methodologies have been employed to understand the genetic and molecular mechanisms underlying maize domestication (Lai et al. 2010; Li et al. 2012a, b). For example, Jiao et al. (2012) investigated 278 temperate maize inbred lines from different stages of breeding history using genome-wide association, and indicated that artificial selection significantly reduces nucleotide diversity and increases the proportion of rare alleles, while modern breeding brings rapid and dynamic genetic changes into the maize genome. Fu et al. (2010) assessed the gene and gene networks in young seedlings of 98 maize hybrids in multi-environment field trials by transcription profiling with microarrays, and revealed that genes involved in sucrose degradation, glycolysis, cell expansion and endocycle are associated with grain yield. Using 2-D electrophoresis proteomic method, Fu et al. (2011) unraveled the genetic basis of heterosis based on differently accumulated proteins in embryo samples of five hybrids compared with their parental lines, namely, Nongda 108 (Xu 178 × Huang C), Yuyu 22 (Zong 3 × 87-1), Zhengdan 958 (Zheng 58 × Chang 7-2), Xundan 20 (Xun 9058 × Xun 928), and Xundan 18 (Xun 248 × Xun 926). Using gas chromatography–mass spectrometry (GC/MS), Riedelsheimer et al. (2012a) detected 118 leaf metabolites in a set of 289 diverse maize inbred lines with 56,110 SNPs, suggesting that metabolites represent promising connecting links for narrowing the genotype–phenotype gap of complex agronomic traits in maize. Combining genomic data (SNPs) with leaf metabolomic data (130 metabolites), Riedelsheimer et al. (2012b) showed that metabolic profiles of diverse maize inbred lines allow prediction of their testcross performance in multi-location field trials, which provides a reliable screening of large collections of diverse inbred lines for their potential to create superior hybrids. Likewise, Lisec et al. (2011) conducted a comparative analysis of the root metabolome of 6 parental maize inbred lines and their 14 corresponding hybrids showing fresh weight heterosis, and concluded that maize hybrids display lower metabolite variability and complex metabolite inheritance patterns. In addition, using GC coupled to time-of-flight MS (GC–TOF–MS), Skogerson et al. (2010) identified 119 metabolites in kernels of a range of hybrid maize lines grown at three locations, and observed that metabolic variation of the small molecule metabolite pool is highly associated with genotypic variation and that levels of certain metabolite classes have inverse genotypic relationship to each other.

Maize kernel quality is of huge agronomical importance, and is ultimately determined by chemical composition. Metabolites in maize kernel function as energy components for the kernel, nutrients for humans and livestock, and resource for bioenergy production as well. Despite its significance mentioned above, the information about the metabolome of maize kernel, particularly in inbred lines is limited. Since metabolomic approaches have been increasingly applied in crop breeding (Fernie and Schauer 2009; Kusano et al. 2011), assisting the process of the selection of elite germplasms, thus, understanding on the kernel metabolome in maize, particularly the metabolic network, is helpful for the proper manipulation of maize kernel metabolic pathways to increase its productivity and nutritious quality.

In this study, using GC/MS and ultra performance liquid chromatography-MS/MS (UPLC/MS/MS), we conducted metabolic profiling of mature maize kernels of representative 13 inbred lines and one hybrid line. The results revealed the remarkable metabolic variation and notable variability of metabolite–metabolite correlation network in mature maize kernels. Additionally, based on the combined data from transcriptomic, proteomic and metabolomic analyses, we, for the first time, constructed a metabolic map of the mature maize kernel, and provided a comprehensive and unique metabolic network that could serve as a useful tool for future maize breeding with the aim to improve productivity and nutritious quality.

2 Materials and methods

2.1 Plant materials

Fourteen representative maize lines including 13 inbred lines and 1 popular hybrid line in China (Supplementary Table S1), kindly provided by Prof. Jinsheng Lai from China Agricultural University, were selected for their agronomic importance and genetic relationships. All maize lines were planted at a farm in Minhang, Shanghai, China. The experimental design was a randomized complete block design including three repeats for each inbred line. All primary ears were self-pollinated. All primary ears from each plot were harvested at physiological maturity, and air dried under sunlight for 72 h, frozen with liquid nitrogen and kept at −80 °C until analysis.

2.2 Metabolite profiling

Metabolic profiling of maize kernels was performed using a global unbiased platform, a combination of three independent analytical platforms: UPLC/MS/MS optimized for basic species, UPLC/MS/MS optimized for acidic species, and GC/MS (Oliver et al. 2011; Xu et al. 2012). The detailed descriptions of these platforms, including instrument, data acquisition and processing, and compound identification and quantitation, were published previously (Evans et al. 2009; Ohta et al. 2009). For sample extraction, kernels of each line were grounded in liquid nitrogen into fine powders using SPEX 6870 Freezer/Mill (SPEX SamplePrep, New Jersey, USA) and 40 mg of lyophilized powder per sample were extracted at room temperature using 400 μL of methanol containing recovery standards, and then analyzed with the three platforms. For the two LC platforms, chromatographic separation followed by full scan mass spectra was carried out to record retention time, molecular weight (m/z) and MS/MS2 of all detectable ions presented in the samples. For the GC platform, the samples were derivatized using bistrimethyl-silyl-triflouroacetamide (BSTFA) prior to injection. The retention time and molecular weight (m/z) for all detectable ions were measured. The metabolites were identified by automated comparison to Metabolon’s reference library entries. For each platform, the reference library was created using approximately 1,500 authentic standards that were analyzed in multiple concentrations and under the same conditions as the experimental samples. Each library includes retention time, molecular weight (m/z), preferred adducts, and in-source fragments as well as their associated MS/MS2 spectra. The combination of chromatographic retention index and mass spectra signatures gave an indication of a match to the specific metabolite.

2.3 Metabolomic data analysis

Data normalization was performed as described previously (Lawton et al. 2008). The variation resulting from instrument inter-day tuning differences was block corrected for studies spanning multiple days. For each compound, raw area counts were divided by its median value, setting the medians equal for each day’s run. Missing values (if any) were assumed to be below the limits of detection and were imputed with the observed minimum after the normalization step. Mev (MultiExperiment Viewer) 4.8 was used to draw a heat map and perform one-way ANOVA with standard Bonferroni correction. principle component analysis (PCA) analysis was done with SIMCA-P 12.0 software and significant changed metabolites were further determined in a PLS-DA model, followed by independent t test (SPSS 17.0 software) as described previously (Chang et al. 2012). Metabolic pathway and the graphical presentation of metabolite–metabolite correlation were composed with Cytoscape version 2.8.3. Before correlation analysis, metabolites whose contents are the same values in more than seven lines were filtered for their highly suspect correlations of significant. The metabolite–metabolite correlation was done using Pearson’s product-moment correlation (Pearson’s r), as provided by the R statistical software. The corresponding p-values were also calculated using the cor.test function. p-Values were then adjusted by the multiple testing procedures described by Benjamini and Yekutieli (2001), by controlling the false discovery rate (FDR).

2.4 MS-based shotgun-proteome profiling

Proteins were extracted from mature maize kernel powders with extraction buffer [100 mM Tris–HCl (pH 8.5), 5 mM DTT, 1 mM EDTA, 1 % (v/v) plant proteinase inhibition cocktail (Sigma, St. Louis, MO, USA)]. After incubation for 30 min at room temperature, the tissue homogenates were centrifuged at 14,000×g for 15 min and then protein concentration in the supernatant was measured by Synergy 2 Multi-Mode Microplate Reader. A 10–20 μL aliquot of supernatant containing 100 ng of total soluble protein was mixed with 2× sample loading buffer [100 mM Tris–HCl (pH 6.8), 200 mM Dithiothreitol (DTT), 4 % SDS, 20 % glycerol, and 0.2 % bromophenol blue], and then boiled for 5 min prior to run on 12 % 1D sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS–PAGE) at 80/120 V. After the protein front ran approximately 5 cm onto the resolving gel, the gel with total proteins was cut for later MS identification as previously reported (Reumann et al. 2009). The reference maize proteome database was downloaded from maize sequence (http://www.maizesequence.org/index.html).

3 Results

3.1 Metabolic profiling of mature maize kernel

To comprehensively understand the metabolome of mature maize kernel, we performed metabolic profiling of kernels from a set of 14 maize lines (Supplementary Table S1) using a well-established global metabolic profiling approach that combines GC/MS with UPLC/MS/MS (Oliver et al. 2011; Xu et al. 2012). A total of 210 metabolites were identified in the tested 14 maize lines (Supplementary Table S2), which uncovered so far the most broad maize kernel metabolome as compared with previous studies (Harrigan et al. 2007; Skogerson et al. 2010; Frank et al. 2012; Yang et al. 2013).

The identified 210 metabolites included 199 primary metabolites, 9 secondary metabolites and 2 phytohormones (Supplementary Table S3), which could be mapped to 8 super pathways and further 42 sub-pathways, according to the database from Plant Metabolic Net (PMN) and Kyoto Encyclopedia of Genes and Genomes (KEGG). Those identified 210 metabolites covered most of the central metabolism pathways and partial secondary pathways, corroborating the power of this non-targeted metabolomic profiling platform in uncovering the crop seed metabolome. The identified 199 primary metabolites reflected the most common and abundant metabolites in maize kernel, including 64 amino acids, 55 carbohydrates, 37 lipids, 21 nucleotides, 18 CPGECs (cofactors, prosthetic groups, electron carriers) and 4 peptides.

3.2 Metabolic variation of mature maize kernel

To have an overview of the maize kernel metabolome among tested lines, K-Medians clustering analysis was performed resulting in a plot of the all 210 biochemicals versus 14 maize lines that divided all the metabolites into 10 classes (Fig. 1a; Supplementary Table S2). Except for 7 metabolites in class 10 that were of low abundance in all the lines, the remaining 203 identified metabolites displayed remarkable diversity in their abundances across the 14 tested lines (Fig. 1a). Notably, the enrichment of certain metabolites seemed to be line-specific. For instance, 34 of 43 metabolites in class 1 were abundant only in 153-77, 16 of 22 metabolites in class 3 were only abundant in Dan340, while 14 of 17 metabolites in class 6 were enriched only in Ye478. Conversely, most of 210 metabolites appeared to be very low in the inbred line 84-106-2-2-3. In addition, as compared with all inbred lines including its parent and grandparent lines, the abundances of most of these 210 metabolites in the hybrid Zhangdan958 were also very low, lacking the obvious metabolic variation as observed in the inbred lines. Furthermore, some metabolites were specifically enriched in one or more lines. For example, the level of dihydrokaempferol (Dih) was the highest in Dan340, while that of costunolide (Cos) was higher in 153-77 and Dan340 compared to the other 12 lines. Trehalose level was elevated in 153-77, BC109-2, BT1, H109, Huangzao4 and Chang7-2, but was low in other lines. Moreover, the content of sorbitol was the highest in 153-77 but the lowest in CML103, while raffinose content was higher over other lines.

Fig. 1
figure 1

Clustering analysis and PCA of the metabolomic data from all maize lines. a Heat map representation of 210 metabolites across 14 maize lines, showing 10 classes by clustering analysis. Each line in the heat map represents a metabolite. The deeper the red color, the higher its content in the 14 maize lines; similarly, the deeper the green color, the lower its content in the 14 maize lines. Three metabolites mentioned in the text are shown in greater detail. The error bars represent the SDs (n = 3). b PCA scores plot generated from all 210 metabolites cross all 14 maize lines (Color figure online)

To better evaluate the natural metabolic variation and the variability of each metabolite, one-way ANOVA analysis revealed that the levels of 75 metabolites were significantly different (p ≤ 0.05) among the tested 14 maize lines (Supplementary Table S3), including 21 amino acids, 18 lipids, 12 nucleotides, and 10 carbohydrates, etc. The top five highly variable bio-chemicals were Dih, nicotinate ribonucleoside (NR), phosphoethanolamine (PE), stigmasterol, and trans-4-hydroxyproline (4-HP). Conversely, the remaining 135 relatively non-variable metabolites included all 8 glycolysis carbohydrates plus sucrose (Suc) and fructose (Fru).

To provide a snapshot of the information hidden in the metabolomic data, PCA was performed on all 14 maize lines. Although PCA could not distinguish lines with different maturities, grain texture or origin, it, to some extent, could separate lines with different genetic relationships (Fig. 1b). For example, the SiPingTou (SPT) heterotic group (lines Huangzao4 and Chang7-2) could be clearly separated from other heterotic groups such as Tuxpeno (line CML103), Suwan (line BT1) and Reid (line Ye478), and the elite hybrid line Zhengdan958 could be clearly separated from its grandparent line Huangzao4, its parent line Chang7-2, and to less extent, its maternal grandparent line Ye478. Thus, metabolic data of these maize lines reflects, at least to some extent, their genetic backgrounds.

The supervised statistical method PLS-DA was then employed to identify metabolites associated with the identified metabolic separations (Supplementary Tables S4–S9, S11–S13; Supplementary Fig. S1). Taking the four elite inbred lines (Huangzao4, Dan340, Chang7-2 and Ye478) as an example, there were 40, 40, and 17 metabolites that could separate Dan340 from Huangzao4, Chang7-2, and Ye478, respectively (Supplementary Tables S4–S6), and the numbers of the metabolites separating Huangzao4 from Ye478 and Chang7-2 were 26 and 24, respectively (Supplementary Tables S7–S8), while 20 metabolites could separate Ye478 from Chang7-2 (Supplementary Tables S9). Collectively, 12 metabolites including 4 amino acids, five lipids, two secondary metabolites and one CPGEC could separate one line from other three lines (Supplementary Tables S10). In the case of Zhengdan958 from its parent (Chang7-2) and grandparent lines (Huangzao4 and Ye478) (Fig. 2a), 13, 33, and 30 metabolites could separate Zhengdan958 form its maternal grandparent Ye478 (Supplementary Table S11), paternal grandparent line Huangzao4 (Supplementary Table S12), and its paternal parent line Chang7-2 (Supplementary Table S13), respectively. Notably, eight metabolites could separate Zhengdan958 and Chang7-2 between Huangzao4 and Chang7-2, the ratios of Zhangdan958/Chang7-2 and Chang7-2/Huangzao4 for each of these 8 metabolites, however, were reverse (Fig. 2b), strongly suggesting a dominant inherit patterns of these 8 metabolites. Likewise, eight metabolites were responsible for the separation each among these four lines (Fig. 2c). Among them, the level of gamma-glutamylcysteine in Zhengdan958 was significantly higher than those in other lines.

Fig. 2
figure 2

Metabolic changes among Zhengdan958 and its parent and grandparent lines. a Pedigrees generations for Zhengdan958. The name of one parent of Zheng58 was not recorded and was termed ‘Inbred X’. Lines used in this study were in red. b The ratios of eight metabolites responsible for the separation of both Chang7-2 & Huangzao4 and Zhengdan958 & Chang7-2 showed reverse changes. c List of the eight metabolites that were responsible for distinguish between Zhengdan958, Chang7-2, Huangzao4 and Ye478. The error bars represent the SDs (n = 3) (Color figure online)

3.3 Metabolite-metabolite correlation analysis

To reveal the regulatory metabolic network in maize kernels, network-based analysis (Toubiana et al. 2012) was used to analyze the correlations among identified metabolites. We calculated the values of Pearson pair-wise correlation across these lines for the set of 210 identified metabolites (Supplementary Table S14). The results were visualized as a heat-map as shown in Fig. 3, which showed in total 18,528 correlations, ranging from −0.92 for erythronate and myo-inositol hexakisphosphate (IP6) to 0.99 for glucose (Glc) and Fru.

Fig. 3
figure 3

Metabolites–metabolites correlation/significance. X and Y-axes were categorized into metabolites, grouped by pathway information. Both r and p values of the correlations were displayed in distinct colors (Color figure online)

Further screening found that there were 448 significant correlations with r2 ≥ 0.49 (r ≥ 0.7 & r ≤−0.7) and FDR ≤ 0.05 (Fig. 4). Among them, 421 were positive correlations while only 27 were negative ones. Notably, amino acids dominated the significant metabolite–metabolite correlations (213 correlations). There were 42 amino acids that were highly correlated with each other (68 correlations) or with other non-amino acid metabolites (145 correlations), and no correlations between amino acids and phytohormones were found. All the 20 standard amino acids except glutamate (Glu), asparagine (Asn) and histidine (His) had in total 78 significant correlations, 97 % of which were positive. Notably, five amino acids, methionine (Met), threonine (Thr), isoleucine (Ile), leucine (Leu), and valine (Val), correlated each other significantly. Especially, Val positively correlated with other four amino acids (r ≥ 0.90). With the exception of negative correlations for alanine (Ala) with IP6 and proline (Pro) with glucarate, all the remaining 12 standard amino acids were highly and positively associated (66 correlations with the average of r value of 0.84). Most (83 of 145) significant external correlations for amino acids were with carbohydrates including Glc and Fru but not Suc and alpha-ketoglutarate (2-KG). In addition, 37 carbohydrates had 182 significant correlations: 45 within carbohydrates and 137 with non-carbohydrate metabolites. Similarly, 29 lipids had 100 significant correlations: 40 within lipids and 60 with non-lipid metabolites. For nucleotides, most of the significant correlations (123 out of 144) were between nucleotides and other metabolites. Notably, the only correlation for auxin indoleacetate (IAA) was with myo-inositol (r value of 0.79). On the other hand, 22 out of 27 significantly negative correlations were related to myo-inositol phosphates, especially, 17 out of 27 negative correlations were related to IP6, an anti-nutritional factor present in most of the cereals and legume-based foods and feeds (Chen et al. 2008).

Fig. 4
figure 4

Kernel metabolite network based on significant correlations (r2 ≥ 0.49; FDR ≤ 0.05). Metabolites were represented as nodes, and their relations as edges. The positive correlations are shown in blue and the negative correlations are shown in red. Different node colors displayed metabolites in different pathway (Color figure online)

3.4 Comprehensive metabolic pathways in mature maize kernel

To fully understand the regulatory aspects of the metabolic pathways that exist in mature maize kernel, we further carried out transcriptomic and proteomic profiling of the mature kernel of Hi-II maize line, and revealed the expression of 342 proteins (Supplementary Table S15) and 17,607 genes (Rao et al. 2013). Expression of 311 genes (Supplementary Table S15) coincided with expression of their encoding proteins in mature maize kernel. Pathway enrichment analysis observed that 1,302 out of the 17,607 transcripts were significantly enriched in 30 pathways (Supplementary Table S16). Likewise, MAPMAN analysis categorized those identified proteins into 35 groups (Supplementary Table S17), the first 26 groups (except stress and metal handling) containing 155 proteins that were involved in various metabolic pathways.

Using PMN and KEGG, a comprehensive metabolic map in mature kernel of Hi-II maize was constructed from the identified 208 metabolites (two isobars out of 210 identified metabolites could not be mapped, see Supplementary Table S2) and the transcriptomic and proteomic data. The resulting metabolic map incorporated 400 identified transcripts and 68 identified proteins participating in 118 reactions covering 87 identified metabolites that covered 7 super pathways (except secondary metabolism) and 23 sub-pathways (Fig. 5; Supplementary Table S18). Importantly, 353 out of the 400 transcripts were involved in the metabolisms of 34 amino acids and 26 carbohydrates. These 34 amino acids were distributed into 57 mapped reactions. We observed the conversions between Glu and seven other amino acids, such as Asn and Asp, with the involvement of 2-KG. There were 43 and 47 transcripts associated with the metabolism of SAM and glutathione (reduced, GSH), respectively, indicating an important and active role of these 2 non-standard amino acids in maize kernel metabolism. Additionally, the 26 carbohydrates were distributed into 40 mapped reactions including glycolysis, TCA cycle and starch metabolism that play vital roles in kernel development. Moreover, 52 of the 400 transcripts were associated with the metabolism of 12 nucleotides and other classes of metabolites, including lipids, peptides, CPGECs, and IAA, distributing among 25 mapped reactions.

Fig. 5
figure 5

The comprehensive metabolic map in maize mature kernel with 208 metabolites. In the metabolic map, totally 400 transcripts and 68 proteins are mapped to 118 reactions covering the 87 metabolites detected in maize kernel, which are shown on the edge as Rx(y/z). For example, R23(5/0) from IP3 to IP4 indicates 5 transcripts and 0 protein mapped to the reaction 23. 75 out of mapped 208 metabolites are significantly different among 14 tested lines, which are labeled from pink to red according to the value of –lg p from 1.35 to 20 (20 was defined for Dih as its p value in one-way ANOVA was 0), while the left 133 stable metabolites are labeled in green (Color figure online)

The 68 identified proteins were associated mainly with the metabolisms of six standard amino acids and several carbohydrates. Those six standard amino acids were Glu, Met, Cys, serine (Ser), Gly, Asn and Pro, while the carbohydrates were sorbitol, Glc, Fru, Suc, glucose 1-phosphate (G1P), glucose-6-phosphate (G6P), fructose-6-phosphate (F6P), citrate, malate, succinate, 2-KG, and isocitrate. Abovementioned results indicated important roles of pathways involved in the metabolism of those carbohydrates and amino acids in the maize kernel.

4 Discussion

A maize kernel’s nutritional and physiological traits are functionally related to its metabolome (Nambara and Nonogaki 2012). Although metabolic analysis of kernel development have been done in maize (Harrigan et al. 2007; Skogerson et al. 2010; Frank et al. 2012; Yang et al. 2013), questions regarding maize kernel metabolic variability remain open due to the lack of integrative studies on different inbred lines. By subjecting a sample into three analytical platforms that combined GC–MS and UPLC-MS/MS, we compared the mature kernel metabolomes of 14 maize lines and revealed significant variation among them not only in metabolite abundances but in metabolite–metabolite correlations. In combined with other omics data, we constructed a metabolic map. Thus, this study provides not only a much broader metabolomic profile of mature maize kernel as compared with previous studies (Harrigan et al. 2007; Skogerson et al. 2010; Frank et al. 2012; Yang et al. 2013), but also a full picture of the metabolic pathways functioning in mature maize kernel. Our work extends the understanding of maize kernel metabolism regarding to the process of maturation, desiccation and perhaps germination, and could be useful for metabolic engineering and/or molecular breeding to improve maize kernel quality and yield.

Among the identified 210 metabolites in kernel of mature maize kernels, 32 (about 14 %) metabolites were basic and essential macronutrients listed in the USDA National Nutrient Database for Standard Reference (Release 25), including 17 standard amino acids, five carbohydrates, three CPGECs, and seven lipids (Supplementary Table S2). Small molecules such as vitamin E, stigmasterol, campesterol, and beta-sitosterol that are essential for human body normal function were also found. An additional 17 compounds, such as 15 phospholipids and two natural antioxidants, were also detected.

Anti-nutrient compounds and overdose of the nutrients are two important nutritious issues of a particular crop. A common anti-nutrient IP6 was highly present in mature maize kernel. To reduce the IP content in maize kernel, breeders either screen for mutants with lower IP6 levels or transfer recombinant fungal phytase to generate transgenic maize. Although several low phytic acid mutants (such as lpa1-1, lpa2-1 and lpa241) have been generated (Raboy et al. 2000, 2001), their poor agronomic properties prevented their agricultural use. The first transgenic maize with altered phytic acid in China was developed in 2008 by overexpressing the Aspergillus niger phyA2 gene (Chen et al. 2008). However, due to regulatory approval requirements, commercial use of this transgenic event awaits. The negative association of IP6 with specific metabolites [such as alanine, ribose and S-adenosylhomocysteine (SAH)] uncovered in this study may provide useful information for the breeding of low IP6 maize. Additionally, high amounts of sorbitol were also found in maize kernel, which could cause adverse effects on health. Because sorbitol in food is usually incompletely absorbed in the normal small intestine, this malabsorption is associated with abdominal complaints and diarrhea, and may act as dietary triggers for clinical symptom of irritable bowel syndrome (Islam and Sakaguchi 2006; Fernández-Bañares et al. 2009). As a result, it is critical to make sure that the content of nutritious substances in maize and maize derived foods is moderate. Despite the fact that several qualitative and quantitative methods have been developed to measure food content and that there already exists a food composition database for biodiversity by FAO, it is, to some extent, more complex for people to know the safety only from food composition data, because current available data are limited to some targeted metabolites measured with certain technologies. Therefore, the non-targeted technology used here could be a promising method to analyze compositional components for food safety as a great number of both primary and secondary metabolites could be qualitatively and/or semi-quantitatively determined in a high output way. It could thus provide more comprehensive and more accurate food compositional and nutritional information for any given foods than any targeted methods, thus a better safety assessment.

Fait et al. (2006) performed a combined analysis of the metabolic processes occurring in Arabidopsis seeds at different developmental stages of seed development, and found that seed development and germination in Arabidopsis is associated with temporally distinct metabolic switches. Primary metabolites such as amino acids, sugar alcohols and oligosaccharides, played important roles in the processes of seed vernalization, germination, maturation, and desiccation (Fait et al. 2006; Angelovici et al. 2010). Seed development depends highly on the metabolic utilization of carbohydrates. Suc is stored at the end of seed development concomitantly with the accumulations of stachyose and raffinose. Glc and Fru are present in the early developing seed, but disappear as the seed reaches maturation (Handley et al. 1983; Weber et al. 1998). Previous studies also found that coordinated cellular C/N balance is important for seed development, and that metabolites such as Glc, Suc, 2-KG, Glu, Gln, Asp and Asn are important for the monitoring of C/N balance in plants (Zheng 2009). In this study, amounts of over 82 % of the detected carbohydrates including Glc, Suc, and 2-KG, were not significantly variable among the 14 tested maize lines, which implied that the C status in carbohydrates was relatively stable in mature maize kernel. In contrast, levels of the four C/N monitoring amino acids (Glu, Gln, Asp, and Asn) were significantly variable among the tested lines, indicating a high variability of N status in mature maize kernel, which is likely genetic dependent, because a previous study indicated that there was a genetic and organ-specific control of the main steps of N/C metabolism in reproductive sink organs during the grain-filling period (Cañas et al. 2011). Notably, the highest variability was observed in lipids, which corroborated a previous study (Voelker and Kinney 2001). Since lipids and amino acids are two important forms of carbon storage in many angiosperm seeds, our result suggested, at the metabolomic level, that the C status in lipids and amino acids may play an important role in balancing C/N in mature maize kernel. The high variability of raffinose detected among all tested 14 maize lines indicated variability of different maize in their energy storage and response to stresses. Raffinose, together with stachyose, is reported to be an unessential sources of energy during soybean seed development (Dierking and Bilyeu 2009) and responsive to stresses (Chen and Burris 1990).

Natural variation in compositions and levels of metabolites in plants are often observed (Keurentjes et al. 2006), although the underlying genetic mechanism and its physiological significance remain unclear. In this study, among significant variable metabolites, lipids were more variable than amino acids, organic acids, sugars and sugar alcohols. A recent study on the genome-wide association patterns of 508 diverse inbred maize lines revealed abundant variation in oil-related genes (Li et al. 2012a). Furthermore, expression of an oil level associated gene GRMZM2G439195 (responsible for the conversion from SAM to 5-methylthioadenosine, MTA) was indeed detected in the transcriptome of mature maize kernel. MTA correlated positively with linolenate and both metabolites were significant different in the 14 maize lines, suggesting a direct role on metabolic variation due to the genome association. The natural variability of the maize metabolome in term of amino acids, lipids, and secondary metabolites, such as Dih and Cos, reflect the natural variation of the nutritional and physiological properties of mature maize kernels. Dih is one antioxidant that could reduce lipid peroxidation and Cos has been detected in many medicinal plants that possess anti-carcinogenic, anti-viral, anti-fungal, and immunosuppressive activities (Redzynia et al. 2009; Liu et al. 2011). Trehalose is involved in plant stress tolerance and the production of seed oil (Ali et al. 2012), and among the top five highly variable bio-chemicals, four metabolites (NR, PE, stigmasterol, and 4-HP) were the precursors of vitamins or nutritious substances useful to humans as food and feed and helpful for kernel development for maize itself (Jacobson et al. 1979; Kametani and Furuyama 1987; Schmid and Ohlrogge 2002).

When focusing particularly on four inbred lines (Dan340, Chang7-2, Huangzao4 and Ye478) that have distinct genetic origins, 12 metabolites were found to be responsible for distinguishing either line from the other three lines (Supplementary Table S10). Therefore, there must be significant functional alterations in the metabolism pathways of these 12 compounds among these four lines. Take G1P as an example. G1P was the highest variable metabolite identified between Chang7-2 and Ye478, its ratio between Ye478 and Chang7-2 was 9.61. Similar to AT3G02100 in Arabidopsis, the GRMZM2G030821 gene in maize encodes a UDP-Glycosyltransferase super family protein that transfers G1P into glucuronate as displayed in KEGG database. As the level of glucuronate level was much low in Ye478 where the expression of GRMZM2G030821 was absent (Lai et al. 2010), the accumulation of high level of G1P in Ye478 may result from lack of the expression of GRMZM2G030821. Above results indicated a clear metabolic pathway alteration between Ye478 and Chang7-2. Therefore, further investigation into genes and or enzymes involved in the metabolism of these 12 metabolites could validate these testable hypotheses, which may be useful to the identification of metabolomic biomarkers for breeding use.

The highly positive associations of amino acids with each other and with other non-amino acid metabolites detected in maize kernel in this study was in line with a previous observation in tomato seeds (Toubiana et al. 2012), which may suggest a conserved role of amino acids in the seed metabolism. However, the generality of this conclusion merits further extensive investigation with other plant seeds. Additional highly positively associated metabolites, such as the three maltooligosaccharides (maltohexaose, maltopentaose, and maltotetraose), represented metabolites that can be interconverted by enzymatic reactions (Takaha and Smith 1999; van der Maarel et al. 2002). We identified only one correlation for IAA with myo-inositol, as in agreement with the literature. As early as in 1969, it was discovered that about one-half of IAA in maize kernels is present as high molecular weight esters and the remaining one-half as esters of myo-inositol (Ueda and Bandurski 1969). Later, myo-inositol esters of indole-3-acetic acid were indicated as seed auxin precursors in maize kernels (Nowacki and Bandurski 1980). Hence, we propose that in mature maize kernels myo-inositol and IAA is tightly associated, which could play a vital role in seed development as free IAA is in equilibrium with its conjugate. Thus, correlation analysis proved to be helpful for dissection of putative key regulatory elements or pathways for metabolism regulation.

Correlation analysis of metabolomics data might also help in discovering novel pathways (Weckwerth and Fiehn 2002). A previous study with Zymomonas mobilis showed that Glu and Fru could be converted to sorbitol and sodium gluconate by two enzymes, gluconolactonase and glucose–fructose oxidoreductase (Zachariou and Scopes, 1986). There was no information on these two enzymes in Zea mays in the KEGG database. In MaizeGDB (http://www.maizegdb.org/), gluconolactonase catalyzes the conversion of d-glucono-1,5-lactone to D-gluconic acid (gluconate) in the absence of glucose-fructose oxidoreductase. Recently, transcriptome analysis by digital gene expression (DGE) revealed that GRMZM2G018082, encoding glucose-fructose oxidoreductase, is expressed in mature maize kernels. Meanwhile the ortholog gene of GRMZM2G018082 in Arabidopsis AT4G09670 has been reported to express at a normal level (expression value around 400–1,000, similar with that in other tissues) in Arabidopsis seeds (Schmid et al. 2005). Together with the highly positive associations (r ≥ 0.8) between Glc and Fru and gluconate and sorbitol, we assume that there may be a pathway in maize kernel in which Glc and Fru are converted to sorbitol and sodium gluconate with gluconolactonase and glucose–fructose oxidoreductase, similar to that in Zymomonas mobilis. However, so far it has not been proved in this study and merits further investigation. Altogether, networks reconstructed in this study from metabolomics data provided a formal framework for investigating maize kernel metabolism. Supplemented by biochemical, genetic, and molecular data, it would help to identify regulatory mechanisms to aid metabolic engineering.

The comprehensive metabolic map constructed in this study might be useful in maize breeding. Wang et al. (2012) have built a transcriptional roadmap for maize kernel development and concluded that it does not only provide information to better understand the transcriptional network for kernel development, but might provide strategies to modify plant chemistry to improve kernel quality and vigor by genetic engineering. Moussaieff et al. (2013) recently provided a high-resolution metabolic mapping of Arabidopsis roots with 52 metabolites and relative mRNA expression of regulators or enzymes in the corresponding pathways: a novel framework for metabolomics analyses. It is noteworthy there were 208 metabolites, 400 transcripts and 68 proteins in the metabolic map constructed in our study, which, for the first time, covers a relatively comprehensive profiling in X-omes (metabolome, transcriptome, and proteome). Still, there remains an opportunity to improve our metabolic map with additional x-omic data.

5 Concluding remarks

In summary, our metabolic profiling data revealed a remarkable metabolic variation among selected maize lines both in metabolite abundances and the metabolite–metabolite associations. In addition, a comprehensive metabolic map was constructed by integrating the transcriptomic, proteomic and metabolic data. Our effort provides a unique tool for maize breeding towards improved kernels quality and yield.