Keywords

Introduction

The last decade has seen rapid advances in functional genomic research globally. Most of the efforts involve construction of technological and resource platforms for high-throughput DNA sequencing, gene identification, and physical and genetic mapping; functional analysis of genomes for agronomic traits and biological processes; and identification and isolation of functional genes. The overall goal of functional genomic research is to understand how the genome functions at the whole-genome level. Similarly, proteomics looks for the systematic analysis of the protein population in a tissue, cell, or subcellular compartment. It enables correlations to be drawn between the range of proteins produced by a cell or tissue and the initiation or progression of a stress or normal metabolism. Metabolites are the end products of cellular process, and they show the response of biological systems to environmental changes. The current trend in metabolomic studies is to define the cellular status at a particular time point of development or physiological status. These techniques complement other techniques such as transcriptomics and proteomics and depict precise pictures of the whole cellular process. The growing number of sequenced plant genomes has opened up immense opportunities to study biological processes related to physiology, growth and development, and tolerance to biotic and abiotic stresses at the cellular and whole plant level using a novel systems-level approach. The “omics” approach integrates genome, proteome, transcriptome, and metabolome data into a single data set and can lead to the identification of unknown genes and their regulatory networks involved in metabolic pathways of interest. This will also help in understanding the genotype–phenotype relationship.

Arabidopsis thaliana

Genomics

Arabidopsis thaliana is an excellent model organism for the analysis of complex biological processes in plants using molecular and biotechnological techniques. The frontiers of plant science, like other branches of the life sciences, have been dominated by genomics over the past 25 years. Many research laboratories are currently putting intensive efforts to isolate Arabidopsis genes of biological importance using map-based cloning strategy. Although genetic linkage (Koornneef et al. 1983) and recombinant inbred line (RIL) maps (Lister and Dean 1993) have been reported, the construction of accurate physical maps of the chromosomes will be highly advantageous not only for the genomic sequencing but also for map-based gene cloning (Ward and Jen 1990). Hence, a complete physical map of the Arabidopsis genome should be greatly advantageous for cloning the genetic loci of interest as well as sequencing the entire genome. Hence, yeast artificial chromosome (YAC)-based physical maps of chromosomes 2 (Zachgo et al. 1996) and 4 (Schmidt et al. 1995) of A. thaliana have been constructed by several workers. Based on the sizes of the YACs and their coverage of the chromosome, the length of chromosome 2 was estimated to be at least 18 Mb. Sato et al. (1998) presented physical map of the entire chromosome 3 which was constructed by ordering the clones from YAC, PI, TAC, and BAC libraries using the information from the sequences of various DNA markers and the terminal sequences of the clones. The sizes of the centromeric regions of Arabidopsis thaliana chromosomes 1, 2, and 3 were determined by construction of their physical maps on the basis of restriction analysis (Hosouchi et al. 2002). The sizes of the genetically defined centromeric regions were deduced to be 9 megabases (Mb), 4.2 Mb, and 4.1 Mb, respectively (chromosome 1, from markers T22C23-t7 to T3P8-sp6; chromosome 2, from F5J15-sp6 to T15D9; chromosome 3, from T9G9-sp6 to T15M14) (Copenhaver et al. 1998). Mitochondrial genomes in higher plants are characterized by their high flexibility and variation in size and structure. The mitochondrial genome of A. thaliana was physically mapped using cosmid and YAC clones and was found to contain 372 kb size which was relatively large (Klein et al. 1994). The presence of this comparatively large mitochondrial genome in a plant with one of the smallest nuclear genome showed that different size constraints act upon the different genomes in plant cells. A. thaliana is known to contain approximately 1,000 copies of 5S rDNA per haploid genome, and they occur in tandem arrays (Campell et al. 1992). The 5S ribosomal RNA genes were mapped to mitotic chromosomes of Arabidopsis thaliana by fluorescence in situ hybridization (FISH) by Murata et al. (1997).

Arabidopsis thaliana is widely used as a model for the study of many aspects of plant biology. Because of its small genome size (125 Mb), it was chosen as the subject of the first plant genome sequencing project, an effort that was completed. Arabidopsis thaliana was the first plant, and the third multicellular organism after Caenorhabditis elegans (The C. elegans Sequencing Consortium 1998) and Drosophila melanogaster (Adams et al. 2000), to be completely sequenced (The Arabidopsis Genome Initiative 2000). Since systematic sequencing was completed in late 2000, the genome sequence has undergone several rounds of reassembly, hole patching, and extension into un-sequenced regions. One of the major features of the Arabidopsis genome revealed by the genome sequence was the extent of gene duplication and segmental duplications, which was surprising given the expectation of a functionally compact genome. Approximately 60 % of the genome was thought to be derived from a single duplication event, possibly of the entire genome (The Arabidopsis Genome Initiative 2000). The extensive work carried out based on the Arabidopsis genome sequence also supports interpretations of the evolution of the vertebrate lineage that propose a central role for genome duplications (Wolfe 2001). Comparison of Arabidopsis sequences with genomic sequence from the closely related Brassica oleracea (Chinese cabbage) identified regions of high similarity that either identified putative new genes or extended existing gene models. About 30 % of these new genes encoded a transcript. About 25 % of the originally predicted genes had no supporting evidence such as an EST match or reasonable similarity of their putative peptide sequence to any other protein. The decreasing cost along with rapid progress in next-generation sequencing and related bioinformatics computing resources has facilitated large-scale discovery of SNPs in Arabidopsis species. Large numbers and genome-wide availability of SNPs make them the marker of choice in partially or completely sequenced genomes. The complete nucleotide sequence of the chloroplast genome of Arabidopsis thaliana has been determined (Sato et al. 1999). The genome as a circular DNA composed of 154,478 bp containing a pair of inverted repeats of 26,264 bp, which are separated by small and large single copy regions of 17,780 bp and 84,170 bp, respectively. Cao et al. (2011) presented the first phase of the project, based on population-scale sequencing of 80 strains of A. thaliana populations drawn from eight regions throughout the species’ native range. They found common small-scale polymorphisms as well as many larger insertions and deletions in the A. thaliana pan-genome.

A major goal in evolutionary biology is to identify the genetic basis of adaptive trait variation. In the model plant species Arabidopsis thaliana, studies are now being performed exploiting natural variation as a powerful alternative to classical mutant genetics (Koornneef et al. 2004), in particular to identify genes underlying important quantitative trait variation. Benjamin et al. (2010) studied combined analysis of genome-wide association (GWA) study with traditional linkage mapping in order to detect the genetic bases underlying natural variation in flowering time in ecologically realistic conditions in the plant Arabidopsis thaliana. It involved phenotyping of nearly 20,000 plants over 2 winters under field conditions in a temperate climate. Simon et al. (2008) studied phenotyping of nearly 20,000 plants over 2 winters under field conditions, including 184 worldwide natural accessions genotyped for 216,509 SNPs and 4,366 RILs derived from 13 independent crosses chosen to maximize genetic and phenotypic diversity. The results showed that combined linkage and association mapping clearly outperforms each method alone when it comes to identifying true associations. Kuittinen et al. (1997) described a quantitative trait locus (QTL) mapping experiment for flowering time in Arabidopsis. Five to seven QTLs affecting flowering time were found in a BC1 population derived from the Finnish Naantali genotype and the German strain Li-5. In a different population, consisting of 165 RILs, Alonso-Blanco et al. (1997) found four QTLs affecting the flowering time. Several loci exhibiting variation in complex traits (quantitative trait loci or QTLs) have been cloned. Examples include using linkage disequilibrium (LD) to fine map the FRI and FLC loci controlling flowering time (Hagenblad et al. 2004). Natural variation in hypocotyl responses to light was shown to be due to polymorphisms in phytochrome light receptors. Affymetrix expression arrays have also been used for genotyping; total genomic DNA from recombinant inbred lines (RILs) made from a cross of Col and Ler was hybridized to the ATH1 Affymetrix array, and recombination events were identified. Marker and QTL information obtained from a segregating population can be used for the design of efficient breeding strategies. Marker-assisted selection (MAS) has been advocated as a useful tool for rapid genetic advance in the case of quantitative traits (Lande and Thompson 1990; Knapp 1994, 1998). Berloo and Stam (1999) described an experiment using RILs of A. thaliana with an objective to compare an MAS breeding strategy, using molecular marker and QTL information, with conventional breeding methods, based on phenotype only. Selection based on marker and QTL information gave approximately the same result as selection based on phenotype. The relative high heritability of flowering time in Arabidopsis facilitated successful phenotypic selection. The difference in selection result that was anticipated to be in favor of the marker-assisted approach was therefore not observed.

Functional Genomics

With the availability of complete genome sequences of several organisms, the focus has shifted from structural genomics to functional genomics, specifically in plants where the complete genomic sequences are becoming available (Arabidopsis and rice). A variety of approaches are used to clone and gather information about the function(s) of gene(s). Among these, insertional mutagenesis has been extensively used for cloning genes, promoters, enhancers, and other regulatory sequences from Arabidopsis. Strategies used for cloning and characterization depend upon the information available about the gene or its product. Expressed sequence tags (ESTs) and microarray-based techniques are some of the powerful approaches in this direction.

A comprehensive molecular-marker-based linkage map exists for Arabidopsis, and the map-based cloning of genes conferring specific phenotypes will become even easier with the availability of genomic sequence information. Jun et al. (2011) conducted whole-genome sequencing of A. thaliana populations. Here they described the majority of common small-scale polymorphisms as well as many larger insertions and deletions in the A. thaliana pan-genome, their effects on gene function, and the patterns of local and global linkage among these variants. The plant hormone auxin, typified by indole-3-acetic acid (IAA), regulates a variety of physiological processes, including apical dominance, tropic responses, lateral root formation, vascular differentiation, embryo patterning, and shoot elongation. Okushima et al. (2005) employed a functional genomic strategy that involves the identification of T-DNA insertion in the ARF gene family members to elucidate some of the biological functions of the ARF transcription factors. Most of the single ARF T-DNA insertion mutants fail to show an obvious growth phenotype. However, double mutants, such as arf7 and arf19, showed a strong auxin phenotype that results in the absence of lateral root formation than neither the arf7 nor arf19 single mutant expresses. The completion of the whole-genome sequence of Arabidopsis thaliana has made it possible to explore the phytochemical genomics in this species by determining gene-to-metabolite correlation through the comprehensive analysis of metabolite accumulation and gene expression. Takayuki et al. (2007) proposed a strategy that involves the integrated analysis of metabolic profiling, transcriptome coexpression analysis, and the sequence similarity analysis of genes, followed by the reverse genetics and biochemical approaches of narrowed-down candidate genes for a particular reaction or regulation. TILLING (Targeting Induced Local Lesions IN Genomes) is a reverse genetic method that can be employed to generate allelic series of induced mutations in targeted genes for functional analyses. To date, TILLING resources in Arabidopsis thaliana are only available in accessions Columbia and Landsberg erecta.

A permanent collection of 3,509 independent EMS mutagenized M2 lines was developed in A. thaliana accession C24 by Lai et al. (2012). Using the TILLING method identified a total of 73 mutations, comprising 69.6 % missense, 29.0 % sense, and 1.4 % nonsense mutations. Duplication of chromatin following DNA replication requires spatial reorganization of chromatin domains assisted by chromatin assembly factor CAF-1. Nicole et al. (2006) tested the genomic consequences of CAF-1 loss and the function of chromatin assembly factor CAF-1 in heterochromatin formation and suggested that CAF-1 functions in heterochromatin formation. They also suggested the CAF-1 is required only for the complete compaction of heterochromatin but not to maintain transcriptional repression of heterochromatic genes. Extensive work in T-DNA tagging of Arabidopsis has become possible because of improvements in techniques for Agrobacterium-mediated transformation. Recently, 150,000 transformed plants carrying 225,000 (1.5 insertion per line) independent T-DNA integration events were generated and precise locations determined for >88,000 loci by the Salk Institute for Biological Studies (Alonso et al. 2003). Analysis of the insertion site sequences revealed that insertional mutations had been created in ~74 % of the annotated Arabidopsis genes.

Proteomics

Sequencing of complete genomes has advanced our understanding of biological systems, mostly by enabling a broad range of technologies for the analysis of gene functions and by providing information about the theoretical protein-coding capacity of organisms. Proteomics has been defined as “the systematic analysis of the protein population in a tissue, cell, or subcellular compartment” and is often associated with two-dimensional electrophoresis (2-DE). The concept of “proteome” (for PROTEin complement expressed by a genOME, Wilkins et al. 1996) has been emerged recently as a consequence of questions raised from several genome and post-genome projects. The first plant large-scale proteomic work was published in Arabidopsis. The Arabidopsis proteome map provides a detailed map of 14,867 organ-specific proteotypic peptides, which accounts for the diverse composition of protein samples and confers higher sensitivity to proteotypic peptide selection for targeted and quantitative proteomics.

In terms of proteomics in plants, Arabidopsis is currently a unique system. Tom et al. (2006) mapped the Arabidopsis organelle proteome. In their study, the density gradient distributions of 689 proteins from Arabidopsis thaliana were determined, enabling confident and simultaneous localization of 527 proteins to the endoplasmic reticulum, Golgi apparatus, vacuolar membrane, plasma membrane, mitochondria, and plastids. In plants, progress has been made in determining the proteomes of organelles such as chloroplasts, mitochondria, and peroxisomes. Alison et al. (2005) described the first proteomic analysis of plant (Arabidopsis thaliana) nucleoli, in which they identified 217 proteins. This allows a direct comparison of the proteomes of an important nuclear structure between two widely divergent species: human and Arabidopsis. The comparison identified many common proteins, plant-specific proteins, proteins of unknown function found in both of the proteomes and proteins that were nucleolar in plants but non-nucleolar in human. In two proteomic studies in Arabidopsis, one examining programmed cell death in cell cultures (Swidzinski et al. 2004) and one comparing basal and R gene-mediated defense in leaves (Jones et al. 2004), differentially accumulating proteins were represented by only a small number of relatively abundant proteins, many of which were also transcriptionally regulated during the responses. In 2004, three proteomic studies of the Arabidopsis vacuole were published, two using suspension-cultured cells (Shimaoka et al. 2004; Szponarski et al. 2004) and one using mature plants (Carter et al. 2004).

The A. thalianaP. xylostella interaction is a model system used to investigate insect resistance in plants, in particular the analysis of inducible defense mechanisms. Richard et al. (2010) investigated the physiological factors affecting feeding behavior by larvae of the insect, Plutella xylostella, on herbivore-susceptible and herbivore-resistant Arabidopsis thaliana. The leaves of 162 recombinant inbred lines (RILs) were screened to detect genotypes upon which Plutella larvae fed least (P. xylostella resistant) or most (P. xylostella susceptible). The combined results suggest that enhanced production of ROS may be a major pre-existing mechanism of Plutella resistance in Arabidopsis. To assemble a high-density Arabidopsis proteome map, Katja et al. (2008) performed 1354 LTQ (linear trap quadrupole) ion-trap mass spectrometry runs with protein extracts from six different organs. The resulting data files were analyzed with two search algorithms, PeptideProphet (6) and PepSplice (7). They identified 13,029 proteins with 86,456 unique peptides originating from 790,181 tandem mass spectrometry (MS/MS) spectrum assignments at a false-discovery rate below 1 %.

Through tandem mass spectrometry, Kleffmann et al. (2004) identified 690 different proteins from purified Arabidopsis chloroplasts. Most proteins could be assigned to known protein complexes and metabolic pathways, but more than 30 % of the proteins have unknown functions, and many are not predicted to localize to the chloroplast. The combined shotgun proteomics and RNA profiling approach is of high potential value to predict metabolic pathway prevalence and to define regulatory levels of gene expression on a pathway scale. Proteomic analysis of glutathione S-transferase of A. thaliana identified 20 GSTs at the protein level with a combination of GST antibody detection, LC-MS/MS analysis of 23030 Kda proteins and glutathione affinity chromatography (Pia et al. 2004). Peltier et al. (2004) presented a simple, fast, and scalable off-line procedure based on three-phase partitioning with butanol to fractionate membrane proteomes in combination with both in-gel and in-solution digestions and mass spectrometry. This should help to further accelerate the field of membrane proteomics and revealed new functions of the thylakoid membrane proteome of Arabidopsis thaliana.

Metabolomics

The “omics” approach integrates transcriptome and metabolome data into a single data set and can lead to the identification of unknown genes and their regulatory networks involved in metabolic pathways of interest. Masami et al. (2007) reported the discovery of two R2R3-Myb transcription factors that positively control the biosynthesis of GSLs in Arabidopsis thaliana by an integrated omics approach. Combined transcriptome coexpression analysis of publicly available, condition-independent data and the condition-specific (i.e., sulfur deficiency) data identified Myb28 and Myb29 as candidate transcription factor genes specifically involved in the regulation of aliphatic GSL production (Fig. 1).

Fig. 1
figure 1

Coexpression analysis of aliphatic GSL biosynthetic genes and transcription factors. Yellow and red points indicate genes encoding enzymes and transcription factors, respectively. Transcripts from AtIMD1 and AtIMD3 and those from CYP79F1 and CYP79F2 were cross-hybridized to the same probe sets on an ATH1 microarray used in AtGenExpress and hence are indistinguishable. Lengths of the lines are valueless in these displays (Masami et al. 2007)

The circadian clock is an endogenous timer that anticipates and synchronizes biological processes to the environment. Traditional genetic approaches identified the underlying principles and genetic components, but new discoveries have been greatly impeded by the embedded redundancies that confer necessary robustness to the clock architecture. Brenda and Steve (2013) studied the regulation, and mechanistic connectivity between clock genes and with output processes has substantially broadened through genomic, transcriptomic, proteomic, and metabolomic approaches. Although much evidence supports the regulation of metabolites by the clock (Farre and Weise 2012), few studies have undertaken a global analysis. The metabolomes of the two arrhythmic clock mutants d975 (the triple mutant for PRR9, PRR7, and PRR5) and CCA1-OX (overexpressor of CCA1) have been compared. While both mutants have similar general morphology, their underlying metabolite profiles are quite distinct. Prieurianin (Pri) and prieurianin acetate (Pri-Ac) were identified as two related compounds that caused period shortening, dampening, and reduced expression of GI::LUC and CCA1::LUC. In addition to affecting circadian expression, both Pri and Pri-Ac caused polarity alterations in different cell types that suggested actin cytoskeletal defects.

A metabolite fingerprinting of crude extracts of A. thaliana using 1H nuclear magnetic resonance (NMR) spectroscopy and multivariate statistics has been tested for the rapid analysis of unfractionated polar plant extracts, enabling the creation of reproducible metabolite fingerprints (Ward et al. 2003). Salt stress is one of the most important factors limiting plant cultivation. Many investigations of plant response to high salinity have been performed using conventional transcriptomic and/or proteomic approaches. Hence, the functions of the complex stress response pathways are yet to be determined, especially at the metabolic level. Kim et al. (2007) analyzed primary metabolites, especially small polar metabolites such as amino acids, sugars, sugar alcohols, organic acids, and amines, by GC-MS and LC-MS at 0.5, 1, 2, 4, 12, 24, 48, and 72 h after a salt-stress treatment with 100 mM NaCl being the final concentration. The results also suggest the co-induction of glycolysis and sucrose metabolism as well as co-reduction of the methylation cycle as long-term responses to salt stress. Metabolic profiling analyses were performed to determine metabolite temporal dynamics associated with the induction of acquired thermotolerance in response to heat shock and acquired freezing tolerance in response to cold shock (Fatma et al. 2004). This investigation provides new insight into the mechanisms of plant adaptation to thermal stress at the metabolite level, reveals relationships between heat- and cold-shock responses, and highlights the roles of known signaling molecules and protectants. Ralf et al. (2009) studied the putative A. thaliana FLS gene family using a combination of genetic and metabolic analysis approaches. They presented evidence that flavonol glycosides found in the fls1-2 mutant are synthesized in plants by the FLS-like side activity of the LDOX enzyme. It is a prerequisite to obtain structural information for the isolated metabolites from the plant of interest. Ryo et al. (2009) isolated metabolites of Arabidopsis in a relatively nontargeted way, aiming at the construction of metabolite standards and chemotaxonomic comparison. In addition, 35 known compounds, including six anthocyanins, eight flavonols, one nucleoside, one indole glucosinolate, and four phenylpropanoids and a derivative, together with three indoles, etc., were also isolated and identified from their spectroscopic data. Nuclear magnetic resonance (NMR) has become a key technology in plant metabolomics with the use of stable isotope labeling and advanced heteronuclear NMR methodologies. Jun et al. (2004) performed multidimensional heteronuclear NMR analysis of metabolic movement of carbon and nitrogen nuclei in Arabidopsis thaliana. The investigations made by Masami et al. (2004) for gene-to-metabolite networks regulating sulfur and nitrogen nutrition and secondary metabolism in Arabidopsis, with integration of metabolomics and transcriptomics, they carried out the transcriptome and metabolome analyses, respectively, with DNA microarray and several chemical analytical methods, including ultrahigh-resolution Fourier transform-ion cyclotron MS.

Rice

Rice is a model crop plant for genomic study because:

  1. 1.

    It has one of the smallest genomes in crop plants, i.e., 390 MB;

  2. 2.

    It has huge economic importance;

  3. 3.

    It is the first genome of crop plants sequenced with high precision;

  4. 4.

    Large amounts of genetic stocks are available to the research community;

  5. 5.

    Rice is rich in germplasm resources including the cultivated forms and its wild relatives which are maintained in germplasm banks;

  6. 6.

    Rice genome shares collinearity with other members of the grass family like corn and wheat;

  7. 7.

    The information available on genetic studies is huge;

  8. 8.

    High-efficiency transformation technology is available and has gained widespread utility;

  9. 9.

    Almost every rice-producing country has very intensive breeding programs; and

  10. 10.

    Large-scale ESTs and insertional mutants are available in rice

Genomics and Synteny

Genetic and Physical Mapping

Brondani et al. (2001) constructed a genetic map with 162 PCR-based markers (155 microsatellite- and 7 sequence-tagged site markers) using a backcross population derived from the cross O. glumaepatula, accession RS-16 from the Brazilian Amazon Region x O. sativa BG-90-2, an elite rice inbred line. The map included 47 new simple sequence repeat (SSR) markers developed from an O. glumaepatula genomic library enriched for AG/TC sequences. All SSR markers were able to amplify the O. sativa genome, indicating a high degree of SSR flanking region conservation between O. glumaepatula and O. sativa species. The map covered 1,500.4 cM, with an average of one marker every 10 cM. Despite some chromosomes being more densely mapped, the overall coverage was similar to the other maps developed for rice. The advantage to construct SSR-based map is to permit the combination of the speed of the PCR reaction and the codominant nature of the SSR marker, facilitating the quantitative trait loci analysis and marker-assisted selection for rice breeding programs. Temnykh et al. (2001) developed a set of 200 Class I SSR markers and integrated into the existing microsatellite map of rice, providing immediate links between the genetic, physical, and sequence-based maps. This contribution brings the number of microsatellite markers that have been rigorously evaluated for amplification, map position, and allelic diversity in Oryza spp. to a total of 500. Yamamoto et al. (2001) developed a BC1F3 population and constructed an RFLP linkage map. The map was constructed using data from 187 plants and 116 RFLP markers. They compared the chromosomal locations of putative QTLs in their study with those in other studies. They found that several QTLs for these traits, showing a correlation with phenotypic values, were mapped in the same chromosomal regions. Kennard et al. (2000) presented the first genetic map of wild rice (Zizania palustris, 2n = 2x = 30), a native aquatic grain of North America. The map is composed principally of previously mapped RFLP genetic markers from rice (Oryza sativa, 2n = 2x = 24). The map is important as a foundation for genetic and crop improvement studies, as well as a reference for genome organization comparisons among species in Gramineae. A comparative mapping approach with rice is especially useful because wild rice is grouped in the same subfamily, Oryzoideae, and no other mapping comparison has yet been made within the subfamily. As rice is the reference point for mapping and gene cloning in cereals, establishing a consensus map within the subfamily identifies conserved and unique regions. The genomes of wild rice and cultivated rice differ in total DNA content (wild rice has twice that of rice) and chromosome pairs (wild rice = 15 versus rice = 12). Rice genome sequence is available and is the first genome of crop plants sequenced with high precision.

BAC Library

A comparison of expressed sequence tag (EST) database from different plants can reveal the diversity in coding sequences between crop species, both closely related and distantly related. Bacterial artificial chromosomes (BACs) are very useful for physical mapping, gene location, cloning, sequencing, and gene discovery in crop plants. Tao et al. (2001) developed a genome-wide BAC-based map of the rice genome. The map consists of 298 BAC contigs and covers 419 Mb of the 430-Mb rice genome. They found that the contigs constituting the map are accurate and reliable. Particularly important to proficiency were (1) a high-resolution, high-throughput DNA sequencing gel-based electrophoretic method for BAC fingerprinting, (2) the use of several complementary large-insert BAC libraries, and (3) computer-aided contig assembly. It has been demonstrated that the fingerprinting method is not significantly influenced by repeated sequences, genome size, and genome complexity. Use of several complementary libraries developed with different restriction enzymes minimized the “gaps” in the physical map. They observed that a clonal coverage of 6.0–8.0 genome equivalents seems to be sufficient for the development of a genome-wide physical map of 95 % genome coverage. Their study indicates that genome-wide BAC-based physical maps can be developed quickly and economically for a variety of plant and animal species by restriction fingerprint analysis via DNA sequencing gel-based electrophoresis.

Synteny

The development of genetic and linkage map in many crop species has positional similarity, and information in one crop species will lead to better understanding of crop evolution and functioning of genes in other crop species. This “synteny” allows advances made in one species to have spillover impact in another. Snape et al. (2001) illustrated how genes for vernalization response and cold tolerance on chromosomes 5A and 5D of wheat have been identified and located. They advocated that their relationships to genes in other species, such as barley and rice, could be characterized through comparative mapping approaches, leading to strategies for their isolation using rice genomic tools. Smilde et al. (2001) developed a set of 88 rice expressed sequence tags previously mapped on rice chromosome 1 in the cross “Nipponbare” x “Kasalath.” They used these markers for comparative mapping in a cross of the barley. About one-third of the markers were polymorphic between “Igri” and “Franka.” These polymorphisms were mapped, and most of these (56 %) confirmed that rice chromosome 1 and barley chromosome 3H are syntenous. The markers that were not fitting in the collinear order were distributed randomly across the barley genome. The comparative maps of barley chromosome 3H and rice chromosome 1 comprise in total 26 common markers covering more than 95 % of the genetic length of both chromosomes. A 30-fold reduction of recombination is seen around the barley centromere, and synteny may be interrupted in this region. In their study, the good overall synteny on a mesoscale (1–10 cM) justifies the use of rice as a platform for map-based cloning in barley.

Dubcovsky et al. (2001) demonstrated collinearity of a large region from barley (Hordeum vulgare) chromosome 5H and rice (Oryza sativa) chromosome 3 by mapping of several common restriction fragment length polymorphism clones on both regions. One of these clones, WG644, was hybridized to rice and barley bacterial artificial chromosome (BAC) libraries to select homologous clones. A comparison of the rice and barley DNA sequences revealed the presence of four conserved regions, containing four predicted genes. The four genes are in the same orientation in rice, but the second gene is in inverted orientation in barley. The fourth gene is duplicated in tandem in barley but not in rice. Comparison of the homeologous barley and rice sequences assisted the gene identification process and helped determine individual gene structures. General gene structure (exon number, size, and location) was largely conserved between rice and barley and to a lesser extent with homologous genes in Arabidopsis thaliana. Similarly, a major rust (Puccinia melanocephala) resistance gene identified in a self-progeny of the sugarcane cultivar R570 was known to be linked to a marker revealed by the sugarcane probe CDSR29. Asnaghi et al. (2000) used synteny relationships between sugarcane and three other grasses in an attempt to saturate the region around this rust resistance gene. Comparison of sugarcane, sorghum, corn, and rice genetic maps led to the identification of homeologous chromosome segments at the extremity of sorghum linkage group D, rice linkage group 2, maize linkage group 4, and in the centromeric region of maize linkage group 5.

Gene Expression

The rice blast (Magnaporthe grisea) resistance gene Pib is a member of the nucleotide-binding site (NBS) and leucine-rich repeat (LRR) class of plant disease resistance genes and belongs to a small gene family. Wang et al. (2001) isolated and characterized a Pib homologue (PibH8) from rice cultivars Tohoku IL9, Nipponbare, Kasalath, Koshihikari, and IR24 and extensively investigated the expression of the Pib gene family (Pib, PibH8, HPibH8-1, and HPibH8-2) under various environmental and chemical treatments. RNA gel blot analysis revealed that their expression was regulated dramatically by environmental signals, such as temperature, light, and water availability. Their expression was also induced by chemical treatments, such as jasmonic acid, salicylic acid, ethylene, and probenazole. Their findings suggest that expression of the Pib gene family is upregulated by environmental conditions that would favor pathogen infection.

Gene Expression for Salt Tolerance

Kawasaki et al. (2001) investigated transcript regulation in response to high salinity for salt-tolerant rice (var. Pokkali) with microarrays including 1,728 cDNAs from libraries of salt-stressed roots. NaCl at 150 mM reduced photosynthesis to one-tenth of the prestress value within minutes. Hybridizations of RNA to microarray slides probed for changes in transcripts from 15 min to 1 week after salt shock. Beginning 15 min after the shock, Pokkali showed upregulation of transcripts. Approximately 10 % of the transcripts in Pokkali were significantly upregulated or downregulated within 1 h of salt stress. The initial differences between control and stressed plants continued for hours but became less pronounced as the plants adapted over time.

QTL Mapping and Tagging of Useful Genes

Yield and Yield Components

Moncada et al. (2001) used an advanced backcross breeding population to identify quantitative trait loci (QTLs) associated with eight agronomic traits in a BC2F2 population derived from an interspecific cross between Caiapo, an upland Oryza sativa subsp. japonica rice variety from Brazil, and an accession of Oryza rufipogon from Malaysia. They concluded that the advanced backcross QTL analysis offers a useful germplasm enhancement strategy for the genetic improvement of cultivars adapted to stress-prone environments. They observed that 56 % of the trait-enhancing QTLs identified in this study were derived from O. rufipogon. This figure is similar to the 51 % of favorable QTLs derived from the same parent in crosses with a high-yielding hybrid rice cultivar evaluated under irrigated conditions in a previous study. Similarly, Li et al. (2001) analyzed the quantitative trait loci (QTLs) responsible for the area, perimeter, length, width, and length-width ratio of the flag leaf and the second and third upside-down leaves of a doubled-haploid population of 117 lines derived from a cross between Oryza sativa var. indica (Zhaiyeqing 8) and var. japonica (Jingxi 17) cultivars based on the genetic linkage map of 243 molecular markers. All the traits were continuously distributed with transgressive segregation in the population. For the flag leaf, 12 QTLs were mapped in 8 intervals of 4 chromosomes including 3 QTLs for leaf area, 2 each for leaf perimeter and length, and 5 for ear length-width ratio. The QTLs accounted for 9–27.8 % of phenotypic variation. For the second upside-down leaf, 18 QTLs were mapped for 5 leaf morphological traits in 11 intervals of 5 chromosomes including 3 QTLs for leaf area, 4 each for leaf perimeter and length, 2 for leaf width, and 5 for length-width ratio. The mapped QTLs accounted for 9.1–24.7 % of phenotypic variation. Heading time (HT, days from sowing to heading) is an important agronomic trait in rice. Physiologically, HT can be divided into two stages: vegetative growth time (VGT) and reproductive growth time (RGT). Zhou et al. (2001) in a study mapped HT-related QTLs based on the performance of HT, VGT, and RGT. An indica (cv. Gui 630)/japonica (cv. Taiwanjing) doubled-haploid population and a corresponding RFLP map were constructed for the study. They used methods of composite interval mapping and multiple-trait composite interval mapping to map QTLs. A total of 19 QTLs were mapped on all 12 rice chromosomes with the exception of chromosomes 1 and 4. Their results showed that (1) more QTLs could be detected by partitioning HT into VGT and RGT; (2) the genetic variation of HT was largely attributed to VGT; and (3) the two component stages were relatively independent in terms of QTL effects, suggesting that the ratio between VGT and RGT could be genetically adjusted without apparently altering HT.

Mapping for Abiotic Stresses

Abiotic stresses like drought tolerance, cold tolerance, mineral deficiency, mineral toxicity, and submergence tolerance are few major issues in rice.

Drought Tolerance

Drought is one of the main abiotic constraints in rice. A deep root system contributes efficiently in maintaining the water status of the crop through a stress period. Ali et al. (2000) undertook an experiment to map QTLs associated with five root traits using RFLP and AFLP markers in an RIL population developed from two indica parents, IR58821-23-B-1-2-1 and IR52561-UBN-1-1-2. A genetic linkage map of 2,022 cM length was constructed comprising 303 AFLP and 96 RFLP markers with an average marker space of 5.0 cM. QTL analysis via interval mapping detected 28 QTLs for these five root traits, which were located on chromosomes 1, 2, 3, 4, 6, 7, 10, and 11. Individual QTLs accounted for between 6 and 27 % of the phenotypic variation. Similarly, Zhang et al. (2001) constructed a genetic linkage map consisting of 315 DNA markers and identified a total of 41 QTLs for osmotic adjustment and root traits and individually explained 8–38 % of the phenotypic variance. A region on chromosome 4 harbored major QTLs for several root traits. In another study, Shen et al. (2001) identified QTLs affecting root parameters in a doubled-haploid population of rice derived from the cross, IR64/Azucena.

Cold Tolerance

Norin-PL8 is a cold-tolerant variety of rice (Oryza sativa) that was developed by introgressing chromosomal segments from a cold-tolerant javanica variety, Silewah. The QTLs for cold tolerance of Norin-PL8 was earlier detected on chromosomes 3 and 4. For fine mapping those genes, Saito et al. 2001 developed a set of near-isogenic lines (NILs) from recombinants in the segregating population. From their study, they concluded that there are at least two QTLs for cold tolerance, tentatively designated as Ctb-1 and Ctb-2, in the introgression on chromosome 4. The map distance between Ctb-1 and Ctb-2 was estimated to be 4.7–17.2 cM. Similarly, Takeuchi et al. (2001) performed QTL analysis with doubled-haploid lines (DHLs) to identify the chromosomal regions controlling cold tolerance (CT) at booting stage of rice (Oryza sativa var. japonica). Three QTLs controlling CT (qCT-1, qCT-7, and qCT-11) were mapped to chromosomes 1, 7, and 11, respectively. The QTL with the largest effect, qCT-7, was mapped to chromosome 7 and explained 22.1 % of the total phenotypic variation, while qCT-1 and qCT-11 explained approximately 5 % each. For all of the QTLs, alleles from Koshihikari increased the degree of CT.

Flood and Submergence Tolerance

The inheritance and expression of traits associated with submergence stress tolerance at the seedling stage are genetically complex. Using AFLP markers, Sripongpangkul et al. (2000) could identify several genes/QTLs that control plant elongation and submergence tolerance in a recombinant inbred rice population. The most important gene was QIne1 mapped near sd-1 on chromosome 1. The Jalmagna (the deepwater parent) allele at this locus had a very large effect on internode elongation and contributed significantly to submergence tolerance under flooding. The second locus was a major gene, sub1 (t), mapped to chromosome 9, which contributed to submergence tolerance only. The third one was a QTL, QIne4, mapped to chromosome 4. Toojinda et al. (2003) exploited naturally occurring differences between certain rice lines in their tolerance to submergence and used QTL mapping to improve understanding of the genetic and physiological basis of submergence tolerance. Three rice populations, each derived from a single cross between two cultivars differing in their response to submergence, were used to identify QTL associated with plant survival and various linked traits. Several major QTLs determining plant survival, plant height, stimulation of shoot elongation, visual tolerance score, and leaf senescence are each mapped to the same locus on chromosome 9. These QTLs were detected consistently in experiments across all years and in the genetic backgrounds of all three mapping populations. Secondary QTLs influencing tolerance were also identified and located on chromosomes 1, 2, 5, 7, 10, and 11. These QTLs were specific to particular traits, environments, or genetic backgrounds. All identified QTLs contributed to increased submergence tolerance through their effects on decreased underwater shoot elongation or increased maintenance of chlorophyll levels or on both. Septiningsih et al. (2009) developed mega varieties of rice with Sub1 introgression that are submergence tolerant. They observed that all mega varieties with Sub1 introgression had a significantly higher survival rate than the original parents. An intolerant Sub1C allele combined with the tolerant Sub1A-1 allele did not significantly reduce the level of tolerance, and the Sub1C-1 expression appeared to be independent of the Sub1A allele; however, even when Sub1C-1 expression is completely turned off in the presence of Sub1A-2, plants remained intolerant. They concluded that the Sub1 provided a substantial enhancement in the level of tolerance of all the sensitive mega varieties to submergence.

Salt Tolerance

Prasad et al. (2000) mapped QTLs controlling various rice seedling traits conferring salt tolerance by using a doubled-haploid population derived from the cross between IR64 and Azucena. Seven QTLs were identified (threshold LOD _2.00) for seedling traits under salt stress, i.e., two for seed germination (%), one for seedling root length (cm), three for seedling dry matter (mg), and one for seedling vigor. Among the seven QTLs, four were located on chromosome 6. A QTL analysis for root length on chromosome 6 that was flanked by RFLP markers RG162 and RG653 exhibited a very high phenotypic variance of 18.9 % and a peak LOD score of 2.852.

Mapping for Biotic Stresses

Many diseases like bacterial blight, blast, and sheath blight are the major diseases in rice. Genes responsible for resistance against those diseases have been exhaustively worked, mapped, and used in breeding programs. Some of the initial examples are given here. Ammiraju et al. (2000) identified molecular markers linked to a bacterial blight resistance gene from a rice cultivar (Ajaya), which is nonallelic to other known recessive genes conferring resistance to Indian pathotype I of X. oryzae pv. oryzae. Similarly, Che et al. (2003) in a study used an F2 rice population from a cross between “4011” and “Xiangzaoxian19” and identified five molecular markers, including three RFLP markers converted from RAPD and AFLP markers and two SSR markers to link with the sheath blight-resistant gene. This dominant resistant gene was named as Rsb 1 and mapped on rice chromosome 5. Ahn et al. (2000) reported the chromosomal localization and molecular mapping of this blast resistance gene and designated it as Pi-18, which confers resistance to Korean isolate KI-313 of the blast pathogen. RFLP analysis showed that Pi-18 was located near the end of chromosome 11, linked to a single copy clone RZ536 at a distance of 5.4 cM and that this gene was different from Pi-1 (t). An allelism test revealed that this gene was also different from Pi-k. A study by Fukuoka and Okuno (2001) detected two QTLs for blast resistance on chromosome 4, and one QTL was detected on each of chromosomes 9 and 12. The phenotypic variation explained by each QTL ranged from 7.9 to 45.7 % and the four QTLs explained 66.3 % of the total phenotypic variation. The resistance gene, designated pi21, was mapped on chromosome 4 as a single recessive gene between RFLP marker loci G271 and G317 at a distance of 5.0 cM and 8.5 cM, respectively. Similarly, Fujii et al. (2000) mapped the Pb1 locus for the panicle blast resistance on rice chromosome 11 using RFLP markers. The Pb1 locus was mapped in the middle part of the long arm of chromosome 11. This locus was closely located at 1.2 cM from three RFLP markers: S723, CDO226, and C189. Although based on linkage analysis they determined that S723, CDO226, and C189 were located at the same locus, graphical genotyping analysis, using many progeny cultivars, revealed that S723 was the closest marker to Pb1 among these three.

The major insect pests of rice includes brown planthopper, whitebacked planthopper, stem borer, and stem rot. Some of the initial examples of mapping insect pest resistance are given. Yamasaki et al. (2000) working on brown planthopper phenotyped a set of 71 rice recombinant inbred lines (F11) derived from a cross between a japonica cultivar Asominori with ovicidal response and an indica cultivar IR24. In composite interval mapping for GWL and EM with 293 RFLP marker loci, two QTLs each on the long arm of chromosome 1 (1 L) and the short arm of chromosome 6 (6S) were detected for both GWL and EM. The 6S QTL explained 72.1 and 85.1 % of the phenotypic variations for GWL and EM, respectively. The QTL on 1 L explained 19.8 and 17.8 % of the phenotypic variations for GWL and EM, respectively. Murai et al. (2001) constructed a high-resolution linkage map as a foundation for map-based cloning of the bph2 locus. An advanced mapping population derived from a cross of “Tsukushibare” (susceptible) with “Norin-PL4” (an authentic bph2-introgression line) was used. Through bulked segregant analysis and linkage analysis, bph2 was located within a 3.2-cM region containing eight AFLP markers. One marker (KAM4) showed complete co-segregation with bph2, and bph2 was mapped within a 1.0-cM region delimited by KAM3 and KAM5, two flanking markers. Huang et al. (2001) conducted a molecular marker-based genetic analysis of the BPH resistance of B5, a highly resistant line that derived its resistant genes from the wild rice, Oryza officinalis. Their study revealed two genomic regions on chromosomes 3 and 4, respectively, that contained genes for BPH resistance. These two loci were further assessed by quantitative trait locus analysis resolved to a 14.3 cM interval on chromosome 3 and a 0.4 cM interval on chromosome 4. Similarly, working on WBPH, Ma et al. (2002) studied a rice population consisting of 90 F3 lines from TN1/Guiyigu to analyze the linkage between DNA markers and a new gene Wbph 6(t) conferring resistance to the whitebacked planthopper. They mapped the Wbph 6(t) gene onto the short arm of rice chromosome 11, with a distance of 21.2 cM to the SSLP marker RM 167.

The Chinese rice cultivar Duokang #1 carries a single dominant gene Gm-6 (t) that confers resistance to the four biotypes of Asian rice gall midge (Orseolia oryzae) known in China. Katiyar et al. (2001) performed bulked segregant analysis on progeny of a cross between Duokang #1 and the gall midge-susceptible cultivar Feng Yin Zhan using RAPD method. The RAPD marker OPM06 (1400) amplified a locus linked to Gm-6 (t). The locus was subsequently mapped to rice chromosome 4 in a region flanked by cloned RFLP markers RG214 and RG163. Fine mapping of Gm-6(t) revealed that markers RG214 and RG476 flanked the gene at distances of 1.0 and 2.3 cM, respectively. Another gall midge resistance gene, Gm-2, mapped previously to chromosome 4, is located about 16 cM from Gm-6 (t), to judge by data from a segregating population derived from a cross between Duokang #1 and the Indian cultivar Phalguna that carries Gm-2. Sardesai et al. (2002) identified an AFLP marker SA598 that is linked to Gm7, a gene conferring resistance to biotypes 1, 2, and 4 of the gall midge. A set of PCR primers specific to an RFLP marker, previously identified to be linked to another gall midge resistance gene Gm2, also amplified a 1.5-kb (F8LB) fragment that is linked to Gm7. Gm7 is a dominant gene and nonallelic to Gm2. Hybridization experiments with clones from a YAC library of Nipponbare revealed that Gm7 is tightly linked to Gm2 and is located on chromosome 4 of rice. The germplasm line 87-Y-550 (PI566666) inherited its resistance to stem rot from the wild species Oryza rufipogon. Ni et al. (2001) made four crosses of 87-Y-550 (resistant donor) with susceptible lines. One AFLP marker showed significant association with stem rot resistance and accounted for 45.0 % of the phenotypic variation in 59 progenies. This marker was mapped on rice chromosome 2 between the RFLP markers RZ166 and RG139 by using F2 reference population information. With the strategy of selective genotyping combined with a parental survey, two microsatellite markers, RM232 and RM251, on chromosome 3 were found to be associated with stem rot resistance and accounted for 41.1 % and 37.9 % of the phenotypic variation, respectively.

Meta-analysis of QTLs combines the results of several QTL detection studies and provides narrow confidence intervals for meta-QTLs, permitting easier positional candidate gene identification. It is usually applied to multiple mapping populations but can be applied to one. Khowaja et al. (2009) reported a meta-analysis of drought-related QTLs in the Bala × Azucena mapping population that compiled data from 13 experiments and 25 independent screens providing 1,650 individual QTLs separated into 5 trait categories: drought avoidance, plant height, plant biomass, leaf morphology, and root traits. A heat map of the overlapping 1-LOD confidence intervals provides an overview of the distribution of QTLs. The heat map graphically illustrates the genetic complexity of drought related traits in rice. Formal meta-analysis on chromosome 1, where clusters of QTLs for all trait categories appear close, established that the sd1 semidwarfing gene coincided with a plant height meta-QTL, that the drought avoidance meta-QTL was not likely to be associated with this gene, and that this meta-QTL was not pleiotropic with close meta-QTLs for leaf morphology and root traits. On chromosome 5, evidence suggests that a drought avoidance meta-QTL was pleiotropic with leaf morphology and plant biomass meta-QTLs but not with meta-QTLs for root traits and plant height 10 cM lower down. A region of dense root QTL activity graphically visible on chromosome 9 was dissected into three meta-QTLs within a space of 35 cM. The confidence intervals for meta-QTLs obtained ranged from 5.1 to 14.5 cM with an average of 9.4 cM, which are approximately 180 genes in rice.

Marker-Assisted Selection

In order to pyramid the useful genes into a single genotype useful for Indian condition, Singh et al. (2001) pyramided three BB resistance genes, xa5, xa13, and Xa21, into cv. PR106, which is widely grown in Punjab, India, using marker-assisted selection. Lines of PR106 with pyramided genes were evaluated after inoculation with 17 isolates of the pathogen from Punjab and six races of Xoo from the Philippines. Genes in combinations were found to provide high levels of resistance to the predominant Xoo isolates from the Punjab and six races from the Philippines. They also evaluated the pyramided lines under natural conditions at 31 sites (including Ludhiana, Jalandhar, Ferozepur, and Sangrur) in commercial fields during 1999. The combination of genes provided a wider spectrum of resistance to the pathogen population prevalent in the region. Similarly, Sanchez et al. (2000) transferred three bacterial blight (BB) resistance genes, xa5, xa13, and Xa21, to IR65598-112 and two other sister lines IR65600-42 and IR65600-96 (new plant type, NPT) rice lines via marker-aided backcrossing. The BC3F3 NILs having more than one BB resistance gene showed a wider resistance spectrum and manifested increased levels of resistance to the Xoo races, compared with those having a single BB resistance gene. Results for two F2 populations and the progeny testing of their F3 lines showed that MAS reached an accuracy of 95 and 96 % of identifying homozygous resistant plants for xa5 and xa13, respectively. Their results demonstrated the usefulness of MAS in gene pyramiding for BB resistance, particularly for recessive genes, such as xa5 and xa13, that are difficult to select through conventional breeding in the presence of a dominant gene such as Xa21. Similarly, Siangliw et al. (2003) crossed three submergence-tolerant cultivars, FR13A, IR67819F2-CA-61, and IR49830-7-1-2-2, with KDML105. Transferring the major QTL for submergence tolerance was facilitated by four backcrossings to the recipient KDML105. Molecular markers tightly linked to the gene(s) involved were developed to facilitate molecular genotyping. They demonstrated that individuals of a BC4F3 line that retained a critical region on chromosome 9 transferred from tolerant lines were also tolerant of complete submergence while retaining all the agronomically desirable traits of KDML105. Few successful commercial cultivars released in India using marker-assisted backcrossing are given below.

Commercial Release of MAS-Derived Varieties in India
Improved Samba Mahsuri

Sundaram et al. (2008) introgressed three major bacterial blight resistance genes (Xa21, xa13, and xa5) into Samba Mahsuri (BPT5204), a medium slender grain indica rice variety, popular with farmers and consumers of India. Samba Mahsuri is susceptible to several diseases and pests, including bacterial blight (BB). They used PCR-based molecular markers in a backcross breeding program to introgress the genes from a donor line (SS1113) in which all the three genes were present in a homozygous condition. A selected BC4F1 plant was selfed to generate homozygous BC4F2 plants with different combinations of BB resistance genes. Under conditions of BB infection, the three-gene pyramid lines exhibited a significant yield advantage over Samba Mahsuri which was later on released for commercial cultivation.

Improved Swarna

Swarna is one of the most popular rice varieties in India producing a high yield, good grain eating quality and requires 25 % less nitrogen, as widely claimed by the farmers. Released in Andhra Pradesh in 1982, it spread across the subcontinent and into Bangladesh, where it was never officially released. It is now the number two variety during the wet season in India. The research began in the 1970s when flood-resistant varieties of rice were identified in India. Researchers then crossed and improved the tolerant characteristics to produce higher-yielding rice varieties, which can withstand up to 17 days of complete submergence. Using MAS, this single trait of interest was transferred into commercially valuable rice variety Swarna without losing useful characteristics—such as high yield, good grain quality, or pest and disease resistance. The first variety developed, Swarna-Sub1, showed high survival under submerged conditions compared to the original variety Swarna and gave yield advantages of one to three tons per hectare over Swarna when submerged.

Next-Generation Sequencing

NGS technology is based on massive parallel sequencing as opposed to the Sangers sequencing technology. They rely on the amplification of single isolated DNA molecules and their analysis in a massive parallel way. Hundreds of thousands or even tens of millions of single-stranded DNA molecules are immobilized on a solid surface such as glass slides or on beads, depending on the platform used. The commercially available NGS platforms used for plant genome sequencing include Roche/454 FLX and the Illumina/Solexa Genome Analyzer and the Applied Biosystems SOLiD System.

Next-generation sequencing (NGS) is a powerful tool for the discovery of domestication genes in crop plants and their wild relatives. Re-sequencing of domesticated and commercialized genotypes can identify regions of low diversity associated with domestication and adaptation. The sequence data can be used to design species-specific PCR primers. Novel allelic variations in close or distant relatives can be characterized by NGS. By using NGS populations of large number of individuals can be screened rapidly. NGS also supports in efficient identification and capture of novel genetic variation from related species. NGS allows whole-genome analysis to determine the genetic basis of phenotypic differences. This helps in the facilitation of recombination of traits which are agronomically important both in cultivated and wild relatives. NGS provides very large amounts of DNA sequence data normally with short repeats. As the technology continues to develop, the amount of data and the length of the reads are increasing (Varshney et al. 2009). NGS gives an opportunity to explore genetic variation among plants and their wild relatives in a better way than the earlier technologies. It also allows the most complex plant genome to be studied and understood.

Rice was the first crop to have a reference genome sequence and was a model for the use of NGS. Even if lots of information is available in rice in relation to its origin and evolution, the origin of rice is complex and involves movement of genes between species and groups like indica and japonica types (Sang and Ge 2007). Whole-genome re-sequencing of rice genotypes has allowed analysis of domestication genes identified in areas of low diversity in the genome as a result of human selection. This approach has shown that both japonica and indica rice share common areas of low diversity possibly due to introgression from one population to the other following selection. NGS of wild rice population has recently been used to identify the distinctness of Asian and Australian wild rice populations. NGS also provides new tools for the evaluation of grain quality (Henry 2011).

Using existing genetic and genomic resources and tools (mutants, transgenics, cytogenetic stocks, and genomic resources), rice researchers were able to rapidly integrate and apply genome sequence information to understand rice genome structure and evolutions as well as to discover and mine genes, including those underlying complex traits of agricultural importance. Members of the large gene families [viz., transcription factors, peptide transporters, kinases, nucleotide-binding leucine-rich repeats (NB-LRRs), microRNAs, and germins] have been discovered through genome-wide surveys, enabling their cellular functions to be dissected and their roles in plant growth and development to be elucidated. In another example genome sequence-enabled identification and positional cloning of genes responsible for traits selected during domestication, including the seed-shattering trait, led to the identification of molecular changes selected during domestication (Feuillet et al. 2010; Izawa et al. 2009). Rice is now rich in tools for mapping and breeding, including high-density SSRs, comprehensive SNPs, insertion–deletion polymorphism (IDPs), and custom-designed (candidate gene) marker for marker-assisted breeding. One of the outcomes of the NGS in rice for rice genome sequencing has been the development of a comprehensive collection of SNPs. Fukuoka et al. (2009) used sequence-based markers from the blast disease resistance gene pi21 region to identify recombinants between pi21 and another gene located 37 kb apart that confers poor eating quality. With the information available from NGS, rice varieties with both durable resistance against blast and good eating quality were combined.

Expressed Sequence Tags (ESTs) and Flanking Sequence Tags (FSTs)

Kim et al. (2013a, b) while working with rice observed that transcript levels of the flowering repressor Ghd7 were decreased while those of its downstream genes, Ehd1, Hd3a, and RFT1, were increased. Sequencing the known flowering-regulator genes revealed mutations in Ghd7 and OsPRR37 that cause early translation termination and amino acid substitutions, respectively. Genetic analysis of F2 progeny from a cross between cv. Kitaake and cv. Dongjin indicated that those mutations additively contribute to the early-flowering phenotype in cv. Kitaake. They generated 10,000 T-DNA tagging lines and deduced 6,758 flanking sequence tags (FSTs), in which 3,122 were genic and 3,636 were intergenic. Among the genic lines, 367 (11.8 %) were inserted into new genes that were not previously tagged. Because the lines were generated by T-DNA that contained the promoterless GUS reporter gene, which had an intron with triple splicing donors/acceptors in the right border region, a high efficiency of GUS expression was shown in various organs. Sequencing of the GUS-positive lines demonstrated that the third splicing donor and the first splicing acceptor of the vector were extensively used. Similarly, Priya and Jain (2013) constructed a database, RiceSRTFDB, which provided comprehensive expression information for rice TFs during drought and salinity stress conditions and various stages of development. This information will be useful to identify the target TF(s) involved in stress response at a particular stage of development.

Functional Genomics

The last decade has seen rapid advances in rice functional genomic research globally. Most of the efforts can be summarized into three fronts: (1) construction of technological and resource platforms for high-throughput gene identification, (2) functional genomic analysis of agronomic traits and biological processes, and (3) identification and isolation of functional genes (Jiang et al. 2011). The overall goal of rice functional genomic research is to understand how the genome functions by deciphering the information conserved in the sequences, including genes and regulatory elements at the whole-genome level. DNA microarray and expression profiling information of gene expression profile is useful in many ways for functional genomic studies. The temporal and spatial patterns of the gene expression can be studied by the transcript abundance of a gene in multiple tissues. This will give information whether the gene expression is constitutive or tissue specific. The expression pattern of the gene would provide clue to understanding the gene function.

Zhang et al. (2007) proposed Green Super Rice as a new goal for rice breeding. On the premise of continued yield increase and quality improvement, Green Super Rice should possess resistances to multiple insects and diseases, high nutrient efficiency, and drought resistance, promising to greatly reduce the consumption of pesticides, chemical fertilizers, and water. With the further advancement of functional genomics in rice, it is anticipated that more genes for these traits will be available for the development of Green Super Rice. The ideal situation of varietal development is breeding according to designed blueprints to breed for cultivars to meet the diverse needs of global rice production for high yield, superior quality, multiple resistances, and high nutrient use efficiency. However, compared to other crops, especially corn, with major commercial breeding programs, breeding application of genomic technology has been rather limited in rice, despite more advances made in sequencing and functional genomic research in rice than in any other crops. At present, high-throughput and low-cost technologies based on the massive sequence information should be developed for breeding applications, in the forms of multiple sets of oligonucleotide chips to meet the needs of rice breeding programs, such as indica vs. japonica. With the tremendous efforts and rapid progress in functional genomic research, rice should be expected to be the first crop for practicing breeding by design.

Transgenic for Genomic Studies

Insertion mutants are valuable resources, especially when their insertion positions are annotated into a chromosome. This allows one to identify the knockout mutants in a given gene. Systemic phenotyping of the insertion mutants can be used to determine the phenotypic alteration that co-segregates with the insertion element. Mutants can also be used for the verification of functional analysis for a map-based cloned gene. Transposon tagging and cloning using Ac/Ds system and other transposable elements is a novel way to understand gene function and cloning new genes in crop plants. T-DNA insertional mutagenesis is the most successful way of developing population for genomic studies.

T-DNA Insertional Mutagenesis for Functional Genomics in Rice

Efforts have been made to discover the functions of plant genes. Most of the strategies have been based on genetic approaches such as mutant identification and map-based gene isolation. A large number of sequenced rice genes of unknown function have been revealed by large-scale analysis of expressed sequence tags (Yamamoto and Sasaki 1997). Thus, the development of a systematic method for discovering the biological functions of these genes has become extremely important. Insertion of transposon has been employed for functional studies in several species. The insertion of T-DNA is a random process. Insertional mutagenesis is the most suitable method for the systematic functional analysis of a large number of genes in the context of the whole plant. This system allows the production of many mutant lines at one time and the induced mutations can be easily detected by PCR. In Arabidopsis, whose entire genomic sequencing has been completed in the past few months, several insertional mutagens like T-DNA, Ac/Ds, and En/Spm have been employed. The inserted genes are stable over generations (Azpiroz-Leehan and Feldmann 1997). The mutant populations induced by these mutagens are being used for forward and reverse genetics. Rice plants are larger than Arabidopsis plants; hence, it is not easy to handle thousands of transgenic rice plants in a greenhouse. For rice, therefore, the use of endogenous insertional mutagens is desirable. Recent comprehensive studies on bacterial artificial chromosome (BAC) end sequences have revealed that the rice genome contains many kinds of transposable elements. Most of them, however, may not be active, because none of the spontaneous waxy mutations were caused by the insertion of transposable elements. Hirochika and his group have shown that a rice retrotransposon, Tos17, is highly active during tissue culture and that the activation of Tos17 is responsible for tissue culture-induced mutations (Hirochika et al. 1996). The unique features of Tos17 suggest that it can be used for forward and reverse genetic studies. At present many newer methodologies are available for screening T-DNA or transposon insertions within known genes and recovering sequences flanking the insertions (Cooley et al. 1996; Frey et al. 1998; Okushima et al. 2005). Reporter genes as insertional elements have been utilized to aid in the identification of insertions within functional genes. The GUS gene is the most frequently used gene as reporter gene because of the accurate detection of its gene products.

Jeon et al. (2000) produced 11,090 primary transgenic rice plants that carry a T-DNA insertion producing 18,358 fertile lines. They observed that approximately 65 % of the population contains more than one copy of the inserted T-DNA. The transgenic plants contained an average of 1.4 loci of T-DNA inserts. They carried out the histochemical GUS assay in the leaves and roots from 5,353 lines and observed that 1.6–2.1 % of the tested organs were GUS positive in the tested organs. They also observed that the GUS expression patterns were organ or tissue specific or ubiquitous in all parts of the plant. Kohli et al. (2001) developed a population of transgenic indica rice lines (Oryza sativa var. indica cultivars Bengal and Pusa Basmati) containing the autonomous Activator transposon (Ac) from maize. A transposon excision assay, which uses the reporter green fluorescent protein (GFP) gene driven by the ubiquitin promoter, was employed to monitor Ac excision in various tissues. In their experiment, twenty percent of transformed calluses displayed uniform GFP activity, indicating very early Ac excision, while later excision in another 40 % of calluses was revealed by mosaic GFP activity. They observed that 12 % of all amplified tags display homology to genes or ESTs—five times more than would be expected on the basis of random transpositional insertion. They advocated that the “Tagged Transcriptome Display” (TTD) technique, using a population of Ac insertion-tagged lines, represents a convenient approach for the identification of tagged genes involved in specific processes, as revealed by their expression patterns.

Proteomics

Proteomics is the scientific discipline which studies proteins and searches for proteins that are associated with traits like disease by means of their altered levels of expression and/or posttranslational modification between control and disease states. It enables correlations to be drawn between the range of proteins produced by a cell or tissue and the initiation or progression of a disease state and the effect of therapy (Isabel et al. 2013).

Abiotic Stresses

In recent past, numerous studies in crop proteomics were conducted using different genotypes and tissues, submitted to different stresses, thus complicating comparative analyses. To overcome this problem, several repositories where researchers can find proteomics information for functional analysis were built, such as the International Proteomics Research in Arabidopsis, the Maize Proteome DB, the Soybean Proteome DB, or the Organellome among many other databases (reviewed in (Jorrín-Novo et al. 2009). In plant abiotic stress studies, it is common to analyze proteomes by contrasting stressed plants against control ones, attempting to correlate changes in protein accumulation with the plant phenotypic response. Additionally, comparisons between genotypes with different sensitivity toward stress (sensitive vs. tolerant) are crucial to understand the putative influence of differentially abundant proteins in tolerant genotypes. As a general tendency, it is observed that tolerant genotypes not only increase oxidative stress metabolism and the scavenging capacity of reactive oxygen species (ROS) but also show enhanced carbohydrate metabolism and more efficient photosynthesis (Hajheidari et al. 2007; Ma et al. 2012; Salekdeh et al. 2002; Witzel et al. 2010). Furthermore, the embryo proteome of six different rice genotypes (with contrasting responses to drought, salt, and cold) showed differences in the posttranslational status of an LEA rice Rab21 (Farinha et al. 2011). This stress protein was found more strongly phosphorylated in the embryos of sensitive genotypes than in tolerant ones. Another proteome comparison between rice (two contrasting genotypes) and Porteresia coarctata (a halophytic type of wild rice) allowed identifying in P. coarctata 16 proteins involved in osmolyte synthesis, photosystem functioning, RuBisCO activation, cell wall synthesis, and chaperone activity (Sengupta and Majumder 2009). It is interesting to note that two Porteresia genes (PcINO1 and PcIMT1, absent in Oryza sativa) conferred salt tolerance when overexpressed in tobacco (Das-Chatterjee et al. 2006). However, the products of these genes could not be detected in Porteresia proteome in the study conducted by Sengupta and Majumder, using two-dimensional gel electrophoresis (2-DE) followed by MALDI-TOF. Liu et al. (2013b) studied for a comparative proteomic analysis of rice shoots in combination with morphological and biochemical investigations. They observed that arsenate suppressed the growth of rice seedlings, destroyed the cellular ultrastructure, and changed the homeostasis of reactive oxygen species. A total of 38 differentially displayed proteins, which were mainly involved in metabolism, redox, and protein metabolism, were identified. Arsenic was found to inhibit rice growth through negatively affecting chloroplast structure and photosynthesis.

Biotic Stresses

In the case of Xanthomonas oryzae pv. oryzae (Xoo), infection of transgenic rice suspension-cultured cells overexpressing the disease resistance gene Xa21 with compatible and incompatible races revealed the modulation of 11 proteins in compatible interactions (Chen et al. 2007). This was comparable to modulation of 20 proteins in rice leaf blades inoculated with compatible and incompatible Xoo races (Mahmood et al. 2006). Proteomic studies of the rice–fungus pathosystem have taken advantage of the natural resistance and/or compatibility existing in the rice germplasm and the fungal races. For example, two near-isogenic rice lines CO39 and C101A51 displaying, respectively, compatible and incompatible interactions with Magnaporthe oryzae have been used to decipher proteome alteration during fungal infection (Liao et al. 2009). Proteomic study of the apoplastic secretome where pathogenic determinants are potentially accumulating can provide additional insights into the plant–fungus interaction. The majority of identified proteins in the apoplastic secretome using the resistant interaction rice—Magnaporthe oryzae— were rice defense-related proteins. On the contrary most of the differentially regulated proteins using 2-D gel approach originated from Magnaporthe oryzae in the susceptible interaction (Kim et al. 2013a, b). Investigations of proteome modulation during viral infections in model species have been scarce and mostly restricted to the rice–rice yellow mottle virus (RYMV) pathosystem (Ventelon-Debout et al. 2004; Brizard et al. 2006).

Metabolomics

Metabolites are the end products of cellular process, and they show the response of biological systems to environmental changes. The current trend in metabolomic studies is to define the cellular status at a particular time point of development or physiological status. These techniques complement other techniques such as transcriptomics and proteomics and depict precise pictures of the whole cellular process. The growing number of sequenced plant genomes has opened up immense opportunities to study biological processes related to physiology, growth and development, and tolerance to biotic and abiotic stresses at the cellular and whole plant level using a novel systems-level approach. Rice continues to be the flagship plant model and shares a significant fraction of reported literature on gene functions and phenotypes associated with plant development and metabolism (Hunter et al. 2012). Gene ontology assignments are often enriched by the addition of annotations from Arabidopsis and rice based on sequence homology. Depending on the biological question, these annotations are further evaluated to model the metabolic (Dal’Molin et al. 2010; Zhang et al. 2010), regulatory (Yun et al. 2010) and coexpressed networks (Childs et al. 2011) leading to novel discoveries of genes and enzymes regulating important agronomic traits. In order to respond to and survive environmental challenges, plants, as sessile organisms, have developed a multitude of anatomical, morphological, growth habit, and developmental adaptations that are based on underlying genetic variation. Genome sequence, gene structure, and functional annotation provide the basis for understanding a genome. However, in order to understand the physiology, development, and adaptation of a plant and its interaction with the environment, its metabolic network needs to be understood. This network represents a (bio)chemical manifestation of downstream changes in shape, form, and growth and development.

Studies on rice metabolomics have so far focused on the quality of metabolites, such as the types of metabolites that can promote seed germination (Shu et al. 2008), the metabolite variation between mutant and wild-type plants (Wakasa et al. 2006), the profiling of metabolome at different developmental stages (Tarpley et al. 2005), and the observation of natural metabolite variation between rice varieties (Kusano et al. 2007). However, information available on biotic and abiotic stresses is limited. Liu et al. (2013a) for the first time gave a reconstructed and curated genome-scale metabolic model of rice, including gene regulatory network, microRNA target information, and protein–protein interactions. The genome-scale multilevel network provides a detailed reference for rice molecular regulatory analysis and genotype–phenotype mapping. Eventually, a comprehensive molecular regulation database of rice has been developed to systematically store, analyze, and visualize the rice genome-scale multilevel network. Takahara et al. (2010) studied the effects of NADK2 expression in rice by developing transgenic rice plants that constitutively expressed the Arabidopsis chloroplastic NADK gene (NK2 lines). NK2 lines showed enhanced activity of NADK and accumulation of the NADP(H) pool, while intermediates of NAD derivatives were unchanged. Comprehensive analysis of the primary metabolites in leaves using capillary electrophoresis mass spectrometry revealed elevated levels of amino acids and several sugar phosphates including ribose-1,5-bisphosphate but no significant change in the levels of the other metabolites. The results suggest that NADP content plays a critical role in determining the photosynthetic electron transport rate in rice and that its enhancement leads to stimulation of photosynthesis metabolism and tolerance of oxidative damages.

Recently, several studies have proposed strategies for the genotype–phenotype mapping. It is clear that approaches of integrating genetics and omics would be a valuable strategy for investigating the regulation of the relationship between plant metabolism and physiology. The potential of metabolomics as a functional genomic tool in addition to transcriptomics and proteomics is well recognized (Carreno-Quintero et al. 2012). Therefore, the integration of different levels of regulatory information (genome, proteome, and metabolome) could probably be a new approach for mapping genotypes to phenotypes. The determination of rice genome sequence (and its annotation), of proteome interactions, and of transcriptome regulatory information has led to the accumulation of sufficient public data to construct systems-level models. These models could increase the understanding of genotype–phenotype relationship and consequently help to improve the quality and productivity of rice.

Maize

Genomics

Maize is the third most important food crop after rice and wheat, both in terms of area and production. In Africa and some of the Asian countries, almost 90 % of maize grown is for human consumption and may account for 80–90 % of the energy intake. The genetic studies in Zea mays started with Edward East’s 1908 report of inbreeding depression. The earlier genetic studies primarily focused on hybrid vigor in the 1940s and cytogenetic breakthrough studies, e.g., transposable elements by Barbara McClintock. During the last decade, the hegemony of SSRs, declared in the near past as “markers of choice,” was broken by single nucleotide polymorphic (SNP) markers (Mammadov et al. 2012). More than 130,000 gene-based SNPs have been identified (Mammadov et al. 2012) and many of these were used for the construction of genetic maps (Shi et al. 2012; Sa et al. 2012). More than 1,800 molecular maps have been developed using different mapping populations in maize were documented in the Maize Genetics and Genomics Database (Maize GDB http://www.maizegdb.org). Molecular marker-facilitated QTL mapping for yield-related traits in maize was first reported by Stuber et al. (1987). Since then, there have been large numbers of studies for identifying associated major genes through gene tagging and quantitative traits locus (QTL) mapping. Many studies have been conducted to identify QTL associated with oil content in maize kernels (Wassom et al. 2008) and provitamin A content (Zhou et al. 2012). Thus, these highly effective and closely linked markers to the QTLs were being used in the marker-assisted selection (MAS) studies for introgressing the targeted genes into locally well-adapted germplasm. Successful MAS applications have been reported for introgression breeding in maize, including introgressions of transgenes (Ragot et al. 1995) and conversions involving simple (Morris et al. 2003) or complex traits (Willcox et al. 2002). Gupta et al. (2009) converted successfully a promising maize hybrid, viz., Vivek Maize Hybrid 9, into QPM version using MAS by introgression of opaque2 gene.

MAS for Development of QPM Maize

Quality protein maize (QPM) is the result of opaque 2 gene along with associated modifiers. The protein quality of QPM maize is as good as 90 % of the milk protein. Since DNA markers within the exons of this gene were available, it was a viable proposition to use MAS for the introgression of the gene along with the necessary modifiers for the development of QPM maize cultivars. Utilizing this method, many normal maize inbreds and hybrids have been converted into QPM versions and were released for commercial cultivation (Gupta et al. 2009). Vivek QPM 9 was released commercially in 2008 while Vivek QPM 21 was released in 2012.

Vivek QPM 21

Vivek QPM 21 (QPM version of Vivek Maize Hybrid 21) shows >70 % enhancement in tryptophan over the original hybrid, Vivek Maize Hybrid 21. The tryptophan content of Vivek QPM 21 is 0.85, whereas it is 0.49 for Vivek Maize Hybrid 21. Vivek QPM 21 was also tested in the All India Coordinated Trial of Kharif 2007, 2008, and 2009, in which it performed equally well in respect of grain yield and other agronomic traits over non-QPM national check, Vivek Maize Hybrid 17, and Vivek Maize Hybrid 21. Vivek Maize Hybrid 21 was released for commercial cultivation in zones I, II, and IV in 2006. The parents of this hybrid have been converted into the QPM version using DNA markers, and this hybrid was reconstituted by crossing VQL 1 and VQL 17. This QPM hybrid shows more than 70 % enhancement in tryptophan over the original hybrid. In the state trials of Uttarakhand under organic condition, this hybrid gave more than 2.4 % higher yield over Vivek Maize Hybrid 21 with an average grain yield of 56.31 quintals per hectare. Vivek QPM 21 was released for the state of Uttarakhand, India, in the year 2012 for commercial cultivation by the State Varietal Release Committee, Uttarakhand, for the hill conditions (Fig. 2).

Fig. 2
figure 2

Vivek QPM 21- QPM version of Vivek Maize Hybrid 21 developed using MAS. (a) Cobs of Vivek QPM 21. (b) Grains of Vivek QPM 21

Functional Genomics

Maize is an excellent model plant species for genetics, developmental biology, and physiology especially for those systems for addressing evolutionary dynamics of transposable elements within and between species particularly for large, complex, repetitive genomes such as maize. The maize genomic research is now accelerated by introduction to next-generation sequencing (NGS) technology, and a number of databases are developed for functional genomics, viz., sequencing database (http://www.maizesequence.org), MaizeGDB (http://www.maizegdb.org/), PlantGDB (http://www.plantgdb.org/ZmGDB), TIGR Maize Database (http://maize.jcvi.org), Maize Assembled Genomic Island (http://magi.plantgenomics.iastate.edu/), and Plant Proteome Database (http://ppdb.tc.cornell.edu) and integrated database OPTIMAS-DW(http://www.optimas-bioenergy.org/optimas_dw).

To study the global gene expression, high-density microarrays, ESTs, or full-length cDNAs serve as the central resources (Schadt et al. 2003). There are 364,385 EST sequences of maize available at NCBI database for their further use in the functional analysis of maize genome. Sekhon et al. (2011) studied the gene expression of different developmental stages especially organ- and paralog-specific expression patterns of lignin biosynthetic pathway genes in vegetative organs during the life cycle of a maize plant. Functional annotation of full-length cDNA in gene prediction has sharply improved knowledge about transcriptome of maize (Soderlund et al. 2009). Zinselmeier et al. (2002) characterized the transcriptomes for various traits/organ-specific gene expression like water-stressed maize tissues and female reproductive tissues. While in another transcriptomic study, the gene expression of placenta and endosperm in developing maize kernels, developing immature ear and tassel (Yu and Setter 2003; Andjelkovic and Thompson 2006; Zhuang et al. 2007), leaves and roots (Bassani et al. 2004; Zheng et al. 2004; Jia et al. 2006; Poroyko et al. 2007; Spollen et al. 2008; Li et al. 2010; Lu et al. 2011), salt-stressed tissues (Wang et al. 2003), and aluminum toxicity-stressed tissues such as roots (Maron et al. 2008) was used. Jia et al. (2006) constructed a full-length cDNA library of maize line Han 21 to study the gene regulations in the osmotically stressed maize seedlings. Alexandrov et al. (2009) generated 36,565 full-length cDNAs using different tissues and treatments from diverse hybrids of maize, of which 10,084 were determined to be of high-quality unique clones. Soderlund et al. (2009) generated 27,455 full-length cDNAs from maize inbred B73 (www.maizecdna.org).

RNA interference (RNAi) is a powerful tool for functional genomics, and it causes a reduction in the steady-state RNA levels of not only the target gene but also another closely related gene (McGinnis et al. 2007). Maize is susceptible to Aspergillus flavus infection, and using the RNAi technology for pathogenesis-related protein 10 (PR10) gene, the expression of PR10 was reduced by 65 % to more than 99 % in transgenic callus lines (Chen et al. 2010). Virus-induced gene silencing (VIGS) has recently been used for functional genomic studies. Linde et al. (2011) used VIGS to identify maize genes that are functionally involved in the interaction with Ustilago maydis, the causative agent of brome mosaic virus (BMV). Shi et al. (2012) used the similar approach VIGS to silence the expression of the gene ZmTrm2 encoding thioredoxin m during sugarcane mosaic virus (SCMV) infection, and it resulted in significant enhancement of systemic SCMV infection.

Proteomics

A significant progress has been made during the last two decades in the analysis of the maize proteome. Protein data sets were generated from different tissues during different physiological and environmental stages. Pechanova et al. (2013) developed the proteomics database from different tissues in maize. Proteomic analysis has been employed to study the important environmental stress factors limiting maize plant growth and productivity. Hu et al. (2010) studied the alterations in protein expression in response to drought and ABA in leaves of maize seedlings. In the ABA-dependent pathway, an anionic peroxidase and two putative uncharacterized proteins were upregulated by drought, while in ABA-independent pathways, glycine-rich RNA-binding protein, pathogenesis-related protein, an enolase, a serine/threonine protein kinase receptor, and a cytosolic ascorbate peroxidase were upregulated by drought. Late embryogenesis abundant (LEA) proteins constitute a complex set of proteins that participate in several plant stress responses. Amara et al. (2012) have identified 20 unfolded maize embryo proteins, 13 of which belong to the LEA family on the basis of the unusual heat stability and acid solubility characteristic of unfolded proteins. Kunpeng et al. (2011) studied extensively the proteome profile of maize leaf tissue at the flowering stage after long-term treatment with rice black-streaked dwarf virus infection. They identified 91 differentially accumulated proteins that belong to multiple metabolic/biochemical pathways. Further analysis of these identified proteins showed that MRDD resulted in dramatic changes in the glycolysis and starch metabolism and eventually the significant differences in morphology and development between virus-infected and normal plants. Mechin et al. (2004) have established a proteome reference map for maize endosperm by 2-D gel electrophoresis and protein identification with LC-MS/MS analysis. Among the 632 protein spots processed, 496 were identified by matching against the NCBInr and ZMtuc-tus databases (using the SEQUEST software).

Metabolomics

Metabolomics is the term coined for essentially comprehensive, nonbiased, high-throughput analyses of complex metabolite mixtures typical of plant extracts. The first International Congress on Plant Metabolomics was held in Wageningen, Netherlands, in April 2002, with the primary goal of bringing together those players who are already active in this field and those who soon plan to be. Metabolomics is driven primarily by recent advances in mass spectrometry (MS) technology and by the goals of functional genomic efforts. The combination of NMR spectroscopy, chemometric methods, and PCA is a useful tool for the discrimination of maize silks in respect to their chemical composition, including rapid authentication of the raw material of current pharmacological interest (Fiehn 2002; Halket et al. 2005; Shulaev 2006). Marcelo et al. (2012) determined metabolic fingerprint and pattern recognition of silk extracts from seven maize landraces cultivated in southern Brazil by NMR spectroscopy and chemometric methods. Metabolomics is also useful for predicting the complex heterotic traits in maize which has been supported by some studies. Christian et al. (2012) crossed 285 diverse Dent inbred lines from worldwide sources with two testers and predicted their combining abilities for seven biomass- and bioenergy-related traits using 130 metabolites. Under drought situations, maize ovule abortion appears to be related to the flux of carbohydrates to the young ear around flowering, and concurrent photosynthesis is required to maintain this above threshold levels (Zinselmeier et al. 1995). The application of an NMR-based metabolic profiling approach to the investigation of saline-induced stress in maize plants was studied by Claire et al. (2011). The maize seedlings were grown in 0, 50, or 150 mM saline solution. Plants were harvested after 2, 4, and 6 days (n = 5 per class and time point), and H NMR spectroscopy was performed separately on shoot and root extracts. Metabolomic, transcriptomic, and, to a lesser extent, proteomic studies have been conducted for the high-throughput phenotyping necessary for large-scale physiological, molecular, and quantitative genetic studies, aimed at identifying the function of a particular gene or set of genes involved in the control of complex physiological traits such as NUE (Meyer et al. 2007; Lisec et al. 2008; Kusano et al. 2011). Pavlík et al. (2010) analyzed metabolome of maize plants growing under different nitrogen nutrition conditions and sequential extraction of fresh biomass was used, and isolated fractions were characterized and evaluated using IR spectra. The results showed that an increased induction of oxalic acid in plants after 4 g nitrogen application.

Conclusions

Approaches of integrating genetics and omics would be a valuable strategy for investigating the regulation of the relationship between plant metabolism and physiology. The potential of transcriptomics, proteomics, and metabolomics as tools for functional genomics is well realized. Therefore, the integration of different levels of regulatory information (genome, proteome, and metabolome) could probably be a novel approach for mapping genotypes to phenotypes. The determination of the genome sequence (and its annotation), of proteome interactions, and of transcriptome regulatory information of a crop plant will help in generating sufficient database for developing systems-level models. These models will help in understanding of genotype–phenotype relationship and consequently help to improve the quality and productivity of crop plants for the food and nutritional security of millions of human populations.