Keywords

Introduction

Classical genetics ruled plant breeding until the twentieth century, and its application led to development of numerous high yielding cultivars in various crops. During this period, the concept of “one gene, one enzyme” (Beadle and Tatum 1941) and discovery of double-helical structure of DNA (Watson and Crick 1953) provided the early insights into the functional and structural aspects of the genomes and accelerated the progress of molecular biology. During the last decade of twentieth century, genomes of various organisms (from virus to human) were completely sequenced with the blessing of Sanger’s genome sequencing technology (Sanger and Coulson 1975) and later on with next-generation sequencing (NGS) technology (Schuster 2008). Subsequently genomes of many more organisms have been and are being sequenced. This rapid development led to the birth of a new field of molecular biology termed “genomics”. In fact this term was coined in 1986 itself by mouse geneticist Thomas Roderick (McKusick 1997) to refer “mapping, sequencing and characterizing genomes” based on the term “genome” introduced by Winkler (1920). Hence, the recent developments in the field of molecular biology simply expanded the scope of this term.

Subsequently, whole genome sequencing of the model plant Arabidopsis (The Arabidopsis Genome Initiative 2000) harbingered a new era of plant genomics. To date genome sequence of more than 50 plants belonging to 49 different plant species have been decoded (Michael and Jackson 2013). However, the function of most of the sequence information generated by genome projects or genes remains elusive. The generation of enormous sequence information and research efforts to understand their role in character expression led to the formation of two new subfields within the main branch of genomics, viz. functional genomics and structural genomics. The application of genomics tools in plants has led to important insights into important biological processes and a wealth of knowledge about development. Now, agriculture can take its share of benefits from genomics.

The genomics-led studies in crop improvement can assist plant breeders in identifying genes/QTLs that could be best utilised to improve crop productivity and quality through genetic engineering and plant breeding. In the past the success of plant breeding largely relied on forward genetics based on screening of natural and induced mutants by phenotypic selection. But introduction of genomics-led high-throughput techniques such as marker-assisted selection (Collard and Mackill 2008; Xu and Crouch 2008), genomic selection (Goddard and Hayes 2007; Jannink et al. 2010) and genotyping by sequencing (GBS) (Huang et al. 2009; Elshire et al. 2011) has offered new opportunities to the plant breeders, enriching the arsenal of classical breeding tools thereby facilitating mapping of desired traits precisely and thus aided in tailoring desirable genotypes. Therefore, genomics in combination with functional genomics-led studies has opened a new avenue to the plant breeders for developing high-throughput markers, developing high-density linkage map and identifying, fine mapping and cloning of gene/QTL followed by analysis of their functions strategically in quicker time. In this chapter we have tried to assemble the functional genomics-driven technologies, their potentials and their limitations followed by the latest relevant applications in plant science.

Functional Genomics

The functional genomics can be defined as “development and application of global (genome-wide or system-wide) experimental approaches to assess gene function by making use of the information and reagents provided by structural genomics” (Hieter and Boguski 1997). Gibson and Muse (2009) defined it as “approaches under development to ascertain the biochemical, cellular, and/or physiological properties of each and every gene product”. However, Pevsner (2009) defined it by including non-genic regions as “the genome-wide study of the function of DNA (including genes and non-genic elements), as well as the nucleic acid and protein products encoded by DNA”. Functional genomics uses the approaches of both forward genetics (phenotype to gene sequence) and reverse genetics (gene sequence to function). The forward genetics aids in gene discovery, while reverse genetics helps in deciphering the gene function. The main goal of functional genomics is to understand the relationship between the genome and the resultant phenotype of an organism. This could be achieved by knowing the expression, regulation and genome-wide mutagenesis through reverse genetics tools (Alonso and Ecker 2006). The functional genomics contemplates to understand the dynamic properties of an organism at cellular and/or organism levels using genomics and proteomics. This would provide better insights into how the information encoded in an organism’s genome could be transformed into biological function. There is a possibility of understanding how a particular mutation leads to a given phenotype. This might have implications in dissecting the genetics of complex traits and understanding of the genetics of traits of economic importance.

Functional Genomics Tools

The field of functional genomics is the result of efforts to understand the significance of the genome sequence information generated. It is necessary to understand the biochemical and physiological function of every gene product and its complexes. The function of gene is manifested at different levels, including at RNA, protein and metabolite levels. Hence, critical analyses at these levels would enable to understand the possible function of a particular gene/gene product and its interaction with other gene/gene products leading to a certain phenotypic expression. Thus, the tools involved have to address the challenges at each level efficiently. However, in this chapter we do not discuss about the functional proteomics (functional genomics of proteins) as there is a separate dedicated chapter on proteomics in this book.

Here, we discuss the tools or techniques involved in defining the functional genomics such as microarray technique; SAGE; transgenic and gene silencing approaches such as gene knockout, RNAi and VIGS; insertional mutagenesis and chemical mutagenesis including TILLING, and EcoTILLING and NGS technique. Each of the techniques and their applications are summarised below.

Expression Profiling

In the early 1990s, scientists took the task of unravelling the gene expression and transcript profile of genome. This led the foundation of functional genomics. After the whole genome sequence information of the various species was available, molecular biologists estimated the number of genes encoded in the genomes of such species. But, this approach provided little information about the gene function. One approach for understanding the function of genes is to establish the identity and abundance of different mRNA transcripts expressed in a tissue or cell. This is termed as “expression profiling”. This approach assumes that the genes that have similar expression patterns are functionally related and that changes in gene expression are due to changes in physiological conditions as per the requirement of the cell. The most effective genomics technique first used for gene expression analysis was expressed sequence tags (ESTs). A number of high-throughput technologies used for gene expression or transcriptome analysis are expressed sequence tags (EST) sequencing (Adams et al. 1995), serial analysis of gene expression (Velculescu et al. 1995), massively parallel signature sequencing (Brenner et al. 2000) and microarray technology (Schena et al. 1995). We will discuss here these important technologies widely used for transcript profiling and gene expression analysis across the genome of different plant species.

Microarray Technology

Microarray is a versatile high-throughput tool and is one of the fastest-growing new technologies in the field of genetic research. There are several synonyms of microarray like DNA chips, gene chips, biochips, DNA arrays and gene arrays. This technique was first used in 1995 by Patrick Brown and his group for expression analysis in Arabidopsis leaf and root tissues (Schena et al. 1995). DNA microarrays are collection of sequences from thousands of different genes that are immobilised or attached onto a solid support at fixed and known locations/spots in an orderly or fixed way. The supports are usually microscope slides but can also be silicon chips or nylon membranes. The DNA is printed, spotted or actually synthesised directly onto the support. As many as 30,000 cDNAs can be spotted on a microscope slide with each spot corresponding to a unique cDNA. Based on the kind of sequences spotted in microarray, it can be grouped into cDNA microarray (cDNA are immobilised) and oligonucleotide array (synthesised oligonucleotides are attached to the chip). Since the location of each sequence/gene is known, it is used to identify a particular gene sequence. The cDNA microarrays can be used for investigating gene expression in both animal and plant systems. The core principle behind this technique is hybridisation between two DNA strands (target sample DNA and a large set of immobilised DNA sequences), using the complementary nature of DNA strands, specifically paired with each other by forming hydrogen bonds between complementary nucleotide pairs. Based on this, the procedure of microarray analysis is as follows: (1) cDNAs fragments are amplified by PCR. (2) Anchoring of the amplified cDNAs on glass slide having already known positions. (3) Contrasting (positive trait and negative trait) cDNA probes are produced by reverse transcription. (4) The probes hybridised to the DNAs positioned on the glass slide. (5) Confocal microscope driven scanning of array, followed by array image analysis by computer program, for details (see Xiang and Chen 2000; Aharoni and Vorst 2001). (6) Applying statistical methods to infer that differences in gene expression between cell populations or experimental conditions are true or due to chance. (7) Sharing of microarray data. The sharing of microarray data and making it publicly available is important and was highlighted by Pavlidis and Noble (2001). The sharing of data also permits validation of the results.

The databases for microarray data are available in plenty with both public and private (Becker 2001). Two public databases are serving as a repository for data. These are Array Express at the European Bioinformatics Institute (EBI) (http://www.ebi.ac.uk/arrayexpress); Na- 244 J.D. Pollock/Chemistry and Physics of Lipids 121 (2002) 241/256 National Center for Bioinformatics’s (NCBI/NIH) Gene Expression Omnibus or GEO (http://www.ncbi.nlm.nih.gov/geo).

Application: Microarray technique is a high-throughput technology for studying the expression profile of genes on whole-genome scale; microarray technique can play a significant role in dissecting the various kinds of simultaneous gene expression in terms of transcription and translation and regulatory network in organism. Its application ranges from model plant Arabidopsis to almost all cultivated crop species such as rice, maize, wheat, brassica, potato, cabbage, grape, peanut and soybean given in Table 1. In the early 1990s, it was applied largely in Arabidopsis for harnessing the different expression profiles such as organ development (Ruan et al. 1998), phytochrome A-related expression (Spiegelman et al. 2000), systemic acquired resistance (SAR) expression (Maleck et al. 2000), circadian clock regulation response (Harmer et al. 2000) and cold and drought response (Seki et al. 2001). Later, it has been applied to understand the following aspects of plants: (1) Gene expression during plant metabolism: carbon metabolism and starch deposition in tuber have been studied under drought condition in potato (Watkinson et al. 2008). (2) Gene expression during growth and development: microarray expression analysis assisted in analysing multiple expression of seed, leaf, stem, root and gynophore development in peanut (Bi et al. 2010) and analysing expression profile of genes at the time of fertilisation in brassica, revealing role of B3 family transcription factor genes and XB3 gene contributing growth and development (Peng and Weselake 2013; Yuan et al. 2013). (3) Expression pattern under abiotic and biotic stresses: this technique has been used in soybean (Yuan et al. 2013) for abiotic stress expression pattern of eight BrDREB1 genes (Lee et al. 2012). Similarly, salinity tolerance expression profile of 33 genes has been studied (Srivastava et al. 2010). Additionally, the expression of ZmALDH9, ZmALDH13 and ZmALDH17 genes of maize during drought stress, acid tolerance and pathogen infection (Zhou et al. 2012) and differential expression of 60 genes taking part in differential response of powdery mildew in wheat have been investigated (Xin et al. 2012). (4) To study expression profile of quality traits: expression analysis of 562 differentially expressed genes accounting for high and low oleic acid trait regulation has been documented (Guan et al. 2012a). In maize 74 MAPKKK genes have been identified by this technique involved in signalling pathways and organ development (Kong et al. 2013).

Table 1 Application of microarray techniques used in functional genomics

Advantages: This technique provides advantages over other expression analysis techniques used earlier. (1) Prior information of cDNA of genome is not required in microarray analysis (Aharoni et al. 2000; Harmer and Kay 2000), (2) it can even cover small portion of genomes under investigation (Aharoni et al. 2000) and (3) the custom chip used in this technique can easily be fabricated in laboratory (Harmer and Kay 2000).

Disadvantages: It suffers from some drawbacks: (1) There may be a problem in differentiating the closely related gene sequences (Duggan et al. 1999; Ishii et al. 2000) and genome duplication, due to cross hybridisation and misinterpretation of the expression of any single gene (Meyers et al. 2004); (2) it investigates random arbitrary chosen genes (Lee and Lee 2003) and (3) there is no standard set of protocols for doing microarray experiments and hence the comparisons among different experimental datasets are difficult. To overcome this difficulty the MicroArray and Gene Expression group (http://www.mged.org), an organisation consisting of users of microArray data, presented a proposal for the minimum information about microarray experiments (MIAME; Brazma et al. 2001) that would be included in any public database.

Serial Analysis of Gene Expression (SAGE)

Serial analysis of gene expression is an innovative approach for defining the expression profiling of cellular transcripts, gene discovery and analysing metabolic pathway (Gowda et al. 2004). Basically it measures the quantitative profile of expressed genes using the principle of sequencing technology. This technique was first developed by Velculescu et al. (1995). In this method, cDNA is produced from mRNA through reverse transcription from a specific cell, tissue or organ of interest. The cDNA is digested with the enzyme NIaIII, and poly-A ends are ligated with linker fragment with 5′-GGGAC-3′ sequence, which is a recognition site for type IIS restriction enzyme BsmFI. A short tag of 14–15 bp length having well-specified 3′ transcript is isolated by digestion with BsmFI (Velculescu et al. 1995; Zhang et al. 1997). Tags from different expressed sequences are ligated and cloned into a plasmid vector for sequence analysis. After sequencing thousands of tags, the gene corresponding to the tag is identified. The 14–15 bp tag sequence is used as query to search cDNA databases of the organism by BLAST (Altshul et al. 1990). Finally tag annotation is combined to generate a gene expression profile. Many modifications to this technique have been made for increasing its efficiency such as (1) LongSAGE – containing longer-size tags of 21 bp in comparison to conventional SAGE having 14–15 bp tags (Saha et al. 2002; Chen et al. 2002); (2) SuperSAGE – designed by reaping the benefit of high-throughput SAGE and microarray technique simultaneously (Matsumura et al. 2006), it generates 26 bp tags and broadens its use in non-model organisms (Matsumura et al. 2003) and (3) Virtual SAGE – uses the principles of both EST and SAGE analysis for gene expression (Poroyko et al. 2004).

Application: SAGE has been used for discovering new genes, transcript profiling and analysis of cellular metabolic pathways. In plants, it has been applied in analysing the response of expression during host–pathogen interaction in rice (Matsumura et al. 2003) and abiotic stress in chickpea (Molina et al. 2008). SAGE has also been applied to investigate the molecular basis of heterosis and gene regulation pathways in rice by generating 465,679 tags (Bao et al. 2005). By applying SAGE in an elite Chinese super-hybrid rice (LYP9), 10,268 quality tags were generated which helped in assigning candidate genes responsible for heterosis mechanism in rice (Song et al. 2007). Similarly SAGE assisted in searching out 1,183 differentially expressed genes (DGs) in F1 super-hybrid rice Liangyou-2186, conferring heterosis (Song et al. 2010).

Further, SAGE has also been used for elucidation of plant–pathogen interaction (Gowda et al. 2007; Matsumura et al. 2003). For instance, in rice RL SAGE has been used for understanding the molecular basis of Rhizoctonia solani and host resistance response that contributed to fine mapping of the QTLs governing this disease (Venu et al. 2007). LongSAGE disclosed the role of ethylene as induced resistance response against cucumber mosaic virus in tomato (Irian et al. 2007) and gene regulation and expression analysis during grain development in wheat (Poole et al. 2008). Likewise transcript profiling of cyst nematode-infected tomato roots has been performed applying SAGE (Uehara et al. 2007). This powerful tool has been used for investigating the caryopsis development in wheat producing 29,231 unique tags, leading to enhancement of wheat breeding (McIntosh et al. 2007). This technique has been used successfully for comparative analysis in egg plant (Fukuoka et al. 2010).

Advantages: It can be used for evaluation of thousands of genes to obtain quantitative and qualitative information on them (Velculescu et al. 1995; Lee and Lee 2003). Besides, digitisation of generated data is unique feature in this technique (Gowda et al. 2004).

Disadvantage: There are chances of missing analysis of rare transcripts in this technique (Harmer and Kay 2000). This technique also demands large set of expressed sequence tags (Harmer and Kay 2000). Error in expression profile arises due to non-authentic annotation in tags for complex polyploidy genome, viz. wheat (Poole et al. 2008).

Gene Silencing Approaches

Gene Knockout

Gene knockout describes the reciprocal exchange of DNA sequence between two chromosomes harbouring the same genetic loci (Lewin 2004; Tierney and Lamour 2005) starting with a double-strand break and followed by repair of these breaks. It is the most widely accepted model for gene targeting (Iida and Terada 2004). Site-specific recombination gene knockout system (such as Cre-Lox) can be potentially applied in rice and other plants (Iida and Terada 2004). Utility of site-specific recombination-based gene knockout is very limited only used in moss Physcomitrella patens (Schaefer and Zryd 1997). So disruption-based gene knockout applying insertion elements has been used for knocking out genes for decoding function of genes (Wisman et al. 1998; Campisi et al. 1999; Tissier et al. 1999). The first report of knocking out actin gene in Arabidopsis was reported (McKinney et al. 1995). A deletion-based gene knockout system was used in Arabidopsis (Li et al. 2002).

Applications

This unique technique has been intensively applied for unravelling the gene function in different plant species. In Arabidopsis a list of genes were knocked out for disclosing the function of important morphological and physiological traits (Bouché and Bouchez 2001). This tool has also been used for finding out plant metabolism like lipid, gluconeogenesis, starch, sugar metabolism, etc. (Thorneycroft et al. 2001). In rice the function of gene responsible for plant height has been decoded by knocking out the gibberellin 2-oxidase (GA2ox) gene (Hsing et al. 2007), kernel size and yield in maize (Martin et al. 2006); by knocking out Gln1-3 and Gln1-4 genes, defence gene and root-knot nematode resistance (Gao et al. 2008) by knocking out LOX gene (ZmLOX3). Similarly it has been used for elucidating the function of genes associated with aluminium tolerance (Chen et al. 2012), lutein biosynthesis (Lv et al. 2012), resistance to rice blast (Delteil et al. 2012), root elongation and iron storage (Bashir et al. 2011) in rice, oxidative pentose phosphate pathway (Settles et al. 2007) in maize, drought stress resistance (Malatrasi et al. 2006) in barley and salinity tolerance and ABA signalling (Park et al. 2009) in Arabidopsis. The list of plant genes and their function determined by gene knockout has been given in Table 2.

Table 2 Applications of gene knockout for deciphering gene function in plant species

This unique technique suffers from some drawbacks, it creates pleiotropic effect, and many knockout mutations may be without phenotype (Thorneycroft et al. 2001). Its full potentiality and application in functional genomics are yet to be used fully.

RNAi

RNAi is the process by which expression of a target gene is inhibited by antisense and sense RNAs. RNAi was discovered when injection of double-stranded RNA (dsRNA) into worms led to specific degradation of the corresponding mRNA (Fire et al. 1998). The discovery of RNAi has supported a major paradigm shift from “one gene, one protein” to the concept that noncoding DNA can have profound effects in cells and organisms (Auer and Frederick 2009).

RNAi is a post-transcriptional gene silencing PTGS approach deployed in functional genomics study by switching off the action of a particular gene by breaking down the mRNA and preventing the translational process of the specific gene considered. (1) The first step of RNAi mechanism starts with an enzyme called DICER dsRNA-specific endonuclease (Bernstein et al. 2001; Xie et al. 2004), which cuts dsRNA into pieces called siRNAs (small interference RNAs) (Zamore et al. 2000); (2) these siRNAs attach with a protein and form complexes called RNA-induced silencing complex (RISC) (Hammond et al. 2000); (3) the siRNA with RISC remains apart as single-stranded mRNA and (4) finally the RISC becomes activated and breaks down the target mRNA and inhibits the translational process (Hannon 2002; Schwarz et al. 2003; Kusaba 2004; Fuchs et al. 2004). Recently different types of RNAs based on PTGS and transcriptional gene silencing (TGS) approaches are being used in plants like siRNA (Meister and Tuschl 2004), tasiRNA (Eamens et al. 2008), natsiRNA (Eamens et al. 2008), rasiRNA (Eamens et al. 2008), dsRNA (Fire et al. 1998), miRNA (Vaucheret 2008; Voinnet 2009) and other RNAs (Vazquez 2006; Vazquez et al. 2010).

Application of RNAi in plants: Since its inception, RNAi has been used in transforming the phenotype of cells and whole organisms. Its prolific application has heralded a new era in functional genomics. Biological science is quickly reaping the benefits of this magical technique. Its universal application has been deployed for designing transgenic, recombinant protein production in insect and mammalian system, in metabolic engineering and in plant biotechnology for improving quality and nutritional value (Hebert et al. 2008). The application of RNAi is highlighted in Table 3.

Table 3 Application of RNAi decoding functions of genes in plant highlighted

RNAi was described in plants by Waterhouse et al. (1998) for experiments that produced virus-resistant tobacco. Over the last decade, numerous findings have contributed to the view that RNAi is evolutionarily conserved in the plant kingdom and has many diverse functions (Dunoyer and Voinnet 2008; Eamens et al. 2008; Small 2007; Vaucheret 2006; Baulcombe 2004; Kusaba 2004). Our understanding of RNAi has emerged from two areas of plant science, experiments designing transgenic plants and research into virus resistance (Eamens et al. 2008; Vaucheret 2006; Baulcombe 2004; Kusaba 2004; Ghildiyal and Zamore 2009; Hebert et al. 2008). Today, researchers are tailoring varieties of crops to produce small RNAs causing silence of essential genes in insects, nematodes and pathogens, through an approached called host-derived RNAi (HD-RNAi) (Auer and Frederick 2009). The application of RNAi is ever demanding since its discovery and it has been used extensively; it can be used for silencing gene in many organisms other than plant sp. like Caenorhabditis elegans, Drosophila and animals (Palauqui and Vaucheret 1998; Kamath et al. 2001; Tabara et al. 1998; Matzke et al. 2001; Hannon 2002; Min et al. 2010). In different plants some important RNAi databases are used, for example, for Medicago truncatula (www.medicago.org/rnai/) and for Arabidopsis (http://2010.cshl.edu/scripts/main2.pl). Therefore, RNAi is one of the gifted reverse genetics tools for functional genomics (Table 4).

Table 4 Application of VIGS

Advantages and disadvantages: RNAi technology has many advantages and limitations. The details of advantage and disadvantages have been thoroughly discussed (Gilchrist and Haughn 2010). An advantage of using RNAi is that a specific gene can be silenced if the target sequence is better chosen (Small 2007). This is also one of the limitations of using RNAi because; unlike insertional mutagenesis, here the exact sequence of the gene is required. This technique can effect on non-/off-target gene (Qiu et al. 2005) and cannot be applied for mutants producing sterile phenotype (Gilchrist and Haughn 2005).

Non-Transgenic Approaches/Mutational Approach

Virus-Induced Gene Silencing

Virus-induced gene silencing (VIGS) is recently developed as one of the important genomics tools for deciphering gene function. The term (VIGS) was first used by van Kammen (1997). VIGS follows the PTGS mechanism for inducing gene knockdown effect (Baulcombe 1999, 2002; Robertson 2004). The basic principle behind this tool is to insert viral DNA or RNA harbouring the gene sequence of interest into the plants. Specific sequences are inserted in viral genome without disturbing infectivity of virus (Watson et al. 2005). Finally, the introduced virus multiplies and initiates silencing of gene (Lu et al. 2003a; Burch-smith et al. 2004). In this approach plant viruses can be used as vector for induction of sequence-specific VIGS (Baulcombe 1999; Lindbo et al. 2001). This novel approach has been used as an important tool for investigating gene function (Kumagai et al. 1995; Liu et al. 2002a, b; Peart et al. 2002b) described below.

Applications of VIGS: This tool has been used in many plants successfully for harnessing the array of gene functions ranging from disease resistance to quality traits. This technique is mostly used in Nicotiana benthamiana. The success of VIGS is more responsive in N. benthamiana as host in comparison to other plant (Lu et al. 2003a). The different silencing vectors used for VIGS are barley stripe mosaic virus (BSMV) (Holzberg et al. 2002; Van Eck et al. 2010; Wang et al. 2010; Ma et al. 2012; Hein et al. 2005; Zhang et al. 2009), tobacco rattle virus (TRV) (Li et al. 2011a; Qu et al. 2012; Gao and Shan 2013) and pea early browning virus (PEBV) (Constantin et al. 2004). VIGS assisted in disclosing the genes and its function ranging from disease resistance to numerous traits such as fruit ripening, shape, size and many other traits in various plant species is given in Table 5. Additionally, it has been applied in tracking down the role of genes involved in drought tolerance (Manmathan et al. 2013) and other abiotic stress tolerance (Senthil kumar et al. 2008). Recently a modified technique of VIGS called artificial miRNA VIGS (MIR-VIGS) has been used in gene silencing in Nicotiana benthamiana (Tang et al. 2010). In the near future its application will be indispensible as functional genomics tool for harnessing gene function.

Table 5 Some important genes and their functions isolated by T-DNA in different plant species

Therefore, this emerging tool has advantages over other reverse genetic approaches: (1) VIGS can be used for initial study of large number of stress genes. This tool can be used without having the complete gene sequence information (Lu et al. 2003a; Burch-Smith et al. 2004). (2) VIGS provides a better option for studying the species that lack mutants and are recalcitrant to genetic transformation (Senthil Kumar et al. 2008), and it can be applied for functional genomics by creating gene knockout phenotype (Scofield and Brandt 2012). Targeted knockdown expression of any gene can be studied by VIGS (Burch-Smith et al. 2006). It can be used as high-throughput functional genomics (Todd et al. 2010; Becker and Lange 2010).

The main drawback of this innovative tool is lacking in adequate VIGS vector and VIGS creates only transient silencing (Padmanabhan and Dinesh-Kumar 2009).

Insertional Mutagenesis

The most popular mutagenesis strategy in functional genomics is insertional mutagenesis, in which a piece of DNA is randomly inserted into the genome causing loss in gene function. The DNA may be transposable elements that can relocate from one genomic location to another (Hayes 2003; Tierney and Lamour 2005). Transposon tagging has been used for discovering the function of many important genes in different plant species given in Table 5.

Applications of insertional mutagenesis as reverse genetics tool: Insertional mutagenesis in plant can be achieved using Agrobacterium T-DNA (Azpiroz-Leehan and Feldmann 1997; Krysan et al. 1999) and plant transposon (Walbot 2000; Hamer et al. 2001; Ramachandran and Sundersan 2001). T-DNA as insertional mutagen is used in genome-wide mutagenesis programme in Arabidopsis thaliana and rice (Walden 2002). Currently over 130,000 T-DNA-tagged Arabidopsis lines are made available by the University of Wisconsin Arabidopsis Knockout Facility (Krysan et al. 1999). Modified T-DNA vectors have been used in Arabidopsis as activation tags (Weigel et al. 2000) as well as gene and promoter traps (Feldmann 1991; Lindesy et al. 1993; Babiyachuk et al. 1997). T-DNA as trap vector can be used in genome-wide screen of rice (Jeon et al. 2000). T-DNA tagging has been used in rice; Zheng and colleagues have created more than 30,000 T-DNA insertional lines (Wu et al. 2003). In rice T-DNA insertion mutagenesis has been used for high-throughput tool for study of in silico reverse genetics. By this approach, 683 T-DNA insertions within genes coding for transcription factors (TF) have been presented; this approach has been successful in unravelling gene function in rice and cereals (Sallaud et al. 2004). Transposon tagging can be applied both as forward and reverse genetic tool for discovering gene function in rice and Arabidopsis (Greco et al. 2001; Radhamony et al. 2005). Radhamony et al. (2005) enlisted the gene function of nearly all important genes of Arabidopsis deploying insertional mutagenesis as forward and reverse genetics tool. T-DNA-tagged genes in rice have been used in functional genomics analysis using MADS-box genes as test case (Lee et al. 2003).

Some important web based resources for genome-wide random mutagenesis are given such as http://www.biotech.wisc.edu/NewServicesAndResearch/Arabidopsis/default.htm Arabidopsis Knockout Facility at the University of Wisconsin http://www.zmdb.iastate.edu/zmdb/sitemap.html Maize Gene Discovery and Rescue Mu Project, http://mtm.cshl.org Maize Targeted Mutagenesis database and http://arabidopsis.org/abrc Arabidopsis Biological Resource Center (Primerose and Twyman 2006).

In case of transposon it is widely used for insertional mutagenesis in plants and helped in identifying new genes (Gierl and Saedler 1992). Transposon-based methods have been used in Arabidopsis, maize and other plants (Stemple 2004). Many transposons have been used like Ac, suppressor-mutators (spm) and mutator (Mu) from maize, Tam3 from Antirrhinum majus and Tph1 from petunia. Functional genomics programmes using Ac started in Arabidopsis (Ito et al. 1999) and in tomato (Meissner et al. 2000). It has also been used in Brassica napus (Bade et al. 2003), Medicago truncatula (Trieu et al. 2000) and poplar (Groover et al. 2004). In the recent past Tos17, endogenous retroposon of rice, Tnt1 and TtoI, retroposon of tobacco, have been used for gene tagging in Arabidopsis and in rice (Courtial et al. 2001; Okamoto and Hirochika 2000; Yamazaki et al. 2001; Iantcheva et al. 2009). Recently identified retroelements MERE in M. truncatula and LORE in Lotus japonicus can be used in reverse genetic tool (Trichine et al. 2009). T-DNA insertion aided in designing total 372,346 mutant lines contributing in many physiological traits in rice (Chang et al. 2012).

Insertional mutagenesis has certain advantages over traditional form of mutagenesis, the interrupted gene becomes tagged with insertional elements, and the strategy is known as signature tagged mutagenesis (STM). T-DNA insertional mutagenesis also offers the advantage of its heritability and complete loss of function (Gilchrist and Haughn 2010).

In spite of one of the potential reverse genetics tool, insertional mutagenesis/transposon-mediated mutagenesis suffers from some disadvantages like (1) low frequency of mutations causes large number of screening population to find mutations in a given gene (Gilchrist and Haughn 2005), (2) insertion in essential genes usually causes lethality (Till et al. 2003), (3) the precise mechanism of T-DNA integration into the plant genome remains largely unknown (Iida and Terada 2004; Tierney and Lamour 2005), (4) insertional mutagenesis is limited by its host range (McCallum et al. 2000b) and (5) sometimes complex arrangement of T-DNA, multiple insertions, chromosomal duplications and rearrangements and insertion of vector backbone sequence occur (Jorgensen et al. 1987; Radhamony et al. 2005). Therefore, insertional mutagenesis has been used as cost-effective reverse genetics tool for identifying and decoding the gene function of model legumes Lotus, soybean, Medicago, Pisum (Trichine et al. 2009) and many other plant species.

Chemical Mutagenesis
TILLING and EcoTILLING

One of the groundbreaking discoveries in the history of genetics is the discovery of induced mutations (Muller 1930). Subsequently, it has been widely used for phenotypic screening, mapping of genes and investigating function of genes applying mutagen randomly in genome of a plant species in large populations. A new innovative tool has been designed called TILLING, one of the most dynamic high-throughput reverse genetic tool. This technique includes high density of point mutations and traditional chemical mutagenesis with rapid mutational screening for seeking induced lesions (McCallum et al. 2000b). TILLING accompanies chemical mutagenesis with a sensitive mutation trapping instrument (Koornneef et al. 1982).

The TILLING approach follows the steps: (1) generation of mutagenised population, (2) isolation of DNA and pooling (3), PCR amplification with labelled primer, (4) enzymatic assay and (5) detection of mutation (McCallum et al. 2000b; Till et al. 2007a, b).

Application of TILLING: TILLING is an attractive reverse genetic tool first utilised in Arabidopsis thaliana (McCallum et al. 2000a, b) and latter Arabidopsis TILLING Project (ATP) was established (Wang et al. 2006). Now this ATP project is known as STP (Seattle TILLING Project) (http://tilling.fhcrc.org:9366/) which enabled the researchers to align more than 100 genes harbouring 1,000 mutations (Till et al. 2003). A computational device has been designed for primer designing and gene modelling called Codons Optimized to Deliver Deleterious Lesion (CODDLe) (http://www.proweb.org/coddle) (McCallum et al. 2000b). Similarly, Sorting Intolerant From Tolerant (SIFT) (http://blocks.fhcrc.org/sift/SIFT.html) web-based program assists in tracking down the neutral and deleterious amino acid changes (Ng and Henikoff 2003), and PARSESNP program (http://www.proweb.org/parsesnp/) predicts the missense mutation and point mutation causing restriction endonuclease sites (Till et al. 2003; Taylor and Greene 2003). This technique has been successfully used in different crops for detection of genes applying various mutagen treatments with varying mutation frequencies given in Table 6. TILLING has been applied successfully in wheat for its polyploidy nature leading to tolerate the high densities of mutation (Slade et al. 2005), also deployed for investigating extensive allelic series of the waxy genes in both hexaploid bread wheat and tetraploid wheat (Uauy et al. 2009). In barley TILLING helped in detecting Hin-a, HvFor1 genes (Caldwell et al. 2004), dehydrin genes Dhn12 and Dhn13 (Lababidi et al. 2009), COMT gene in sorghum (Xin et al. 2008), 35 genes in Cameor population (Le Signor et al. 2009) and tendril-less (Hofer et al. 2009) gene in pea and beta-glucan biosynthesis gene in oat (Chawade et al. 2010). With the advent of this novel tool, it has been extensively applied for functional genomics in different crop plants resulting in development of TILLING platform in various plants given in Table 7.

Table 6 Detection of genes/alleles obtained by creating TILLING population in different plant species
Table 7 TILLING platform in different plant species

The main advantages of TILLING in comparison to other reverse genetic approaches are: (1) It is an appropriate tool for genetic modification without introducing a foreign DNA into the genome, irrespective of genome size or reproductive system and ploidy level of organism (Gilchrist and Haughn 2005). (2) The population size for detecting mutation is small in comparison to other reverse genetic approaches (McCallum et al. 2000b). (3) The chances of recovering deletion mutation can be calculated in advance (McCallum et al. 2000b). (4) It is applicable for both small- and large-scale screening. (5) TILLING utilises rapidly advancing technology such as DHPLC utilised in detecting high-throughput polymorphism detection (Gilchrist and Haughn 2005). (6) This strategy can be deployed as high-throughput technique for detecting single base changes within the target gene (Tierney and Lamour 2005).

Although having such sound potentiality, this technique has some drawbacks. In this approach the load of mutations created must be balanced with the recovery of mutants (Till et al. 2003). The fertility must be maintained in the mutagenised organism in the first and in subsequent generations (Perry et al. 2003).

EcoTILLING and its use: Henikoff and Comai (2003) coined the term “EcoTilling” first used for describing Arabidopsis ecotypes (hence Ecotilling); it was used to survey variation in five genes in 96 different Arabidopsis accessions (Comai et al. 2004). In this approach enzymatic mismatch cleavage and fluorescence detection method are applied similar to TILLING (Colbert et al. 2001; Comai et al. 2004). This technique was first applied in rice crop (Kadaru et al. 2006) followed by wheat (Wang et al. 2006, 2008) and barley (Mejlhede et al. 2006). EcoTILLING can be used for diversity analysis, germplasm screening and functional genomics (Till et al. 2010). In western cottonwood (Populus trichocarpa) plant, EcoTILLING aided in tracing 63 new SNPs (Gilchrist et al. 2006). This technique was employed to identify single nucleotide polymorphisms (SNPs) and small insertions/deletions (INDELS) in a collection of Vigna radiata accessions (Barkley et al. 2008). Likewise, EcoTILLING has been used to find out orthologous hypoallergenic isoforms of Arah2 in 30 different accessions of Arachis duranensis (Ramos et al. 2009; Riascos et al. 2010). It has offered the benefit of SNP haplotype diversity in switchgrass (Weil 2009). Additionally, it can be deployed for identification of gene related with biotic and abiotic stresses (Antollin-Llovere and Parniske 2007). In the recent past EcoTILLING has been deployed in Musa species for identification of nucleotide polymorphism; further 800 novel alleles have been discovered from 80 accessions (Till et al. 2010).

De-TILLING (deletion TILLING): In the very recent past another alternative reverse genetic strategy has been discovered called De-TILLING. It includes physically induced genomic deletion and employs fast neutron mutagenesis and PCR-based detection (Rogers et al. 2009). This technique has been used in Medicago truncatula. The advantages of this technique are (1) it is independent of plant transformation, of tissue culture and of target size and (2) it recovers knockout mutants (Rogers et al. 2009). Its complete applications in different plant species are yet to be harnessed.

Next-Generation Sequencing

Sanger’s DNA sequencing technology in 1975 ruled almost two decades since its development and was considered as one of the most robust techniques for genome sequencing. But its cost and labour becomes a limiting factor for sequencing complete genome. With the progress of cutting-edge technologies, Sanger’s sequencing technique has been substituted by next-generation sequencing (Schuster 2008) also known as second-generation sequencing technologies (Pérez-de-Castro et al. 2012), a powerful high-throughput technology, which reduces down the sequencing cost and time and enhances the accuracy of sequencing. NGS technologies include Roche 454 system (the first successfully used NGS in the year 2005) based on sequencing by synthesis principle (Ronaghi 2001), AB SOLiD system, Illumina Golden Gate assay and Compact PGM sequencer different systems. The NGS techniques are classified on the basis of read length, short read (25–75 bp) and long read (400–500 bp) (Shendure and Ji 2008). NGS uses three principles: sequencing by synthesis, sequencing by ligation and single-molecule sequencing (Ansorge 2009; Egan et al. 2012). The principle behind this technique is based on massively parallel sequencing and imaging facilities for generating hundreds of billions of bases per run (Shendure and Ji 2008; Deschamps et al. 2012a). The details of advantages and disadvantages are discussed thoroughly considering throughput, NGS systems used (Metzker 2010; Liu et al. 2012; Pérez-de-Castro et al. 2012).

Applications of NGS in Functional Genomics

Initially, Sanger’s sequencing technology was deployed for decoding the genome sequence of model plant Arabidopsis (TAGI 2000) and rice (IRGSP 2005). With the arrival of NGS technology, it has heralded a paradigm shift in both plant and animal science by enabling in cracking genome sequences of large number of crops of economic importance in quick time with lower cost. Importantly, NGS has enabled in cracking whole genome sequences of more than one dozen crops are given in Table 8. The application of NGS includes de novo genome sequencing (Velasco et al. 2007; He et al. 2011; Buckler et al. 2010; Lai et al. 2010); transcriptome sequencing including siRNA and miRNA sequencing (Axtell et al. 2006; Jacquier 2009); epigenetic analysis including (1) DNA methylation pattern or methylation profiling (Cokus et al. 2008; Costello et al. 2009), (2) histone modification (Impey et al. 2004; Mikkelsen et al. 2007) and (3) nucleosome pattern analysis (Johnson et al. 2006); genotyping by sequencing (GBS) (Huang et al. 2009); genome-wide association study (GWAS) (Elshire et al. 2011); and single nucleotide polymorphism (SNP) marker development (Davey et al. 2011; Bundock et al. 2009). Some important applications related directly or indirectly with functional genomics are summarised below.

Table 8 List of complete genome sequence of different plant species applying next-generation sequencing (NGS)

De Novo Whole Genome Sequencing

With the arrival of NGS technology, it has brought a revolution in genome sequencing. It has been applied for de novo whole genome sequencing of plants having no reference genome sequence or with reference genome sequence given in Table 8. For the first time whole 5A chromosome of wheat was sequenced applying NGS (Vitulo et al. 2011). Application of NGS that can facilitate in decoding the complex genome sequence of allopolyploids such as wheat and oilseed rape has been discussed (Edwards et al. 2013). It is also deployed for re-sequencing the plant genome having already reference genome sequence such as rice, maize and Arabidopsis (He et al. 2011; Xu et al. 2011b; Yu et al. 2011a; Huang et al. 2013a; Hufford et al. 2012; Cuperus et al. 2010) for marker development, understanding complex traits and QTLs, SNP discovery and allele mining, given in Table 9.

Table 9 Some important applications of NGS in plant species

Whole Genome Re-sequencing

Whole genome re-sequencing aims at sequencing of individual member’s genome of a species for distinguishing genomic variation in relation to reference genome of that species (Straton 2008). With the blessing of NGS, the sequencing technology has speeded up in re-sequencing the whole genome of populations leading to discovery of markers, development of high-resolution map and QTL mapping (Gao et al. 2012). Re-sequencing of whole genome by NGS has been reported in rice (He et al. 2011; Xu et al. 2011b; Yu et al. 2011a; Huang et al. 2013a), in maize (Buckler et al. 2010) and in soybean (Kim et al. 2010). Seventeen wild species and fourteen cultivated species of soybean have been re-sequenced for substantiating the presence of high allelic diversity in wild species of soybean (Lam et al. 2010). Re-sequencing genome of 446 diverse rice accessions of Oryza rufipogon and 1,083 cultivated species of indica and japonica have given the useful insights of domestication and origin of rice (Huang et al. 2012c). Likewise in maize genome-wide re-sequencing of wild, landrace and improved lines has thrown lights on domestication and evolution of maize (Hufford et al. 2012). In Arabidopsis 58 RILs and both parents have been re-sequenced by NGS and 6,159 and 701 SNPs have been identified (Maughan et al. 2010). The mainstay of phenotypic variation has arisen due to evolutionary factors in Arabidopsis and its close relatives have been analysed by sequencing 80 strains of Arabidopsis (Cao et al. 2011). Re-sequencing of Arabidopsis genome led the identification of SNP responsible for causing ebi-1 phenotype (Ashelford et al. 2011). Similarly re-sequencing served in identification of MIR390a precursor processing–defective mutants in Arabidopsis (Cuperus et al. 2010). Additionally, for capturing overall genomic variation, 180 Swedish Arabidopsis lines have been sequenced (Long et al. 2013). Likewise, sequencing of 916 foxtail millet varieties endowed in identification of 2.58 million SNPs spearheaded in developing haplotype map of foxtail millet genome (Jia et al. 2013b). Recently whole genome re-sequencing approach has been employed in rice for constructing MutMap (Abe et al. 2012) and MutMap + (Fekih et al. 2013) for capturing mutants contributing quantitative traits; likewise QTL-seq based on whole genome re-sequencing has been applied for quick QTL mapping in rice (Takagi et al. 2013).

Whole Transcriptome Sequencing

Transcriptome sequence enables in translating the functional aspects of genes of an organism (Nagalakshmi et al. 2008; O’Neil and Emrich 2013). NGS technology can facilitate in sequencing the whole transcriptome, which assists in unravelling the functions of the whole mRNA of an organism (Malonae and Oliver 2011; Chu and Corey 2012). Applying NGS, it enabled in enriching the 24,000 ESTs information in switchgrass and 90 % of gene space was covered by transcriptome sequencing (Wang et al. 2012a). Transcriptome sequencing by Illumina paired-end sequencing technology in radish provided 61,554 unigenes and enriched the resources of EST-based SSRs (Wang et al. 2012b); likewise, EST-derived SSRs have been obtained from transcriptome sequencing generating 22,756 unigenes in rubber tree (Li et al. 2012b). In peanut whole transcriptome sequencing enabled to identify 26,048 unigenes and 8,817 unigenes were characterised (Wu et al. 2013); 8,252 unigenes have been annotated (Zhang et al. 2012b). Whole transcriptome sequencing facilitated in discovery of novel SNPs in wheat (Duan et al. 2012), melon (Blanca et al. 2012), black cottonwood (Geraldes et al. 2011), Brassica napus (Trick et al. 2009), Eucalyptus (Novaes et al. 2008) and lodgepole pine (Parchman et al. 2010). Similarly 192 EST-SSR markers are identified in lentil from transcriptome sequencing (Kaur et al. 2011), EST-derived SSRs in field pea and fava bean (Kaur et al. 2012). In case of chickpea 4,072 SSRs and 36,446 SNPs have been identified from transcriptome sequence of wild chickpea C. reticulatum (PI489777) (Jhanwar et al. 2012). Likewise transcriptome sequencing in kabuli chickpea offers repertoire for development of functional markers (Agarwal et al. 2012).

Molecular Marker Discovery

NGS technology has played key role in development of useful high-throughput markers such as SSRs, ESTs and SNPs. Some important molecular markers developed by applying NGS have been discussed. NGS has assisted in the development of 246 SSRs in Prunus virginiana (Wang et al. 2012c) and 94 reproducible novel SSRs in fava bean (Yang et al. 2012a) and detection of chromosome arm–specific microsatellite marker in wheat (Nie et al. 2012), SSR marker for distinguishing aphids in soybean (Jun et al. 2011) and SSRs in peanut (Zhang et al. 2012b) given in Table 9. Importantly, the role of NGS in identification of SNP marker is worth mentioning. NGS led resequencing of 8 genotypes of wheat has facilitated in developing exome-based, codominant, SNP marker used for differentiating homozygote and heterozygotes in wheat (Allen et al. 2013) and identification of SNPs in lentil (Sharpe et al. 2013), 20,000 SNPs in Brassica rapa (Bus et al. 2012), 8,207 SNP markers and five markers linked with anthracnose disease resistance in lupin (Yang et al. 2012a) and 1,022 SNPs and 4,543 SNPs, respectively, in chickpea (Gaur et al. 2012; Azam et al. 2012). NGS led to identify 575,340 SNPs from three cultivars of potato: “Atlantic”, “Premier Russet” and “Snowden” and 96 SNPs were used for allelic diversity measurement (Hamilton et al. 2011). In tomato 8,784 SNPs were detected, derived from transcriptome sequences and utilised for constructing high-density linkage maps for three interspecific F(2) populations (Sim et al. 2012). Zou et al. (2013) searched out 1,953 SNPs associated with QTLs contributing 4-methylthio-3-butenyl glucosinolate contents in roots of radish, Raphanus sativus L. By applying NGS technique in durum wheat, 2,659 SNPs have been discovered (Trebbi et al. 2011) and 1,050 SNPs were identified in common bean (Hyten et al. 2010a) further 1,790 SNPs disclosed by NGS in soybean were used for developing high-resolution genetic map (Hyten et al. 2010b). In wheat (Aegilops tauschii) without having reference genome sequence, 497,118 genome-wide SNPs have been discovered using NGS (You et al. 2011). By deploying NGS in alfalfa, 40,661 candidate SNPs have been identified which are useful for association mapping and high-resolution mapping (Han et al. 2011). Application of NGS for developing SNP markers and its uses have been thoroughly described by Kumar et al. (2012).

Genotyping by Sequencing (GBS)

Advancement of NGS technology has paved the way for developing high-density linkage map covering all the linkage groups with anchoring thousands of high-throughput markers. These markers constitute especially SNPs, positioned on the linkage map developed earlier, enhancing its resolution, by re-sequencing of genome by NGS platforms in diploid and polyploidy species (Oliver et al. 2011; Eckert et al. 2009). SNP-based high-throughput linkage map has been developed for mapping the recessive mutant loci in maize using RILs developed from B73 x Mo17 (Liu et al. 2010). In tomato 8,784 SNPs developed from transcriptome sequences generated by NGS. These SNPs enabled in developing high-density linkage maps from three interspecific F2 populations of EXPEN 2000, EXPEN 2012 and EXPIM 2012. The average marker bin intervals were 1.6 cM, 0.9 cM and 0.8 cM, respectively (Sim et al. 2012). NGS-led developed SSR and SNP markers assisted in developing high-density linkage map of 1,227 markers positioned on 9 linkage groups covering 1197.9 cM in brassica (Wang et al. 2012d). Moreover, along with NGS, sequence-related amplified polymorphism (SRAP) markers aided in developing an ever most saturated ultradense genetic map retaining 9,177 SRAP markers, 1,737 integrated unique Solexa paired-end sequences, 46 SSRs and 10,960 independent genetic loci in B. rapa (Li et al. 2011b). Similarly a linkage map developed from 114 double haploid lines, harbouring 415 INDELS and 92 SSR markers covering 1234.2 cM length, positioning 152 scaffolds on the chromosomes (Wang et al. 2011a). In chickpea one of the most high-resolution maps has been developed applying NGS comprising 1,063 markers covering map length of 1808.7 cM (Gaur et al. 2012).

Owing to lower cost involved in sequencing by NGS technology, it is utilised for sequencing the entire population contributing in trait mapping along with tracing down markers across the genome, called GBS (Elshire et al. 2011). This innovative approach has been successfully applied in rice by re-sequencing 150 RILs developed from parents of indica and japonica cultivars, leading to discovery of 1,226,791 SNPs (Huang et al. 2009). A high-quality physical map has been developed harbouring the QTL responsible for green revolution by re-sequencing whole genome of 128 chromosome segment substitution lines (CSSLs) of rice (Xu et al. 2010). Low-density GBS is reported in barley (Chutimanitsakun et al. 2011). In maize and barley GBS approach has been applied (Elshire et al. 2011) and 25,185 bi-allelic SNPs have been detected in maize. Similarly, the approach of GBS has been used in barley and wheat (Poland et al. 2012). GBS has been deployed to sequence the pool of mutants in segregating populations in Arabidopsis called SHORE map (Schneeberger et al. 2009), SHORE mapping a GBS strategy used for identification of causal mutation in Arabidopsis (Galvão et al. 2012). Likewise, next-generation mapping (NGM) (Austin et al. 2011) technique extension of next-generation genomic sequencing has been employed to map mutations directly from pooled F2 populations in Arabidopsis. This technique led to detect three genes associated with cell wall biology in Arabidopsis (Austin et al. 2011).

Epigenetic Analysis and Discovery of Small Noncoding/Regulatory RNAs

Epigenetics refers to the gene expression without any alteration in DNA sequences (Liang et al. 2009; Bird 2007). Epigenetic changes are triggered by small RNAs causing changes in DNA methylation and histone modification (acetylation, methylation, phosphorylation and ubiquitinylation) (Simon and Meyers 2011; Rival et al. 2010). Epigenetics mechanisms endow plant to change its gene expression and produce particular phenotype in response to environmental changes (Piferrer 2013). Development of NGS first brought the attention of researchers dealing with epigenetic analysis. Wide application of NGS in epigenetic analysis has been discussed recently (Meaburn and Schulz 2011). Considering the role of methylation of cytosine in DNA contributing in regulation of epigenetics (Laird 2010), NGS has been used for mapping cytosine methylation in Arabidopsis (Cokus et al. 2008; Lister et al. 2008). This tool has benefited in rendering the epigenome of plant (Simon and Meyers 2011), mapping the methylation pattern and regulation of methylation throughout the genome (Lister and Ecker 2009) and generating the epigenetic markers across the genome (Liang et al. 2009). In maize NGS has helped in revealing the relationship between epigenome and transcriptome (Elling and Deng 2009; Eckardt 2009); similarly, it has assisted in development of methylome map in Arabidopsis (Zhang et al. 2006). Additionally, this novel technique is used for the discovery of small RNAs including microRNA (miRNA), small interfering RNA (siRNA), transferable RNA (tRNA) and ribosomal RNA (rRNA) playing key role in post-transcriptional gene expression (Xie et al. 2004; Lu et al. 2005; Filipowicz et al. 2008; Morozova and Marra 2008). Considering this, NGS enabled in discovering 14 novel and 22 conserved miRNA families from peanut responsible for growth and development in response to environment stress (Zhao et al. 2010) given in Table 9; similarly, NGS aided in discovering sRNAs accounted for fruit development and ripening in tomato (Mohorianu et al. 2011). Applying NGS, miR156, miR159, miR172, miR167, miR158 and miR166, miRNAs have been identified which are associated with seed development and maturation in Brassica sp. (Huang et al. 2013a). In sugarcane 26 conserved families of miRNA have been detected conferring regulation in axillary bud outgrowth sequencing sRNA by NGS (Ortiz-Morea et al. 2013). Similarly applying NGS sequencing, small RNAs that are genome wide in rice assisted in identification of pollen development stage specific 292 miRNAs (Wei et al. 2011). In barley expression profile of miRNA and other noncoding RNAs were analysed in response to phosphorus requirement, and 221 conserved miRNAs as well as 12 novel miRNAs were detected by sequencing sRNAs using Illumina’s NGS (Hackenberg et al. 2013). Additionally, NGS analysis led to detection of 66 miRNA genes closely contributing in leaf growth in response to drought in Brachypodium distachyon (Bertolini et al. 2013).

RAD Sequencing and Reduced Representation Sequencing

Application of NGS along with restriction enzyme has given birth to some new techniques, restriction-associated digestion (RAD) sequencing, reduced representation sequencing based on reduced representation library (RRL) and complexity reduction of polymorphic sequences (CRoPs) for mostly focusing on sequencing targeted region of genome rather than sequencing whole genome (Davey et al. 2011). RAD sequencing is an innovative method for discovery of SNPs and high-throughput genotyping (Miller et al. 2007; Baird et al. 2008; Davey and Blaxter 2011; Davey et al. 2012). RAD generates two types of markers: codominant markers in targeted region close to restriction site (endonuclease enzyme) and dominant markers within region of the restriction site (Deschamps et al. 2012b). RAD sequencing aided in development of 347 denovo SNPs, which has facilitated in developing linkage map of 1,390 cM in eggplant which developed from cross “305E40” × “67/3” population, and this enabled in tracking down seven QTLs conferring to anthocyanin accumulation (Barchi et al. 2012). A high-density linkage map has been developed from F1 populations in grape using 1,841 SNPs developed through RAD sequencing (Wang et al. 2012e). RAD sequencing aided in finding out high-throughput 20,000 SNPs and 125 insertions and deletions in Brassica napus (Bus et al. 2012); nearly 10,000 SNPs, 1,000 INDELS and 2,000 putative SSRs in eggplant (Barchi et al. 2011) and 34,000 SNPs and nearly 800 INDELS in C. cardunculus (Scaglione et al. 2012). RAD sequencing is used for developing markers tagging disease resistance gene against anthracnose disease resistance in lupin (Yang et al. 2012a). In Arabidopsis genome-wide genotyping has been done applying 2b RAD (Wang et al. 2012f). RAD sequencing has permitted mapping of QTLs involving in fatty acid synthesis in perennial ryegrass (Lolium perenne L.) (Hegarty et al. 2013).

RRL approach has been applied for discovering SNPs in maize (Gore et al. 2009; Deschamps et al. 2012a), soybean (Hyten et al. 2010a) and 4,294 to 14,550 SNPs from four accessions of soybean (Varala et al. 2011); in jointed goat grass (You et al. 2011); in grape (Myles et al. 2010); in common bean (Hyten et al. 2010b) and in rice (Monson-Miller et al. 2012). Redeced representation sequencing based, a new approach called Restriction Enzyme Site Comparative Analysis (RESCAN) has been used for detection of SNP in rice (Kim and Tai 2013). Likewise, reduced representation sequencing helped in identifying SNP residing in major QTL region contributing to pod shattering resistance in rapeseed (Hu et al. 2012), mapping respective QTLs of flowering time and petiole length in Arabidopsis (Seymour et al. 2012). Sequencing of RRL contributed in detection of QTLs (qLpPg1, qLpPg2 and qLpPg3) responsible for stem rust in Lolium perenne (Pfender et al. 2011). RAD based sequencing has facilitated the development of GBS libraries for maize and barley, providing opportunities to the breeder community for practicing genomic selection in the future. Apart from the above applications, some other applications of NGS in association mapping study, evolutionary relationship study, diversity, alien introgression studies and organelle sequencing in plant have been elaborated (Varshney et al. 2009).

Association Mapping for Tagging Genes/Complex QTLs and Exploiting Plant Natural Variation

Traditionally genetic map and QTL mapping were developed by using biparental populations. The details of QTL mapping starting from developing mapping population subjecting contrasting parents to check polymorphism for parental and segregating progenies for the traits and detection of polymorphic markers by genotyping has been reviewed (Collard et al. 2005). An alternative method called association mapping (AM) or linkage disequilibrium has been developed to exploit the natural variation present in germplasm repository and facilitating discovery of QTLs by analysing marker–trait association (Zhu et al. 2008). Association studies provide the opportunity of high-resolution mapping and identification of gene accounting for phenotypic variation, along with high marker coverage in comparison to biparental linkage mapping (Thornsberry et al. 2001; Remington et al. 2001; Wang et al. 2005). This powerful method clarifies complex QTLs using historical and evolutionary recombination episode at population level (Nordborg and Tavare 2002; Risch and Merikangas 1996). AM is practised based on two approaches: (1) candidate gene-based approach and (2) whole-genome based approache genome- wide association study (GWAS) (Zhu et al. 2008).

Initial landmark study of AM was applied in oat (Beer et al. 1997), in rice (Virk et al. 1996) and subsequently in other plants (Thornsberry et al. 2001). The successful application of this technique in different plant species for various traits has been depicted (Zhu et al. 2008). AM offers the benefits of detection of marker–trait relationship ranging from complex qualitative to quantitative traits of interest in various crop plants. AM was implemented for analysing kernel size and milling quality using 36 SSR markers in wheat (Breseghello and Sorrells 2006). The stem rust resistance loci has been identified by association mapping deploying DArT and SSR markers in 276 spring wheat lines (Yu et al. 2011b), and Sr13gene/QTL confirmed in durum wheat earlier using biparental mapping (Letta et al. 2013). Similarly, candidate-based association mapping aided in identification of sclerotium head rot resistance QTLs in sunflower (Fusari et al. 2012), and linkage disequilibrium method enabled to correlate candidate gene marker and resistance to Verticillium dahliae QTL in tetraploid potato (Simko et al. 2004). Moreover, candidate gene AM has been applied for analysing drought tolerance in 192 diverse perennial ryegrass (Lolium perenne L.) (Yu et al. 2013). Thirteen QTLs accounting for spot blotch in wild barley has been identified using SSR and SNP markers by association analysis (Roy et al. 2010); similarly four QTLs diagnosed for Septoria speckled leaf blotch resistance using 3,840 lines of barley (Zhou and Steffenson 2013). In wheat this method has enabled to identify marker–trait association with yield under drought condition (Dodig et al. 2012). Association study led to detect 7QTLs account for preharvest sprouting in wheat using 1,166 DArT and SSR markers in 198 genotypes (Kulwal et al. 2012). In wheat, six candidate genes conferring to flowering time have been unfolded by AM (Rousset et al. 2011). Additionally, AM has aided in the detection of earliness causative QTL region on chromosomes (Le Gouis et al. 2012), main effect QTL for quality traits such as kernel weight, protein content, etc. (Reif et al. 2011), multiple loci offering aluminium resistance (Raman et al. 2010), and russian wheat aphid resistance (RWA2) gene in wheat (Peng et al. 2009). Invertases and starch phosphorylases alleles, contributing in tuber quality (Li et al. 2013) and QTL for tuber sugar content and chip quality (Draffehn et al. 2010), tuber, starch content and yield (Li et al. 2008) in potato have been unravelled by AM. Therefore, AM becomes an alternative approach to the classical breeding approach (biparental mapping) for dissecting complex QTLs concerned with yield, documented in barley (Kraakman et al. 2004). To bridge the benefit of linkage mapping and AM, another robust tool, nested association mapping (NAM), has been applied for the first time in maize for dissection of complex traits. Its advantages and mapping resolution have been discussed thoroughly (Yu et al. 2008). The role of complex QTLs associated with flowering time has been elucidated by studying association studies in 5,000 RILs in NAM population in maize (Buckler et al. 2009) and in barley (Stracke et al. 2009). Backcross nested association (BC-NAM) mapping facilitated in identifying 40 QTLs in Sorghum (Mace et al. 2013), the Dwarf8 (d8) locus in maize (Larsson et al. 2013) responsible for flowering time. For gaining insight into genetic basis of leaf architecture in maize, NAM has been applied (Tian et al. 2011). Thirty-two QTLs contributing for resistance against southern leaf blight (SLB) disease in maize has been detected applying NAM in 5,000 maize RILs (Kump et al. 2011), and similarly 32 QTLs were investigated causing northern leaf blight in maize deploying NAM (Poland et al. 2011). NAM association study assisted in showing the variation at bx1 gene conferring DIMBOA content in maize (Butrón et al. 2010). Likewise, deployment of NAM for improvement of kernel composition in maize has been well documented (Cook et al. 2012). However, AM suffers from major drawback of confounding effect of population structure giving rise to false positive (Yu et al. 2006; Zhao et al. 2007; Ingvarsson and Street 2011).

In recent past GWAS has been developed which inspects association of marker and QTLs across the genome (Risch and Merikangas 1996). GWAS is a population-based approach offering the advantage of studying association of SNP and phenotype across the genome from unrelated individuals (Mitchell-Olds 2010). It has been applied to study 107 phenotypes of 200 Arabidopsis inbred lines (Atwell et al. 2010). Similarly GWAS is efficient in detecting QTLs from the natural or pre-exiting lines in comparison to biparental mapping strategy. This approach is utilised in finding out for beta-glucan QTL in oat (Newell et al. 2012) and common bacterial blight resistance QTL in Phaseolus vulgaris (Shi et al. 2011). GWAS facilitated in marker–trait association for various agronomic and disease resistance traits in barley (Pasam et al. 2012; Berger et al. 2013), saccharification yield in sorghum (Wang et al. 2011a), detecting allelic variation in natural germplasm in rice (Han and Huang 2013) and mapping of 15 traits with 1,536 SNPs in 500 lines of barley (Cockram et al. 2010). Contribution of 61 loci to tocopherol content and composition has been revealed by genome-wide association analysis in Brassica napus (Wang et al. 2012g). In recent past GWAS led to identify 512 loci correlated with 47 agronomic traits in foxtail millet (Jia et al. 2013b). From abiotic stress point of view GWAS has been applied in 125 inbred lines of maize and 43 SNPs found to be associated with chilling tolerance (Huang et al. 2013c). Therefore, AM can be utilised for genome-wide diversity study, shedding light on marker–trait association and discovery of genes/QTLs. Similarly it has the potentiality for QTL dissection and cloning (Salvio and Tuberosa 2007). To facilitate accurate detection of gene/QTLs coupled with their functions for overall crop improvement, an integrated approach combining all the tools of functional genomics is depicted in Fig. 1.

Fig. 1
figure 1

Integrated approaches of functional genomics for developing improved crop cultivars

Perspectives and Future Direction

The advent of fast-evolving DNA sequencing technology has given a new direction in the field of genomics by enabling sequencing of whole genome, extricating precious genomic information of non-model crops and re-sequencing of model crops in quick time and under manageable cost. Reduction of cost for sequencing leads to develop next-next or third-generation sequencing technologies such as single-molecule real-time (SMRT™) sequencing capable in generating longer sequence read (Thudi et al. 2012). Additionally, high-throughput-driven technologies led to development of ultradense linkage map, transcript map, SHORE map and MutMap of plant species mentioned earlier to capture the desired genes/QTLs of interest. Nevertheless, the rapid accumulation of sequence information and decoding of function of these sequence informations are still lagging behind. To date rice and Arabidopsis are the only members of plant species for which we have most of the genes with known function. Thus the progress of functional genomic research will benefit the plant science community by unlocking the function of all genes of most of the plant species. Ultimately, this will lead to improvement of crop breeding programme thereby developing cultivar with enhanced tolerance to biotic and abiotic stresses mitigating the challenges of food security in the coming future.