Keywords

4.1 Introduction

Finger millet [(Eleusine coracana (L.) Gaertn.); family: Poaceae; subfamily: Chloridoideae; allotetraploid (2n = 4x = 36, AABB); A genome donor: E. indica; B genome donor: unknown; assembled genome size: 1.2 Gb; genome size based on flow cytometry: 1.5 Gb; self-pollinated] was domesticated from its wild progenitor, E. coracaca subsp. africana, in Ethiopian highlands and western Uganda more than 5,000 years ago (Hilu 1988; Liu et al. 2011, 2014; Hittalmani et al. 2017; Hatakeyama et al. 2018). Finger millet also known as ragi is cultivated in semiarid, arid, tribal, and hilly regions of Africa and India. India is the secondary center of finger millet diversity, where the crop was introduced in 3000 BC in Western Ghats. Finger millet is a crop of immense importance in view of its climate resilience, massive nutritional and nutraceutical properties (high protein; essential amino acids particularly enriched in lysine and methionine; minerals especially high calcium content, vitamins; dietary fiber; phytochemicals; glycoproteins; gluten free; low glycaemic index) (Saleh et al. 2013; Chandra et al. 2016; Kumar et al. 2016a). The crop is also a storehouse of vital genomic resources because of its excellent adaptability to harsh conditions. Large germplasm collections of finger millet are available in genebanks [ICRISAT genebank: ~6000 accessions (http://exploreit.icrisat.org/profile/Small%20millets/187); National Genebank, ICAR-NBPGR, New Delhi, India: ~11,352 accessions (http://www.nbpgr.ernet.in:8080/PGRPortal/(Shtfofsuczs3o44ns55eh2h45))/SimpleSearch.aspx)], which needs to be comprehensively characterized (high throughput) to know the level of phenotypic, genotypic, and nutritional diversity as well as adaptability under difficult conditions for future food and nutritional security, agrarian sustainability, biotic and abiotic stress tolerance. These large germplasm holdings of genebanks are the repositories of genetic variation and novel alleles/traits, which were earlier characterized through morphological markers to know the level of genetic diversity and develop core/minicore sets (Upadhyaya et al. 2006, 2010). These huge collections are now being characterized through molecular markers and next-generation sequencing (NGS) technologies to develop molecular cores, identify candidate genes and marker-trait associations. So, there is a paradigm shift in the way we characterize and utilize the available genetic diversity/allelic variation in a particular crop.

A number of reports from the genetics era are available wherein characterization of diversity was carried out and variation levels and genetic structure were assessed using different genetic markers, including classical markers such as morphological, cytological, and biochemical markers, and DNA/molecular markers including hybridization-based markers such as restriction fragment length polymorphisms (RFLPs) and polymerase chain reaction (PCR)-based markers such as random amplified polymorphic DNA (RAPDs); amplified fragment length polymorphisms (AFLPs); inter-simple sequence repeats (ISSRs); simple sequence repeats (SSRs); sequence-related amplified polymorphism (SRAP); start codon targeted (SCoT), etc.

But in today’s scenario, high throughput, in terms of amount of genetic and genomic data generated and bioinformatic analysis are the backbone of characterization for enhanced utilization of genetic resources. In the era of genomics, information has been generated in finger millet in terms of whole-genome sequence (Hittalmani et al. 2017; Hatakeyama et al. 2018), genome-wide molecular markers/genes discovery, genotyping-by-sequencing (GBS), and genome-wide association study (GWAS), which represents a paradigm shift from genetics to genomics. These genomics approaches need to be augmented by using associative transcriptomics, genomic selection, pan-genome sequencing, etc. to dissect complex traits associated with agronomic performance, stress tolerance, and nutritional aspects and to promote haplotype-based breeding in finger millet.

4.2 Genetics to Genomics: Characterization of Diversity in Finger Millet

Characterization of diversity is the foundation of all crop improvement programs. The factors responsible for creating such variation/diversity are genetic drift and/or recombination, selection, migration, and mutation. Evaluation of genetic diversity of plant genetic resources is very crucial due to bourgeoning population, food insecurity, associated risks with narrowing genetic base of the existing cultivars and climate change. Finger millet is one such crop that can fulfill the criteria of nutritional and future food security as well climate resilience and needs to be thoroughly characterized for its enhanced utilization. A number of reports are available wherein classical markers, viz. morphological, cytological and biochemical markers; DNA/molecular markers and next-generation sequencing approaches have been used to characterize diversity in finger millet showing a paradigm shift from genetics to genomics.

Morphological markers though monitored easily, are limited in number, and affected by the environment, hence limiting their usefulness. Similar restrictions are put on to the usage of isozyme markers also (Andersen and Lubberstedt 2003). Among morphological, isozyme (biochemical), and molecular markers, the latter give superior power of detection due to recognition of genotypic differences, genomic abundance, level of polymorphism, and non-influence of environmental factors.

Molecular characterization based on DNA molecular markers can be carried out using Random DNA Markers (RDMs-RAPD, RFLP, AFLP, ISSR, SSR, etc.), gene-targeted markers (GTMs), and functional markers (FMs).

In the genomic era, NGS approaches including whole-genome sequencing (WGS) and resequencing, RNA sequencing, metagenomics, high throughput genotyping, pan-genome sequencing, etc., and bioinformatic analysis tools are available to molecularly characterize the germplasm at the sequence level and to best utilize them for crop improvement. The most common complexity reduction high-throughput genotyping technologies are: restriction-site-associated DNA (RAD), GBS, diversity array technology (DArT) and are being frequently used in diversity analysis.

Characterization of all the germplasm holdings of a particular crop in a genebank using genotyping at the sequence level will answer many questions. Duplicates can be identified to mark the redundant germplasm and to decrease the load on genebanks in terms of usage of different resources (financial, manpower, etc.). Diversity level and population structure of the whole germplasm could be assessed and the molecular cores/subsets representing maximum diversity of the whole set could be designed. These cores can then be easily handled for their multilocation morpho-agronomic performance, biotic and abiotic stress resistance evaluation in the field as well as under artificial conditions. These can also be evaluated for quality traits for adding nutritional value to them. Once the whole germplasm set or molecular cores or subsets are characterized for different traits of economic importance, the genotyping along with phenotypic/biochemical data could be used for GWAS to identify SNPs, gene(s), or quantitative trait loci (QTLs) linked to these traits.

One such report (Bharati 2011) is available wherein the global composite collection of finger millet was characterized molecularly as well as for morpho-agronomic traits and a reference set of 300 most genetically diverse accessions capturing 89.2% of the alleles of the whole collection (959 accessions) was constituted using 20 SSR markers. A total of 11.6 alleles per locus along with 121 common alleles and 110 rare alleles at 1% and a mean gene diversity of 0.560 were reported. Wild spontanea race accessions showed maximum gene diversity of 0.611. A large number of alleles ranging from 10 to 21 were observed using UGEP81, UGEP10, UGEP102, UGEP26, and UGEP77 SSR loci. UGEP3, UGEP5, UGEP31, and UGEP104 SSR loci detected large numbers of multiple alleles. High PIC values of more than 0.636 were reported using UGEP15, UGEP5, UGEP18, UGEP102, UGEP12, and UGEP77 SSRs. Race and region-specific unique alleles were also reported. QTL UGEP56 in LG6 and UGEP8 in LG3 for days to 50% flowering indicated strong marker-trait associations (MTAs). Accessions were identified for high grain yield, early flowering, more fingers, fodder yield, ear head length, basal tiller number, high iron, and zinc content for finger millet improvement program. More such studies involving entire genebank germplasm and NGS-based genotyping along with morpho-agronomic and biochemical characterization are the need of the hour to characterize and utilize finger millet germplasm.

4.3 Morphological Markers

The establishment of genetic and genomic resources is a crucial step for crop improvement for desired traits (Ceasar et al. 2018). With the onset of studies on diversity analysis in crops, morphological markers have emerged as potential markers that are dependent on agromorphological characters. Morphological characteristics could facilitate the determination of the agronomic parameters as well as the taxonomic classification of plant species (Ortiz et al. 2008). These markers also play a key role in the maintenance and management of plant genetic resources (PGR), as well as in the Plant Breeders’ Rights (PBR) system (Babic et al. 2016).

A number of reports are available wherein morphological markers have been used to characterize genetic diversity in finger millet using either a smaller or larger set of germplasm. Upadhyaya et al. (2006) developed a core subset of 622 accessions representing the diversity of the entire collection of finger millet germplasm based on their geographical origin and data on 14 quantitative traits from the global collection of 5,940 accessions (Africa, Asia, America, Europe, and unknown origin) available at the genebank at ICRISAT, Patancheru, India. All the five races and breeding materials, improved cultivars, landraces, and wild types were represented in the core subset. Upadhyaya et al. (2007) characterized 909 finger millet accessions introduced from the genebank ICRISAT, Bulawayo, Zimbabwe, and grown at ICRISAT, Patancheru. Variability was recorded for plant pigmentation and growth characters (time to 50% flowering, plant height) and inflorescence characters (inflorescence width, length, and exsertion). Dominance of green-type plant pigmentation, erect-type growth habit, light brown grain color was observed. Dwarf plant (up to 75 cm) accessions were mostly from Zimbabwe. Early flowering accessions were reported from Kenya and late flowering from Tanzania and Zaire.

Umar and Kwon-Ndung (2014) characterized 10 finger millet accessions collected from diverse locations in northern Nigeria using morphological characters (plant height, leaf length and diameter, finger length and width, number of fingers, and 1000 seed weights). Significant genetic diversity was observed for the traits studied.

Variability for 19 agromorphological characters between 60 exotic and 89 Indian accessions was reported and flag leaf sheath length, peduncle length, panicle exsertion, ear head width, fingers per head, and 1000-grain weight were found more in the Indian accessions (Babu et al. 2017). Significant and positive correlations were observed between days to 50% flowering and days to maturity as well as for peduncle length and panicle exsertion. The genotypes (IE 7320, IE 4491, GE 1437, VHC 3911, and VHC 3898) and (GE 1437, GE 5192, and IE 5367) were identified as better parents for high photosynthetic efficiency and tryptophan content, respectively.

The germplasm from Plant Genetic Resource Center, Gannoruwa, Sri Lanka, was characterized using morphological markers (Dasanayaka 2016; Kaluthanthri and Dasanayaka 2016; Kumari et al. 2018). Kaluthanthri and Dasanayaka (2016) characterized 20 finger millet accessions and on principal component analysis (PCA), characteristics such as days to flowering, finger number, and yield per plant were found as important traits for variability among the studied genotypes. A set of 24 accessions were characterized using 14 quantitative characters (Dasanayaka 2016). Kumari et al. (2018) morphologically characterized 139 accessions (100 local collections from 15 districts of Nepal, 26 accessions from India, Zimbabwe, and unknown exotic origin, 9 from farmer’s fields, 2 standards, and 4 others) using 14 quantitative characters and the highest variability was observed in grain yield, panicle exertion, weight of 20 mature ears, number of productive tillers and length of the longest finger. A significant and positive correlation was observed between grain yield and the number of productive tillers, threshing ratio, weight of 20 mature ears, and panicle exertion. The reported studies could be used for finger millet improvement in Sri Lanka.

Anuradha et al. (2017) evaluated genetic diversity among 25 finger millet genotypes through PCA and cluster analysis. The characteristics, viz., plant height, number of productive tillers, days to 50% flowering, fodder, and grain yield showed significant variability. Genotypes VR 1101, VR 1098, VR 1112, VR 1113, VR 1111, VR 1115, VR 1116, and VR 1117 were found better for different traits could be further utilized for breeding programs. A set of 27 accessions were morphologically evaluated and a high degree of similarity was observed between IC49979A and IC49974B genotypes, whereas IC204141 and IC49985 showed a low level of similarity (Prabhu et al. 2018).

Mohan et al. (2018) evaluated 38 finger millet genotypes (18 released varieties and 20 landraces from India) using morphological markers (13 qualitative and 14 quantitative traits). Among the qualitative traits studied, diversity was observed for ear shape and size, however, most of the quantitative traits showed significant differences among the genotypes.

Kumar et al. (2019) evaluated 92 accessions for 16 qualitative morphological descriptors at G.B. Pant University of Agriculture and Technology, Pantnagar, India. Erect growth, dark green glume, droopy ears, non-pigmented leaf juncture, nonculm stem branching, lodging susceptibility, non-pubescent leaf sheath, non-branched fingers with multiple whorls, in thumb position branched fingers, and seeds with enclosed glume cover, brown color, round shape, rough surface, unpersistent pericarp with shattering nature were the predominant characters.

Morphology, plant growth, and yield contributing characteristics have been evaluated to characterize 20 accessions of finger millet in the Palghar district of Maharashtra (Patil et al. 2019). Characters such as erect growth habit, light brown seed color, partially enclosed seeds by glumes, and semi-compact ear were found dominant among the studied accessions. Productive tiller number followed by ear head length and finger number were the most varied traits and the lowest variation was shown by a finger width.

4.4 Cytological Markers

Cytological markers are based on variation in chromosomal morphology and have been used to discover progenitors or parents to the cultivated finger millet and also to study relatedness between different species. They can also be used in physical mapping and identification of linkage groups, however, their direct use has been very limited.

Chennaveeraiah and Hiremath (1974) concluded the subspecies africana as the direct progenitor of finger millet based on chromosome pairing data and no contribution of E. indica toward finger millet genome on the basis of lack of chromosome pairing. The report from this study was contrary to the recent reports wherein E. indica is considered one of the genome donors of domesticated finger millet. The reason for this disparity may be due to the fact that they did not mention the number of crosses made and extensive cytogenetical studies are required to conclude phylogenetic affinities (Dewey 1982). Further lack of chromosome pairing does not always show a lack of genomic resemblances (De Wet and Harlan 1972).

Hiremath and Salimath (1992) reported E. floccifolia not to be the genome donor to E. coracana and E. multiflora as a distinct species, with genomic symbol “C” based on mean chromosome pairing. Identification of the “B” genome donor to cultivated E. coracana is yet to be identified.

Bisht and Mukai (2000) mapped ribosomal DNA (rDNA) sites of four diploid and two tetraploid species of Eleusine by fluorescence in situ hybridization (FISH) and the similarity of the rDNA sites and their location on chromosomes in the studied species showed that diploid species might be the possible genome donors to tetraploid species. E. multiflora was differentiated from the rest of the species due to the presence of 18S-5.8S-26S rDNA on the largest pair of the chromosomes, 5S rDNA at four sites on two pairs of chromosomes, and 18S-5.8S-26S and 5S rDNA at the same location on one pair of chromosomes. Tetraploid species, namely, E. coracana and E. africana were found to possess the same number of 18S-5.8S-26S and 5S rDNA sites located at a similar position on the chromosomes. Diploid species, E. floccifolia, E. indica, and E. tristachya were found to possess the same 18S-5.8S-26S sites and locations that showed resemblance with the two pairs of 18S-5.8S-26S rDNA locations in tetraploid species, E. africana and E. coracana. The 5S rDNA sites on chromosomes of E. floccifolia and E. indica were found comparable to those of E. africana and E. coracana.

Bisht and Mukai (2001) revealed E. indica and E. floccifolia as genome donor/contributor to E. coracana (an allotetraploid species) on the basis of in situ hybridization of E. coracana genome with the genomic DNA of different diploid species of the same genus. A close genomic relationship was observed between four diploid species, namely, E. floccifolia, E. indica, E. intermedia and E. tristachya, and the tetraploid species E. coracana. Based on the common genomic in situ hybridization (GISH) signals, it was found that E. indica and E. tristachya shared close similarities and E. intermedia as the intermediate species of E. indica and E. floccifolia.

Liu et al. (2014) reported E. indica as the primary A-genome parent and E. tristachya (or its extinct sister or ancestor) as the secondary A-genome donor to finger millet based on multicolor genomic in situ hybridization (McGISH).

4.5 Biochemical Markers

Biochemical markers generally involve the analysis of seed storage proteins and isozymes (enzymes differing in the sequence of amino acids but catalyzing the same biochemical reaction). These markers are based on enzymatic functions and allow the measuring of allele frequencies for specific genes. Studies involving isozymes and seed proteins for characterization of genetic diversity specifically for E. coracana are scanty.

Werth et al. (1994) employed 16 isozyme loci coding nine enzymes to analyze genetic variability among seven Eleusine species. Genetic variability differed considerably among members of diploid species (E. indica and E. jaegeri). Both the subspecies of the tetraploid E. coracana (subsp. coracana and subsp. africana) displayed fixed heterozygosity at several loci. Moreover, both the tetraploids also possessed E. indica marker alleles at all loci, confirming that they were derived from E. indica by hybridization with an unknown diploid.

Chong et al. (2011) studied genetic diversity within and among six glyphosate-resistant (R) and eight glyphosate-susceptible (S) E. indica populations from Peninsular Malaysia using isozyme markers encoding acid phosphatase (Acp), glutamate dehydrogenase (Gdh), glucose-6-phosphate isomerase (Pgi or Gpi), glycerate dehydrogenase (Gly), isocitrate dehydrogenase (Idh), malate dehydrogenase (Mdh) phosphoglucomutase (Pgm), and uridine diphosphogluconate pyrophosphate (Ugp). Genetic variations at 13 enzyme loci from studied enzyme systems were evaluated in a set of 840 accessions. Low levels of isozyme diversity in R and S populations of E. indica were observed with a small percentage of polymorphism and the number of alleles per locus. The results inferred that the populations might possess a background of severe or long-lasting population bottlenecks that have shrunken the genetic diversity.

Kumar et al. (2012) studied seed storage proteins profiles of 52 finger millet genotypes from Uttarakhand. Clear and distinct polypeptide bands (15–25) having molecular weights in the range of 10–100 kDa were observed. No major differences in banding pattern among 52 finger millet genotypes were reported based on sodium dodecyl sulfate polyacrylamide gel electrophoresis. However, an additional band of 32 kDa was detected in a few genotypes that need to be studied further from a nutritional point of view.

4.6 Molecular Markers

On the basis of the method of detection, molecular markers are classified into hybridization-based and PCR-based markers. The molecular markers, gene discovery, and advancement of genomic resources were also reviewed by Sood et al. (2016).

4.6.1 Hybridization-Based Markers

RFLP is the first marker system based on hybridization, which relies on polymorphism as a result of insertion/deletions (known as InDels), point mutations, translocations, duplications, and inversions (Nadeem et al. 2018).

Salimath et al. (1995) analyzed genome origins and genetic diversity in 22 accessions belonging to five species of Eleusine from Africa and Asia using an eight probes-three enzyme RFLP combination, revealing 14% polymorphism (low level of sequence variability) in 17 accessions of Ecoracana from Asia and Africa. Along with RFLP, RAPD and ISSR patterns were also studied based on which three species including EcoracanaEindica, and Etristachya showed a close genetic assemblage within the genus, whereas E. floccifolia and Ecompressa were found most divergent.

Muza et al. (1995) classified 26 finger millet lines belonging to Africa and India into cytotype groups on the basis of the Southern blot hybridization patterns obtained with maize and sorghum mitochondrial cloned gene probes. Five restriction endonuclease enzymes were used, giving a total of 20 enzyme/probe combinations. A low level of polymorphism was detected with RFLP banding patterns. However, the data based on mitochondrial DNA clone atp9 hybridization allowed the classification of the lines into three cytotype groups.

Parani et al. (2001) studied 119 accessions belonging to 7 small millet species using the chloroplast trnS-psbC gene regions to generate PCR–RFLP with 8 restriction enzymes individually as well as in combinations of two enzymes. A combination of two enzymes distinguished all the species. Species-specific differential banding patterns were observed.

4.6.2 PCR-Based Markers

PCR allows amplification of the region of DNA, targeted by the regions of high homology with the primers. PCR-based markers can be categorized as: (i) arbitrary primer-based markers, and (ii) sequence-based markers. As per the published reports, RAPD and SSRs have been the markers of choice for genetic diversity and population structure analyses in finger millet, and to some extent, ISSR and SNP makers have also been used for diversity studies. Very few reports are available on genomic SSR marker development (Dida et al. 2007; Gimode et al. 2016; Hittalmani et al. 2017; Lee et al. 2017) in finger millet. Large-scale genomic SSR marker (18,514) development based on finger millet genome sequence was reported by Hittalmani et al. (2017) and 35 SSRs were validated in 26 E. coracana and 14 wild species accessions, the markers identified can be used for diversity and marker-trait association studies to hasten marker-assisted breeding programs in finger millet. Functional/gene-based markers such as expressed sequence tag-simple sequence repeats (EST-SSRs)/transcriptome sequencing-based markers, nucleotide-binding site-leucine rich repeat (NBS-LRR) based markers, cytochrome P450 gene-based markers, Aspartate kinase2 gene based SSRs, calcium transporters and calmodulin based anchored-SSRs, drought stress related genic SSRs, SRAP, SCoT have also been developed and used for molecular characterization of finger millet (Arya et al. 2009; Panwar et al. 2010a, b; Panwar et al. 2011; Naga et al. 2012; Nirgude et al. 2014; Kumar et al. 2015a; Saha et al. 2017; Panda et al. 2020). The different markers employed to study genetic diversity and population structure in finger millet are detailed in Table 4.1.

Table 4.1 Genetic diversity and population structure analyses in finger millet using PCR-based markers

4.7 Prospects of Molecular Markers in the Post-genomics Era

4.7.1 Genetic Diversity and Population Structure Analysis

Genetic diversity and population structure studies based on RFLP, RAPD, ISSR, and SSR markers, etc. have already been discussed above. Here we will focus on the use of GBS in genotyping followed by genetic diversity and population structure analysis. The technique is highly cost-effective, robust, and was developed by Elshire et al. (2011). GBS is an NGS-based approach employed for discovering and genotyping SNPs in crop genomes and populations (He et al. 2014). SNP markers generated through GBS possess outstanding genetic attributes such as high reproducibility, wide genome coverage, codominant mode of inheritance, and chromosome-specific location, and thus extensive genotyping followed by the selection of diverse parents/alleles for breeding programs is achievable. GBS has been employed in finger millet for diversity and population structure analysis for extensive characterization at the sequence level.

Kumar et al. (2016b) studied 113 accessions from Africa, India, Nepal, Maldives, and Germany through GBS and generated 33 GB of data (160 million raw reads). Genome-wide set of 23,000 SNPs segregating across the entire collection and several thousand SNPs segregating within each accession were observed. Based on phylogenetic analysis and model-based STRUCTURE program three groups/subpopulations: Subpopulation 1 (southern Asia (India, Nepal, and the Maldives), followed by eastern Africa, Europe, unknown origin); Subpopulation 2 (southern Asia, followed by eastern Africa, western Africa); Subpopulation 3 (southern Asia, followed by eastern Africa, unknown origin) consistent with geographical distribution with some exceptions were inferred. The results also confirmed the hypothesis of African domestication of finger millet followed by its introduction to India.

Nyongesa et al. (2018) genotyped 95 genotypes from Kenya, India, Uganda, Malawi, Zambia, Zimbabwe, Nigeria, Nepal, and Germany using GBS. The genotypes were divided into three subpopulations (A, B, and C) and all three showed an admixture of alleles based on 117,542 SNPs. Cluster B comprised of genotypes showing high resistance to Striga, clusters A and C contained the most susceptible genotypes. Existing genetic variation can be used for marker-assisted breeding for Striga resistance. The highly diverse nature of the composite collection was revealed based on racial and regional diversity. Structure analysis closely corresponded with the phylogenetic analysis.

In another study by Puranik et al. (2020), 190 genotypes of Asian (India, Nepal, Pakistan, Sri Lanka, and the Maldives); east African (Burundi, Ethiopia, Kenya, Tanzania, and Uganda); south African (Malawi, Zambia, and Zimbabwe) and European or American origin were characterized by GBS (169,365 SNPs including 16,000 putative SNPs in the stringent and 73,419 putative SNPs in the relaxed parameter). The less diverse genetic background between the East and South African accessions indicated their common evolutionary lineage and evolution from the same natural population. Three subpopulations: Subpopulation 1 (East African origin); Subpopulation 3 (the South African origin) and Subpopulation 2 belonging to Asian origin (India, Nepal, Pakistan, Sri Lanka, and the Maldives) were identified. European or American origin genotypes were not clustered with any particular group. This kind of clear geographic distinctions was also reported based on random markers systems.

4.7.2 Phylogenetic Relationships

Chloroplast DNA sequence analysis is a reliable tool for concluding phylogenetic relationships in polyploid species as compared to cytogenetic studies. Hilu (1988) revealed E. indica as the maternal genome donor of finger millet based on chloroplast DNA sequence analysis as chloroplast and its genome are predominantly maternally inherited.

Hilu (1995) used RAPD markers and reported the close genetic affinity of E. tristachya to the E. coracana-E. indica group and the distinctness of E. multiflora. A loose correlation between geographic distribution and pattern of genetic variability was observed. The allotetraploid nature of the finger millet was also confirmed.

In another study by Neves et al. (2005), E. indica was suggested as the A-genome (maternal) donor of E. coracana based on nuclear ITS and plastid trnT-trnF sequences. And also reported that E. floccifolia is not the second genome donor and the B genome donor is unidentified or extinct.

Liu et al. (2011) reported E. indica-E. tristachya clade as possible A-genome progenitors to E. coracana based on a biparentally inherited nuclear Pepc4 gene tree and a maternally inherited plastid 6-gene tree.

Liu et al. (2014) suggested a single allotetraploid origin for the E. africana-E. coracana subclade based on the low-copy nuclear gene (waxy). E. indica and E. tristachya were found as the A-genome donors, with a differential degree of relatedness to E. coracana.

Zhang et al. (2019) also concluded E. indica as the maternal parent of E. coracana and E. africana group, based on transcriptome analysis of E. multiflora, E. floccifolia, E. tristachya, E. intermedia, E. africana, E. coracana, and E. indica. The study also supported a close relationship between E. indica and E. tristachya. The close relationship between E. multiflora and E. floccifolia was unexpected since they have distinct nuclear genomes, CC and BB, respectively.

4.7.3 Generation of Linkage Maps

The genetic maps provide important information for genetic analysis and crop improvement. Large numbers of highly variable markers are required to generate maps useful for trait analysis and eventually plant breeding. The first linkage map of finger millet based on 332 loci and F2 population developed by crossing between Okhale-1 (a cultivated accession) and MD-20 (a wild accession), was generated by Dida et al. (2007). This group used RFLP, AFLP, EST, and SSR markers to generate the map of the tetraploid finger millet. The map spans A genome: 721 cM and B genome: 787 cM and covers almost all 18 finger millet chromosomes. The map was also used for a comparative study between rice and finger millet genomes.

The first high-density genetic map (4,453 SNP markers/18 linkage groups) of finger millet was developed by Qi et al. (2018) using the same population as discussed above. Paired-end GBS reads (278,880,767) and a new pipeline (UGbS-Flex) was used to generate the map. This pipeline can be used in species having different breeding systems, ploidy, and polymorphism levels, and even in the absence of a reference genome sequence.

4.7.4 Genetic Purity Testing of Hybrids

RAPD, ISSR, and SSR markers were used to find polymorphism between two parental lines, viz. PR 202 and IE 2606 of finger millet (Ajeesh Krishna et al. 2020). Twelve RAPD, 4 ISSR, and 21 SSR markers showed polymorphism and were used for genetic purity testing of the F1 hybrids. Molecular markers were found useful in the identification of true hybrids and comparing the efficacy of hybridization methods. Hot-water-based emasculation was found much better compared to the hand-emasculation method of hybridization in finger millet.

4.7.5 Whole-Genome Sequencing and High-Throughput Genotyping by Resequencing

A paradigm shift from marker-based to genomics-based high-throughput sequencing approaches is generating sequencing data by several orders of magnitude with the advantage of dense marker coverage, less time requirement, and cost-effectiveness. Availability of whole-genome sequence in a particular crop is a boon since it provides a reference for resequencing of genotypes constituting core/minicore/trait-specific reference sets/mapping populations, etc. in a particular crop. This will deliver extensive and quality information at the sequence level for estimation of diversity, marker-trait associations, identification of candidate genes, taxonomic and evolutionary studies, high mapping accuracy and resolution, comparative mapping, etc. The data generated will further aid in functional genomics, forward and reverse genetics, and proteomic studies.

Developments of NGS tools have progressed the WGS and transcriptome sequencing in several crop species (Ceasar et al. 2018). The trend and timeline of genome sequencing in millets have been summarized in Fig. 4.1. Presently there are 2 reports available on WGS in finger millet, 1 using ML 365 yielded 1,196 Mb (~82% of the total estimated genome size) and predicted 85,243 genes and 1,14,083 SSRs using Illumina and SOLiD platforms (Hittalmani et al. 2017). Another group, Hatakeyama et al. (2018) reported 1.2 Gb as against 1.5 Gb genome size estimated based on flow cytometry and 62,348 genes in PR 202 based on NGS combined with single-molecule sequencing and by whole-genome optical mapping with Bionano Irys® system.

Fig. 4.1
figure 1

Whole-genome sequencing trend and timeline in millets

This whole-genome sequence data in finger millet has opened up the door for resequencing and utilization of huge variability available in genebanks. Resequencing data of 88 finger millet genotypes accomplished by GoK-funded Finger Millet Genomics Project is available at NCBI. This is being used for the development of superior finger millet varieties for Karnataka using Integrated genomics-assisted breeding approaches. But the whole germplasm of finger millet available in national genebanks needs to be characterized by NGS-based genotyping followed by the development of molecular cores/minicores and then these cores can be resequenced along with phenotypic and biochemical characterization for genome-wide marker-trait associations studies and identification of candidate genes.

4.7.6 Trait Mapping

Finger millet germplasm collections available with national and international genebanks are the repertoire of useful variations important for crop improvement. Previous studies on finger millet germplasm/core sets have identified genotypes for desirable traits such as high grain mineral content (Fe, Zn, and Ca content), protein content, resistance to biotic and abiotic stresses, etc. (Upadhyaya et al. 2011; Babu et al. 2013; Krishnamurthy et al. 2014). However, the true potential of finger millet germplasm collections has not been realized due to the lack of information about genes/alleles associated with these traits. In plants, the linkage mapping approach has been widely used to identify QTL(s)/gene(s), however, this method was not much successful in millets due to the difficulty involved in the generation of mapping populations. Recent advancements in genomics particularly the availability of high throughput genotyping technologies (GBS and resequencing), have made it possible to map diverse traits using a complementary gene mapping technique, association mapping (AM). AM approach exploits linkage disequilibrium at adjacent loci in diverse sets of genotypes, mostly natural populations to establish the correlation between genotype and phenotype (Gupta et al. 2014). This approach is highly efficient for genetic dissection of complex traits and has been used widely for mapping traits in a wide range of crops and would also be useful in the case of finger millet. There are different variants of association mapping, which include candidate gene association mapping, genome-wide association mapping, and associative transcriptomics. Potential applications of these approaches in finger millet are described below.

4.7.6.1 Candidate Gene-Based Association Mapping

Candidate gene association approach involves sequencing of candidate genes from diverse genotypes in order to discover variants which are then tested for their association with the target trait using statistical models. The application of this approach has allowed the identification of allelic variants for many useful traits in many crops including rice (Mishra et al. 2016; Abbai et al. 2019), wheat (Sukumaran et al. 2015; Ma et al. 2016), maize (Cook et al. 2012), soybean (Ikram et al. 2020). The approach can be very well applied in finger millet to identify novel allelic variants of candidate genes associated with various desirable traits. Since, some studies in finger millet have already identified some putative candidate genes for economically important traits including grain protein, micronutrient, tryptophan content, and blast resistance (Babu et al. 2014b, c; Puranik et al. 2020; Tiwari et al. 2020), candidate association mapping studies can be conducted for these genes using genebank collections/core sets. For nutritional traits, the important genes that can be potentially targeted for association analysis work are: gene encoding for no apical meristem-associated (NAM) protein, a member of the NAC family associated with iron content (Puranik et al. 2020), gene encoding for aspartyl protease exhibiting strong association with grain protein content (GPC) (Tiwari et al. 2020) and genes involved in lysine and tryptophane biosynthesis (Babu et al. 2014b). Similarly, candidate gene association analysis can also be attempted for orthologues of NBS-LRR family blast resistance genes of rice such as PiKh and Pita as the pathogen Magnaporthae grisea causes blast disease of rice and finger millet (Babu et al. 2014d). Besides, the above-described traits, drought tolerance is another desirable trait in finger millet as it is mainly grown as a rainfed crop. Candidate gene association mapping can play important role in the identification of novel alleles for abiotic stress tolerance genes. To date, a host of abiotic tolerance genes and transcription factors (DREB, MYB, NAC) have been identified in other cereals such as rice, wheat, foxtail millet, etc. (Lata et al. 2011). Orthologues of these genes can be annotated from the finger millet genome and targeted for candidate gene-based association mapping. Importantly, there is a need to identify more number of functionally validated candidate genes in order to unravel superior alleles/haplotypes for various traits from germplasm collections using candidate gene based association mapping approach.

4.7.6.2 Genome-Wide Association Study

Genome-wide association study (GWAS) is a very powerful technique and does not require mapping populations’ generation step which is a laborious, time-consuming, and technically demanding exercise. Moreover, since GWAS uses a diverse set of lines that may capture many historical recombinant events, linkage disequilibrium (LD) mapping can enable mapping resolution up to 100–200 Kb as compared to 10–20 cM in the case of biparental mapping, and sometimes it may even identify causative genes associated with the target trait. GWAS has been widely used in many plant species including, arabidopsis (Togninalli et al. 2018), rice (Lekklar et al. 2019), wheat (Chaurasia et al. 2020; Kumar et al. 2020), foxtail millet (Jaiswal et al. 2019), soybean (Fang et al. 2017), maize (Mazaheri et al. 2019), pearl millet (Srivastava et al. 2019), etc. for identification of genomic regions/genes associated with economically important traits. Recently, with the availability of finger millet genome sequence and high throughput genotyping methods, it is possible to conduct large-scale genome-wide association studies in this crop for diverse traits. As of now, GWAS studies in finger millet have been reported for various traits such as agro-morphological characteristics, blast resistance, nutritional traits, and low phosphorus tolerance (Babu et al. 2014a, b, c; Kumar et al. 2015b; Ramakrishnan et al. 2016b, 2017; Tiwari et al. 2020). Babu et al. (2014a) conducted AM on a panel of 190 diverse lines and identified markers associated with agromorphological traits including flag leaf width, plant height, basal tiller number, and 50% flowering. Grain protein and tryptophan content associated genomic regions have also been identified using GWAS (Babu et al. 2014b; Tiwari et al. 2020). Moreover, integration of AM with comparative mapping has proved very effective in the identification of genomic region for tryptophan and grain protein content. Using this approach, Babu et al. (2014b) have identified 2 QTLs for tryptophan and 1 QTL for grain protein content (GPC) in a global collection of finger millet. Of the 2 QTLs controlling tryptophan, 1 was major, explained 11% of phenotypic variance and was found associated with SSR marker OM5 designed from 27-kD γ-zein gene of OPM (opaque-2 modifiers). It was a very significant finding as OPM influences tryptophan content to a large extent (Babu et al. 2014b). GWAS has also been conducted for mapping of blast disease which is caused by Magnaporthe grisea and is considered to be one of the major limiting factors in finger millet production. A total of four QTLs for finger blast and one QTL for neck blast have been mapped in global finger millet collection using SSR markers (Babu et al. 2014c). The marker FMBLEST32 and RM262 explained 8% and 10% of the phenotypic variance, respectively, for blast resistance. UGEP81 and UGEP18 were associated with finger and neck blast and explained 7.5% and 11% of phenotypic variance, respectively. The above-described studies indicate the great potential of GWAS in the genetic dissection of important traits in finger millet. GWAS was also done to find the association of SNPs with Striga reaction based on field Striga resistance and GBS using 95 finger millet genotypes and markers TP 85,424 and TP 88,244 were identified for Striga resistance (Nyongesa et al. 2018). In another study, 14 agromorphological traits were mapped in a panel of 113 finger millet accessions using GBS-derived SNP markers and three different GWAS models, viz., SLST, MLMM, and MTMM. A total of 109 novel associations were identified for 14 different agomorphological traits. Further, among these 109 novel MTAs, 9 were common across 3 different models (Sharma et al. 2018). Recently, Tiwari et al. (2020) applied association mapping on 113 diverse finger millet lines to dissect complex genetic regulation of GPC and uncovered 5 reliable genomic regions for GPC. Out of these five regions, one contained gene encoding for aspartyl protease, which was considered a major promising candidate gene contributing to variation in GPC content. In a recent study (Puranik et al. 2020), the application of a combination of GBS and GWAS on a panel of 190 finger millet genotypes revealed genomic regions underlying putative candidate genes associated with grain micronutrient content (iron, zinc, calcium, magnesium, potassium, and sodium). A total of 34 highly reliable MTAs were identified, out of which 18 markers showed homology with the candidate genes and suggested to have putative functions in remobilization, binding, and transport of metals.

4.7.6.3 Associative Transcriptomics

Associative transcriptomics is a variant of GWAS which involves the analysis of transcripts across association panels to discover genomic regions controlling complex traits (Harper et al. 2012). This approach has allowed the identification of both transcript-level sequence variation (SNPs/InDels) and changes in the expression (Gene expression markers; GEM) that could be associated with diverse traits. Initially established in Brassica, this approach has also been used in other species such as wheat (Miller et al. 2016; Wang et al. 2017) maize (Azodi et al. 2020), and chestnut (Kang et al. 2019) for mapping complex traits. Indeed, associative transcriptomics has opened a way to identify expression-level variations in genomic regions/genes critical for the development of important traits. In the case of finger millet, this approach can be used to identify differentially expressed genomic regions associated with variations in grain mineral nutrients such as Ca, Fe, Zn, etc. as for these traits it is the changes in the expression of individual genes that could have more role than sequence-level variation (Kumar et al. 2016a).

4.7.6.4 Genomic Selection

The main factors limiting the utilization of large finger millet germplasm resources conserved in genebanks are the requirement of huge financial resources and technical expertise to identify potential lines for various desirable traits. In recent years, with the availability of cost-effective high throughput genotyping methods, genomic selection (GS) holds great promise for the selection of superior germplasm lines from the gene bank collections. (Muleta et al. 2017). The GS approach involves estimation of genomic estimated breeding values (GEBV) of traits in a reference population using genome-wide markers and phenotypic data, and subsequently, these GEBVs are used to predict the performance of respective traits in a related genotype sets exclusively based on the genomic data. Since the GS approach does away with multilocation/multiyear evaluation of germplasm required for trait identification, it can facilitate early selection of useful germplasm from genebank collections and accelerate genetic gain in the breeding program. In the beginning, the GS approach was mainly considered for the selection of lines in the breeding population, however, recently a few studies have reported the potential of GS in the identification of desirable traits in a germplasm collection of crops such as soybean, wheat, rice, etc. (Muleta et al. 2017; Kehel et al. 2020). In soybean, a GS model for white mold disease developed using a reference population of a diverse set of lines, could reliably identify white mold resistance genotypes from the United States Department of Agriculture (USDA) soybean germplasm collection using their genotyping data. In the case of finger millet, GS has not been reported so far as there was very limited genomic resources information available in this crop. However, with the availability of finger millet genome sequence and advanced genotyping methods, this approach can be potentially considered for selecting potential lines for yield contributing traits, blast resistance, and rich in micronutrient contents from genebank collection. The development of GS prediction models in finger millet would require constituting diverse finger millet germplasm lines with variations for various traits. The panel can be densely genotyped and phenotyped for various traits under various environments and the GEBV is calculated. The GEBV values for traits are used to select useful lines from the genebanks exclusively based on genomic information.

4.8 Concluding Remark and Future Prospects

4.8.1 Pan-Genome Sequencing: Constitution of Pan and Super Pan Genomes of Finger Millet and Its Wild Relatives

The actual assessment of the extent and pattern of genetic diversity in finger millet germplasm collection is critical for its conservation and efficient utilization in the breeding program. However, so far, the genetic diversity studies in finger millet have used traditional PCR-based markers and a limited number of genotypes (Vetriventhan et al. 2020), which have failed to provide a comprehensive picture of the actual genomic diversity in genepool of finger millet at the level of total genes. Recently, with a decrease in sequencing costs, NGS-based genotyping approaches such as GBS/resequencing are becoming popular for characterizing genetic diversity in crop genebank collections (Milner et al. 2019). However, both these approaches have some limitations. The GBS, approach which involves only partial sequencing of individual genomes, may not be able to capture genomic diversity/sequence variation at a large portion of the genome across the studied set of genotypes/genebank collection. Similarly, in the resequencing approach, SNP/InDel variants are identified by aligning sequencing reads of each accession against the single reference genome so the genomic regions/genes that are present in one or some individuals but absent in the reference genome cannot be analyzed/compared using this approach. Therefore, currently, the emphasis is on pan-genome sequencing in which individuals of the targeted species are sequenced, de novo assembled, and compared to unravel the total genes present in the genepool of that species (Bayer et al. 2020). The pan-genome represents total genomic diversity available in a species collection and is comprised of a core set of genes that are present in all the individuals as well as variable genes which are absent in some individuals. The first pan genome in plants was constituted using seven wild individuals, which revealed a variable number of genes for seed composition, organ size and biomass, flowering and maturity time genes, etc. in soybean (Li et al. 2014). Thereafter pan genomes have been constituted in many plant species including rice (Stein et al. 2018), sunflower (Hübner et al. 2019), and sesame (Yu et al. 2019). Likewise, pan-genome sequence of finger millet collections could be constituted that would unravel novel genes/alleles for different traits and accelerate their utilization in breeding programs. Pan-genome sequencing can enable the comparison of genomes of ancestral and cultivated species and tracking of gene frequency during the domestication process and breeding.

4.8.2 Haplotype-Based Breeding

Recently, haplotype-based breeding is emerging as a potential strategy for designing tailor-made crops. The approach involves mining of superior haplotypes of genes controlling agronomically useful traits from germplasm collection and their deployment in the best combinations for developing high-yielding superior varieties (Sinha et al. 2020). Some studies in this direction have already been initiated in rice and pigeon pea. In the coming years, resequencing of finger millet germplasm resources/core set is expected to unravel allelic diversity and thus would aid in harnessing genetic diversity. This would facilitate the identification of genotypes with novel superior alleles for agronomical and yield traits that could in turn be used in finger millet improvement. For finger millet, low productivity is one of the major factors limiting its wide-scale cultivation, haplotype-based breeding may enable mobilization of superior alleles and would pave the way for the development of tailor-made finger millet varieties.

The developments like GWAS, genome sequencing, transcriptome analysis, trait mapping, etc. have ensued in finger millet but not to the extent as required (Fig. 4.2). Enhanced usage of genomic tools and approaches is the need of the hour to efficiently utilize the huge amount of diversity available in the genebanks for food and nutritional security, biotic and abiotic stress resilience, and overall sustainability. Further genomic selection, associative transcriptomics, pan-genome sequencing, and haplotype-based breeding approaches should be employed to accelerate finger millet crop improvement.

Fig. 4.2
figure 2

Overview of genetics and genomic developments in terms of characterization of diversity and prospects of molecular makers in finger millet