Introduction

Pigeonpea [Cajanus cajan (L.) Millspaugh] is a protein-rich food legume that caters to the dietary demand of more than a billion people in the developing world (Valenzuela 2011). Multiple uses of pigeonpea as food, livestock feed/fodder and domestic firewood make it a sustainable crop of small-holding farmers in the marginal and risk-prone rain-fed conditions (Saxena 2008). Globally, 6.8 m t of pigeonpea is harvested annually from 7 m ha area. Eighty-three per cent of global pigeonpea production is shared by Asia, while Africa and Americas produce 14.2% and 2.2%, respectively (FAOSTAT 2017). The top five producers of pigeonpea are India (4.8 m t), Myanmar (0.79 m t), Malawi (0.47 m t), Tanzania (0.27 m t) and Kenya (0.20 m t).

Efforts to improve pigeonpea with breeding tools have met with a considerable success, and over 140 varieties have been developed over the last five decades for cultivation across diverse agroecological zones in India (Singh et al. 2017a). However, the actual yield (around 800 kg/ha) of pigeonpea remains well below its potential yield of over 2 t/ha (Varshney et al. 2012). Therefore, there is a need to improve the efficiency of current breeding activities to sustain yield gains in the face of increasing climate vagaries. In this context, large-scale genomics tools such high-throughput DNA markers, saturated genome maps, comprehensive transcriptome assemblies, whole-genome assemblies and importantly, the DNA markers associated with the breeding traits were developed to support pigeonpea improvement (Table 1). Also, the cytoplasmic male sterility (CMS)-based hybrid system has emerged as a promising tool to boost pigeonpea yield. Recent adoption of modern breeding tools including genomic selection (GS) and speed breeding or rapid generation turnover (RGT) is likely to enhance genetic gains in pigeonpea. Delivering the gains to farmers’ field, however, demands strengthening seed systems and agronomic practices. We begin this review by discussing latest developments in pigeonpea genomics with an emphasis on next-generation resources. This is followed with sections that explain how pigeonpea improvement can be accelerated by employing new genomic designs in combination with efficient use of genetic diversity, heterosis and RGT. We also highlight increasing role that improved seed systems could play in pursuit of the goal of achieving productivity gains in farmers’ fields.

Table 1 Genomic resources in pigeonpea

Modern genomic resources for pigeonpea

Latest developments in pigeonpea genomics have allowed generation of a variety of modern genomic tools and technologies for application in crop improvement (Fig. 1). In this section, we present a brief account on genomic resources currently available for use in the genetic enhancement of pigeonpea with a focus on high-density assays.

Fig. 1
figure 1

A genomics-based holistic approach to deliver gains at farmer’s field. This approach demands a coordinated improvement in germplasm, breeding and seed systems. As recommended by Khoury et al. (2015), germplasm collection programs of pigeonpea should be better informed. Operational efficiency of pigeonpea breeding programs can be enhanced by deploying modern tools and technologies like MAS, GS, RGT, etc. Concurrently, efficient seed systems in place will ensure rapid delivery of such improved cultivars at farmers field followed by their timely replacement, reflected as a dramatic improvement in SRR and VRR

Marker technologies

Molecular markers remain indispensable to genomics research and molecular breeding for genetic enhancement of any crop. Initially, in pigeonpea, various molecular markers such as restriction fragment length polymorphism (RFLP) (Nadimpalli et al. 1994; Sivaramakrishnan et al. 1997, 2002), random amplified polymorphic DNA (RAPD) (Ratnaparkhe et al. 1995) and amplified fragment length polymorphism (AFLP) (Panguluri et al. 2006) were developed and used in the assessments of genetic diversity and trait-specific molecular mapping. The development of simple sequence repeat (SSR) markers catalyzed pigeonpea molecular analyses. Initially, the procedures for developing SSR markers from genomic libraries (Burns et al. 2001; Saxena et al. 2010a) and mining of expressed sequenced tags (ESTs) (generated using Sanger platform) were time-consuming and expensive. Overcoming this, survey of BAC-end sequences (BESs) resulted in the development of first large-scale set of 3072 SSR markers for genotyping applications in pigeonpea (Bohra et al. 2011). In different crops, the SSRs were the markers of choice during 2000–2010 owing to their genome abundance, multiallelic, codominant nature and ease of genotyping. However, in the case of cultivated pigeonpea, limited molecular (genetic) diversity hindered the application of SSRs in marker–trait association (MTA) studies. Therefore, the attention of pigeonpea researchers shifted toward high-throughput, automated and cost-efficient DNA marker systems. To this end, diversity arrays technology (DArT), a hybridization-based highly parallel genotyping protocol, generated thousands of polymorphic loci in pigeonpea that were used for genetic diversity analysis and linkage mapping (Yang et al. 2006, 2011). Further, next-generation sequencing (NGS) technologies were used for marker development in pigeonpea. Single-nucleotide polymorphism (SNP) markers have now become the most preferred DNA marker given their higher genome abundance, amenability to automation and ubiquitous distribution throughout the genome. The concomitant development of genotyping platforms such as BeadXpress (Roorkiwal et al. 2013) and GoldenGate assay (Kassa et al. 2012) in pigeonpea allowed medium- to high-throughput SNP genotyping. Saxena et al. (2011) employed a cost-effective KASP technology to genotype 1616 SNPs, referred to as pigeonpea KASP assay markers. Genome-wide SNP analysis of diverse pigeonpea accession has helped discerning crop domestication and demographic history of pigeonpea (Saxena et al. 2014).

Owing to “ease in library preparation” and “greater multiplexing capacity,” genotyping methods like genotyping-by-sequencing (GBS) have offered a promising way to simultaneously discover and genotype thousands of SNPs (Saxena et al. 2017a). Other similar approaches like whole-genome resequencing (WGRS)/skim sequencing have also been greatly facilitated by the availability of reference genome sequence of pigeonpea. However, the inherent problems with GBS assay such as a large number of missing data and “ascertainment bias” encouraged researchers to establish array-based platforms for high-density genotyping in pigeonpea. To this end, resequencing of diverse germplasm and advanced breeding lines of pigeonpea has facilitated development of Axiom Cajanus SNP array with 56,512 unique and informative sequence variations tiled on the array (Saxena et al. 2018a). Notably, the inclusion of 1554 SNPs and 385 insertion/deletion (InDel) polymorphisms potentially associated with some key agronomic traits renders the array to be more suitable for identification of new haplotypes for associated traits.

The first-generation HapMap of pigeonpea based on WGRS data of 20 Cajanus accessions revealed 5.5 million genome-wide variants including 4.6 million SNPs and 0.7 million InDels along with large structural variations (SVs) like copy number variation (CNV: 2598) and presence/absence variation (PAV: 970) (Kumar et al. 2016). More recently, Varshney et al. (2017) performed WGRS of 292 accessions including landraces, elite breeding lines and wild accessions and the study divulged information on large SVs (≥ 1000 bp) in breeding lines (282 CNVs, 35 PAVs), landraces (228 CNVs, 37 PAVs) and wild species accessions (173 CNV, 77 PAVs). Among larger SVs, the CNVs and PAVs are gaining attention, with a growing body of research associating these SVs with agronomic traits including flowering time, stress response and evolutionary transitions.

High-density genetic maps

High cost of genotyping coupled with inadequate polymorphic loci has greatly hampered the development of high-density genetic maps in pigeonpea. In 2011, the first SSR-based genetic map of pigeonpea with 239 loci was developed for an interspecific F2 population [Cajanus cajan acc. ICP 28 × Cajanus scarabaeoides (L.) Thouars acc. ICPW 94]. This was accompanied by the development of a series of genetic maps for cultivated pigeonpea; however, the marker density of these maps remained extremely poor (Gnanesh et al. 2011; Bohra et al. 2012). The marker densities of the genetic maps in pigeonpea have now improved dramatically in concert with the advances in de novo discovery and high-density genotyping of sequence variants (Table 2). The first high-density genetic map of pigeonpea was developed using SNP markers for an interspecific F2 population (Saxena et al. 2012). This 910-loci map had an average intermarker distance of 1.11 cM. As shown in Table 2, other high-density genetic maps of pigeonpea include 1101 loci (Saxena et al. 2017a), 964 loci (Saxena et al. 2017b) and 787 loci (Saxena et al. 2017c). Apart from population-specific genetic maps, Arora et al. (2017) developed a 932-loci consensus genetic map that consolidates marker information from three F2 populations (Asha × UPAS 120, Pusa Dwarf × H2001-4 and Pusa Dwarf × HDM04-1). Earlier, a 1039-loci consensus genetic map was constructed for cultivated pigeonpea using six mapping populations (Bohra et al. 2012). To date, the most comprehensive genetic map of pigeonpea harbors 6818 SNP loci that span 974 cM of the genome (Yadav et al. 2019). Availability of such high-density genetic maps is a key to understand the genomic architecture of essential agronomic traits, for fine mapping of gene(s)/QTL, facilitating comparative genomics and anchoring of sequence scaffolds (Hu et al. 2018).

Table 2 Recent high-density genetic linkage maps developed in pigeonpea

Comprehensive transcriptomic resources

Development of transcriptomic resources in pigeonpea has yielded valuable insights into the spatiotemporal expression of important genes, their regulatory mechanisms and participation in a variety of biological processes. Also, a plethora of functional or genic molecular markers were generated for genetic research and breeding applications in pigeonpea. Based on Sanger sequencing of cDNA libraries, first set of transcriptomic resources comprising 9468 high-quality ESTs was developed that served for identification of candidate genes for fusarium wilt (FW: 19 genes) and sterility mosaic disease (SMD: 20 genes) and a set of 3583 SSRs. Recent shift from conventional gene expression methods to digital platforms has leveraged the repertoire of transcriptomic resources. Multiple transcriptome assemblies with 21, 434 transcript assembly contigs (TACs) (Kudapa et al. 2012), 48, 726 TACs (Dubey et al. 2011) and 43, 324 TACs (Dutta et al. 2011) have been reported in pigeonpea with the help of Illumina and FLX/454 platforms. Recent gene expression studies in pigeonpea have shed light on the plant stress response and offered a set of stable reference genes to facilitate expression studies under conditions stressed by drought, heat and salinity (Sinha et al. 2015a, b).

A global view on the gene expression may help in bridging the gap between the whole-genome sequence information and the plant phenotypes. Moreover, the gene functions of genome assemblies are assigned according to homology and de novo gene prediction programs. To complement this gene information, the Cajanus cajan gene expression atlas (CcGEA) has been developed with RNA Seq information covering entire life cycle of pigeonpea (Pazhamala et al. 2017). The CcGEA catalogs a set of 28,793 genes expressed during different developmental stages (embryo to senescence), with a special emphasis on genes engaged in fertilization and seed formation. Importantly, the CcGEA suggests role of posttranscriptional modifications or epitranscriptomics in seed and embryo development in pigeonpea. Co-expressing network analysis allowed identification of 28 genes and three highly connected genes or hub genes for flower-related traits in pigeonpea. Authors advocated that focusing on such gene clusters or modules involved in common biological function would be more rewarding than seeking expression pattern of individual candidate gene. Refinements in the genome assemblies in combination with growing pangenomic studies will fuel upgradation of the current CcGEA, thus increasing opportunities to accelerate basic and applied research (Pazhamala et al. 2017).

Draft genome sequence

Advances in sequencing technologies have facilitated the creation of draft genome assemblies in different legume crops (Bohra and Singh 2015), and pigeonpea became the first orphan crop and the second food legume after soybean to be sequenced using a de novo sequencing approach. With the help of Illumina technology and the Sanger-sequenced BESs, 605.78 Mb of the genome was assembled for genotype Asha (ICPL 87119), with a scaffold N50 of 516.06 kb (Varshney et al. 2012). The genome assembly contains a total of 48,680 genes with mean transcript length of 2348.70 bp. Analysis of the genome assembly provided new insights into important traits such as drought response. Interestingly, the number of drought-responsive genes (111) in pigeonpea genome is similar to that of soybean (109) notwithstanding the genome duplication event in soybean. Approximately half of the pigeonpea genome (51.6%) is composed of repeat elements (REs), with majority of the REs in the genome belonging to long terminal repeat (LTR) retrotransposons. In another whole-genome sequencing attempt, 510 Mb genome of the genotype Asha was assembled following analysis of sequence reads generated using 454 GS-FLX platform. The authors found a total of 47,004 protein-coding genes in the genome assembly, with 1213 and 152 genes engaged, respectively, in diseases response and abiotic stress (Singh et al. 2011).

The genome sequence analysis provided new clues about the pigeonpea genes affected by domestication and breeding, thus revealing breeding targets for future improvement. Besides offering biological insights on genome evolution, the reference genome sequence provides a rich resource for genome-scale marker discovery and resequencing and other NGS-based applications. However, the single reference genome remains inadequate to understand the enormous genetic diversity contained in the wide crop germplasm collections including landraces and wild relatives. Toward this end, sequencing of multiple genomes has enabled deeper investigations on genetic diversity and allelic variation controlling important traits (Varshney et al. 2017; Tao et al. 2019). Also, accession-specific or unique molecular signatures as elucidated from resequenced genomes hold great relevance in terms of varietal identification and genotypes adapted to particular agroecologies (Kumar et al. 2016).

Next-generation mapping resources and high-resolution trait dissection

A variety of gene/QTL affecting important agricultural traits have been mapped in pigeonpea using conventional gene/QTL discovery methods (Varshney et al. 2013; Bohra et al. 2017a, 2019). These conventional methods relying on low-throughput marker systems face challenges in terms of time, cost and labor. Also, the low density of polymorphic markers limits the resolution of genetic mapping achieved by these methods (Schneeberger 2014). The following section covers innovative experimental designs and mapping techniques applied in pigeonpea that overcome the shortcomings of conventional procedures of gene discovery.

Gold standard experimental designs

Narrow diversity and limited recombination constrain the resolution of genetic mapping based on biparental populations. This problem can be addressed by adopting complex experimental designs that accommodate multiple founders and profuse genome reshuffling via inter-mating (Huang et al. 2015; Bazakos et al. 2017). These high-power and high-resolution mating schemes, termed as “gold standard experimental designs” (Wallace et al. 2018), have emerged as long-term genetic resources to elucidate genetic architecture of important agronomic traits in various crops. Two such designs multiparent advanced generation intercross (MAGIC) and nested association mapping (NAM) have been used extensively in crop genetic studies. These multiparental populations remain particularly suitable to map adaptive loci as their “balanced” nature separates population structure from the adaptation. The pigeonpea MAGIC panel involves eight founder parents with seven funnels (see Huang et al. 2015).

NGS-based trait mapping

With the advent of NGS technologies and availability of the reference genome sequence, next-generation trait mapping approaches, particularly sequencing-based bulked segregant analysis (Seq-BSA), have been employed in pigeonpea for rapid gene discovery. Combining Seq-BSA with nonsynonymous (ns) SNP substitution approach led to discovery of eight nsSNPs in seven genes (Singh et al. 2016). Seq-BSA was applied in extreme bulks obtained from the recombinant inbred (RIL) population (ICPL 20096 × ICPL 332), while nsSNP substitution approach was based on the WGRS data of four pigeonpea genotypes (ICPL 20097, ICP 8863, ICPL 99050 and ICPB 2049). Four of the nsSNPs were associated with FW, while remaining four showed associations with SMD. In silico protein analysis and gene expression profiling provided evidences for causativeness of genes C.cajan_01839 and C.cajan_03203 for SMD and FW resistance, respectively. A similar InDel Seq approach aimed to identify genomic regions associated with SMD and FW in pigeonpea revealed a total of 16 Indels, of which five were further validated from analysis of resequencing data (Singh et al. 2017b). Three InDels responsible for FW resistance were located on linkage groups (LGs) 2, 7 and 8, while the two controlling SMD resistances were located on LGs 2 and 10. Recently, WGRS data of high seed protein content (SPC) lines (HPL 24, ICP 5529), a low SPC line (ICP 11605) and the draft genome of ICPL 87119 (low SPC) were subjected to common variant analysis for detection of candidate genes associated with the SPC (Obala et al. 2019). This strategy in combination with gene function analysis showed 108 ns sequence variations in 57 candidate genes. Sanger sequencing confirmed sequence variations in 37 genes, and a subset of these variants (30) was converted to cleaved amplified polymorphic sequence (CAPS)/derived CAPS (dCAPS) markers. A possible association between four CAPS/dCAPS markers and the SPC trait was demonstrated in an F2 population. In addition to WGRS, other high-density SNP genotyping approaches such as GBS and SNP array have also been employed in pigeonpea for enhanced trait dissection (Table 3). For example, high-density Axiom array data on a RIL population facilitated identification of QTL explaining up to 50.6% of variation for important traits such as cleistogamous flower, shriveled seed and seed size (Yadav et al. 2019).

Table 3 QTL for important traits in pigeonpea as revealed by next-generation sequencing (NGS)-based trait mapping

Association mapping/genome-wide association studies (GWAS)

Association analysis or genome-wide association study (GWAS) is another approach that addresses the issue of low precision of conventional QTL mapping technique. The GWAS, relying on historical recombination, is recommended to gain enhanced understanding of the genetic underpinning of complex traits (Ingvarsson and Street 2011; Korte and Farlow 2013; Huang and Han 2014; Liu and Yan 2019). GWAS enables dissection of complex phenotypes in a set of distantly related individuals with higher number of alleles and greater mapping resolution. Association mapping of 94 pigeonpea genotypes with 6144 DArT and 768 SNP markers allowed authors to detect significant MTAs involving 19 SNPs (R2 = 8.05–8.58%) and six DArTs (R2 = 9.79–14.53%) for determinacy trait (Mir et al. 2012). Later, candidate gene-based analysis of 142 pigeonpea germplasms coupled with the biparental QTL (ICPA 2039 × ICPR 2447) and gene expression analyses suggested strong association of CcTFL1 with the determinacy trait (Mir et al. 2014). Association mapping of 89 FW-responsive pigeonpea genotypes revealed association of six SSR markers (HASSRs 8, 18, 30, 121 128, 174) with FW resistance (Patil et al. 2017a), and two of these were further validated through analysis of F2 individuals (Patil et al. 2017b).

A total of 241 MTAs were detected for various agronomic traits following GWAS of 286 resequenced pigeonpea accessions. Interestingly, the abundance of MTAs on LG 9 (such as 86 MTAs for days to 50% flowering) underscored the prominent role of this particular pseudomolecule and the associated SVs in pigeonpea domestication and breeding (Varshney et al. 2017). Notwithstanding the advantages of association genetics over biparental QTL analysis, only limited association mapping studies have been conducted so far in pigeonpea. Nevertheless, we anticipate that the increased access to high-throughput genotyping and wide-scale phenotyping will encourage pigeonpea researchers to perform high-resolution trait analysis in pigeonpea by using efficient genomic designs like GWAS.

Advances in hybrid breeding

Hybrid breeding is an important research arm of yield enhancement program of pigeonpea. The hybrid breeding project launching activities began with the discovery a natural source of genetic male sterility (GMS) in a germplasm line (Reddy et al. 1978). Subsequently, the success in breeding cytoplasmic–nuclear male sterility (Saxena et al. 2005) gave wings to this program. In the following section, we briefly discuss the major breakthroughs that led to the development of a full-scale commercial hybrid breeding technology in pigeonpea.

Male sterility systems

Being an often cross-pollinated crop, pigeonpea offers exciting avenues to harness hybrid vigor (Bohra et al. 2017b). Discovery of male sterility systems in pigeonpea paved the way for hybrid development by circumventing the need for manual emasculation. The male sterility trait has emerged as a boon for hybrid seed industry. Since the application of GMS in commercial hybrid production was a herculean task in terms of hybrid seed production, this program was abandoned.

Cytoplasmic nuclear male sterility (CMS) system, which is characteristically inherited through maternal tissues and this phenomenon, has the ability to produce hybrid seed economically. According to Horn et al. (2014), the CMS system has been reported in over 150 plant species, and among legumes it has been reported in soybean, common bean, pigeonpea and faba bean (Saxena and Hingane 2015; Bohra et al. 2016). Rigorous efforts by pigeonpea breeders have yielded diverse sources of CMS systems. These include A1 [Cajanus sericeus (Benth. ex Baker) Maesen], A2 (Cajanus scarabaeoides), A3 [Cajanus volubilis (Blanco) Blanco], A4 [Cajanus cajanifolius (Haines) Maesen], A5 (Cajanus cajan), A6 [Cajanus lineatus (Wight & Arn.) Maesen] and A7 [Cajanus platycarpus (Benth.) Maesen] (see Saxena et al. 2010b and references therein). Saxena (2013) later reported A8, a new CMS system through crossing a wild species [Cajanus reticulatus (Aiton) F. Muell] with cultivated pigeonpea. Of these eight CMS sources, only A2 and A4 have been used for the development of pigeonpea hybrids in India (Table 4).

Table 4 CMS-based hybrids of pigeonpea developed for diverse agroecologies in India

Recently, a temperature-sensitive genic male sterility (TGMS) line was obtained from the cross between a cultivated line (IPCA 85010) and the wild relative C. sericeus (Saxena 2014). The line becomes male sterile at higher temperatures (≥ 25 °C) and remains male fertile at lower temperatures (≤ 24 °C). Such two-line breeding system simplifies both line maintenance and hybrid seed production since the same TGMS line serves as male sterile and male fertile, and also any fertile line can act as pollen parent for hybrid development (Chen and Liu 2014).

Genomics interventions in hybrid breeding

The CMS system includes male-sterile line or A-line, maintainer line or B-line and fertility restorer line or R-line. B-lines are crossed with cognate A-lines to maintain A-lines, whereas hybrids are produced by crossing A-lines with R-lines. Maintenance of genetic purity of parental lines and hybrid is thus inevitable and crucial to harness the benefit of high heterosis and is the key to bring success in hybrid technology of any economically important crop. The genetic purity during seed production is prone to contamination due to the presence of pollen shedders as well as physical admixtures during processing.

Traditionally, grow-out test (GoT) on representative samples of the seed lot is performed to assess the purity of hybrid seeds. The GoT involves growing plants to maturity and assessing several morphological and floral characteristics to determine the purity of the hybrid. Many companies/breeding programs undertake GoT in several crops for assessing the purity of hybrid seed before the sale. In the case of pigeonpea, however, as only one cycle of crop can be taken annually, it will take almost a full year to assess the purity of hybrid seeds by using conventional GoT. This, in turn, will result in major capital investment in storing the seeds for the next cropping season/year. Furthermore, GoT can be subjective, as several aspects of plant phenotype can be affected by environmental conditions. The duration and other limitations of conventional GoT can be managed effectively by applying molecular markers.

Molecular markers for CMS

Based on the mitochondrial genome sequencing in pigeonpea, candidate chimeric open reading frames (ORFs) and genes were identified in A-line (ICPA 2039) and its maintainer line (ICPB 2039) (Tuteja et al. 2013). Further, comparative analysis at sequence and expression level between ICPA 2039 and ICPB 2039 of 34 mitochondrial genes provided a deletion of 10 bp in nad7a in the CMS line. The 10 bp deletion has been used to develop PCR-based InDel marker (nad7a_del) and validated for testing genetic purity of A4-derived CMS lines (Sinha et al. 2015c).

Markers for restoration of fertility (Rf) in A4 hybrid system

Two markers, namely “CcLG08_RFQI1” and “CcLG08_RFQI4,” have been developed to differentiate restorer lines from nonrestorer lines in A4 hybrid system in pigeonpea. These markers were developed from the QTL regions identified in a segregating F2 population derived from male-sterile (ICPA 2039) line and fertility restorer (ICPL 87119) line (Saxena et al. 2018b).

The developed markers for CMS and Rf will reduce phenotyping activities for the identification of CMS and restorer lines. Moreover, suitability of these DNA markers to low-cost genotyping helps overcoming the technical challenges faced with advanced genotyping methodologies.

Markers for hybrid seeds testing

In order to assess the purity level in hybrid seeds, initially SSR markers have been developed for few hybrids such as GTH 1, ICPH 2438, ICPH 2671, IPH 09-5 and IPH 15-03 (Saxena et al. 2010c; Patel et al. 2012; Bohra et al. 2012, 2015, 2017c). However, in practical hybrid breeding, purity assessment of large amount of seeds through SSRs could be expensive. Therefore, low-density SNP assays have been optimized for several hybrids to bring the genotyping cost down for marker-based hybrid purity (http://cegsb.icrisat.org/high-throughput-genotyping-project-htpg/). On this platform, data on ten SNPs could be generated in US $ 1.5–2 per sample including DNA extraction. These SNPs are being used in routine breeding programs (unpublished).

Toward construction of heterotic pools and inferring heterotic patterns

In the long term, delivery of hybrids with greater heterosis for grain yield from any crop breeding program necessitates the establishment of heterotic groups and heterotic patterns (Longin et al. 2013). Melchinger and Gumber (1998) have defined heterotic group as “a set of genotypes which display a comparable hybrid performance when crossed with another genetically distinct set of genotypes” and heterotic pattern as “a specific pair of heterotic groups showing an optimum exploitation of heterosis” (Boeven et al. 2016). Earlier studies on defining heterotic groups in crops relied on estimates of combining ability and heterosis, and pedigree details. In pigeonpea, however, limited attention has been placed on this aspect with only one study reporting the development of heterotic groups (Saxena and Sawargaonkar 2014). The authors have constructed seven heterotic groups based on the performance (in terms of SCA effects) of 102 hybrids derived from crossing three CMS lines (ICPA 2043, ICPA 2047, and ICPA 2092) to 34 testers. Molecular marker technologies have emerged as great add-on to the traditional way of identifying heterotic groups.

Unlocking genetic diversity of breeding relevance

Use of diverse genetic resources in breeding programs and understanding of the genomic diversity in the gene pools are crucial for accelerating crop improvement. Conventionally, morphological observations provide an idea about the presence of genetic divergence among different genotypes available with breeders for executing breeding programs. Such information, however, is biased due to various genotype–environment interactions. Even the multilocation data are not capable of eliminating this bias totally, and this may lead to some degree of inefficiencies in selection. Following the advent of DNA marker technologies, a shift was evident in the procedures applied to estimate the diversity at molecular level involving the genotypic data. Unfortunately, the molecular marker data, however, are deemed unsuitable to provide information about genetic diversity of key adaptive traits (Jackson et al. 2011).

In pigeonpea, the estimation of genetic diversity has been explored using the gene-based functional markers; and more recently by using WGRS of diverse accessions. As mentioned earlier, the recent discovery of genome-wide SNP markers in pigeonpea has offered novel insights about the germplasm diversity and relationships among breeding materials. Even prior to the whole-genome sequencing and resequencing of pigeonpea, the SNP-based analysis of diverse germplasm including wild species indicated a severe “domestication bottleneck” in pigeonpea (Kassa et al. 2012). Another analysis in pigeonpea with 1616 SNPs confirmed its origin in central Indian provinces, Madhya Pradesh and occurrence of domestication processes in Andhra Pradesh (Saxena et al. 2014).

Latest research using high-density SNP arrays and resequencing data has revealed a trend of genetic diversity and breeding targets in legumes (Bohra and Singh 2015). For instance, WGRS data of 292 accessions including breeding lines, landraces and wild species (C. cajanifolius, C. scarabaeoides and C. platycarpus) suggested the selection sweeps on LG 9. A less severe domestication bottleneck from landraces to modern varieties suggested a relatively less intensive breeding history. Earlier, Kumar et al. (2016) using the WGRS data of 20 Cajanus accessions observed frequent occurrence of gene loss events in domesticated forms as compared to the wild species. The analysis of 63 released cultivars and 40 germplasm/landraces with 50 K SNP array implied toward a near-constant diversity in pigeonpea breeding programs over the last decades (Saxena et al. 2018a), and this was congruent with the trend demonstrated earlier with SSR markers in pigeonpea (Bohra et al. 2017c). Enhancing pigeonpea yield with climate adaptation will require reintroduction of the diversity lost during the process of domestication and selection (Kumar et al. 2016). Also, the high-density genotyping/sequencing data on diverse germplasm panels may be combined with phenotypic data to unleash new useful alleles associated with breeding traits. The fast-track identification and introgression of these useful alleles into adapted germplasm can be accomplished with the help of breeding techniques like speed breeding, genomic selection and marker-assisted backcrossing (MABC) (Li et al. 2018a). Besides efficient utilization of existing genetic resources, breeding pigeonpea for adaptation traits also requires a targeted germplasm collection strategy based on a comprehensive analysis of current gene bank. Khoury et al. (2015) highlight urgency for systematic ex situ collection of the Cajanus species, which should be informed by ecological/geographic gaps and presence of the potential adaptive traits. Nearly 80% of the Cajanus species studied by Khoury et al. belonged to “high-priority category” for targeted collection from “high-value” regions such as India and northern Australia.

Genomic selection for improving breeding efficiency

Molecular mapping of important traits in pigeonpea has revealed a variety of candidate gene(s)/QTL having substantial effects on the phenotype (Table 3). Deploying these QTL into elite backgrounds enables faster development of the genotypes with considerable improvement in the transferred trait. Unlike other major legume crops, molecular breeding in pigeonpea is in infancy and the recent initiatives undertaken include transfer of major QTLs associated with stress and defense response (see Varshney et al. 2019). The MABC technique is effective in transferring limited number of major effect gene(s)/QTL and is not adequate to address complex plant traits controlled by various gene(s)/QTL having small effect on phenotype. In the context, ability of the GS to account for minor-effect loci is extremely important with regard to prediction of performances of agriculturally important traits, majority of which follow polygenic inheritance. The concept of GS was first proposed by Meuwissen et al. (2001) in animals for the estimation of breeding values of unobserved phenotypes based on genome-wide marker data. GS enables selection of superior individuals using genome-wide marker data and circumvents the need for phenotypic score of those individuals. Genomic estimated breeding value (GEBV) that forms the basis of genotype selection can be predicted using a variety of GS models (Crossa et al. 2017). Unlike MAS, GS avoids overestimation of marker effects through circumventing the need for a set of markers having significant associations (Duhnen et al. 2017).

Moderate to high accuracies were obtained in legumes with GS prediction for various traits including yield (0.59: Jarquín et al. 2014; 0.47–0.62 : Duhnen et al. 2017) and protein content (0.50–0.55 : Duhnen et al. 2017) in soybean, and ascochyta blight disease resistance (0.56: Carpenter et al. 2018) and seed weight, seeds per plant and flowering time (0.65–0.83: Tayeh et al. 2015) in pea. In pea, Annicchiarico et al. (2017) have established the superiority of GS over phenotypic selection (PS) with regard to the prediction of grain yield under drought conditions. Similarly, higher efficiency of GS as compared to PS was evident in soybean from prediction models trained with phenotypic and genotypic data of 324 genotypes (Matei et al. 2018). The study found no significant influence of population structure on the GS prediction accuracy. This observation remains in agreement with a recent study in chickpea where GS was applied for yield and other traits in 320 lines (Roorkiwal et al. 2016). The authors reported varying prediction accuracies between 0.13 (seed yield) to 0.91 (100 seed weight). These studies highlight the substantial influence of the size and composition of the training population on the GS prediction accuracy. On the other hand, changes in statistical method and the marker density had limited impact on the prediction accuracy (Tayeh et al. 2015; Roorkiwal et al. 2016; Duhnen et al. 2017). Recent studies suggested that genomic prediction could be improved by using multienvironmental settings and by incorporating G × E interactions into GS models (multiplicative reaction norm model: Jarquín et al. 2014, Roorkiwal et al. 2018). Inclusion of epistasis in GS models is also reported to contribute to improved prediction accuracy (Duhnen et al. 2017). Though GS does not need a set of markers associated with the trait, incorporating information about the significant markers is shown to improve accuracy of genomic prediction (Spindel et al. 2016). For instance, in chickpea prediction accuracy was improved by twofold when using a subset of SNPs selected from GWAS instead of entire WGRS-SNPs (Li et al. 2018b).

In recent years, genomic prediction has been applied for the identification of high-yielding heterotic patterns in crops. In wheat, Zhao et al. (2015) phenotyped a training set comprising 135 parental lines and 1604 hybrids at 11 different locations. Parental lines were assayed with 90 K SNP array, while the genotyping data of hybrids were “deduced” from parental SNP profiles. The model trained from this set was then used to predict the performance of a total of 9045 hybrids including 7441 unobserved phenotypes. Varying degrees of relatedness between the training and test sets yielded prediction accuracies from 0.32 to 0.89. This was followed by identification of heterotic groups and a high-yielding heterotic pattern from predicted hybrid performance using simulated annealing algorithm. A similar approach was adopted recently in pigeonpea where a training population of 396 resequenced lines and 435 hybrids was used to predict the performance of a total of 78,210 possible single-cross hybrids. Prediction accuracy among different test scenarios could reach up to 0.24. Simulated annealing algorithm identified two heterotic groups with 20 lines each, and the average yield of intergroup crosses was 25% more than the average of all test entries (Unpublished data).

Rapid generation turnover (RGT) technologies

The rate of genetic improvement in current crop breeding programs is not adequate to feed the burgeoning population worldwide. The longer time required to develop a cultivar following crossing of the parents is primarily responsible for this slow progress of applied research programs and breeding (Ghosh et al. 2018). Following breeder’s equation, response to artificial selection on a trait is inversely related to the length of the breeding cycle. In other words, any technology that shortens the length of the breeding cycle can improve genetic gains accrued from crop breeding programs. Speed breeding (SB) is one of such state-of-the-art breeding techniques that improves genetic gains through accelerating generation turnover (Hickey et al. 2019). Under controlled temperature conditions, extending daily exposure to light coupled with harvesting of premature seeds has been reported to accelerate generation turnover. The SB protocols based on extended photoperiod are highly suitable for long-day/facultative and long-day/neutral plants. Among legume crops, the RGT technologies have been reported for chickpea, pigeonpea and groundnut (Varshney et al. 2019).

In pigeonpea, four generations of early maturing genotypes ICPL 4, ICPL 151, ICPL 85024 and ICPL 87093 were obtained using RGT in 349, 367, 313 and 338 days, respectively (Saxena 1996; Saxena et al. 2017d). The generation time was reduced by harvesting immature seeds from 35-day-old crop grown in a glasshouse supplied with natural light, and temperature and relative humidity maintained at 28–32 °C and 50–60%, respectively. Single pod descent method of breeding was applied for generation advancement. More recently, Saxena et al. (2019) present a strategy to breed early-duration pigeonpea cultivars in a time span of 4 years using RGT. Notwithstanding this, the utility of SB for hastening breeding cycles in medium- and long-duration pigeonpea remains to be seen. Results of Carberry et al. (2001) established qualitative response of pigeonpea toward photoperiod where a certain photoperiod length is required for flowering. A 12-h day length has been established as the optimum photoperiod in pigeonpea (McPherson et al. 1985) exceeding to which is shown to delay flower onset, and the delay is most stark in the case of long-duration types (Carberry et al. 2001). Below a 12-h period, temperature variation is shown to influence flowering in pigeonpea (Silim et al. 2007). For instance, rapid flowering was observed in the extra-short-duration, short-duration, medium-duration and long-duration genotypes at optimum temperatures 24.7 °C, 23.1 °C, 22.2 °C and 18.3 °C, respectively. Such diverse response of pigeonpea genotypes of different maturity groups entails a thorough investigation on photoperiod and temperature. For example, flowering was induced in amaranth (a short-day plant) in 14 days under short-day (8 h, 30 °C) conditions preceded by two weeks of long-day (16 h, 35 °C) conditions (Stetter et al. 2016). Besides light quantity, there is a need to explore impact of light quality on flowering induction, as demonstrated in cool-season grain legumes by Croser et al. (2016). The study shows the importance of a low-red/far-red ratio (below 3.5) in combination with higher photon load in far-red region (52.9 μmol m−2 s−1) for accelerating flowering in five grain legume species (pea, chickpea, lentil, faba bean and lupin). The authors also suggested combining in vitro seed germination techniques with in vivo conditions optimized to hasten flower onset.

The SB protocols with single seed/pod descent are extremely useful for preserving genetic variability, and there is tremendous scope for combining SB recipe with the MAS or GS (SpeedGS; Voss-Fels et al. 2018). However, optimization of a SB recipe that is genotype-independent is crucial to broad applicability of the method.

Toward sequence-based breeding in pigeonpea

First-generation molecular breeding tools like MABC may not suffice to meet the rate of improvement that is currently expected from crop breeding. In view of this, we recently recommended sequence-based breeding (SBB) strategy for enhancing genetic gains in breeding (Varshney et al. 2019). The SBB strategy is applicable not only legumes but across a wide range of crop species. Unlike MAS/MABC, SBB strategy seeks constant population improvement and reaps benefits offered by current genomic technologies. It is proposed that founder parents/germplasm should be sequenced at considerable depth and based on the availability of phenotypic records in such panel, methods like GWAS should assess individuals with higher number of favorable alleles. This should accompany crossing among the superior genotypes to generate large-size segregating populations. Further selection of the genotypes from the breeding scheme should be based on the GEBVs obtained through GS models trained from data on founder parents/germplasm. The high-performing individuals thus identified can be directly released as a variety or should serve as raw material to commence next breeding cycle. Earlier, Poland (2015) also called for greater investments on plant breeding programs in order to understand the genome dynamics under selection in breeding programs. Integrating genomic resources with practical breeding programs will greatly leverage functional genomics and help to bridge the genotype–phenotype gap (Fig. 1).

Rapid varietal replacement systems for climate adaptation

Once improved cultivars are developed through genomic-assisted breeding and RGT technologies, their efficient dissemination to small-holder and marginal farmers will be crucial to replace obsolete varieties at farmers’ field. This rapid varietal turnover in turn will contribute to climate adaptation (Atlin et al. 2017). Pigeonpea has witnessed a slow rate of varietal turnover. For example, in India few years ago nearly 70% of indented breeder seed was of pigeonpea varieties which were released over 20–30 years ago (Chauhan et al. 2016). Low varietal replacement rate (VRR) is one of key reasons that have contributed to yield stagnation in pigeonpea. Though older pigeonpea varieties like NA 1, Asha and UPAS 120 still remain among the most sought-after in terms of BS indent in India, new government initiatives like withdrawal of seed subsidies for obsolete varieties have helped in strengthening the national seed systems to ensure cultivation of new cultivars along with continuous replacement of older varieties. This rising trend becomes evident from the latest indent for breeder seed of pigeonpea (2019–2020) where more than 50% of the indented seed quantity is shared by the varieties released over the last 10 years (https://seednet.gov.in/). Accordingly, seed (SRR) and varietal replacement rate (VRR) of pigeonpea has also shown an increasing trend over the last decade (Chauhan et al. 2016). As underscored by Atlin et al. (2017), rapid adoption of the crop cultivars bred in the current climates by the farmers depends heavily on coordination between the breeding programs and the seed systems.

Conclusions

Highly nutritious and stress-tolerant crops like pigeonpea are important in view of increasing need for sustainable food production systems in changing climate (Khoury et al. 2015). Improvement in current breeding efficiency is needed to achieve stable increase in pigeonpea productivity. Innovative breeding designs supported with appropriate genomic technologies will play crucial role in modernization of the breeding programs. The hybrid pigeonpea breeding will also be greatly benefitted by current genomic advances that broaden our understanding of the male sterility and fertility restoration system. In a crop like pigeonpea that shows considerable maturity generation with photoperiod sensitivity, reducing the length of crop breeding cycle in combination with improved selection intensity holds the key to improve genetic gains accrued from breeding program. We opine that the GS models and SB protocols will contribute significantly to this end. Equally important will be to invest on strengthening germplasm collection programs and seed delivery systems.