Introduction

Crop breeding is a continuous effort to ensure sustainable food production despite the challenges posed by growing global population, changing climate and emerging virulent strains of pathogens and pests. Historically, crop breeding relied on selection of visually important traits and has gone through several phases as the knowledge about the genetic control of such traits evolved and as the new technologies and innovations emerged. Knowledge-based breeding is the next epoch to deliver food in the most challenging time of human history due to the population growth and extreme weather events. Wheat (Triticum aestivum L.) is among the three most cultivated food crops which provide calories to one-fifth of the global population (FAO 2018). Identification of the genes involved in productivity traits and their accumulation followed by improved selection methods in the future cultivars will be central to crop breeding. Genomics and its associated technologies could play a pivotal role to achieve this goal (Bevan et al. 2017).

Wheat is both the easiest and most difficult crop to develop tools and resources required to practice knowledge-based breeding. For example, a huge array of wheat aneuploid stocks, which are not available to any other crop, were developed by Sears (1954). These aneuploids were developed in genetic background of wheat cv. Chinese Spring include possibly complete sets of nullisomics, trisomics, monosomics, tetrasomics and which were later accumulated by 41 out of the 42 possible telocentrics (Sears and Sears 1978). Such an extraordinary cytogenetic flexibility provided elegance and precision in genetic studies currently unattainable in few if any other higher organisms. This allowed a precise and often predictable genetic control over introduced diversity in the species, and major factors controlling major adaptability traits were physically mapped. The series of experiments then in wheat aneuploid lead to the discovery of genes for several important traits including waxiness, maturity (Driscoll and Jensen 1964), endosperm proteins (Shepherd 1968), vernalization (Halloran and Boydell 1967) and many more (Law 1966). These outcomes were so tempting that an effort was initiated in 1968 to catalogue wheat genes known as the Catalogue of Gene Symbols for Wheat by Prof. Robert McIntosh (McIntosh 1973). This seems unlikely when we see the progress of reference genome sequencing of wheat cultivar Chinese Spring. It took nearly 13 years to release a high-quality reference sequence in 2018 after establishing a consortium in 2005 (Appels et al. 2018). Such a difficulty in sequencing the genome of one of the most important food crops, when competitor food crops were sequenced quite early, was due to the very large genome size (~ 17 Gb; almost 6× bigger than human and 40× bigger than rice), polyploidy (6×) and highly repetitive genome contents (~ 80%).

An important use of genomics is to understand the full spectrum of genes involved in expression of a phenotype. The concept of marker-assisted selection using major alleles of vernalization, photoperiod response and grain size genes has contributed a lot in genetic improvement of crops (Würschum et al. 2015, 2017). There are several convincing evidence that new genomics resources and approaches could identify genetic basis for yield under target environment (Rasheed and Xia 2019). During the last few years, several high-throughput genotyping platforms in crops have been established as a result of technological innovations in genomics and allied disciplines (Li et al. 2018b). Use of these advances in genotyping technologies for crop breeding during the last two decades had hastened the finding of novel alleles (Rasheed et al. 2017). Here, we reviewed the gene discovery in wheat and the technological innovations that influenced the gene discovery studies with special emphasis to research conducted in Asian countries.

Population genomics resources with focus on Asian wheats

We have briefly described the major timelines and sequential activities in sequencing the bread wheat genome (Rasheed and Xia 2019). The Asian countries contributed in physical mapping and survey sequencing of chromosomes 2A (India) and 6B (Japan), 7D (China), and reference genome sequence of Aegilops tauschii (China) (Zhao et al. 2017). The reference genome sequence of bread wheat cv. Chinese Spring is now greatly facilitating the further studies to underpin the genes involved in major breeding traits and unravelling the quantitative genetic framework in wheat adaptability to target environments. The genome-wide resequencing of 120 wheat cultivars and landraces from China revealed that A and B subgenomes of modern Chinese cultivars were mainly derived from European landraces, while D subgenome is mostly derived from Chinese landraces (Chen et al. 2019a, b). The genomic regions were positively selected during modern wheat breeding in China included 48 high-confidence (HC) genes. The strongest signals were reported on chromosome 6B which contain genes TaNPF6.1-6B, a nitrate transporter gene; TaNAC24, associated with drought and heat stress adaptability; and TaRVE3, a circadian clock and flowering time regulator (Chen et al. 2019a, b). Apart from this genomic resource from China, several global efforts have successfully developed genomic resources that have implications for wheat breeding in Asia. A high-density SNP array analysis of 4506 cultivars and landraces originating from 105 countries provided a high-resolution phylogeographic insight on wheat breeding in Asia (Balfourier et al. 2019). The Asian wheats were culminated into two prominent groups including wheats from Southeast Asia (SEA), and Central Asia and Africa (CAA). The CAA cluster, mostly consisted of spring-type landraces, also had a prominent sub-cluster of the Indian Peninsula (INP) wheats. The INP cluster consisted of spring-type landraces from Pakistan, India and Nepal. The wheat landraces from the geographic region between the Black Sea and Caspian Sea were grouped under Caucasus (CAU) cluster. On the other hand, the landraces from Japan and China were clustered in the SEA group. A significant proportion (71%) of the modern cultivars was derived from the East European and Mediterranean pool, while ~ 94% of the landraces were derived from Asian pool (Balfourier et al. 2019). This indicated that breeding programs in Asia have higher reliance on European germplasm, especially those from Italy.

Pont et al. (2019) sequenced a panel of 500 worldwide wheat accessions using exome capture and provided insight into the selection and adaptation of wheat during the last 10,000 years that shaped the modern day wheat. The key findings include that (1) the genetic diversity in wheat is consistent with human migration pattern out of the Fertile Crescent, and Egypt to Maghreb (Northern Africa) a coastal route, (2) vernalization, historical groups and geographic origins (Europe, Asia, Oceania, Africa and America) were the three major driving forces partitioning genetic diversity, and (3) only the regions on chromosomes spanning 168 Mb containing selective sweeps were identical between the Asian and European germplasm, which indicated that genetic improvement targets were independent in these two germplasm pools. In another study, He et al. (2019) re-sequenced 890 diverse hexaploid and tetraploid wheat accessions to identify the wild-relative introgressions favoring global wheat adaptation. They concluded that introgressions from wild emmer were more frequent in wheat accessions from Europe than from Eastern Asia.

Previously, Chinese wheat landrace collection (Zhou et al. 2018b) and mini core collection (Hao et al. 2017) were genotyped with DArTseq markers and 9 K SNP array, respectively. Hao et al. (2017) traced back several genetic blocks to a century old famous Strampelli cross and identified significantly higher diversity in the gene coding regions in modern wheat cultivars. Zhou et al. (2018b) revealed that selective sweeps were distributed over 148 chromosomal regions in at least one geographic area. They postulated that wheat in China was adapted during its agricultural trajectory to increasingly mesic and warm climatic areas and spread from the northwestern Caspian Sea region to South China. Recently, the selection signals were identified and compared between cultivars and landraces from Pakistan and China using a 90 K SNP array (Liu et al. 2019b). Of the 477 genome-wide selective sweeps, 109 were shared in both Pakistan and China germplasm. It was further revealed that no germplasm admixture was observed among modern wheat germplasm from Pakistan and China despite being neighboring countries.

International Maize and Wheat Improvement Center (CIMMYT) has significant impact in developing improved wheat germplasm used in breeding or directly adapted for cultivation in several Asian countries. An extensive genomic analysis of 44,624 accessions from CIMMYT and phenotyping of 50 traits evaluated in South Asia, Africa and Americas identified important loci for grain quality, disease resistance and adaptability in stressed environments (Juliana et al. 2019). Genetic diversity studies for germplasm from Afghanistan (Manickavelu et al. 2014) and Japan (Kobayashi et al. 2016) using GBS markers, and from Kazakhstan using 90 K SNP array (Shavrukov et al. 2014) are also available. These studies not only had global breeding implications, but also provided a strong genomics foundation for wheat breeding in Asia. This would help to design strategies to efficiently manage economically important stresses like three rust diseases, wheat blast and resilience to climate change. All these studies provided insight into the population structure, genetic diversity, linkage disequilibrium decay and the identification chromosomal regions under artificial selection that would provide relevant guidelines to improve wheat and facilitate the germplasm use in breeding.

From QTL to gene discovery: strategies and inventory

Quantitative trait loci (QTL) mapping and genome-wide association studies (GWAS) from Asia dominated the research scenario in wheat genomics. These studies are the routine activities to discover the genetic basis of important phenotypes and result in underlying allelic variations, marker-traits associations and frequency of favorable alleles in the target germplasm and provide foundation for wheat functional genomics (Rasheed and Xia 2019). For example, almost all the economically important traits have been dissected using GWAS; however, the prominent limitations of GWAS are not ruled out. These limitations include problems of confounding population structure with low-frequency causal alleles leading to false-negative results and other unaccounted factors including low-accuracy genotype calls at some loci (Browning and Yu 2009) and small population size (Finnoet al. 2014). Therefore, further validation is necessary, which is not usually practiced in GWAS experiments. The validation could be cross-population, where candidate loci are either validated in bi-parental populations or independent germplasm collections; and the other could be biological validation, such as genetic transformation, gene silencing or gene knockout and gene editing (Liu and Yan 2019). In a GWAS experiment, TaRPP13L1 was associated with flour color in Chinese wheat cultivars, and the effect of the gene was validated in a doubled haploid (DH) mapping population and in two tetraploid wheat mutants carrying premature stop codons of the TaRPP13L1 gene (Chen et al. 2019b). Similarly, several marker-trait associations from GWAS experiments were validated in bi-parental populations for black point resistance (Liu et al. 2016, 2017), flour color (Zhai et al. 2016a, b), kernel number per spike (Shi et al. 2017) and thousand grain weight (Sun et al. 2017).

Another approach that is now widely used for gene discovery is the bulked segregant analysis (BSA) or bulked segregant RNA-seq (BSR-seq), which is a combination of RNA sequencing (RNA-seq) and BSA (Zou et al. 2016). BSA, BSR-seq or several of their variants are efficient methods to rapidly map QTL or genes. The process includes the pooling of DNA samples from individuals of a population showing contrasting phenotypes, which are subsequently sequenced to provide a simple and rapid way to identify consistent polymorphic regions in contrasting pools. This strategy not only enables discovery of SNPs linked to the target genes using NGS but also provides patterns of genes differentially expressing between pools. This approach has recently identified four candidate genes within a QTL interval, Qsm.hebau-4A, providing resistance against Orange wheat blossom midge (OWBM), which is a serious threat to wheat production in China, Canada, Russia, US and Japan (Hao et al. 2019). The discovery of candidate genes for OWBM resistance in Chinese wheat cv. Jimai 24 and development of six markers for diagnosis could be useful for breeding against midge resistance. The stripe rust resistance in Zhoumai 22 designated as YrZH22 has been confined to a narrow genetic interval spanning 4 Mb physical genomic region using BSR-seq approach (Wang et al. 2017a). There are several new genes discovered using BSR-seq such as powdery mildew resistance gene Pm4b (Wu et al. 2018a, b), Pm61 (Hu et al. 2019), leaf senescence gene els1 (Li et al. 2018a), stripe rust resistance gene Yr26 (Wu et al. 2018a), YrMM58, YrHY1 (Wang et al. 2018b), dwarfing gene Rht12 (Sun et al. 2019) and nitrogen-dependent lesion mimic gene Ndhrl1 (Li et al. 2016c). A diversity of genes underpinning important traits has been discovered using bulked sample approach, which indicate its success in initial mapping of gene with high certainty. It is expected that BSA or BSR-seq approaches will become increasingly important in crop genetics and breeding and have the potential to replace the analysis of entire population in several cases (Zou et al. 2016).

Although a huge array of genes has been cloned in wheat, very few were map-based cloned. Map-based cloning in bread wheat is still difficult due to the complexity of the wheat genome (Rasheed and Xia 2019). Alternatively, comparative genomics provides new opportunities for identification of genes because there is high collinearity in genetic organization among grass genomes. Using homology-based cloning approaches, a series of genes underpinning grain weight, grain yield and flowering time have been isolated in wheat. Actually, this arena of wheat genomics has dominated the research scenario in China. A good example is the cloning of more than 40 genes for grain morphology in rice using map-based cloning of quantitative trait loci (QTLs) and screening of T-DNA tagging libraries. The biological processes or function of these genes have been assigned which are related to the regulation of cell proliferation and cell elongation (Li et al. 2018b; Li and Yang 2017). To date, 33 grain size and weight-related genes have been isolated by homology-based cloning and used for developing diagnostic markers (Table 1). Several genes, including TaSus2-2A, TaTAR2.1-3A, TaCWI-4A, 6-SFT-A2, TaSus1-7A and TaTKW-7A, were closely (< 10 Mb) linked with yield-related loci identified in linkage and association mapping. Apart from the yield-related genes, some genes were related to flowering time, TaPRR73 (Zhang et al. 2016), TaZIM-A1 (Liu et al. 2019a); agronomic traits, TaPPH-7A (Wang et al. 2019a); root growth and plant height, TaARF4 (Wang et al. 2018a); and drought tolerance, TaSnRK2.9-5A (Ur Rehman et al. 2019).

Table 1 A brief account of bread wheat genes cloned/characterized in Asia

Functional genomics and map-based cloning in wheat

The availability of new genomics resources like high-quality reference genome sequence, high-quality gene models, transcriptome databases, high-density SNP arrays and advancements in NGS is facilitating fine mapping and making map-based cloning more feasible in wheat. Yao et al. (2019) identified a grain yield QTL underpinning head length and spikelet number which was fine-mapped to a 0.2 cM interval, and the candidate gene was designated as Head Length 2 (HL2). Similarly, a gene for late heading was fine-mapped in an EMS mutant line and was designated as TaHdm605 (Zhang et al. 2018). A large-scale transcriptome analysis of 90 wheat lines identified the variations associated with spike architecture (Wang et al. 2017b). They identified three genes, TaTFL1-2D, TaHOX2-2B and TaAGLG1-5A, as regulators of spike architecture and biologically validated through wheat transgenic assays. In another study, the genes responding to abiotic stresses were selected from a yeast cDNA library constructed from a heat- and drought-tolerant wheat cv. Hanxuan 10 (Zhang et al. 2010). A gene TaPR-1-1 within the pathogenesis-related (PR) protein family was selected which conferred tolerance to abiotic stresses upon transgenesis in Arabidopsis and yeast. The screening of cDNA yeast library following abiotic stresses was found to be an efficient functional genomics tool to identify stress-tolerant genes.

The map-based cloning approach was used for Male Sterile 2 (Ms2), a gene inducing male sterility in wheat (Ni et al. 2017; Xia et al. 2017). The biological validation of the Ms2 using wheat transformation and induced mutations identified a terminal-repeat retrotransposon in miniature (TRIM) element in the promoter of Ms2. This TRIM element was responsible for the gene activation and consequent male sterility. The cloning of Ms2 provided substantial potential to exploit male sterility in wheat breeding for hybrid seed production. Similarly, a gene related to leaf angle, TaSPL8, was cloned using map-based cloning approach in wheat. TaSPL8 is responsible for the activation of genes associated with auxin and brassinosteroid pathways and cell elongation, and knockout mutants of this gene caused erected leaves due to the loss of the lamina joint, compact architecture and increased spike number (Liu et al. 2019c). Previously, a powdery mildew resistance gene, Pm21, from Haynaldia villosa was transferred to bread wheat in the form of 6VS.6AL translocation, and a Serine/threonine kinase gene was putatively cloned as the functional gene for Pm21 (Cao et al. 2011). The Pm21 gene is important for wheat production in China, and currently Pm21 containing cultivars are cultivated on ~ 4 million ha. Two complementary studies have recently used map-based cloning approach and identified Pm21 encode a typical CC–NBS–LRR protein which confers broad spectrum resistance to wheat powdery mildew disease (He et al. 2018a; Xing et al. 2018). These studies were supported by biological validations using multiple means.

Fusarium head blight (FHB) is a fungal disease of cereals which has become a major disease in most wheat producing areas. Despite intensive searches, no completely resistant germplasm has been identified except some landraces from China including Sumai 3, a Chinese wheat cultivar, which is known to carry Fhb1 gene. Sumai 3 is recognized as the best source of FHB resistance and has been widely used as a parent in many breeding programs. Rawat et al. (2016) reported that a pore-forming toxin-like (PFT) gene was the candidate for Fhb1 among 13 putative genes using positional cloning, mutation analysis, gene silencing and transgenic overexpression strategies, but PFT was subsequently found to be present in many FHB-susceptible cultivars casting doubt on the role of the gene (He et al. 2018b; Zhu et al. 2018). In two recent studies, a histidine-rich calcium-binding (TaHRC or His) gene adjacent to PFT from Ning 7840 and Wangshuibai, respectively, was identified as a better Fhb1 candidate. Su et al. (2019) proposed that the Fhb1 product is a susceptibility factor and FHB resistance results from loss-of-function. However, Li et al. (2019) concluded that Fhb1 is a gain-of-function gene and that the newly generated protein acts as a regulator of host immunity. Even though the discrepancies exist, all the three studies agreed upon that PFT gene was expressed at low level after anthesis and was expressed even less after FHB inoculation.

Functional marker development and application

The term of functional markers is used for the diagnostic PCR-based markers designed from causal polymorphisms within functional genes. Ultimately, the allelic variations of a specific functional genes are identified using functional markers (Liu et al. 2012). We have recently reviewed all the aspects of functional marker development and application (Rasheed and Xia 2019); therefore, this topic will be discussed briefly here. Functional markers are the markers of choice for practical breeding and can be used to pyramid genes or to introgress a specific gene. The use of functional markers in genomic selection could also improve selection accuracy. The availability of 72 functional markers in kompetitive allele-specific PCR (KASP) format enabled the high-throughput genotyping of large germplasm collections (Rasheed et al. 2016). Furthermore, this number has been recently increased to 157 KASP markers for various important phenotypic traits (Rasheed et al. unpublished data). Khalid et al. (2019) used KASP functional markers to genotype a diversity panel of synthetic-derived wheats characterized in optimal and water-limited conditions in Pakistan. They concluded that several functional markers of developmental genes like Vrn-A1, Rht-D1 and Ppd-B1 were associated with agronomic traits in both moisture conditions. The KASP assays of genes related to grain morphology were validated on CIMMYT elite lines and genes TaGS3-D1, TaTGW6 and TaSus1 showed significant effect on grain weight in historical CIMMYT germplasm (Sehgal et al. 2019). Zhao et al. (2019b) genotyped 1152 diverse global wheat accessions with KASP functional markers representing 47 genes controlling grain yield, quality, adaptation and stress tolerance. It was concluded that 22 alleles with positive effect of such genes were selected, while 14 alleles with negative effect were selected in global germplasm collection. This work comprehensively reported the human-mediated selection on favorable alleles of important genes in germplasm from Asia, Europe, North America and CIMMYT.

Another important aspect in application of KASP functional marker is the higher cost because KASP mastermix was only available from LGC™, a commercial proprietary. This led several groups to develop some open source SNP genotyping method. A recent example is the development of semi-thermal asymmetric reverse PCR (STARP) (Long et al. 2016) and Amplifluor (Jatayev et al. 2017) methods, which can be used with any commercial mastermix. This would have significant impact on the cost of the SNP genotyping. Currently, our group in Chinese Academy of Agricultural Sciences (CAAS) is converting and optimizing all the KASP assays to STARP assays to significantly reduce the cost and promote the use of high-throughput genotyping in wheat marker-assisted selection.

Gene discovery and marker application to wheat breeding: Japanese perspective

Common wheat should be harvested before the rainy season in Japan to avoid severe damages such as pre-harvest sprouting and Fusarium head blight disease. Recent climate changes bring the rainy time even in the Hokkaido area at the wheat harvest season. In addition, Japan is a very long country from north to south, and thus wheat cultivation should adapt to the various environmental conditions of each area. Due to these climate characteristics peculiar to Japan, early heading and pre-harvest sprouting tolerance are two of the most important targets in Japanese wheat breeding.

Use of the photoperiod-insensitive alleles has contributed to breed the early flowering cultivars. Ppd-D1a, which distributes widely in European and Japanese wheat cultivars, was originally derived from a Japanese landrace ‘Akakomugi’ (Worland 1996). Recently, a causal gene for photoperiod insensitivity of a Japanese wheat variety ‘Chogokuwase’ was identified as wheat PHYTOCLOCK1 (WPCL1) on 3AL (Mizuno et al. 2016). A 142-bp deletion in the MYB domain induces the extra-early flowering phenotype of ‘Chogokuwase’. WPCL1 is a homoeologue of the candidate for an early flowering mutant of einkorn wheat (Mizuno et al. 2012), and the Arabidopsis homologue LUX/PCL1 constitutes the “evening complex” repressing the PRR9 expression. Control of heading time via the Vrn-1-Ppd-1 allelic combinations could be important for further wheat breeding of early heading cultivars without frost injury in Japan.

Pre-harvest sprouting is induced by germination of wheat grains on the mother plants before harvest and greatly decreases the quality of the end products. Two important QTL for pre-harvest sprouting tolerance, QPhs.ocs-3A.1 on 3A and Phs1 on 4A, were recently cloned; one is MOTHER OF FT AND TFL1 (MFT) on chromosome 3A (Nakamura et al. 2011) and another TaMKK3-A on 4AL (Torada et al. 2016). QPhs.ocs-3A.1 and Phs1 have stable and large effects on seed dormancy (Mori et al. 2005; Torada et al. 2008). One of the Japanese wheat varieties ‘Zenkojikomugi’ provides the highest dormancy, and ‘Zenkojikomugi’-derived population has been used for breeding cultivars with strong grain dormancy (Chono et al. 2015). Frequency of the ‘Zenkojikomugi’-type SNP in MFT, which is located in the promoter region and increases the transcription level (Nakamura et al. 2011), is very high in southern part of Japan, but low in the northern part (Chono et al. 2015). ‘Zenkojikomugi’ contains two seed dormant alleles at the two QTLs on chromosomes 1B and 4A in addition to the 3A QTL (Cao et al. 2016). Phs1 was identified as a QTL for seed dormancy in a mapping population between Canadian dormant cultivar and nondormant Japanese cultivar (Torada et al. 2008), and a single nonsynonymous amino acid substitution (N260T) in the kinase domain of TaMKK3-A was related with the dormancy difference (Torada et al. 2016). If the N260T substitution can be introduced into each of the TaMKK3 homoeologues, developing of hyper-dormant cultivars could be expected in further wheat breeding (Nakamura 2018). Qsd1, a major seed dormancy QTL related to barley domestication, encodes an alanine aminotransferase family member (Sato et al. 2016). Common wheat contains three homoeologues of the barley Qsd1, and the B-genome copy of Qsd1 of a semi-dormant winter wheat cultivar ‘Kitahonami’ represents three amino acid substitutions compared with those of nondormant spring cultivars (Onishi et al. 2017).

Other genes and QTLs can be also expected to improve pre-harvest sprouting tolerance in wheat. TaABA8′OH1 encoding an ABA 8′-hydroxylase is highly expressed during seed development (Chono et al. 2013). A Japanese wheat cultivar ‘Tamaizumi’ contains an insertion mutation in TaABA8′OH1-D, and a gamma-ray irradiated mutant line TM1833 from ‘Tamaizumi’ lacks TaABA8′OH1-A. The double mutant TM1833 of TaABA8′OH1-A and TaABA8′OH1-D exhibits higher ABA contents in seed embryos and lower germination than ‘Tamaizumi’, indicating that reduction in the ABA 8′-hydroxylase activity in developing grains could effectively improve pre-harvest sprouting tolerance in wheat (Chono et al. 2013). Several QTLs for ABA sensitivity in germinated seeds has been found in common wheat. Significantly higher responsiveness is observed in Japanese wheat cultivars including some dormant seed-producing cultivars such as ‘Zenkojikomugi’ and ‘Fujimikomugi’ (Yokota et al. 2016). The QTLs for ABA responsiveness at the early seedling stage are assigned to chromosomes 1B, 2A, 3A, 5A, 6D and 7B and could contribute to the development of dehydration and pre-harvest sprouting tolerance (Kobayashi et al. 2010; Iehisa et al. 2014a). Near-isogenic lines for each ABA responsiveness-related QTL should be developed to validate the relationship between ABA responsiveness at the early seedling stage and pre-harvest sprouting tolerance in further studies.

Starch characteristics are important to end-product grain quality in common wheat. Two starch synthesis genes, encoding granule-bound starch synthase I (GBSSI) and starch synthase IIa (SSIIa), play important roles in determining the starch properties in developing wheat grains. Each of the two genes contains three homoeologues in the hexaploid genome, and effects of each copy on the starch properties are different (Shimbata et al. 2012; Inokuma et al. 2016). Combinations of the GBSSI and SSIIa alleles show diverse properties of the starch granules, and the double null mutant line has high levels of maltose and sucrose in the shrunken seeds (Nakamura et al. 2006; Inokuma et al. 2016). A soft winter wheat cultivar ‘Kitahonami’ shows a superior flour yield compared with other Japanese soft wheat cultivars, and QTLs on chromosomes 3B and 7A exhibit highly significant and consistent effects on the flour yield (Ishikawa et al. 2014). A newly developed hard wheat cultivar ‘Yumechikara’ has characteristics of extra-strong wheat (Tabiki et al. 2011). The dough properties and bread-making-quality-related characters have been further improved using different Glu-B3 alleles (Ito et al. 2015). Marker-assisted breeding makes it efficient to improve the end-product grain quality in common wheat.

New marker system to harness D-genome diversity

A diversity panel of wheat varieties from Japan is a useful resource for evaluating their genetic diversity and population structure. The genotyping-by-sequencing (GBS) approach clearly classified the Japanese wheat varieties into four clusters; (1) varieties from Hokkaido area, (2) modern cultivars from the northeast part of Japan, (3) modern cultivars from the southeast part of Japan and (4) classical varieties and landraces (Kobayashi et al. 2016). The SNP density on the D-genome chromosomes was remarkably lower than those on the A- or B-genome chromosomes. The low polymorphism on the D genome was found in RILs between a European wheat cultivar ‘Mironovskaya 808’ and ‘Chinese Spring’ using array-based marker and GBS approaches (Iehisa et al. 2014b; Kobayashi et al. 2016). To overcome the scarcity of the D-genome markers, a new marker system has been developed using genotyping by multiplexing amplicon sequencing and genome-tagged amplification (Ishikawa et al. 2018). The newly developed markers using genome-specific amplicon sequencing are uniformly distributed across the D-genome chromosomes.

The RNA-seq approach is cost-effective for genome-wide SNP discovery in wheat relatives. The members of Triticeae (synonym Hordeae) are considered as a single genetic system which shares homeologous chromosome structure and collinearity (Mayer et al. 2011; Wicker et al. 2011). The highly conserved synteny among homeologous chromosomes of the Triticeae species facilitates the genetic analysis of the wild relatives for which genome sequences are not available. For example, through the RNA-seq assembly from leaf transcripts of Aegilops tauschii, a large number of genome-wide polymorphisms were found among accessions and the SNPs and indels were successfully anchored to the chromosomes of Ae. tauschii and wheat D genome as well as those of barley (Iehisa et al. 2012; Nishijima et al. 2016). Similar RNA-seq approach was applicable to other diploid wheat relatives such as einkorn wheat, Ae. umbellulata and section Sitopsis species (Okada et al. 2018a; Miki et al. 2019; Michikawa et al. 2019). The genome-wide polymorphisms identified by this strategy could be converted to the PCR-based markers, and the converted markers are useful to construct linkage and high-density maps (Iehisa et al. 2014c; Nishijima et al. 2016; Michikawa et al. 2019). BSA combined with RNA-seq allowed to develop efficiently molecular markers linked to a chromosomal region associated with the target phenotype in tetraploid and hexaploid wheats (Trick et al. 2012; Ramirez-Gonzalez et al. 2015; Wu et al. 2018a, b; Nishijima et al. 2018). In the polyploid wheat species, the RNA-seq-based BSA requires precise assignment of reads from homoeologues to each subgenome, and thus having adequate reference sequences is important for the successful assignment.

Development of synthetic wheat lines and use of wild relatives in breeding in Japan

The evolutionary process of polyploid wheat (bread and durum wheats) is largely influenced by allopolyploid speciation. For the birth of a new allopolyploid species, normal growth and fertility of the interspecific hybrids are necessary. The interspecific hybridization between tetraploid wheat and Ae. tauschii results in normal triploid hybrids with the ABD genome, and unreduced gametes should be formed in pollens and eggs of the ABD hybrids. This evolutionary process can be repeated by artificial hybridization of the parental species, and the produced allohexaploid plants are called as synthetic hexaploid wheats (Matsuoka and Nasuda 2004).

Sometimes its triploid hybrids with tetraploid wheat show several types of abnormalities such as failure in germination, hybrid necrosis and hybrid sterility (Matsuoka et al. 2007). An array of synthetic hexaploid wheats developed between durum wheat cv. Langdon and Ae. tauschii accessions is classified into following four types based on abnormal growth phenotypes: type-II and type-III hybrid necrosis, hybrid chlorosis and severe growth abortion (Mizuno et al. 2010). The different types of growth abnormalities in developing synthetic wheats by triploid hybrids could act as postzygotic hybridization barriers leading to failure in development of synthetic wheat. Based on cytological and transcriptome analyses, an autoimmune response-like reaction appears to be associated with necrotic cell death in the two types of hybrid necrosis and hybrid chlorosis (Mizuno et al. 2010, 2011; Nakano et al. 2015). Causal genes for type-II and type-III necrosis in the D genome were assigned to the short arms of chromosomes 2D and 7D, respectively (Mizuno et al. 2010; Sakaguchi et al. 2016). The D-genome causative gene of hybrid chlorosis is located on 7DS (Hirao et al. 2015). However, chromosomal location of these hybrid necrosis and chlorosis genes in the AB genome still remains unknown. The autoimmune response-like reaction could be also assumed in hybrid necrosis observed in intraspecific crosses of common wheat and interspecific crosses of einkorn wheat (Takamatsu et al. 2015).

In the Japanese seedbank, namely National BioResource Project KOMUGI (https://shigen.nig.ac.jp/wheat/komugi/top/top.jsp), more than 12,000 accessions of wheat and its relatives are stored. This collection contains more than 4000 accessions of wild wheat relatives and a lot of cytogenetic experimental lines. Despite the presence of various hybrid incompatibilities, synthetic hexaploid wheat lines could be produced through interspecific crosses between ‘Langdon’ and Ae. tauschii accessions (Takumi et al. 2009; Kajimura et al. 2011). The wide variation in heading time in the parental Ae. tauschii accessions is maintained in the hexaploid background of the synthetic lines, although the ‘Langdon’ genome may have an enhancement on the effect of D-genome-derived variation in derived synthetics. A QTL for flowering time was commonly found on chromosome 7D in both Ae. tauschii and synthetic hexaploid wheat (Nguyen et al. 2013; Koyama et al. 2018). Grain shape is remarkably differentiated between the two subspecies of Ae. tauschii, and a few QTLs mainly control the grain shape differentiation (Nishijima et al. 2017). The D-genome QTL underpinning grain morphology could be at least partly functional in hexaploid wheat (Okamoto et al. 2013). Grains of the synthetic hexaploid wheat lines with the AABBDD genome generally exhibit a soft texture, whereas the U-genome addition to durum wheat generates hard-textured grains in synthetic hexaploid lines with the AABBUU genome (Okada et al. 2018b). The diversity in grain morphology traits of Ae. umbellulata will be useful for improvement of grain hardness diversity in hard-textured common wheat. Therefore, wild wheat accessions with useful genes and characters could be evaluated under the allohexaploid background. Phenotypes of the synthetic hexaploid wheat are distinct from modern cultivars of common wheat. To fill the gap, multiple synthetic derivatives (MSD) population was recently developed (Gorafi et al. 2018). The MSD population harboring chromosomal fragments from Ae. tauschii in the background of a common wheat cultivar, which make it an important resource for finding useful Ae. tauschii loci functioning under the wheat background (Elbashir et al. 2017; Gorafi et al. 2018). Genetic and epigenetic changes in the synthetic lines and MSDs could be elucidated in detail in further studies for efficient breeding use of the wild wheat relatives.

High-throughput phenotyping to support functional genomics

Adaptability of crops in fluctuating environment is an unclear concept, mainly influenced by genetics behind the physiological and morphological attributes, which continuously change during the growth cycle. The future course of action in accelerating genetic gain will depend on repeatable high-throughput phenotyping to get precise phenotypic information from broad range of environments (Araus and Cairns 2014). Intensive phenome data from multiple sites are important for deep understanding of genetic basis of important crop traits. Time-series phenotyping of large plant populations at multiple locations for important climate smart traits could maximize the selection accuracy (Rutkoski et al. 2016). It is difficult, even impractical, to evaluate genotypes for multiple times using present phenotyping approaches and facilities especially in Asia. There has been significant developments in using high-throughput phenotyping (HTPP) facilities outside of Asia, such as phenotyping facility at the Maricopa Agricultural Center (USA), Australian Phenotyping Facility in Canberra and integration networks between phenotyping communities for sharing facilities and data such as Australian, French, Italian, German and UK phenotyping networks under the umbrella of International Phenotyping Network. China has also recently established China Phenotyping Network which is the only phenotyping-assistance body in Asia. The majority of Asia is lagging behind in development and adopting modern phenotyping facilities.

During recent years, many computer vision-based tools and sensors have become vital for phenotyping of various crop traits with high accuracy and throughput (Fiorani and Schurr 2013). Taking concept of imaging-based quantification, several developments have been made for HTP phenotyping of crop under controlled and field conditions (Blais 2004; Zhao et al. 2019a). These advance phenotyping methods focus on a range of measurements such as from single organs level (leaves and roots) to whole plant using 2D/3D shoot imaging approaches in greenhouse facility and even whole plant or canopy surveillance of large field trials (Schmundt et al. 1998; Walter and Schurr 2005). These measurements are mainly in the form of RGB pixels, multi-/hyper-spectral bands, wavelength, time and direction. After converting these measurements into useful biological knowledge (crop traits) for closing phenome to genome knowledge gap using computer vision technology, scientists can get dynamic information about disease, photosynthesis, chlorophyll status, transpiration rate, physiochemical properties of crop through spectral vegetation indices for stress detection and genetic studies (Chaerle et al. 2006; Genty and Meyer 1995; Jahnke et al. 2009) (Fig. 1). Recently, noninvasive technologies have been used for high-throughput data generation for many crops in greenhouse facilities using environmental stimulations and field conditions (Tardieu et al. 2017). Modern greenhouse phenotyping installations include magnetic resource imaging, position emission technology and computer tomography or combination of all above under greenhouse conditions. Whereas, the category of field-based phenotyping platforms is diverse and encompass at many levels. It includes all kind of ground-based platform including fixed field phenotyping facilities such as cable suspended multisensory system (Kirchgessner et al. 2017), the Field Scanalyzer (Virlet et al. 2017) and specially modified carts-based pheno-mobile, pheno-carts as well as linearly moving pheno-poles, capable of carrying multiple sensors across the field (Deery et al. 2014; White and Conley 2013) (Fig. 1). The increase in image resolution and miniaturization of sensors cost together with usefulness of unmanned aerial vehicles (UAVs) has also allowed researchers to oversee performance of genotypes in breeding trials for crop management. UAV platforms that are flexible alternatives including poli-copter, helicopter and fixed wing configurations are attracting attention due to low cost along with their carrying ability for a range of payload (Gago et al. 2015). In addition, nanosatellites and microsatellites with high-resolution hyper-spectral and RGB cameras have been also becoming attractive alternative to low-altitude UAVs due to its large coverage capability for multiple sites at the same time (Yang et al. 2017). Use of these modern phenotyping approaches will bring a deep network to overcome the phenome to genome knowledge gap. The advances of plant phenotyping can benefit from other disciplines of remote sensing, robotics, computer vision and artificial intelligence.

Fig. 1
figure 1

High-throughput phenotyping network to complement deep genetic analysis. This is an example of big data generation on multiple phenotypes from a single organ to whole plant/canopy level with high repeatability and cost efficiency, using modern high-through facilities at both greenhouse-based laboratories and field conditions. Recent developments in phenotyping approaches such as observing the plants through 2D/3D images by using simple RGB camera, fluorescent, hyper/multispectral sensors in fixed or portable ground and aerial platforms have increased the efficiency of data collection. While computer vision-based approaches help in converting the raw information embedded in RGB and hyper-/multispectral images into useful biological knowledge and have increased phenome knowledge. Rapid and comprehensive phenome information of important traits such as seed morphology, seed physiology, spike number, plant height, biomass, chlorophyll contents, photosynthesis rate, senescence rate, normalized different vegetation index (NDVI), canopy temperature and disease scoring can increase the efficiency of genetic analysis for crop improvement

In Asia, only China is vastly developing central and institutional-based HTP phenotyping facilities such as NERCITA in Beijing (National Engineering Research Center for Information Technology in Agriculture) and Plant Phenomics Research Center (PPRC) in Nanjing Agricultural University. Several initiatives formed a strong international cooperation: the most important ones are with European countries, such as China–UK, China–France and China–Germany Phenomics Centers at Nanjing Agricultural University for development of joint phenotyping facilities. NERCITA has developed greenhouse and field-based remote sensing facilities through robotics to quantify important attributes of cereal and fruit crops in the context of precision agriculture (Wen et al. 2017; Yongjun et al. 2017; Yue et al. 2017; Zhang et al. 2015a, b; Zhou et al. 2018a). Whereas, PPRC is developing several phenotyping platforms ranging from growth chamber to field and is providing central phenotyping facilities for crop researcher and breeders. PPRC aims to develop and provide access to facilities that address multiscale plant phenotyping in different agro-climatic scenarios in China. In the context of practical use of modern HTPP, CAAS (Beijing) is establishing aerial and pheno-cart-based ground HTP phenome data for genomic prediction modeling and for precise selection in their breeding programs. Recently, a genomic selection paper has been published using UAV-based plant height data (Hassan et al. 2019). There are also some other reports from China on the use of HTP phenotyping platforms for assessment of important traits in different crops and their applications in quantitative genomic analysis and within field selection (Guo et al. 2015; Hassan et al. 2018a, b; Wang et al. 2019b). In Japan, there are also some developments in phenotyping technology through sensors and robotics, but application is very limited in crop breeding. Recently, a report has been published on assessment of sorghum plant height using UAV-based platform and its application in genomic modeling from the Institute of Sustainable Agro-ecosystem Services, The Tokyo University (Watanabe et al. 2017).

Asia needs more projects and initiatives which could be centralized by establishing Asian Phenotyping Network. In this context, China could help to initiate some cooperations with other Asian countries to introduce high-throughput technologies, to change the landscape of phenotyping infrastructure, the awareness about the importance of precision in phenome data for future research and integration of scientific community in Asia. This will create opportunities to coordinate data generation, analysis and integration across the platforms. Finally, this will allow us to bridge the phenotype-to-genotype gaps to characterize and improve a wide range of crops for food security in Asia.

Analytical approaches to utilize benefits of large-scale genomics

Advances in computer simulation and analytical genetics are crucial to improve selection accuracy in complex phenotypic traits and can assist wheat breeders in their important breeding decisions. Computer simulation is a product of integrated disciplines such as computer science, quantitative genetics, statistics, mathematics and molecular biology (Li et al. 2012). Computer simulation can be a tool which allows breeders to integrate theoretical, empirical and practical knowledge into dynamic model in silico (i.e., genetic model). The genetic model contains all necessary inputs (e.g., trait, environment, linkage map, marker information, gene or QTL location, QTL, pleiotropic and epistatic effects). This genetic model can be used to simulate long-term effects of breeding decision in early phase of breeding programs (Wang et al. 2005; Wang et al. 2003). Information such as choice of parents, crossing and breeding strategy, experimental population size, number of breeding cycles, number of replications and sampling method was required to meet specific breeding objective (Wang et al. 2005). In last two decades, computer simulation and modeling approaches have brought a new paradigm in wheat, rice and maize. For example, through computer simulation, Wang et al. (2003) compared two CIMMYT practical wheat breeding strategies called modified pedigree/bulk selection (MODPED) and selected bulk selection method (SELBLK) by using QU-GENE software. The genetic model accounted for pleiotropy, epistasis and G × E. The computer simulation revealed that SELBLK method resulted in 3.9% higher genetic gain than MODPED. Simulation study also showed SELBLK method is more cost-effective because of retaining less crosses from F1 to F8 and also requires one-third less land when compared with MODPED. In another study, Wang et al. (2009a) used computer simulation and modeling to compare single backcross breeding strategy with other selection and crossing schemes in different genetic composition parent lines. The favorable traits in simulation experiment were assumed as polygenic and were selected by phenotypic selection. The simulation outcomes indicated that the single backcross breeding strategy was superior over the two or more backcrosses and could transfer more than 60% favorable genes from donor parent to improve the adaptation of recurrent parent.

Moreover, computer simulations were also used to predict cross-performance in wheat for two quality traits, maximum dough resistance (Rmax) and extensibility (Wang et al. 2005). Simulation study used eight Silverstar sister lines which were morphologically very similar but different in allelic compositions. For simulation, the genetic model considered multiple alleles, linkage, pleiotropy and digenic epistasis. Simulation results showed Silverstar 3 and 7 were suitable parents for crossing for high Rmax objective while the other lines did not have alleles b and d at Glu-A3 and Glu-D1, respectively. In another experiment, Wang et al. (2009b) used simulation modeling to pyramid favorable alleles or genes into one target genotype for long coleoptile in wheat through a design breeding approach.

Li et al. (2013) compared the usage of doubled haploids (DH) with conventional wheat breeding strategy of CIMMYT wheat breeding program. Simulation results revealed that using DH lines derived from F1-DH and F3-DH in wheat did not improve the genetic gain per cycle, per year and per dollar for adaptation traits from shuttle breeding as compared with conventional breeding.

It is predicted from the last decade that MAS failed to improve quantitative traits that are polygenic (Bernardo 2008; Xu and Crouch 2008). However, MAS has been effective to improve monogenic traits with large QTL effects tagged with a known marker (Zhong et al. 2006). The limitation of MAS comes from its two components, (1) finding a significant marker-trait association and (2) estimation of QTL effects. To overcome this problem of polygenic traits, development of statistical methods which simultaneous includes all markers without significant tests and estimate marker effects of the trait under study made genomic selection (GS) popular. Thus, GS emerged as an alternative solution to improve traits with complex genetic architecture as well as to overcome previous selection methods such as phenotypic selection and MAS. GS consists of two parts; training population (TP) and breeding population (BP). TP is that which has been both phenotyped and genotyped and used to train prediction model to predict the genetic merit of BP (only genotyped) based on genomic estimated breeding values (GEBVs; also refers as breeding value). Predicated genotypes could be used as new parents for a new breeding cycle or enter multi-environment trials for variety development, but selection of genetically superior parents for making crosses is still a daunting task. More recently, Yao et al. (2018) demonstrated through computer simulations that GS can be used to predict cross-performance (i.e., cross with a high progeny mean and high genetic variance) based on parental selection criteria such as midparent GEBV and usefulness in bi-parental progenies. For simulation experiment, a total of 57 parents which included high-quality and high-yielding cultivars from China, USA and Australia were used. For crosses prediction, selection index (incorporating yield, maximum resistance and stability time) was developed as a new trait. The simulation results showed that the best crosses or candidate parents can be selected by the usefulness rating which resulted in higher genetic gain than those selected from midparent GEBVs for smaller selection proportion. Although significant progress is being made in wheat breeding and simulation, practical implementation in crop improvement programs is still in its infancy.

Conclusion and future outlook

The extensive use of NGS and other advancements are making inroads for gene discovery in Asia, but the progress is only limited to few countries like China and Japan. In the wheat pan-genome project (10+ wheat genome project), only one cultivar is from Asia (Norin 61 from Japan), while genetic diversity from other Asian countries is not represented. Therefore, there is a strong need to develop genomic resources based on genetic diversity from Asia to further translate this knowledge in applied breeding. Similarly, a comprehensive knowledge of diversity from Asia, including cultivars, landraces and wild relatives, is necessary to deepen our knowledge of the underlying evolution, domestication and adaptation of wheat in Asia. A more collaborative network among researchers from Asia is also needed on crop phenomics, like European Plant Phenotyping Network (EPPN), so that such facilities are accessible to other countries which are unable to afford high-tech facilities.