Introduction

Food production has advanced from the original form where humans gathered food from the wild to cultivation and selection of wild plants (landraces), and further to modern-day plant breeding of new varieties and cultivars with suitability to human requirements for taste, yield, and storage (Meyer and Purugganan 2013). While the currently grown crop cultivars provide stable and essential diets for the living of mankind, they only represent a fraction of the large gene pool and their genetic diversity has been narrowed by domestication and modern breeding (Glaszmann et al. 2010). Wild species are highly diverse and variable in terms of agronomic and morphological traits. Crop wild relatives (CWR), which includes the progenitors of crops and closely related species, have provided plant breeders with a broad pool of potentially useful genetic resources (Tanksley and McCouch 1997). These CWRs may provide new alleles for resistance to biotic (insects, fungi, bacteria, viruses, or viroids) or abiotic stresses (extreme temperatures, drought/floods, salinity, etc.), novel flavors and nutrients, and genes that may affect yield and quality traits (Bohra et al. 2022). The utilization of CWRs for crop improvement is impressive in wheat breeding. Modern bread wheat (Triticum aestivum L, 2n = 6x = 42, AABBDD) originated from two natural wide-hybridization events involving three grass species (International Wheat Genome Sequencing Consortium 2014). Their closely related wild species that were not part of this crossing have been used to provide modern wheat with a boost of useful genetic diversity. Wheat wild relatives contain a large number of favorable genes for crop production and have gained prominence with their use in the improvement, especially of various fungi disease resistance (Pour-Aboughadareh et al. 2021). The most notable examples are the powdery mildew resistance genes Pm8 and Pm21, which were incorporated from rye and Dasypyrum villosum, respectively (Chen et al. 1995; Hurni et al. 2014), Fusarium head blight resistance gene Fhb7 from Thinopyrum elongatum (Wang et al. 2020), and rust resistance gene Yr17/Lr37/Sr38 from Aegilops ventricosa (McIntosh et al. 1995). Recently, widespread alien introgressions from wild relatives into modern cultivated wheat have been revealed by pan-genome sequencing (Walkowiak et al. 2020) and whole-genome resequencing (Cheng et al. 2019; Hao et al. 2020). These suggest the importance of closely related wild species for providing modern wheat with a boost of useful genetic diversity.

Wheat yellow mosaic virus (WYMV) is one of the major soil-borne bymoviruses that cause diseases on autumn-sown varieties and threaten wheat production in many areas of China (Xu et al. 2018). The disease was firstly reported in Sichuan in 1960s (Wang et al. 1980), and spread gradually into the middle and lower reaches of the Yangtze River wheat-growing region (Sun et al. 2013), and recently Huang-Huai-Hai wheat-growing region (Yang et al. 2022). The infectious viruses were vectored by the obligate root-habitat plasmodiophorid Polymyxa graminis, which produces mobile primary and secondary zoospores and thick-walled resting spores within the colonized cereal roots (Chen 1993). The virus particles were incorporated from the zoospores of P. graminis into the seedling root cells and then transported to the newly emerging young leaves (Liu et al. 2016). The infected seedlings displayed visible symptoms in the early spring, characterized by a yellow-striped mosaic pattern on leaves and mildly stunted spring growth at jointing phase (Xiao et al. 2016). This resulted in decreases in tillers and shorter wheat spike, low seed setting rate and hollow kernels at harvest, and finally compromised severe yield losses (Han et al. 2000). It typically causes yield losses of 10–30%, and could be as much as 70% in severe epidemic year (Chen 2005).

The thick-walled dormant spores of P. graminis can survive in soil for more than 10 years (Chen 1992). Therefore, it is extremely difficult to remove P. graminis containing WYMV from infected soil. Pesticide control and agronomic methods also have a limited effect on eliminating fungal spores and WYMV in infected fields (Kanyuka et al. 2003). Breeding WYMV resistant cultivars is considered a sustainable and economical way of virus control (Barbosa et al. 2001). The genetic variations in common wheat for WYMV resistance have been characterized, from high susceptibility to complete resistance. To date, seven WYMV resistance genes/quantitative trait loci (QTLs) have been reported in wheat. They were genetically controlled in a dominant manner and mapped to chromosomes 2DL, 2A, 3BS, 5AL, 7A, and 7BS of common wheat and 4VS of D. villosum, a wheat wild relative with the resistance gene designated Wss1 (Zhang et al. 2005). Two QTLs were detected in Xifeng wheat, each was located on 3BS (QYm.njau-3B.1) and 5AL (QYm.njau-5A.1), respectively (Zhu et al. 2012). Besides, two QTLs on 7A and 7BS of emmer wheat showing wheat spindle streak mosaic virus (WSSMV) resistance were named Qssm-mtpsa-7A and Qssm-mtpsa-7BS (Holtz et al. 2017). The mostly reported resistant genes/QTLs were on homoeologous group 2 chromosomes. YmNM from the wheat cultivar Ningmai9 was firstly mapped to chromosome 2A (Liu et al. 2005a). Five independent studies reported genes/QTLs on 2DL in different wheat cultivars: QYm.nau-2D from Chinese cultivar Yining Xiaomai (Xiao et al. 2016), YmYF from Chinese cultivar Yangfu9311 (Liu et al. 2005b), YmIb from European cultivar Ibis (Nishio et al. 2010), Qym1 from American cultivar Madsen (Suzuki et al. 2015), and Q.Ymym from Japanese cultivar Yumechikara (Kojima et al. 2015). Comparison of the above reported 2DL resistance loci indicated that they are all located within the same genetic region. Thus, they could be controlled by the same dominant gene, which is widely distributed in common wheat from different origins (Kobayashi et al. 2020).

Toward better utilization, fine mapping and cloning of the QYm.nau-2D, in the present study, we performed linkage and association mapping studies by taking advantage of recently released wheat genome sequences (International Wheat Genome Sequencing Consortium et al. 2018; Sato et al. 2021; Walkowiak et al. 2020). We demonstrated the introgression fragment into chromosome 2D harboring QYm.nau-2D has been selected and utilized worldwide in breeding program for improving WYMV resistance. Eleven diagnostic markers for molecular marker-assisted selection of QYm.nau-2D were developed and can be used in breeding for WYMV resistance.

Materials and methods

Plant materials

A biparental population of 11,265 F2 individuals derived from the cross between 2011I-78 (WYMV susceptible) and ‘Yining Xiaomai’ (WYMV resistant) was used for fine mapping of QYm.nau-2D. The 2011I-78 is a Chinese Spring substitution line, in which Chinese Spring chromosome 2D is substituted by Aegilops tauschii 2D. 2011I-78 was kindly provided by Dr. J. Dvorak (University of California, Davis, USA). A total of 53 F2:3 family lines from F2-derived recombinant plants were evaluated for WYMV resistance and to validate the genotypes of their corresponding F2 plants.

A natural population consisting of 372 wheat varieties (Supplementary Table 1) was selected as an association panel for evaluation of WYMV resistance, genome-wide association study (GWAS) and haplotype analysis.

DNA extraction and polymerase chain reaction (PCR) analysis

Genomic DNA was extracted from young leaves using a modified cetyl trimethyl ammonium bromide (CTAB) method (Murray and Thompson 1980). PCR was performed on a programmable thermal cycler (PTC-200, BioRad, Hercules, CA, USA) in total volumes of 10 μL, containing 50 ng genomic DNA, 2 μM each of the primer pairs and 5 μL of 2 × Taq Master Mix (Vazyme, Nanjing, China). The amplification program began with 1 cycle of primary denaturation at 94 °C for 5 min, followed by 35 cycles of denaturing at 94 °C for 30 s, 51, 55 or 61 °C (varying with annealing temperatures of the primer pairs) for annealing and 72 °C for extension. One additional cycle was performed at 72 °C for 10 min for final elongation of the PCR products. PCR products were resolved in 8% non-denaturing polyacrylamide gels (Acr:Bis = 19:1 or 39:1) and were visualized by silver staining (Bassam and Gresshoff 2007).

RNA extraction and reverse transcription (RT)

Leaf samples were collected from the WYMV nursery. Virus-free leaf samples were collected from a greenhouse and used as controls. Total RNA was isolated using TRIzol reagent (Invitrogen, Carlsbad, CA, USA) following the manufacturer’s instructions. The HiScript® II First Strand cDNA Synthesis Kit (Vazyme, Nanjing, China) was used to synthesize the first-strand cDNA using 1 µg total RNA. Each 20 µL reaction contained 10 µL 2 × RT Mix, 4 µL HiScript II Enzyme Mix, 1 µL oligo (dT)23VN (50 µM), and nuclease-free H2O up to 20 µL. The reverse transcription program was 25 °C for 5 min, 50 °C for 15 min, and 85 °C for 2 min.

Detection of WYMV by RT-PCR

Degenerate primer pair (WMVCP-F: gctgcggacacacaaacwgacg, WMVCP-R: ggttagctctggrtgtccatcag; “w” represents “A” & “T”, “r” represents “A” & “G”) were synthesized for detection of WYMV (Clover and Henry 1999; Lei et al. 1998) and used to investigate the presence of the WYMV virus in the nursery. The Tubulin gene (Tubulin-F: agaacactgttgtaaggctcaac, Tubulin-R: gagctttactgcctcgaacatgg) was used for the normalization of cDNA concentrations. PCR was performed using cDNA samples. The obtained products were separated and analyzed by 1% agarose gel electrophoresis.

Development of molecular markers

To enrich the marker density of the QYm.nau-2D region, four different kinds of molecular markers were developed to saturate this region. Primer sequences of the selected markers are listed in Supplementary Table 2 based on the research result previously. The previously reported QYm.nau-2D flanking markers (Xiao et al. 2016), Xwmc41 (F: tccctcttccaagcgcggatag, R: ggaggaagatctcccggagcag) and 2SNP71 (Outer-F: ttctcaagcaacttgtaacgtaccggtgga, Outer-R: gcagttgcttcccgggaagcgattcgttc, Inner-F: caatctccacttccgattcggccttgaat, Inner-R: tgctcaactttcccgctagcctaatag), were used as the queries in BLASTn (version 2.12.0) searches against the chromosome 2D sequence of Ae. tauschii (cv. AL8/78) reference v4.0 (Luo et al. 2017) and Chinese Spring reference v1.0 (International Wheat Genome Sequencing Consortium et al. 2018), covering ~ 48.8 Mb (from 576.1 Mb to 624.9 Mb) and ~ 49.0 Mb (from 577.4 Mb to 626.4 Mb), respectively.

Simple sequence repeat (SSR) marker

By using a Perl script-based program, MISA software (MIcroSAtellite identification tool, http://pgrc.ipk-gatersleben.de/misa/) (Beier et al. 2017; Thiel et al. 2003), SSR sites were identified from 2D genome sequences of 576.1 Mb to 624.9 Mb in AL8/78 and of 577.4 Mb to 626.4 Mb in Chinese Spring. Searching criteria for repeat motif identification was following: minimum repeat length criteria as ten repeat units for mononucleotides, six repeat units for dinucleotides, five repeat units for tri- and tetra-nucleotides, and four repeat units for penta- and hexa-nucleotides (Portis et al. 2007). Primer pairs for SSR markers were designed from the 200 bp up/downstream sequence of SSR sites screened based on the criteria above using Primer3 software (version 2.4.0) (Rozen and Skaletsky 2000; Untergasser et al. 2012) with following parameters: target amplicon size of 100–280 bp, optimal annealing temperature of 55–60 °C, GC content of 50–60% and primer length of 18–22 bp.

Insertion-deletion (InDel) marker

The chromosome 2D sequence of Fielder genome (from 588.8 Mb to 637.9 Mb) was extracted based on the QYm.nau-2D flanking markers mentioned above (Sato et al. 2021). The sequence differences between Chinese Spring and Fielder at the QYm.nau-2D mapping region were identified by the diffseq program (version 3.6) (Rice et al. 2000) embedded EMBOSS with parameters—wordsize 10. The flanking sequences of InDel sites (different lengths > 20 bp) between two genomes were extracted to design InDel markers by Primer3 using the same parameters mentioned above.

Single nucleotide polymorphism (SNP) marker

SNP between Yining Xiaomai and 2011I-78 detected by Wheat 55K SNP array were selected to convert into tetra-primer markers (Chiapparino et al. 2004; Medrano and De Oliveira 2014). The flanking sequences of the SNPs (500 bp upstream and downstream) were identified by BLASTn search against the Chinese Spring reference genome sequence (International Wheat Genome Sequencing Consortium et al. 2018). Tetra-primer markers were designed using the BatchPrimer3 program (https://probes.pw.usda.gov/batchprimer3/) (You et al. 2008).

Sequence-tagged site (STS) marker

Intron length polymorphism of homologous genes annotated in the QYm.nau-2D mapping region between AL8/78 and Fielder was designed as STS markers according to the protocols described by Zhang et al. (2017).

All the four types of primer pair sequences were pre-filtered with e-PCR (Electronic PCR) software (version 2.3.12) (Schuler 1997, 1998) to ensure the specificity on chromosome 2D of Chinese Spring reference v1.0 with parameters D = 50–500 N = 2 G = 2 T = 3. The 2D specific primer pairs were synthesized by TSINGKE (Beijing TsingKe Biotechnology Co., Ltd., Beijing, China) and used to test the polymorphism between 2011I-78 and Yining Xiaomai by PCR products analysis.

Genotyping by Wheat 55K SNP array and genome-wide association study (GWAS)

The genotyping analysis with Wheat 55K SNP array, which contains 53,063 SNPs, was performed by CapitalBio Technology Co. Ltd. (Beijing, China). Genomic DNA of Yining Xiaomai, 2011I-78 and 372 wheat varieties were extracted. The DNA integrity was confirmed on agarose gels, and DNA quantity was measured spectrophotometrically and used for Wheat 55K SNP array. The accuracy of Wheat 55K SNP array genotyping was initially assessed according to the Affymetrix Best Practices Workflow.

For GWAS, we firstly removed SNPs without location information according to the annotation of Wheat 55K SNP array, then filtered SNPs with parameter minor allele frequencies (MAF) < 0.05, missing data > 20%, and 44,215 SNPs were reserved for further analysis. The population structure was determined using STRUCTURE software (version 2.3.4) (Hubisz et al. 2009). Determination of optimal K values (K = 2) using a mixed and correlated allele frequency model of Monte Carlo Markov Chain with 10,000 iterations of aging and 10,000 iterations of Monte Carlo Markov Chain. The Kinship matrix was calculated by the GCTA software (version 1.94.1) (Yang et al. 2011). GWAS was performed using the mixed linear model (Q + K) by TASSEL software (version 5.0) (Bradbury et al. 2007). The K matrix was considered a random-effect factor, whereas the Q matrix was considered a fixed-effect factor. The significant threshold for P value was calculated using a modified Bonferroni correction (Genetic type 1 Error Calculator, version 0.2) with formula P = 0.01/n (where n is the effective number of independent SNPs) (Li et al. 2012). The Manhattan plots using the CMplot package in R (Yin 2020).

Haplotype analysis

The 585 SNPs located in the QYm.nau-2D associated region of chromosome 2D (570.4–619.2 Mb) were used for haplotype analysis. Phylogenetic tree was constructed by MEGA-X (Kumar et al. 2018) with the maximum likelihood method and visualized by Interactive Tree of Life (iTOL, https://itol.embl.de/) (Letunic and Bork 2021). Principal component analysis (PCA) was performed with the glPca function of ADEGENET (version 2.0.1) (Jombart and Ahmed 2011) in R. SNP variants were transformed into numbers before the patterns of SNPs analysis in QYm.nau-2D linkage block. Heterozygous sites were marked as ‘1’ and failure sites were marked as ‘−1’. SNP variants same with ‘Yining Xiaomai’ in different samples were marked as ‘0’, SNP variants different from ‘Yining Xiaomai’ were marked as ‘2’. The result was plotted by pheatmap package in R (Kolde 2019).

Collinearity analysis

Flanking SNPs (AX-111678254, AX-110524750) from the Wheat 55K SNP array which overlapped the QYm.nau-2D region were used as landmarks to retrieve genomic sequences from hexaploid wheat varieties with chromosome-level genome assembly, including ‘Chinese Spring’ (560.9–623.2 Mb) (International Wheat Genome Sequencing Consortium et al. 2018), ‘Fielder’ (568.9–634.8 Mb) (Sato et al. 2021), and ‘Mace’ (558.0–620.3 Mb), ‘LongReach Lancer’ (558.0–619.5 Mb), ‘CDC Stanley’ (565.6–626.0 Mb), ‘CDC Landmark’ (564.2–626.5 Mb), ‘Norin61’ (559.7–620.8 Mb), ‘ArinaLrFor’ (562.3–629.3 Mb), ‘Jagger’ (579.0–646.4 Mb), ‘Julius’ (569.5–636.7 Mb), ‘Spelt (PI190962)’ (560.7–622.4 Mb), ‘SY Mattis’ (559.5–625.3 Mb) (Walkowiak et al. 2020) by BLASTn software for following analysis. The structural variations (SVs) were detected using the SyRI (Synteny and Rearrangement Identifier, version 1.6.1) pipeline (Goel et al. 2019) with default parameters. For this, the fragment of Chinese Spring Refseqv1.0 genome was aligned to the fragments from other 11 sequenced genomes using nucmer utility from MUMmer toolbox (version 4.0) (Marçais et al. 2018) with parameter setting -mum -c 100 -l 40 -L 100. The delta file was filtered using the delta-filter utility from MUMmer with a minimum 500 bp alignments length and a minimum 50 percent alignment identity. The output filter-delta file was converted to stdout format using the show-coords utility from MUMmer. Alignment results were then used for identifying genomic SVs by SyRI with default parameter. SyRI analysis discovered genome syntenic, duplication, translocation and inversion sequences and plotted by plotsr (version 0.5.4) (Goel and Schneeberger 2022).

Chromosome painting of 2D

The Yining Xiaomai and Chinese Spring were used for chromosome painting using the chromosome 2D-specific probe library developed according to the Chinese Spring 2D sequence. The preparations of chromosome spreads and chromosome 2D oligonucleotide painting probes and fluorescence in situ hybridization (FISH) were performed according to the protocols described by Shi et al. (2022).

Mapping coverage analysis

Four Aegilops species, including Ae. umbellulata (UU), Ae. markgrafii (CC), Ae. comosa (MM), and Ae. uniaristata (NN) were Illumina-sequenced (Shi et al. 2022). The obtained clean data (100 million reads) of four species were, respectively, mapped against the Fielder reference genome using BWA-mem (version 0.7.17-r1188). The mean mapped read depth was calculated on chromosome 2D of Fielder with a 10 kb genomic window by SAMtools (version 0.1.19-44428cd). The depth of coverage was visualized with R plot.

Evaluation of WYMV resistance

The F2 population along with the two parents (2017–2018) and F2 recombinant plants derived F2:3 families along with the two parents (2018–2019) were grown in a WYMV nursery in Zhumadian, Henan, China. The 372 wheat varieties were grown in two WYMV nurseries in Zhumadian, Henan, China in 2018–2019, 2020–2021 and Yandu, Jiangsu, China in 2020–2021, 2021–2022. The experimental design was randomized complete blocks with two replications. Each plot comprised 1 m rows spaced 25 cm apart, with 15 seeds in each row. All the field trials were managed according to local practices. Control cultivars with known resistance or susceptibility were Yining Xiaomai (resistant to WYMV), Yangmai158 (susceptible to WYMV), and 2011I-78 (susceptible to WYMV).

The disease rates (DRs) for WYMV were recorded on a 0–5 disease scale using grading standards described by Kojima et al. (2015) with 0 = no visible symptoms, 1 = slightly purple leaves, 2 = slightly yellowish leaves with a mild streak mosaic, 3 = yellowish leaves with a clear streak mosaic on less than half of the leaves, 4 = a distinct yellow and streak mosaic covering almost all of the leaves, and 5 = yellowish‐brown leaves with more than half of the plant dead. A plant was graded as R (resistance, DR 0–2) if no visible symptom was observed and graded as S (susceptible, DR 3–5) if obvious mosaic, streak spots and dwarf symptoms were observed (Nishio et al. 2010).

The WYMV DRs for the F2 recombinants were collected from a nursery in Zhumadian, Henan, on March 15 and March 27 in 2018, and DRs for F2:3 progenies were collected on February 24 and March 12 in 2019. For each F2 individual or F2:3 family, the DRs of all plants were scored. We used the data when the disease symptoms of the susceptible controls were the most severe.

The WYMV DRs for the natural population of 372 wheat varieties were collected on February 24 and March 12 in 2019, and on January 21 and March 3 in 2021 from a nursery in Zhumadian, Henan, on February 22 and March 10 in 2021, and on February 23 and March 4 in 2022 from the nursery in Yandu, Jiangsu. For each variety, the DRs were collected from ten individuals with the most severe disease symptoms. The mean disease rates of WYMV of each wheat variety from four environments were utilized to calculate the best linear unbiased prediction (BLUP). Genotypes of 372 wheat varieties and four environments were treated as random effects in a linear mixed model to estimate BLUP value using the lme4 package (Bates et al. 2015) in the R (version 4.2.0). The mean WYMV DRs for each variety in four environments and BLUP value were used to calculate correlation coefficients and plotted by GGally package (Schloerke et al. 2018) in the R.

Agronomical trait evaluation and statistical analysis

The agronomic traits of 372 wheat varieties were evaluated in three disease-free environments at Zhenjiang (Jiangsu) in 2017–2018, 2018–2019, and 2019–2020. At maturity, ten main spikes in the middle of each row were selected from each plot to measure the plant height (PH, cm), seven spike-related traits, including spike length (SL, cm); spikelet density (SD); spike weight (SW, g); total spikelet number per spike (SPN); fertile spikelet number per spike (FSPN); top sterile spikelet number per spike (TSSN); basal sterile spikelet number per spike (BSSN), and seven grain-related traits, including the number of kernel per spike (KNS); kernel length (KL, mm); kernel width (KW, mm); kernel length-to-width ratio (KLWR); thousand kernel weight (TKW, g); kernel area size (KAS, mm2); kernel perimeter length (KPL, mm). For each variety, the mean values of each trait were calculated across two replicates in each environment. The BLUP of target traits across three environments were obtained using lme4 package in R. Comparison of agronomic performance among the seven haplotypes using BLUP value LSD (Least Significance Difference) multiple comparisons by agricolae (De Mendiburu 2014) package in R was tested for each trait between different haplotypes, setting α = 0.05. The results were plotted by ggplot2 (Wickham 2009) package in R.

Results

Fine mapping of QYm.nau-2D

Our previous research has mapped QYm.nau-2D to the distal region of chromosome arm 2DL (Xiao et al. 2016), flanked by markers WXE1339 (594.6 Mb), and 2EST784 (617.4 Mb) (Chinese Spring reference v1.0). We enriched the marker density within an expanded marker interval between Xwmc41 (577.4 Mb) and 2SNP71 (626.4 Mb) by developing new molecular markers. In total, we designed and identified 50 markers (including 19 SSR markers, 21 InDel markers, four SNP markers and six STS markers) showing polymorphism between the two parents of the mapping population, Yining Xiaomai and 2011I-78 (Supplementary Table 2). The specificity of the above markers on 2D was validated by amplifying polymorphic bands using DNA of homoeologous-group 2 nulli-tetrasomic lines of Chinese Spring.

Screening of the 11,265 individuals of the (2011I-78 × Yining Xiaomai) F2 population using markers Xwmc41 (577.4 Mb) and 2SNP71 (626.4 Mb) identified 53 recombinants. These recombinants represented 13 recombination types in the examined region. Evaluation of WYMV for F2 recombinants and F2:3 families indicated that five were resistant and eight were susceptible (Fig. 1, Supplementary Table 3). The QYm.nau-2D falled into an interval of genetic distances of 0.01 cM and 0.13 cM, flanked by InDel_M41 and InDel_M412. The interval corresponds to a physical distance of 18.8 Mb from 596.0 to 614.8 Mb on chromosome 2D of Chinese Spring.

Fig. 1
figure 1

Mapping result of wheat yellow mosaic virus (WYMV) resistant QTL QYm.nau-2D using resistant Yining Xiaomai and susceptible 2011I-78 derived F2 population. S: susceptible; R: resistant; WYMV+ indicated WYMV could be detected by RT-PCR, while WYMV− indicated WYMV could not be detected. A-M indicated 13 recombinant haplotypes

GWAS for WYMV resistance

A panel of 372 wheat varieties was genotyped using Wheat 55K SNP array. After filtering with the criteria MAF < 0.05, missing data > 20%, 44,215 SNPs were retained and used for GWAS. The panel was evaluated for WYMV resistance in four environments, and the DRs showed a bimodal frequency distribution in each of the environments and BLUP values across the environments. According to the frequency distribution, all the varieties were mainly classified into two groups, resistant and susceptible (Supplementary Figure 1A). Pearson correlation analysis showed high repeatability among the environments (Supplementary Figure 1B). GWAS of WYMV resistance were conducted separately using WYMV phenotyping data collected from four environments and BLUP values. A total of 74 stably significant SNPs (P value < 0.01/44,215) associated with WYMV resistance in four environments and BLUP values were identified, and they were on 1D, 2A, 2B, 2D, 4B, and 7D, respectively. Except for the previously reported QTL on 2A and 2D, the other five were new QTLs detected in the present research. The number of SNPs on 2D chromosome occupied the most proportions, 83.7% (SNPs on 2D/all SNPs = 62/74) (Fig. 2A; Supplementary Table 4). These SNPs were all aligned to the distal region of chromosome arm 2DL from 596.7 to 613.1 Mb (Fig. 2B), in which QYm.nau-2D was mapped. GWAS validated the presence of the WYMV resistance QTL QYm.nau-2D, which can be mapped to a physical distance of 16.4 Mb.

Fig. 2
figure 2

Genome-wide association study of wheat yellow mosaic virus (WYMV) resistance. A Circles from outlayer to inlayer indicated the number of SNPs within 1 Mb window size and five environments (2019 and 2021 in Zhumadian, Henan (2019_ZMD and 2021_ZMD), 2021 and 2022 in Yandu, Jiangsu (2021_YD and 2022_YD), and their best linear unbiased prediction (BLUP) value across the nurseries); yellow dots indicated 0.05 Bonferroni corrected significance with −log(P value) > 5.947 and ≤ 6.646; red dots indicated 0.01 Bonferroni corrected significance with −log(P value) > 6.646; B distribution of significant SNPs associated with WYMV resistance on chromosome 2D of Chinese Spring; The 45 highly associated SNPs (P value) on Chinese Spring 2D were used to represent the QYm.nau-2D linkage block for haplotype analysis of a panel of 372 wheat varieties and 12 sequenced wheat varieties (color figure online)

Haplotype analysis of QYm.nau-2D linkage block

Out of 62 significant SNPs on chromosome 2D, 45 highly associated SNPs (P value < 10–12, Supplementary Table 4) were selected to represent QYm.nau-2D linkage block and used for haplotype analysis. Two haplotypes, designated Hap_a and Hap_b, were identified. Hap_a consisting of 106 varieties was the same as resistant parent Yining Xiaomai. Hap_b consisting of the remaining 266 varieties was the same as susceptible parent 2011I-78 (Fig. 2B). All the Hap_a varieties showed high resistance to WYMV (Supplementary Figure 1), implying they harbor QYm.nau-2D resistance. Most of Hap_b varieties were susceptible. Sixty-five Hap_b varieties showed stable resistance to WYMV in four environments, indicating they might have resistance genes/QTLs other than QYm.nau-2D. We excluded 106 Hap_a varieties and re-performed GWAS using the remained 266 Hap_b varieties. No significant SNP on chromosome 2D was detected to be associated with WYMV resistance, further demonstrating the presence of QYm.nau-2D in 106 Hap_a varieties (Supplementary Figure 2). The haplotype analysis also showed a lack of recombination within the large associated region (16.4 Mb), implying that QYm.nau-2D was located in a recombination suppressed linkage block. We found that wheat variety Fielder was resistant to WYMV and it belonged to Hap_a (Fig. 2B), revealing that Fielder also contained the QYm.nau-2D.

The detailed genotyping results were analyzed using all the filtered SNPs in the enlarged associated region corresponding to 560.0 Mb to 623.0 Mb of Chinese Spring 2D. The SNP alleles were discriminated into three types: the same as Yining Xiaomai, different from Yining Xiaomai and failure genotypes. The Hap_a varieties harboring QYm.nau-2D could be further classified into six sub-haplotypes I–VI by phylogenetic tree and principal component analysis, and each contained 59, 14, 10, 1, 14, and 8 varieties, respectively (Fig. 3A, Supplementary Figure 3). Genotyping identified four obvious recombination events, each at 579.5 Mb (Hap_a- VI), 596.7 Mb (Hap_a-III, Hap_a-V, and Hap_a-VI), 601.9 Mb (Hap_a-IV) and 613.1 Mb (Hap_a-II and Hap_a-III) (Fig. 3B, Supplementary Figure 5). We propose all QYm.nau-2D sub-haplotypes might be originated from a same fragment, and Yining Xiaomai represented Hap_a-I with the largest fragment. The other five different sub-haplotypes may result from different recombination events occurred in the association region during wheat breeding via hybridization (Supplementary Figure 5). The Hap_a-III and Hap_a-VI may originate from recombination events (Fig. 3B). Only a European landrace BARBELA GROSSO representing Hap_a-IV, and its recombination events happened at 601.9 Mb region. This region had high frequency of failure genotyping, which may be due to the high sequence divergence from the counterparts of Chinese Spring (Supplementary Figure 5; Supplementary Table 1). Since BARBELA GROSSO is resistant to WYMV, QYm.nau-2D could be further narrowed to an 11.2 Mb interval of Chinese Spring 2D (601.9–613.1 Mb) (Fig. 3B).

Fig. 3
figure 3

Haplotype analysis of a panel of 372 wheat varieties. A Phylogenetic tree based on the 585 SNPs from 570.4 to 619.2 Mb on Chinese Spring 2D classified the varieties into seven haplotypes: Hap_a-I ~ VI (106 varieties) and Hap_b (266 varieties); B The patterns of SNPs analysis in QYm.nau-2D linkage block region on Chinese Spring 2D. Each column indicated a wheat material, and each row was an SNP site

Putative origin of the alien introgression by chromosome painting, collinearity analysis and mapping coverage analysis

Shi et al (2022) developed and successfully used the Chinese Spring 2D specific oligonucleotide probe library to paint chromosome 2D by fluorescence in situ hybridization (FISH). We performed FISH using the library to compare the 2D in Yining Xiaomai and Chinese Spring. The results indicated the signals were evenly distributed on the pair of Chinese Spring 2D, while we failed to observe the painting signals near the distal of 2D long arm of Yining Xiaomai (Fig. 4), revealing the diversification of this region with that in Chinese spring. Our genotyping results indicated that, in QYm.nau-2D region from 596.0 to 614.8 Mb, the proportion of failure genotypes in the 106 Hap_a varieties harboring QYm.nau-2D (> 29.0%) was much higher than the remained Hap_b varieties without QYm.nau-2D (Supplementary Figure 4; Supplementary Table 1). Combining the findings from genotyping and chromosome painting, we propose the QYm.nau-2D might have originated from an alien introgression, and this may explain the low recombination within the linkage block harboring QYm.nau-2D.

Fig. 4
figure 4

Oligonucleotide FISH-based karyotype of Chinese Spring and Yining Xiaomai using a set of probes specific for chromosome 2D. Arrow indicated QYm.nau-2D region resembling the secondary restriction due to the alien introgressive segment on chromosome 2D. Bar = 10 μm

To investigate whether any variety having available genome sequences has the same alien introgression as those in Hap_a varieties, the 63 Mb (560.0 Mb to 623.0 Mb) 2D sequences in Chinese Spring were aligned with the assembled pseudomolecule level genome sequences of Fielder and 10 + wheat pan-genome project (Walkowiak et al. 2020). The dot plot analysis highlighted low fragment collinearity between Chinese Spring and five varieties (Fielder, Jagger, ArinaLrFor, Julius, and SY Mattis), i.e., two fragments in Fielder (corresponding to Chinese Spring 2D region of 10 Mb, 569.0–579.0 Mb and 23 Mb, 596.0–619.0 Mb) and one fragment in other four varieties (corresponding to Chinese Spring 2D region of 44 Mb, 569.0–613.0 Mb) (Fig. 5, Supplementary Figure 6). In contrast, the five varieties showed strong homology in these genome regions (Fig. 5). Jagger was reported to be resistant to WYMV and we identified Fielder was also WYMV resistant, and they both had QYm.nau-2D (Kobayashi et al. 2020). We propose the other three varieties are WYMV resistant and contain QYm.nau-2D. Moreover, according to the sequence alignment results and haplotype analysis, Fielder represented Hap_a-VI, and the other four varieties belonged to Hap_a-II. All these genome sequences could be used as a reference for QYm.nau-2D candidate gene screening and cloning.

Fig. 5
figure 5

Collinearity analysis of QYm.nau-2D linkage block among 12 sequenced genomes. The collinear regions corresponded to 560.0 Mb to 623.0 Mb on chromosome 2D of Chinese Spring. Left is a diagram showing the collinearity among the three haplotypes which contained 12 sequenced genomes on the right. The putative introgression fragments were indicated in red (color figure online)

Genomes U, M, C, and N are closely related to genome D (Shi et al. 2022). To predict the putative origin of the alien introgression, by taking the advantage of the Illumina sequence of four Aegilops species, we mapped the short reads of Ae. umbellulata (UU), Ae. markgrafii (CC), Ae. comosa (MM), and Ae. uniaristata (NN), to chromosome 2D of Fielder which contained the introgression harboring QYm.nau-2D resistance. For all four species, the proportions covered by mapped reads were increased in the putative alien introgression fragments (shown in gray shadow), 578.0–590.0 Mb and 606.0–631.0 Mb on the chromosome 2D of Fielder, with genome NN having the highest reads depth, followed by genomes CC, MM and UU (Supplementary Figure 7). This indicated the high sequence similarity of the two fragment regions with the reads from the four Aegilops species, with genome N of Ae. uniaristata having the highest similarity. We propose that the introgression was originally from the Aegilops species, with Ae. uniaristata being the most closely related species to the donor of this introgression. But whether N genome is the donor of this introgression needs more evidence.

Potential utilization of QYm.nau-2D in wheat breeding by development of diagnostic markers

We evaluated the genetic effects of the alien introgression on agronomic traits. A total of 15 agronomic traits were compared among the seven haplotypes (Hap_a-I~VI and Hap_b). All haplotypes had no significant differences for all the investigated traits in all environments (Fig. 6). It indicated that the alien introgression harboring QYm.nau-2D did not have additional effects on these tested yield-related traits, implying its unitary utilization in wheat breeding of WYMV resistance.

Fig. 6
figure 6

Comparison of agronomic performance among the seven haplotypes for plant height, spike-related and kernel-related traits using the best linear unbiased predictions values across the 3 years trials. H1–H7 are Hap_a-I, Hap_a-II, Hap_a-III, Hap_a-IV, Hap_a-V, Hap_a-VI, Hap_b, respectively. Abbreviations: SL, spike length; PH, plant height; ESN, effective spike number; KNS, number of kernels per spike; TKW, thousand kernel weight; TSSN, top sterile spikelet number per spike; BSSN, basal sterile spikelet number per spike; FSPN, fertile spikelet number per spike; SPN, total spikelet number per spike; SD, spikelet density; KAS, kernel area size; KPL, kernel perimeter length; KLWR, kernel length-to-width ratio; KL, kernel length; KW, kernel width

Due to the putatively reduced recombination rate within QYm.nau-2D linkage block, molecular markers developed in this region should be tightly linked to WYMV resistance conferred by QYm.nau-2D. The 596.0–614.8 Mb sequence of Chinese Spring chromosome 2D was compared with the counterparts in Fielder. Eleven primer pairs for distinguishing polymorphic insertion-deletion fragments were designed (Supplementary Table 2). All primer pairs generated polymorphism between Hap_a and Hap_b, having identical fragment size PCR products in the 105 wheat varieties (Hap_a, except for BARBELA GROSSO) harboring QYm.nau-2D, while having different fragment size PCR products in Hap_b varieties (Fig. 7). All of them can serve as codominant diagnostic markers to trace WYMV resistance conferred by QYm.nau-2D.

Fig. 7
figure 7

Development of PCR-based codominant InDel markers within QYm.nau-2D linkage block for distinguishing Hap_a (I-VI) from Hap_b. 1–4: Hap_a-I representative varieties: RIETI, NANAH-LEB, Dazibai, Zhenmai 3; 5–8: Hap_a-II representative varieties: Zhongzhimai 4, Xinmai 26, Jining 13, Xuzhou 25; 9–12: Hap_a-III representative varieties: Ningchun 4, Zhenmai 366, Chuanyu 20, Shuangji 4; 13–16: Hap_a-V representative varieties: Yannong 19, Huaimai 33, Xumai 32, Zhongyun 1; 17–20: Hap_a-VI representative varieties: Fielder, Ningzimai 1, Fengyou 3, Qingchun 5; 21–24: Hap_b representative varieties: Mavrogan, Hindi 62, Baihuomai, Yangmai 158

Discussion

Introgressions of widespread alien chromatins from wild species into modern cultivated wheat are generally acknowledged as one of factors affecting genomic diversity of bread wheat (Cui et al. 2015; Hegde and Waines 2004; Pour-Aboughadareh et al. 2021). These introgressions probably contributed to favorable alleles associated with excellent agronomic performance in wheat. The identification of alien introgressions and their possible donors were facilitated by progress in the explosive whole-genome sequences in Triticeae species and the advanced sequencing technology. Taking the wheat variety LongReach Lancer for an example, its chromosome 3D contained a region of approximately 60 Mb from a Th. ponticum introgression and chromosome 2B was made up of the majority (427 Mb) of T. timopheevii chromosome 2G (Walkowiak et al. 2020). The introgressions contribute beneficial alleles related to various wheat disease resistance (Bariana et al. 2001; Chemayek et al. 2017; Wan et al. 2020). The introgression haplo-blocks on chromosome 2D were identified through a comparative sequence analysis (Cheng et al. 2019) and a whole-genome resequencing study (Walkowiak et al. 2020). One of the introgression segments is related to the WYMV resistant locus Qym.nau-2D on 2DL by comparative analysis of several short sequences for the linkage marker (Kobayashi et al. 2020).

In the present study, we further demonstrated that the WYMV resistance conferred by Qym.nau-2D is a putative alien introgression that was incorporated into the distal region of chromosome arm 2DL of common wheat. The resistant parent Yining Xiaomai was genotyped as Hap_a-I, which has the largest introgression segment from 570.4 Mb to 619.2 Mb. Despite a considerable number of F2 individuals being used for screening, no recombination within 596.0 Mb to 614.8 Mb region occurred, indicating this haplo-block was maintained by a low recombination rate due to high divergence between resistant and susceptible parents. The resistant haplo-block was further broken by recombination in several varieties in a core collection by GWAS. By employing these natural recombinants, especially one of the European cultivars BARBELA GROSSO with the recombination site at 601.9 Mb, we narrowed the Qym.nau-2D to 601.9–613.1 Mb region. This is the most divergent region with the highest density of failure genotyping data in the Hap_a group harboring Qym.nau-2D (Supplementary Figure 4). Therefore, markers developed in this block might be tightly linked to Qym.nau-2D and the 11 markers developed in our study would be ideal to diagnose WYMV resistance in breeding (Fig. 7).

The assessment of a pseudomolecule of the chromosome 2D of variety ‘CH Campala Lr22a’ has shown that introgressions from wild relatives contributed variations to wheat genome 2D (Thind et al. 2018). Kobayashi et al. (2020) identified four haplo-blocks associated with the alien introgressions on 2D, and the 48 Mb haplo-block c resided in the WYMV resistance gene/QTL on 2D (designated WYM-2D). Comparison of short sequences cloned from Yumechikara within WYM-2D region with haplo-block c of ‘CH Campala Lr22a’ 2D pseudomolecule indicated that they shared more than 99% identity (Kobayashi et al. 2020). These reveal that WYM-2D containing haplo-block c might be an alien introgression based on the marker analysis of only five varieties. In our research, using 106 varieties harboring Qym.nau-2D, we have identified six haplo-blocks with different fragment sizes of alien introgression (Supplementary Figure 5). Haplotype analysis indicated that all the six haplotypes harboring Qym.nau-2D might be closely related, although they contained different introgression segment sizes due to a few of recombination at this region.

Although the varieties in Hap_a-I group have the largest introgression (570.4 Mb to 619.2 Mb), they might be not primitive. A Portugal landrace (C10) was reported to possess introgression from ~ 570 Mb to the end of chromosome 2D (Cheng et al. 2019), which encompassed the alien segments in all above genotypes. It may represent the original introgression fragment derived from a terminal translocation between wheat and the wild relative donor (Fig. 8; Supplementary Figure 5). The intact introgression fragment was then broken by recombination in some other varieties, given the intense efforts of hybridization breeding using this European landrace (designated as Hap_a-o). In our GWAS collection, the introgression segment was present much more in the varieties than landraces (95:11), supporting this active selection in breeding process.

Fig. 8
figure 8

Proposed diagram for the utilization of QYm.nau-2D in worldwide breeding programs

According to the geographical distribution of the haplotypes, an extensive and independent exploration of the resistant gene in various countries was proposed (Fig. 8). In addition to 106 varieties in this study, four sequenced varieties (ArinaLrFor, etc.), five varieties (Yumechikara, etc.) reported in Kobayashi et al.’s research (2020) and Fielder were also included. The breeding programs initiated from Hap_a-I both with the largest introgressive segment and the greatest number (59/116) of wheat varieties. The introduction of Hap_a-I into China has some relationship with Rieti which was the crossing parent of Nanda2419 (Rieti/Wihelmina//Akagomughi), a widely used cultivar for wheat production and breeding in the 1950s of China (Jia et al. 2013). Breeding programs in China generated a new haplotype Hap_a-V probably in Huang-Huai-Hai winter wheat-growing region and subsequent Hap_a-VI (Mianyang 15, etc.) in the Southwest winter wheat-growing region (Supplementary Figure 8). In Europe, Hap_a-I was used to generate two haplotypes Hap_a-II (Ibis, etc.) and Hap_a-IV (BARBELA GROSSO) (Fig. 8). Hap_a-II has been introduced to different countries, such as Ukraine (Odeskaya 51), the USA (Madsen), Japan (Yumechikara) as well as northern China (Zhongmai 175, etc.) (Fig. 8). Local breeding efforts using Hap_a-II in China generated a new haplotype Hap_a-III (Ningchun 4, etc.) (Fig. 8). Most proportion (70/98, 71.4%) of the WYMV resistant cultivars were from Huang-Huai-Hai and North winter wheat-growing regions, followed by those (20/98, 20.4%) from the Middle and Lower Reaches of the Yangtze River and Southwest wheat-growing regions in China. The high frequency of WYMV resistant varieties in these regions, where winter temperature is favorable for WYMV pathogenesis, might be due to the pressure from WYMV epidemic.

While introgressions from CWRs can bring beneficial traits into elite wheat, it may also have deleterious effects on desirable traits, including yield (Summers and Brown 2013). We comprehensively analyzed the genomic and phenotypic impacts of alien introgression into cultivated wheat varieties and found all the haplotypes (except one Hap_a-IV variety) displayed no significant difference for all the 15 investigated traits. It indicated that this introgression did not have obviously negative effects on the agronomic performances. Moreover, positive effects on grain yield were detected for this introgression. Haplotype effect analysis of local genomic estimated breeding values (GEBVs) showed that one of the LD blocks 2D_b003483, which spanned SNPs from 570,951,341 to 608,198,626 bp on chromosome 2D, exhibited the highest observed local GEBV variance for grain yield under optimum conditions, with estimated haplotype effects ranging from ~ 40 to > 30 kg ha−1 (Voss-Fels et al. 2019). The block was found to be under strong selection for the number of grains per m2 and total plant biomass. According to the same physical location, this block probably referred to QYm.nau-2D linkage block. However, in our study, the positive effects were not detected which might be attributed to the data for the agronomic traits collected from the field without obvious WYMV infection.

To conclude, this study provided a case study of an alien introgression fragment on chromosome 2D, which is selected for its role in the enhancement of WYMV resistance. We could foresee that the further fine mapping of QYm.nau-2D was inevitably impeded by the inhibition of allogeneic chromosome recombination. To overcome recombination suppression, we have introduced a ph1b-based strategy to produce introgressions with reduced segments for the fine mapping and cloning of QYm.nau-2D.