Introduction

Closely related species are useful resources in crop breeding, and efficient use of crop relatives by breeders is important for addressing recent and future climate change (Fita et al. 2015; Prohens et al. 2017). For the identification of agronomically important genes in crop relatives and the confirmation of introduced segments of the related species’ chromosomes, genome-wide polymorphism information can be effective not only within the related species but also between the crop and its relatives (Rasheed et al. 2018). However, reference genome sequence information is not available for most crop relatives. Although cultivated wheat species have many wild relative species in Triticum and Aegilops, reference genome sequences have been published for only three wild relatives (Avni et al. 2017; Luo et al. 2017; Ling et al. 2018). However, in recent studies, next-generation sequencing data enabled in silico analysis of the wild relative species’ genomes based on the chromosomal synteny among wheat homoeologous genomes (Nishijima et al. 2016; Okada et al. 2018).

Einkorn wheat, Triticum monococcum L. (AmAm genome), includes two subspecies, with ssp. monococcum corresponding to the cultivated form and ssp. aegilopoides (Link) Thell. (syn. T. boeoticum Boiss) corresponding to the wild form. Wild einkorn wheat is widely distributed in countries of the eastern Mediterranean, and domestication from ssp. aegilopoides to ssp. monococcum in T. monococcum is inferred to have occurred once, in the Karacadag region of southeastern Turkey (Heun et al. 1997). Triticum urartu Tumanian ex Gandilyan (AA genome), which is closely related to einkorn wheat, is the A-genome donor of tetraploid wheat, namely T. turgidum L. (AABB genome) and T. timopheevii Zhuk (AAGG genome), and common wheat, T. aestivum L. (AABBDD genome) (Dvorak et al. 1988; Takumi et al. 1993). The distribution of T. urartu is restricted almost exclusively to the “Fertile Crescent” (Wang et al. 2017). Reproductive barriers, including abnormal hybrid seed formation and hybrid necrosis, underlie the separation of T. monococcum ssp. aegilopoides and T. urartu (Johnson and Dhaliwal 1976; Gill and Waines 1978; Fricano et al. 2014; Takamatsu et al. 2015). Interspecific hybrids between T. monococcum and T. urartu are almost always sterile, and many chromosomal rearrangements appear to exist between the A and Am genomes (Dubcovsky et al. 1996; Fricano et al. 2014). Postzygotic reproductive isolation could result, at least in part, from nuclear genome differentiation between the two species, which was previously reported at the nuclear and organellar DNA levels (Castagna et al. 1994; Mizumoto et al. 2002; Brandolini et al. 2006).

Triticum urartu is evolutionarily important as the A-genome donor of polyploid wheat species, and the genome sequence of T. urartu has already been published (Ling et al. 2013, 2018). The T. urartu population harbours a large amount of variation in agronomically important traits such as disease resistance and grain quality (Qiu et al. 2005; Guzmán and Alvarez 2012). The recent accumulation of genome information has included surveys of genetic diversity and variation in T. urartu (Luo et al. 2015; Wang et al. 2017; Brunazzi et al. 2018). Cultivated einkorn wheat (ssp. monococcum) is an important resource for improving grain quality, resistance to various diseases, and resistance to abiotic stress in durum and common wheat (Vasu et al. 2001; Tranquilli et al. 2002; James et al. 2006). Domestication of einkorn wheat was achieved by early farmers in the Fertile Crescent, implying that cultivated einkorn wheat may retain a large amount of diversity and useful alleles for various traits (Jing et al. 2007). In recent years, cultivated einkorn wheat has served as a valuable model for wheat genetics and has been useful in the screening of mutant strains to identify mutant alleles (Yan et al. 2004; Murai et al. 2013; Gardiner et al. 2014). Diverse accessions of cultivated einkorn wheat have also been useful for identifying agronomically important genes based on genome-wide association with molecular markers (Jing et al. 2007).

Similarly, wild einkorn wheat (ssp. aegilopoides) has been a useful resource for improving disease resistance and grain quality in common wheat (Rogers et al. 1997; Shi et al. 1998; Anker and Niks 2001; Hovhannisyan et al. 2011). However, in contrast to T. urartu, little genomic information based on next-generation sequencing (NGS) techniques is available for wild einkorn wheat. Whole-genome sequencing and exome sequencing approaches are convenient for multiple samples of wild wheat species because of the large genome size and sequencing costs of these species. RNA-seq is an effective approach for avoiding these problems. The RNA-seq approach enables us to find single-nucleotide polymorphisms (SNPs) and insertions/deletions (indels) covering entire chromosomal regions in various diploid wheat relatives (Iehisa et al. 2012, 2014; Nishijima et al. 2016; Wu et al. 2018; Okada et al. 2018; Miki et al. 2019). To date, RNA-seq data from one accession of wild einkorn wheat have been compared with those from an einkorn wheat cultivar; a large number of polymorphisms were obtained in this comparative study (Fox et al. 2014). Some linkage maps have been constructed, mainly in two mapping populations, permitting comparisons of cultivated and wild accessions of einkorn wheat to find useful genes (Bullrich et al. 2002; Shindo et al. 2002; Hori et al. 2007; Jing et al. 2009; Yu et al. 2017). However, to the best of our knowledge, no genetic map has been constructed for any population derived from intra-subspecies crosses of wild einkorn wheat. For future use of genetic variation in wild einkorn wheat in breeding, genome-wide molecular markers need to be developed to permit efficient detection of DNA polymorphisms in specific chromosomal regions.

Recently, high-quality reference genome sequences have been reported for the A genomes of diploid, tetraploid, and hexaploid wheat (Avni et al. 2017; Ling et al. 2018; International Wheat Genome Sequencing Consortium (IWGSC) 2018). The pseudomolecules of the A genome sequences can be utilized as references for virtual anchoring of the NGS reads and polymorphisms to each chromosome of the Am genome. The DNA polymorphisms from the NGS data showed a genome-wide distribution with high resolution, even between intra-lineage accessions of the wild diploid D-genome species Aegilops tauschii Coss (Nishijima et al. 2016). Thus, the RNA-seq approach has potential for the discovery of genome-wide SNPs and indels, permitting anchoring of these markers to each of the Am-genome chromosomes in wild einkorn wheat. The objectives of the present study are (1) to detect genome-wide polymorphisms distributed across the entire chromosomes of wild einkorn wheat (ssp. aegilopoides), (2) to use the polymorphism data to develop a large number of molecular markers distinguishing the Am-genome chromosomes from A-genome chromosomes, and (3) to confirm the use of polymorphism data in the construction of linkage maps for wild einkorn wheat. To achieve these objectives, we performed RNA-seq analysis of 15 accessions of two diploid wheat species, namely T. monococcum and T. urartu. We provide two examples in which the polymorphism data are converted to PCR-based markers in wild einkorn wheat.

Materials and methods

Plant materials

Fifteen accessions of diploid wheat were used in this study, including three accessions of T. urartu, two of T. monococcum ssp. monococcum, and ten of T. monococcum ssp. aegilopoides. With the exception of DV92 and PI427634, diploid wheat seeds were obtained from the wheat genetic resource centre of the National BioResource Project-Wheat (Japan, https://shigen.nig.ac.jp/wheat/komugi/top/top.jsp) (Table 1). DV92 is an accession widely used as a parental line for population mapping and mutant panel construction (Bullrich et al. 2002; Murai et al. 2013).

Table 1 List of the diploid wheat accessions used in this study

An F2 mapping population was generated from a cross between two wild einkorn wheat accessions: KU-3620 and KU-8276. Seeds of the F2 population, with a population size of 103, were sown in November 2016 and the F2 individuals as well as the two parental accessions were grown individually in randomly arranged pots during the 2016–2017 season in an experimental field at Kobe University (34°43′N, 135°13′E). The heading and flowering times of the F2 individuals were recorded as days after sowing.

A synthetic hexaploid line was used to check the utility of the Am-genome-specific markers developed in this study. For synthetic hexaploid production, a tetraploid wheat cultivar, Langdon (Ldn), was crossed with pollen of the wild einkorn wheat accession KU-3620, and one of the resulting F1 plants was treated with 1 g L−1 colchicine and 2% dimethyl sulfoxide solution for 5 h to obtain selfed seeds. After confirmation of the somatic chromosome number (42) using root tips, a synthetic hexaploid with an AABBAmAm genome was established and designated Ldn/KU-3620.

RNA sequencing

Total RNA was extracted using Sepasol-RNA I Super G (Nacalai Tesque, Kyoto, Japan) from the leaves of 10-day-old plants grown under conditions of 16-h light/8-h dark and 24 °C. Paired-end libraries for RNA-seq were constructed from 6 to 10 µg of total RNA using a TruSeq RNA Library Preparation kit v2 (Illumina, San Diego, CA, USA) according to the manufacturer’s procedure (Sato et al. 2016); the resulting libraries were then sequenced by 300-bp paired-end reads on an Illumina MiSeq sequencer. Five libraries per run were used for sequencing, and approximately 28 million reads were obtained. The sequenced reads were deposited in the DDBJ Sequence Read Archive under accession number DRA007574.

The quality of sequencing reads was evaluated using FASTQC software (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/). Trimmomatic software, version 0.33 (Bolger et al. 2014), was used to remove adapter sequences, low-quality bases with an average quality score per 4 bp of < 30, and reads of fewer than 50 bp. The filtered reads were aligned to the reference A genome sequence of T. aestivum cv. Chinese Spring (CS) version 1 (International Wheat Genome Sequencing Consortium (IWGSC) 2018) using HISAT2 software version 2.1.0 (Kim et al. 2015). The RNA-seq reads of 10 accessions of Ae. tauschii (Nishijima et al. 2016), 12 accessions of Ae. umbellulata (Okada et al. 2018), and 1 accession of Ae. speltoides Tausch (Miki et al. submitted) were also used for phylogenetic tree construction (Table 1). These RNA-seq reads were obtained from the DDBJ Sequence Read Archive as follows: DRA004604 for Ae. tauschii, DRA006404 for Ae. umbellulata, and DRA007097 for Ae. speltoides.

Phylogenetic analysis

SNPs and indels were called using SAMtools (Li et al. 2009) and Coval (Kosugi et al. 2013) with the same criteria as those described in Nishijima et al. (2016); the depth of read coverage was ≥ 10, and > 95% of the mapped reads included nucleotide sequences that differed from the reference sequence of the A genome. To obtain a high-confidence set of SNPs for construction of phylogenetic trees, we chose SNPs at positions for which the read depth was ≥ 10 and at which no ambiguous nucleotides were detected in any of the tested accessions. CIRCOS (Krzywinski et al. 2009) and R statistical software were used to visualize the distribution of SNPs/indels on the physical map of the A genome. Neighbour-joining (NJ) and maximum likelihood (ML) phylogenetic trees were constructed using Molecular Evolutionary Genetics Analysis (MEGA) software, version 7.0 (Kumar et al. 2016).

Marker development, map construction, and QTL analysis

De novo transcriptome assembly of the 12 accessions of einkorn wheat and the 3 accessions of T. urartu was performed using Trinity (Grabherr et al. 2011). If multiple isoforms were detected, the first isoform designated by Trinity was selected as a representative transcript. RNA-seq short reads of T. monococcum ssp. aegilopoides KU-8276, one of the parental accessions of the F2 mapping population, were aligned to the representative transcripts of another parental accession, T. monococcum ssp. aegilopoides KU-3620, using Bowtie2 (Langmead and Salzberg 2012). SNP calling was conducted with the same pipeline described above. SNPs and indels were called when the depth of read coverage was ≥ 10, and > 95% of the aligned reads included nucleotide sequences that differed from sequences of the assembled transcript. The representative transcripts of KU-3620 were anchored to the chromosomes of the A genome of CS (International Wheat Genome Sequencing Consortium (IWGSC) 2018) using GMAP (Wu and Watanabe 2005). According to the location of the anchored transcripts, the SNPs between KU-8276 and KU-3620 were placed on the chromosomes of the A genome.

Based on the SNP and indel information, primer sets were designed using Primer3Plus software (Untergasser et al. 2007). Some of the identified SNPs were converted to cleaved amplified polymorphic sequence (CAPS) markers (Supplementary Table S1). Information for four simple-sequence-repeat (SSR) markers and their respective annealing temperatures was obtained from the GrainGenes website (https://wheat.pw.usda.gov/GG2/index.shtml). PCR amplification, digestion of the PCR products, and visualization of the products were performed as described in our previous study (Sakaguchi et al. 2016).

Genetic mapping using the genotyping data was performed using the MAPMAKER/EXP version 3.0 package; the logarithm-of-odds (LOD) score threshold was set to 3.0 (Lander et al. 1987). Quantitative trait locus (QTL) analysis of heading and flowering times was conducted by single-marker analysis and composite interval mapping with Windows QTL Cartographer version 2.5 software and the backward regression method (http://statgen.acsu.edu/qtlcart/WQTLCart.htm). The LOD score threshold for QTL analysis was determined by a 1000-permutation test. The statistical significance of the QTL effect on the examined traits was estimated by Tukey–Kramer’s HSD test.

Results

Single-nucleotide polymorphisms in the diploid wheat species

To estimate genome-wide DNA polymorphisms in einkorn wheat and T. urartu, 300-bp paired-end RNA sequencing of ten accessions of T. monococcum ssp. aegilopoides, two accessions of T. monococcum ssp. monococcum, and three accessions of T. urartu was performed, generating 220–410 million filtered paired-end reads for each accession (Table 2, Supplementary Table S2). The filtered reads were aligned to the reference sequence of the A genome of CS (International Wheat Genome Sequencing Consortium (IWGSC) 2018). Of the filtered reads, 86.07–95.00% for T. monococcum ssp. aegilopoides, 84.44–88.46% for T. urartu, and 91.43–97.01% for T. monococcum ssp. monococcum were aligned to the reference sequence.

Table 2 Summary of RNA sequencing information for the 15 accessions of diploid wheat

By conducting pairwise comparisons of all the tested accessions of the diploid wheat species and the A genome of CS, 21,057–109,314 SNPs and 315–1853 indels for T. monococcum ssp. aegilopoides, 45,101–59,378 SNPs and 758–1109 indels for T. monococcum ssp. monococcum, and 33,286–53,419 SNPs and 644–1286 indels for T. urartu were detected (Table 3). These SNPs and indels covered all the chromosomes of the A genome of CS (Fig. 1).

Table 3 Identification of SNPs and indels between the diploid wheat accessions and the A genome of CS
Fig. 1
figure 1

Distribution of SNPs (a) and indels (b) between the A genome of CS and the examined accessions of einkorn wheat and T. urartu across the seven chromosomes of the A genome. Red, blue, and purple circles indicate the ten T. monococcum ssp. aegilopoides accessions, three T. urartu accessions, and two T. monococcum ssp. monococcum accessions, respectively (color figure online)

Phylogenetic relationships among the diploid wheat species

To clarify the phylogenetic relationships among T. monococcum ssp. aegilopoides, T. monococcum ssp. monococcum, T. urartu, and the other wild diploid wheat species, NJ and ML trees were constructed for 10 accessions of Ae. tauschii and 12 accessions of Ae. umbellulata; these trees were based on the high-confidence set of SNPs and included Ae. speltoides as an outgroup species (Fig. 2). The einkorn wheat species and T. urartu were separated from the other diploid wheat species, consistent with the findings of previous reports (Mizumoto et al. 2002; Fricano et al. 2014). The comparison based on the branch lengths of the trees indicated that T. monococcum ssp. aegilopoides had the highest genetic diversity among the einkorn wheat species and T. urartu. The genetic diversity in T. monococcum ssp. aegilopoides was higher than that in Ae. umbellulata but was lower than that in Ae. tauschii. T. monococcum ssp. aegilopoides diverged into three groups. The existence of three groups in wild einkorn wheat was consistent with a previous observation (Pour-Aboughadareh et al. 2017). T. monococcum ssp. monococcum belonged to the group that contained T. monococcum ssp. aegilopoides; this group included accessions distributed throughout southeastern Turkey and Iran. The other two groups were geographically separated; members of one group were limited to areas surrounding Ankara, Turkey, while members of the other group were widely distributed throughout southeastern Turkey, Iran, and Iraq.

Fig. 2
figure 2

Phylogenetic relationships among the diploid Triticum/Aegilops accessions. The neighbour-joining tree (a) and maximum likelihood tree (b) were constructed based on RNA-seq-derived SNPs. Bootstrap probabilities are shown on the branches (number of bootstrap replicates = 1000). A scale bar for genetic distance is shown on the left side of each phylogenetic tree

Although the genetic distance between the Am- and A-genome species was smaller than the genetic distance between the U- and D-genome species, the A-genome species were clearly separated from the Am-genome species (Fig. 2). The three accessions of the A-genome species T. urartu were genetically distinguished with 100% bootstrap probability and geographically separated. All of the tested accessions exhibited divergence from the A genome of CS. The genetic distance between T. urartu and the A genome of CS was larger than that between the wild and cultivated einkorn wheat species.

Development of genetic markers that discriminate between the sub-genomes

To develop genetic markers that distinguish between species or between the Am and A genomes, fixed nucleotide differences (SNPs) that distinguished species or the sub-genomes were evaluated. T. monococcum ssp. aegilopoides, T. monococcum ssp. monococcum, and T. urartu were distinguished from the A genome of CS by 11,903, 14,021, and 5693 fixed nucleotide differences, respectively (Table 4). Of these fixed nucleotide differences, 25, 187, and 1892 were uniquely observed in T. monococcum ssp. aegilopoides, T. monococcum ssp. monococcum, and T. urartu, respectively. A total of 8309 fixed nucleotide differences were observed between the Am genome of T. monococcum ssp. and the A genome of T. urartu; these differences were widely distributed over all of the chromosomes (Fig. 3a). A total of 228 fixed nucleotide differences were observed between T. monococcum ssp. aegilopoides and T. monococcum ssp. monococcum; these differences were frequently located in the distal chromosomal regions (Fig. 3b). The fixed nucleotide differences between T. urartu and the A genome of CS were also widely distributed over all the chromosomes (Fig. 3c). The right (long-arm) end of chromosome 4 lacked SNPs in the comparisons between the two subspecies of T. monococcum and between T. urartu and the A genome of CS.

Table 4 Summary of fixed nucleotide differences between species
Fig. 3
figure 3

Chromosomal distribution of fixed nucleotide differences a between the A and Am genomes, b between T. monococcum ssp. aegilopoides and T. monococcum ssp. monococcum, and c between T. urartu and the A genome of CS

Based on the fixed nucleotide differences between the Am and A genomes, seven Am-chromosome-specific CAPS markers were designed for each chromosome (Supplementary Table S1). To test the utility of these CAPS markers, the markers were applied to Ldn, CS, T. urartu, and T. monococcum ssp. aegilopoides KU-3620, and four individuals of a synthetic hexaploid line (Ldn/KU-3620) with AABBAmAm genomes (Fig. 4). These markers successfully discriminated between the Am and A genomes for each chromosome. The four individuals of the Ldn/KU-3620 line retained all seven chromosomes derived from the Am genome.

Fig. 4
figure 4

Am-genome-specific CAPS markers derived from the RNA-seq data. Each PCR product was digested with the indicated restriction enzyme. Lanes 1–4: Langdon, Chinese Spring, T. urartu KU-199-16, and T. monococcum ssp. aegilopoides KU-3620, respectively; lanes 5–8: four individuals of the synthetic hexaploid line Ldn/KU-3620

Application of molecular markers for genetic analysis of wild einkorn wheat

In addition to SNPs that were detected based on the alignment of RNA-seq reads to the reference sequence of the A genome of CS (International Wheat Genome Sequencing Consortium (IWGSC) 2018), we also detected SNPs between the F2 parents, T. monococcum ssp. aegilopoides KU-3620 and KU-8276 based on de novo assembled transcripts, as described below. De novo transcriptome assembly of the tested diploid accessions generated 25,422–75,573 transcripts; the N50 values of these assemblies ranged from 914 to 1325 bp. Alignment of the RNA-seq data of short reads of KU-8276 to the representative transcripts of KU-3620 revealed 18,360 SNPs, 8925 of which were shared with those detected based on the A genome of CS (Fig. 5). Thus, the approach using the de novo transcriptome was able to detect a larger number of SNPs between accessions of T. monococcum ssp. aegilopoides. The identified SNPs and indels anchored to chromosomal positions spanning the entire A genome of CS (Table 5).

Fig. 5
figure 5

Venn diagram of the SNPs detected between two accessions (KU-3620 and KU-8726) of T. monococcum ssp. aegilopoides. The SNPs were called based on the alignment of RNA-seq reads of KU-8726 to the non-redundant transcripts constructed with de novo assembly of the RNA-seq reads of KU-3620. The SNPs between these two accessions were detected based on the alignment of reads to the A-genome chromosomes of CS

Table 5 SNPs and indels between two wild einkorn wheat accessions, namely KU-3620 and KU-8276, that were detected based on assembly of the KU-3620 transcripts

Genetic markers for the F2 mapping population were developed based on the SNPs and indels obtained by integrating both approaches. To convert the identified SNPs and indels to PCR-based genetic markers, ten CAPS and two indel markers were designed on chromosome 7Am and applied to the F2 mapping population (Supplementary Table S1). The fragments digested by the restriction enzymes clearly distinguished between the F2 parental accessions of T. monococcum ssp. aegilopoides KU-3620 and KU-8276, and the resulting polymorphisms were used to genotype the F2 individuals. Based on the genotyping data, a linkage map was constructed for chromosome 7Am (Fig. 6a). The chromosome 7Am map included 4 SSR markers (cfa2028, wmc405, barc174, and gwm573) and the 12 markers developed in the present study. To confirm the chromosomal positions of the 12 novel markers, they were anchored to chromosome 7A of CS (Supplementary Table S1). Chromosomal synteny was well conserved in the mapping regions shared between chromosomes 7A and 7Am that were defined in the present work.

Fig. 6
figure 6

Genetic map construction and QTL analysis of the F2 population of wild einkorn wheat. a A linkage map of chromosome 7Am and QTL likelihood curves of LOD scores showing the locations of QTLs for heading and flowering times. Genetic distances are shown in centimorgans to the left of chromosome 7Am. b, c Frequency distribution of b heading time and c flowering time in the KU-3620/KU-8276 mapping population. d, e Comparison of d heading time and e flowering time among putative genotypes at the 7Am QTL. The KU-3620/KU-8276 individuals were classified based on genotypes of markers between Xcfa2028 and Xa24482. The statistical significance of the QTL effect on the examined traits was estimated by a two-tailed one-way analysis of variance with a post hoc Tukey–Kramer HSD test. Means with the same letters do not differ significantly (p > 0.05)

To easily confirm the usefulness of the constructed map, we conducted QTL analysis of heading and flowering times in the F2 mapping population. The F2 individuals in the KU-3620/KU-8276 population were grown under field conditions. The heading and flowering times of the F2 individuals varied, and transgressive segregation was observed (Fig. 6b, c). Through single-marker analysis using SSR markers from each chromosomal arm of the A genome, SSR markers on the short arm of chromosome 7Am were found to be associated with the phenotypic variation in heading and flowering times in the mapping population. Using the 7Am linkage map constructed in the present study, QTL analysis of these traits was performed by composite interval mapping. Significant QTLs for heading and flowering times were found with LOD scores of 5.17 and 2.69, respectively, within the same region on chromosome 7Am (Fig. 6a). The 7Am QTLs mapped between Xcfa2028 and Xwmc405, and the peaks of QTL likelihood curves of the LOD scores were located in the Xcfa2028Xa24482 region. The 7Am QTLs explained 23.0% and 11.1% of the combined heading and flowering time variation in the KU-3620/KU-8276 population, respectively. The genotypic effects of the 7Am QTLs on heading and flowering times were examined under field conditions using F2 individuals selected from the mapping population. The F2 individuals carrying homozygous KU-3620 alleles in the 7Am QTL region (Xcfa2028Xa24482) showed significantly earlier heading and flowering times than those with homozygous KU-8276 alleles (Fig. 6d,e). The heading and flowering times of the heterozygous F2 individuals were intermediate to those of the homozygous individuals.

Discussion

RNA sequencing is among the technologies used to efficiently detect SNPs and indels in the large genome of wheat species, a genome that is composed primarily (~ 85%) of repeat sequences such as transposons (International Wheat Genome Sequencing Consortium (IWGSC) 2018), facilitating the design of genetic makers covering the whole genome. In the present study, the RNA-seq approach facilitated an efficient search for genome-wide polymorphisms in the diploid wheat species with an A or Am genome; a number of SNPs and indels were detected by comparison with the A genome of CS (Table 3), and the positions of these polymorphisms were confirmed to be distributed throughout the genome (Fig. 1). Similar results have been reported in the wheat D-genome-containing progenitor Ae. tauschii (Iehisa et al. 2012, 2014; Nishijima et al. 2016), the U-genome-containing diploid species Ae. umbellulata (Okada et al. 2018), and section Sitopsis of the genus Aegilops (Miki et al. 2019). The positions of the genome-wide polymorphisms detected by RNA-seq were efficiently determined based on chromosomal synteny conserved among the genomes of wheat relatives (Mayer et al. 2011). Of course, structural rearrangements accumulated on the chromosomes of the wild diploid species during genome differentiation (Wicker et al. 2003; Molnár et al. 2016; Danilova et al. 2017), and inter-chromosomal translocations occurred after allotetraploid wheat speciation (Devos et al. 1995; Dvorak et al. 2018). Such rearrangements sometimes disturb chromosomal synteny; thus, the predicted chromosomal positions of the RNA-seq-derived polymorphisms are not necessarily precise. To solve this problem, the construction of genetic maps using molecular markers based on RNA-seq-derived polymorphism information in the target species is still important. Nonetheless, this putative positional information is expected to be useful for further genetic studies because local synteny is well conserved in each restricted chromosomal region among closely related species (Lu and Faris 2006).

The RNA-seq approach does have a limitation: it detects only the SNPs present in the exons of expressed genes. Polymorphism information can be collected from the exons of unexpressed genes but not from the introns and promoter regions of the expressed genes and intergenic chromosomal regions. Nonetheless, RNA-seq identified many genome-specific SNPs distributed across all the chromosomes in the Am genome, permitting the successful development of genome-specific CAPS markers that distinguished the Am and A genomes. These markers were useful for confirming the alien addition of the Am-genome chromosomes to tetraploid wheat (AABB genome) in synthetic hexaploid lines with AABBAmAm genomes (Fig. 4). Using a similar strategy, U-genome-specific markers were successfully developed to distinguish the U-genome copy from the A- and B-genome-derived copies in our recent study (Okada et al. 2018). U-genome-specific markers are available for validation of interspecific hybridization in crosses between tetraploid wheat and Ae. umbellulata (Okada et al. 2018). Phylogenetic trees constructed in the present study showed that the A and Am genomes are evolutionarily more closely related than the A and U genomes (Fig. 2). The RNA-seq approach allowed the efficient detection of genome-wide polymorphisms between closely related genomes, permitting the development of genome-specific markers. This result also indicated that RNA-seq analysis is appropriate for elucidation of the evolutionary relationships among homoeologous genomes based on genome-wide exon sequences.

Similarly, RNA-seq-derived polymorphisms are an efficient source of information for the development of genome-wide genetic markers; these markers are expected to facilitate linkage map construction, gene mapping, and identification of QTLs for agronomic traits in mapping populations derived from intra-subspecies crossed with wild einkorn wheat. The CAPS markers obtained via the RNA-seq-derived polymorphisms permitted the construction of a genetic map, and the constructed map facilitated the identification of QTLs for agronomic traits in wild einkorn wheat (Fig. 6). Heading and flowering time QTLs have been detected on the short arms of the homoeologous group 7 chromosomes in common wheat (Kuchel et al. 2006; Lin et al. 2008), einkorn wheat (Yu et al. 2017), and Ae. tauschii (Koyama et al. 2017). The heading time QTL reported previously in einkorn wheat is located near Xbarc174 on chromosome 7Am (Yu et al. 2017), whereas the 7Am QTL found in the present study is distal relative to that reported by Yu et al. (2017). The 7Am QTLs were detected here in an F2 population of wild einkorn wheat; thus, the significance of the QTLs for heading and flowering time should be validated using their progeny in further studies. Therefore, although genome information, including the transcriptome, is lacking for many species of wheat relatives, RNA-seq data facilitate genetic analyses of target traits in these wheat relatives.

Our study showed that the cultivated species T. monococcum ssp. monococcum belonged to one of the three groups of T. monococcum ssp. aegilopoides (Fig. 2), suggesting that T. monococcum ssp. monococcum originated recently from a limited subpopulation of T. monococcum ssp. aegilopoides. The recent divergence between T. monococcum ssp. aegilopoides and ssp. monococcum can make it difficult to identify genetic markers that distinguish between these two subspecies. However, using the RNA-seq approach, the present work identified 228 SNPs that differentiate these subspecies. Interestingly, these SNPs were patchily distributed over the distal chromosomal region (Fig. 3). This result suggests that in the process of domestication, nucleotide substitutions were not distributed across the entirety of the chromosomes, although only two accessions of cultivated einkorn wheat were examined. Specific chromosomal regions may have contributed to the domestication of T. monococcum ssp. monococcum. The association of certain chromosomal regions with species or subspecies differentiation was also observed between two Aegilops species, Ae. longissima and Ae. sharonensis, both of which belong to section Sitopsis and contain the Sl genome (Miki et al. 2019). In the present study, the diploid wheat accessions examined were limited. Recent studies using a large number of diploid wheat accessions showed that two species, namely T. urartu and T. monococcum, accumulated abundant genetic diversity (Heun et al. 1997; Jing et al. 2007). The genome-wide polymorphism data obtained in the present study could be utilized to further evaluate the genetic diversity in the two diploid wheat species and determine domestication-related chromosomal regions in einkorn wheat. These observations indicate that RNA-seq-derived genome-wide polymorphisms can contribute to genetic studies on the evolutionary differentiation of closely related species and subspecies.

Our results indicate that the RNA-seq approach is quite useful for the discovery of genome-wide polymorphisms in wild wheat relatives. The resulting polymorphism data are available for subsequent genetic studies such as phylogenetic analysis, genetic map construction, and target gene mapping in wheat relatives for which sufficient genomic information is lacking. Wheat relatives are believed to share homoeologous chromosomes derived from a predicted common ancestor (Mayer et al. 2011). Therefore, the RNA-seq approach should contribute to genetic analyses and introgression of agriculturally important phenotypes from wild wheat relatives. The wheat relatives carrying genomes other than A, B, and D are useful for wheat breeding, serving as a tertiary gene pool. In wheat breeding processes, target genes can be transferred from chromosomes of wheat relatives to the A-, B- or D-genome chromosomes of common wheat by homoeologous recombination (Qi et al. 2007). In contrast, gene transfer from the tertiary gene pool using homologous recombination is not expected. Therefore, cytogenetic techniques to enhance homoeologous recombination, such as the use of ph (pairing homoeologous) gene mutants, will be helpful in accelerating gene transfer from the Am-genome chromosome to the A-genome chromosome (Dubcovsky et al. 1995). Genome-specific molecular markers are expected to enable confirmation of the introgression of target chromosomal segments from wheat relatives. Moreover, markers that are spread over all of the chromosomes could help determine the precise positions of the chromosomal regions transferred from wild relatives. Indeed, RNA-seq-derived PCR markers have been used to validate alien gene transfer from rye to common wheat (Wu et al. 2018). Thus, the genome-wide polymorphism information obtained from RNA-seq is expected to enlarge the genetic variation in the tertiary gene pool available for future wheat breeding.