INTRODUCTION

The genus Avena L. comprises several cultivated species and a few segetal species (field weeds), as well as wild species, which are interesting as potential sources of valuable traits for breeding. One of the species, A. sativa L. (common oat), is the most important cereal, occupying about 10 million hectares of global agricultural land. The common oat provides one of the best concentrated feed for livestock, and green feed and hay from oat, which are usually used in a mixture with legumes, are of great importance in animal husbandry. The groats and flour obtained from oat grains, as well as food products based on them, owing to an optimum ratio of nutrients, have high nutritional, dietary, and functional advantages.

The range of wild and especially of weedy oat species covers the entire grain belt of the world. A wide range of adaptation of wild species to adverse environment, their adaptability to different soil types and climatic conditions, and resistance to pathogenic organisms, as well as some traits associated with the elements of increased productivity and quality, represent a unique source of starting material for breeding [15].

THE SYSTEM OF THE GENUS AVENA IN TAXONOMIC SENSE AND THE ORIGIN OF CULTIVATED SPECIES

The modern system of the genus Avena L. is based on the treatment of C. Linnaeus [6, 7]. In the first edition of Species Plantarum, within the genus Avena, Linnaeus described two species of true oats, A. sativa L. and A. fatua L., and eight species that are now treated as representatives of other genera (Achnatherum, Haeupleria, Trisetaria, and others). Interestingly, Linnaeus considered the presence (A. fatua) or absence (A. sativa) of lemma pubescence as a taxonomically valuable trait for the separation of A. sativa and A. fatua, while such a bright distinguishing trait as the rachilla, easily disarticulating at the joints at the base of the lemma in A. fatua, was not considered important. At the same time, this trait is not only bright but also practically important. Specifically, non-shattering seeds and the preservation of spike integrity until threshing made A. sativa a cultivated species and important agricultural crop [8]. However, in full accordance with the law of homologous series in variation [9], plants with the “fatua-like spike” are found among A. sativa; this trait is apparently controlled by one gene and the “fatuoid” allele is recessive [10].

The form of the cultivated oat A. sativa L. with white glumes is the type-species of the genus. However, Linnaeus described A. sativa by its forms with black glumes (he considered oats with white glumes to be a variant of the latter and designated it as “β”) [6]. Thus, it was unjustified to treat the form with white glumes as a type specimen of the genus Avena until Baum, from authentic material, chose the lectotype Baum, 1974: 579: Herb. Clifford: 25, stored in the British Museum, thereby legitimizing the type A. sativa L. (a form with white glumes) [11].

Considering that modern ideas on the species composition of the genus Avena have changed considerably since the time of Linnaeus, the number of species, subspecies, and varieties ever assigned to the genus Avena (there are at least 436 [12] and only for the flora of Russia, their number exceeds 100 [13, 133, 134]) is impossible to review within the framework of one article. Because of this, in the following, only the species whose assignment to the genus Avena does not raise doubts among the leading modern specialists [1, 2, 12, 13] will be discussed.

The discrepancy in the number of species accepted by different authors in the genus Avena, which ranges from 12–13 [14] to 27 [10], is explained, on one hand, by the fact that some oat species at the level of macromorphology (plant “image”) are indistinguishable, and the differences between them lie in the field of micromorphology [2, 10, 13]. On the other hand, taxonomists disagree about the relative weight (taxonomic value) of a particular trait and, first of all, of such trait as reproductive isolation.

Numerical taxonomy was one of the anti-subjectivity approaches in determining species boundaries that was used in the study of intrageneric diversity in the genus Avena. In this case, the total sample is analyzed at the largest possible number of traits of equal weight or also numerically, but taking into account the weight of the “significant traits” selected by the expert. Using these techniques, Baum [10] studied variation of 29 morphological traits using several thousand herbarium accessions from the main herbaria of the world and in the seed progeny of 5000 oat accessions from different natural populations and elite cultivars. As a result, it was proposed to divide the whole diversity of oats into 26 species plus two variants of cultivated oats A. sativa, type-species and Festuca-like [10].

On the other hand, from a genetic point of view, among the taxonomically significant traits, a special place should be occupied by genetic isolation (reproductive isolation) of a species [1418]. If this trait is considered as species criterion, as Ladizinsky does [14, 18], then the number of currently known species of the genus Avena is reduced to 12 (excluding A. macrostachya, which, according to Ladizinsky, should be assigned to the genus Helictotrichon). The arguments of Ladizinsky are as follows. According to Vavilov’s law of homologous series in variation, morphological characters, which are considered species-specific by the representatives of the classical school of taxonomy, are found with varying frequency in different oat species, and therefore, morphological criterion of the species cannot reveal the real genetic and taxonomic diversity of the genus. Moreover, since diploid cultivated species of oats, A. brevis, A. hispanica, A. nuda, and A. strigose, as well as wild species, A. hirtula, A. wiestii, A. atlantica, can be crossed in the experiment and give viable and fertile progeny, there is no reason to divide them into separate species. The other two diploid wild species, A. eriantha and A. clauda, differ in that, in A. clauda, the rachilla is fragile at the base of each flower, while in A. eriatha, it is fragile only at its base. However, this trait seems to be under monogenic control. In experimental conditions, these species easily cross and produce fertile progeny. According to Ladizinsky, it is reasonable to consider them as one species A. clauda. All hexaploid oat species, A. sativa, A. fatua, A. sterilis, and A. byzantina, easily cross with each other. Moreover, A. fatua, apparently, is not found in the wild, actually does not have its own specific range, and is known only as a weedy plant and often as an introgressive hybrid with A. sativa. According to Ladizinsky, all these plants are merely forms of the species A. sativa [14, 18]. On the contrary, the diploid species A. damascena and A. prostrata are morphologically indistinguishable or difficult to distinguish. However, their crossing yields no progeny, indicating that these are two good cryptic species [14].

The ideas of Ladizinsky were not supported by the majority of taxonomists and geneticists [25, 13, 1922]. In addition to psychological reasons (traditional system of the genus versus the new one), the basis for rejection of the proposal to reduce the number of species in the genus to the number of reproductively isolated groups is the fact that among taxonomist-florists there is a widespread tendency to accept the species as a “monotypic” one, the intraspecific variability of which concerns only insignificant traits [23, 24]. From the point of view of traditional botanical taxonomy, grouping of “morphological” species into “biological” ones, i.e., the transformations similar to those suggested for the genus Avena by Ladizinsky, lead to the fact that the former “species,” reduced in status to the level of subspecies, varieties, and forms, drop out of the system of investigation, scoring, and conservation of biological diversity [17, 24].

However, there are other, more fundamental objections to radical reduction of the number of species in the genus Avena based on the results of crossing experiments. As noted by Baum [10], the recognition of a species as a “biological” species, genetically (reproductively) isolated from the others, is not disproved by the fact that these species can produce viable and productive progeny under experimental conditions. In the wild, Avena species can be completely or almost completely isolated from each other owing to different ranges (geographically), differences in phenology (different flowering time), or a tendency to self-pollination.

Table 1 shows the main data on the distribution, karyotypes, and genomes of 27 species of the genus Avena L. It can be suggested that, over time, as new data on the genetic distances between natural races of wild and weedy species of the genus Avena become available, among the already known species, new species and twin species (sibling and cryptic species) will be identified. It is also possible that it will be demonstrated that some of the as yet unaccepted species, which are currently treated as synonyms of already known species [see 10, 12, 13], are in fact genetically isolated natural populations. There are all grounds to expect this. To date, the ecological and geographical populations of the Avena species from the Mediterranean floristic region, especially the Mediterranean islands and the Maghreb countries, and the populations of Central Asia, India, Iran, Iraq, and Turkey are scarcely studied. A few expeditions to study the genetic diversity of natural oat populations with amazing regularity led to the discovery of new species and ecological geographical races [1, 2, 14, 2529]. From the point of view of revealing and preserving the genetic diversity of oats, the search for, collection, and analysis of the group of ecological geographical races and species of weedy oats, largely lost in the conditions of intensive farming, is a special task.

Table 1. Species of the genus Avena and genomic composition of their karyotypes

There are reasonable grounds to suggest that some of the geographically isolated populations that do not always have a specific morphological trait syndrome (again, let us remember Vavilov’s law of homologous series in variation) on the phylogenetic tree can turn out to be either twin species or closely related, but reproductively isolated sibling species very distant from each other [16, 17]. As concerns Russia, it is of interest to study natural populations of A. barbata var. caspica Housskn., which differ from the type accessions of A. barbata in the lemma morphology [2, 13]. A group of Avena species related to A. sativa (A. aggr. sativa), in particular, as yet unaccepted species such as the Volga region endemic A. volgensis (Vavilov) Nevski, European-Central Asian A. macrantha (Hack.) Nevski, European-Caucasian-Siberian A. georgica Zuccagni, and Euro-Siberian-Far Eastern A. orientalis Schreb., requires special investigation, as well as the forms included in A. aggr. fatua, such as A. septentrionalis Malzew, A. cultiformis (Malzew) Malzew, and A. aemulans Nevski. All the latter are specialized spelt and oat weeds, which nowadays, under intensive farming conditions, are quite rare [13].

The genetics-based genus system can be constructed in the case of correctly determined genomic composition of diploid and polyploid species of the genus Avena. The pilot study of the genomic constitution of species of the genus Avena was carried out by Nishiyama [30], who studied chromosome conjugation in interspecific hybrids of diploids and polyploids and designated the haploid genome of A. strigosа as A, the tetraploid genome of A. barbata as AB', and the genome of A. fatua as ABC, with B' and B being different genomes. Because of this, at the suggestion of Rahati and Morrison [31], the genomic constitution of the tetraploid was designated as AB, and that of the hexaploid was designated as ACD.

Comparative analysis of the mitotic chromosome morphology in Avena showed that diploid oat species could have genomes of A or C type. Moreover, it was proposed to distinguish between two variants of the C genomes, Cv and Cp, and five variants of the A genomes (As, Ap, Al, Ad, Ac) (Table 1). The average haploid genome size in diploid species with the A genomes is about 4.4 Gb; the haploid genome size in diploids with the C genomes is 5.0 Gb. In tetraploids, haploid genomes are more diverse in size, with AB < AC (DC) < CC (mean 1C = 8.2, 9.1, and 10.6 Gb, respectively). The haploid genome size in different hexaploid species is approximately the same, 12.6 Gb [19]. It is easily seen that, in polyploid Avena species with the AC genomes, the total genome size is smaller than one would expect (AC < C + A). There is a reduction in the Cx value (genome downsizing) [32], a phenomenon common for polyploids. The nature of this phenomenon lies in gradual loss of a part of duplicated genes by polyploids, first of all those the products of which do not function in multiprotein complexes [16, 33, 34], and possibly sequence elimination of a part of dispersed repeats of one of the alloploid subgenomes.

The variants of A genomes differ in the number of acrocentric chromosomes. The Ap, Al, and Ad genomes are species-specific, while the As genome is found in a number of diploid species (A. atlantica, A. strigosa, and A. wiestii). These species with the As genome have different geographic ranges, but in experimental conditions, they cross with each other, and their progeny are fertile [14, 18, 35]. Comparison of several thousand SNPs in the As genomes of these three species shows only small variations, pointing to the fact that these are genetically close ecological geographical races [36].

The trait that distinguishes between the Cp and Cv genomes is the number of acrocentric chromosomes. In the Cp genomes, there are two acrocentric chromosomes with large and small satellites, while in the Cv genome, there is one pair of chromosomes with small satellites. Chromosomes of the A and C genomes differ in that, in the genomes and subgenomes of the A type, chromosomes are more biarmed than in the genomes and subgenomes of the C type [3740], as well as in completely different C-banding patterns [3841]. Chromosomes of the C genomes of diploids and the C subgenomes of polyploids not only carry larger heterochromatin C blocks but also in general are darker in color upon Giemsa–Romanowsky and Wright staining after C banding. Jellen [41] and Badaeva [38] name this type of staining “diffuse heterochromatin.” This phenomenon deserves discussion. Recall that facultative heterochromatin (transcriptionally inert euchromatin, inactivated X chromosome of female mammals, and gene-depleted dark G blocks of chromosome arms) is not stained by the C-banding technique [42, 43]. Dark-colored C blocks are the chromosome regions enriched in tandem repeats and proteins characteristic of constitutive heterochromatin (HP1, H3K9me3, H4K20me3, etc.) [43, 44]. Dark staining of chromosome arms in diploid Avena species carrying the C genomes and in the C subgenomes of polyploids indicates that these genomes contain the chromatin regions of special composition, not typical of the chromosome arms. These regions are distributed along the entire length of the chromosome arms and are resistant to depurination and beta elimination of DNA after C banding [45]. The proportion of this chromatin in euchromatin of the chromosome arms in diploid oat species with the A genomes and their derivatives in polyploids is small or absent. The differences are most likely associated with the expansion in the C genomes and C subgenomes of relatively extended and/or numerous clusters of tandem repeats associated with HP-1 and other proteins characteristic of tandem repeats, rather than with retrotransposons. Microsatellites, characteristic of the A-subgenome arms (for example, [46]) are either not numerous enough in the A subgenomes to influence C banding or are associated with other proteins that do not produce dark staining of chromosome arms during C banding. A direct indication of the prevalence of different tandem repeat families among the tandem repeats distributed over the chromosome arms in the A and C genomes of Avena species is the fact that whole-genome sequencing of A. atlantica (A genome) showed that, in this species, the most abundant microsatellites dispersed over the arms were (AT)n and (AAC)n, while in A. eriantha carrying the C genome, these were (GGC)n and (TTTA)n microsatellites [36]. In addition, whole-genome sequencing of the A and C genomes showed that, in the chromosomes of A. eriantha, clusters of the 665-bp subtelomeric satellite were also found in the inner regions of the chromosome arms [36].

Diploid species with different genomes do not cross; hybrids between different variants of the A genome can be obtained; however, they are completely or almost completely sterile [35].

Tetraploid species of the genus Avena are diverse in their genomic constitution. The tetraploid species A. macrostachya is characterized by the CmCmCmCm karyotype [38, 47]; the karyotype of A. barbata, A. vaviloviana, A. abissinica, and A. agadariana is AABB [37, 39]; and A. maroccana, A. murphyi, and A. insularis are characterized by the DDCC karyotype [38, 46]. All hexaploid species have the AACCDD karyotype [37, 40].

According to different estimates, the A and C genomes of modern diploid species of the genus Avena diverged 5–13 [36] or 19–21 million years ago [20]. The 2- to 4-fold differences in chronology are caused by the ambiguity in determining the divergence time of the main branches of cereals. For example, Inda et al. [48] think that the Avena and Triticum lineages diverged about 15 million years ago [88], Fu [20] suggests that this event took place 25 million years ago, and Wang et al. [49] attribute this event to a time not earlier than 25, but not later than 51 million years ago. During the divergence period, the A and C genomes accumulated enough differences to be considerably different in experiments on genomic in situ hybridization (GISH) [5052]. At the same time, this method fails to differentiate the A and B subgenomes of tetrapoid species [50, 53], as well as the A and D subgenomes of hexaploid species of the genus Avena [50, 51].

The question of how important is this difference in the pairwise comparison of the A/C, A/B, and A/D genomes arises. Unfortunately, no universal scale for assessing the resolving power of the GISH method exists so far. At the same time, in allopolyploids and the hybrids resulting from crossing of closely related genera, chromosomes of different origin are usually distinguishable with the help of GISH (review [54]). A rare exception to this rule is the chromosomes of two closely related cereal genera, Leymus Hochst and Psathyrostachys Nevski, which demonstrate similar staining patterns with GISH [55].

The use of FISH and GISH approaches revealed an interesting feature of the genome evolution of polyploid Avena species. In the karyotype/genome of Avena, translocations are encountered rather often, some of which are species-specific, and others of which are found only in certain natural oat populations, lines, and cultivars. Translocations can occur both within one subgenome and between chromosomes of different subgenomes [22, 36, 38, 46, 56, 57]. Moreover, in A. sativa, translocations more often occurred between the chromosomes of the C and D subgenomes than between the C or D subgenome and the A subgenome [22, 58, 59]. Comparison of the linkage groups of A. atlantica and A. eriantha with the genetic maps of Hordeum vulgare L. showed that the A genome of oats (A. atlantica) preserved extended synthenic groups with barley, while the genome of C type (A. eriantha) underwent multiple translocations [36]. Not differing from A subgenomes in the GISH pattern, B-genomes differ from them in a large number of chromosomal rearrangements, which indirectly indicates that tetraploids with the AB genomes are not auto-, but allopolyploids [22].

The nuclear genomes of diploid oat species A. atlantica (As genome) and A. eriantha (Cp genome) were sequenced [36]. The annotated sequences cover 3.69 and 3.78 Gb, respectively (for comparison with the genome sizes determined cytophotometrically, see Table 1; it is suggested that the regions enriched in repeats remained not sequenced and not annotated [36]). The mean G+C content in the sequenced regions of the genome was 44.4 (A. atlantica) and 43.9% (A. eriantha), which was consistent with the data obtained earlier by sequencing the genomes of other cereals (Sorghum bicolor (L.) Moench, 43.9%; Oryza sativa L, 43.6% G+C), but considerably higher than the G+C content in dicotyledons (Carica papaya L., 34%; Arabidopsis thaliana (L.) Heynh., 36%) [60]. Approximately 83% of the A and C genomes of diploid species are represented by transposons, among which the most numerous are LTR retrotransposons, which is typical of plant genomes. In particular, more than 60% of the oat genome are represented by Gypsy- and Copia-like transposons (in the ratio between them of 2.3 : 1 and 3.5 : 1 for A. atlantica and A. eriantha, respectively). The next in the copy number in the Avena genomes (representing 5% of the genome) is the CMC-EnSm DNA transposon [36]. Interestingly, 10–14% of repeats in oats with the A and C genomes were not known previously. Probably, they are specific to the genomes of the genus Avena [36].

High-copy tandem repeat (159)n is located in the centromeric genome regions of A. atlantica and A. eriantha [36]. Satellite sequences with similar repeat unit length of 156 bp were found in the centromeres of Brachypodium distachyon (L.) P. Beauv. and Zea mays L.; the centromeric repeat unit length in Oryza brachyantha A. Chev. & Roehrich is 154 bp [61]. However, the centromeric repeat of Avena differs in sequence from the repeats in the genomes of other genera [36].

As in other cereals, the number of protein-coding genes in the oat genomes is much higher than in the human genome; there are at least 51 100 such genes in A. atlantica and 49 100 genes in A. eriantha, with an average transcript length of 3 kb [36]. The Avena genes are relatively GC-enriched (on average, ~52% in both species, with the genome average of ~44% G+C). The tendency toward GC enrichment of a considerable part of genes because of the high G+C frequency in the third codon position is a property characteristic of grasses, which distinguishes them from dicotyledons [62, 63], as well as of warm-blooded vertebrates, in contrast to invertebrates, amphibians, and fish [64, 65].

Only 2.2–2.3% of protein-coding genes in the studied genomes of diploid Avena species are duplicated [36], which is surprisingly low. Perhaps, these data require verification, since this is the lowest index value among all plants with sequenced genomes. For comparison, in the genome of barley Hordeum vulgare, there are at least 16% of duplicated genes, and in the genome of rice Oryza sativa, at least 49% [66], and on average, 64.5% of genes in plant genomes are duplicated [67].

Bekele et al. [68] recently published a genetic map of A. sativa with high density of polymorphic markers for all 21 linkage groups. Comparison of this map with sequencing data of the A and C genomes of diploid oat species showed that the frequency of crossing over in pericentromeric chromosome regions of Avena was strongly suppressed [36]. Mapping with molecular markers was performed on the basis of the population of recombinant inbred lines (RIL) obtained from crossing of different oat cultivars [22, 36, 69].

The construction of marker-based maps facilitates mapping of breeding important genetic loci. Mapping of the disease resistance genes is of particular interest for oat breeding. These are the genes conferring resistance to barley yellow dwarf virus (BYDV) [70], crown rust [71], powdery mildew [72], and Fusarium head blight [73]. Numerous studies were carried out on mapping of genes controlling qualitative traits, such as the content of protein, oil, phenolic alkaloids, and β‑glucans in oat kernels [7478].

Candidate genes responsible for such critical traits of oats as resistance to Fusarium infection and the accumulation of deoxynivalenol (DON) mycotoxin in grain are mapped using QTL technology. QTLs are most often identified by linkage mapping using experimental F2 families, backcross, and advanced inbred or doubled haploid families [74]. An alternative approach for QTL detection are genome-wide association studies (GWAS) [7881]. It was demonstrated that the main QTLs that control resistance to mycotoxin accumulation are located on chromosomes 17A/7C, 5C, 9D, 13A, 14D 13A [81, 82]. In particular, mycotoxin accumulation is most closely associated with the avgbs_6K_95238.1 locus, the product of which belongs to the class of zinc-finger proteins and is a lipase-like protein or a lipase precursor [81]. QTLs that, with the increase in the growing season duration and the plant height, decreased the mycotoxin accumulation in oat kernels were identified [83]. Analysis of a large array of oat accessions using the Diversity Array Technology (DArT) markers showed that three independent SNPs were statistically significantly associated with the oat kernel beta-glucan content [78].

QTLs associated with lodging, an important agronomic trait in oats, were identified using GWAS. Experiments were performed using 126 spring and winter oat cultivars (both modern bred and landraces) collected in 27 European countries. It was demonstrated that QTLs associated with plant height were concentrated in the Mrg01, Mrg08, Mrg09, Mrg11, and Mrg13 linkage groups [58, 80]. It is noteworthy that QTLs associated with plant height and mapped to the Mrg01 and Mrg13 linkage groups colocalized with QTLs for heading date, while the QTL “responsible for plant height” on Mrg11 colocalized with the gene for cold tolerance [75, 76, 79, 80].

A trait that is directly associated with productivity and depends on a large number of loci are the days of plant development to anthesis. The study of SNPs in the genomes of 682 lines of European oat cultivars showed that the loci associated with this trait were concentrated in the Mrg02 and Mrg24, as well as the Mrg12, Mrg13, and Mrg33 linkage groups [58, 84, 85]. These data make it possible to purposefully select the initial material for breeding already at the early stages of plant development and in early hybrid generations using the marker assisted selection (MAS) technique [85].

MAS technology, in principle, can be used not only with nuclear genome markers but also with organelle genome markers. Until now, a description of the mitochondrial genome of only cultivated oat species A. sativa was reported [86]. The mitochondrial genome of common oat is circular, 596 kb in size, and contains six direct repeats of 1–7 kb in size and two inverted repeats of 12 and 3 kb. There are reasonable grounds to believe that the mitochondrial genome can be present in the cell in several isomeric forms and in several circular molecules of different sizes. Fourteen genes of the Avena mitochondrial genome are protein-coding, three genes code for rRNA (rrn26, rrn18, and rrn5), and 18 genes code for tRNA. Two of these genes, coxI and rrn26, are represented in the oat mitochondrial genome in two copies [86]. The fact that the caxI gene in A. sativa is duplicated reminds us that the duplication of this gene in brown mustard (Brassica juncea (L.) Czern.) was associated with cytoplasmic male sterility [87].

The study of mtDNA polymorphism using restriction endonucleases showed that mitochondria of hexaploid and tetraploid oat species originated from species with the AA genomes. NGS sequencing of the mt genomes of 25 Avena species revealed 1243 parsimony-informative SNPs [20]; however, a detailed annotated description of the mt genomes of wild species has not yet been reported.

Chloroplast genomes of 25 species of the genus Avena were recently sequenced and annotated by Fu et al. [20, 21]. Comparative analysis of the obtained sequences showed that the chloroplast genome size in Avena varied in the range from 135 557 to 136 006 bp (Table 1). Each genome contained 130 genes and 4–6 pseudogenes. Among them, 84 genes encoded proteins, eight genes encoded rRNA, and 38 genes encoded tRNA. Thirteen protein-coding and eight tRNA genes had introns. The degree of similarity of the cp genomes of different Avena species upon pairwise comparison of nucleotide sequences varied from 98.38 to 99.996% with the average of 99.5%. Variations were mainly associated with the intergenic regions. Comparison of the genomes of 25 species revealed 1313 positions at which SNPs were identified; 583 SNPs (44.4%) were found in genes, 714 (54.4%) were found in intergenic regions, and 15 SNPs (1.2%) were found in pseudogenes. The number of microsatellite clusters (SSRs) per genome varied from 256 (A. clauda and A. eriantha) to 280 (A. atlantica), with the average of 276.8. The most common SSRs were A8-18, C8-14, (AT)5-7, and (AG)5. The authors of the cited study [21] correctly supposed that the results of the study of the chloroplast genome polymorphism could be used to develop DNA barcodes applicable for DNA certification of cultivar diversity and verification of conclusions on the taxonomic assignment of accessions to wild Avena species.

THE ORIGIN OF SUBGENOMES OF POLYPLOID AVENA SPECIES, RESULTS OBTAINED USING NEW-GENERATION SEQUENCING

All terrestrial plant species and species of the genus Avena, in particular, have undergone several rounds of whole-genome duplications and interspecific hybridizations [16, 34, 89]. Backcrossing, intergenomic translocations, the loss of a part of the genes of one of the parents, transposon expansions, and gradual secondary diploidization of karyotypes [16, 34] resulted in that nuclear genomes of modern plants have a complex, mosaic structure, in which genes of several ancestors are present in different proportions. For this reasons, mitochondrial and chloroplast genomes seem to be more convenient for reconstructing the origin of plant genera and species. It was demonstrated that, in most plants, including cereals, chloroplast and mitochondrial DNA were characterized by predominantly maternal inheritance, although exceptions to this rule are known [90, 91]. In most plants, including cereals, paternal chloroplast and mitochondrial genomes are lost during microsporogenesis or during fertilization. During pollen development, organelle DNA signals are found in microspores, but at later stages and in mature pollen, they are already cytophotometrically detectable. Degradation of chloroplast and mitochondrial DNA in microsporogenesis occurs through the destruction of chloroplast and mitochondrial nucleoids [90, 91]. In cereals, chloroplast DNA of paternal origin is sometimes found in the progeny [9193], but the probability of transmission of the paternal chloroplast genome to the progeny in cereals is low; for example, in Setaria P. Beauv., it was 3 × 10–4 [93]. However, it should be noted that, in interspecific and backcross hybrids of cereals, the probability of finding paternal chloroplasts in the progeny increases [94, 95].

It can be suggested that maternal inheritance of mitochondrial and chloroplast genomes, which is traditional for cereals, is under concerted control of nuclear and mitochondrial genomes. This mechanism stops functioning properly in the case where nuclear and cytoplasmic genomes of different origin are combined in one cell. This is just our case. All polyploid species of the genus Avena, except perhaps A. macrostachya, are allopolyploids with the AC, AB, CD, ACD genomes (Table 1). Fu et al. [20, 21] attempted to calculate the time of development of the Avena A and C genomes, the appearance of variants of the A and C genomes, and the appearance of tetraploids. With all the conventionality of the absolute chronological scale used to determine the divergence time of the main phylogenetic branches of cereals, the relative time of species divergence that they give seems to be correct. Fu [20], comparing the SNP patterns in the mitochondrial and chloroplast genomes, inferred that, in the case where the lineages of diploid oat species with the A and C genomes diverged about 20 million years ago, then the A. ventricosa lineage (Cv genome) separated from A. clauda and A. eriantha (Сp genome) about 10–11 million years ago. Among the species with the A nuclear genome, A. canariensis (Ac genome) was the first to separate from the common branch about 13–15 million years ago. Then, about 11–12 million years ago, the A. damascena branch (Ad genome) was formed. The A. longiglumis branch became independent about 9–10 million years ago; the chloroplast genome, relative to A. longiglumis, 8–9 million years ago, was received by an ancestor of tetraploids with the AC/DC genome. About 6–7 million years ago, a tetraploid ancestor with the nuclear DC genome (A. murphyi and A. maroccana) appeared, from which the A derivative of the chloroplast genome of A. longiglumis was obtained by hexaploid species A. sterilis, A. sativa, A. hybrida, A. occidentalis, and A. fatua. It can be suggested that different hexaploid species A. sativa + A. sterilia, A. occidentalis + A. hybrida, and A. fatua appeared independently in different parts of the range. In any case according to Fu [20], the chloroplast genomes of tetraploid A. murphyi and A. maroccana are much closer to the chloroplast genome of A. fatua than to the cp genomes of Asativa and A. sterilis. According to Fu [20], the phylogenetic branches of the hexaploid species (A. sativa + A. sterilis) and (A. occidentalis + A. fatua) diverged 7.4 million years ago; the time of divergence of A. sativa from A. sterilis was estimated at 4.9 million years, and that of A. fatua from A. occidentalis was estimated at 6.5 million years. Among the diploid species of the genus Avena, one of the branches that separated from the common tree about 10 million years ago gave rise to diploid species with the As genomes. The chloroplast genome of this branch about 6–7 million years ago went to the ancestor of tetraploid oats with the genomic formula AB (A. abyssinica, A. barbata, A. vaviloviana). It is suggested that A. agadiriana (nuclear AB genome) appeared independently 6–7 million years ago. This species has a different type of chloroplast genome; its cpDNA is closer than others to the chloroplast genome of A. longiglumis [20].

Our analysis of the origin of polyploid Avena species using new generation locus-specific sequencing showed that polyploid Avena species lost most of the 35S rRNA genes received from C-subgenome ancestors [96]. One would expect to find 50% of these genes in A. insularis (DDCC karyotype) and 30% in hexaploid species (AACCDD karyotype). In fact, there was about 3.2% in A. insularis and 1.4–2.4%, in hexaploids. In all cases, the mean p-distance (measured as the proportion of nucleotide differences between pairwise compared sequences) for the 18S rRNA, ITS1, and 5.8S rDNA sequences in the C subgenomes was an order of magnitude higher than in the sequences of the A subgenomes [96].

Molecular phylogenetic analysis of the similarities of the C-subgenome sequences showed that none of the modern diploid Avena species with the C genomes was the direct ancestor of polyploid species. Each of the polyploid species has several different families of the C-subgenome sequences in its genome (Fig. 1). Bootstrap indices are low in all cases, which is caused by the low number of substitutions, which distinguish the compared genomic and subgenomic sequences.

Fig. 1.
figure 1

Molecular phylogenetic tree (minimum-evolution ME algorithm) of similarities between ITS1 sequences of the C subgenome and ITS sequences of diploid Avena species.

Molecular phylogenetic analysis of the similarities of the A-subgenome sequences of polyploids showed that none of the modern diploid Avena species with the A genomes was the direct ancestor of polyploid species. Each of the polyploid species has several different families of the A-subgenome sequences. The genome of A. insularis contains a relatively high number of transcribed spacers, displaying sequence similarity to one of the previously studied A. hirtula accessions (Fig. 2).

Fig. 2.
figure 2

Molecular phylogenetic tree (minimum-evolution ME algorithm) of similarities between ITS1 sequences of the A subgenome and ITS sequences of diploid Avena species.

Next, targeted sequencing data obtained for the “population” of ITS sequences of the polyploid Avena genomes were treated using the TCS software program [97]. The TCS algorithm is based on the probabilistic method of statistical parsimony and makes it possible to assess the probability of relationships between all haplotypes with the indication of the number of mutations distinguishing the studied haplotypes. The data of TCS calculations were treated using the tcsBU software program [98]. The closeness (relatedness) of rDNA from the C and A subgenomes of polyploid species A. insularis, A. fatua, A. ludovociana, and A. sterilis can be seen in Figs. 3 and 4.

Fig. 3.
figure 3

System of C-type haplotypes in diploids and polyploids.

Fig. 4.
figure 4

A-type genome system (ITS1 haplotypes) in diploids and polyploids of the genus Avena. Clusters of putative A and D subgenomes and a cluster grouping diploid species with the A genomes are marked.

From the data presented in Fig. 3, it follows that A. pilosa (synonym, A. eriantha) and A. clauda indeed have common a C genome (Cp), which differs from the C genome of the Cv type, characteristic of A. ventricosa, and the Cm genome of A. macrostachya. The C subgenomes of the studied polyploid species are diverse, but among them the main (core) variant can be distinguished, approximately equidistant from ITS of diploids carrying the C genome and from A. macrostachya. The system of the A-type subgenomes in the studied polyploids is different. The A subgenomes of A. insularis are represented by several families, one of which is close to the A genomes of A. longiglumis and A. canariensis. The genomes of A. fatua, A. sterilis, and A. ludoviciana contain two or three A-type rDNA families, possibly corresponding to the subgenomes named A and D by cytogenetics. The rDNA sequence (strictly speaking, the ITS1 region) of the putative D-subgenome variants is more distant from rDNA of the A genomes of diploids and the variants of the putative A subgenome (Fig. 4).

These findings advance our knowledge on the oat genome evolution and have practical implications for the conservation and use of oat germplasm in breeding.

RESOURCE POTENTIAL OF WILD AND CULTIVATED SPECIES OF THE GENUS OF OATS

Effective use of intraspecific diversity of the Avena species for breeding traits is impossible without a preliminary assessment of the genetic diversity of wild species and landraces of cultivated species of the genus Avena [15]. Moreover, an important task is the certification of new cultivars and landraces of Avena using SSR analysis, which currently seems to be the most informative for these purposes [74, 99, 100]. In the conditions of intensive agriculture, the cultivar certification in a number of cases can be important in the establishment of priorities, authorship of cultivars, and the approval of exclusive rights to the results of breeding.

In the conditions of intensive agriculture in the last quarter of the 20th and the beginning of the 21st century, both in the West and in the East, signs of “genetic erosion” were observed in modern elite cultivars [100102]. Molecular genetic studies (SSR, AFLP and DArT) showed that, compared to modern commercial cultivars, landraces and old oat cultivars were characterized by higher genetic diversity for a wide range of studied traits, which confirmed the value of landraces as sources for breeding programs [100, 103]. Moreover, in cultivars with different types of development (winter crops, spring crops), the level of genetic diversity can be substantially different. For example, in the study of Clos et al. [84], 635 oat lines were compared at 4561 SNPs. Spring oat lines were shown to have lower genetic diversity than cultivars grown in the southern states, where oats are cultivated as a winter crop.

Under these conditions, it becomes critically important to develop effective programs for the conservation and enrichment of the genetic diversity of oats through the reproduction of landraces in genetic collections and the use of wild oat species as breeding important loci [15, 74, 100]. For instance, crossing of A. sativa with A. sterilis can be used to introduce the genes conditioning resistance to powdery mildew and crown and stem rust into the genomes of elite varieties [5, 104, 105]. A. fatua was involved in the programs for the production of oat lines resistant to barley yellow dwarf virus (BYDV) [106]. The hexaploid A. byzantina was a donor for the transmission of the trait of resistance to powdery mildew and crown rust to A. sativa [107]. The Pc91 allele of the crown rust resistance gene was introgressed into cultivated oats from the genome of tetraploid oat A. magna [108], while the Pc94 allele was introgressed into A. sativa from the diploid A. strigosa [109].

The hexaploid oat species A. fatua and A. sterilis and the tetraploid oat A. macrostachya have been and are used in breeding programs for the development of cold-tolerant cultivars of common oat [110, 111].

Metabolomic analysis is a new effective approach to assessing the resource potential of individual cultivars and populations of wild oat species. Our metabolic profiling of kernels from cultivars and wild Avena species showed that the range of the metabolomic profile variation in elite cultivars was much narrower than in wild species. The metabolites, the content of which decreased in the process of domestication or in which wild oat species differed from the cultivars of this crop, were identified [112]. These findings may be associated with the selection during oat domestication and a decrease in the metabolome diversity during the formation of the “domestication syndrome” [113]. The variety of metabolomic profiles can be lost during the selection process upon the creation of highly specialized isoline-based and intensive modern cultivars, since this process is always accompanied by the decrease in genetic polymorphism of the object of breeding in comparison with metagenomes of many ecotypes and landraces and hundreds of natural races of wild species [112, 113]. In analysis of kernel metabolomic profiles in oat accessions susceptible to infection with Fusarium, correlations between these parameters were revealed. It was demonstrated that high-protein forms of oats were less affected by Fusarium infection and accumulated less toxins, and they were more adaptive to biotic stress [114]. Analysis of naked and hulled forms of common oat revealed the differences in metabolomic profiles of these forms, which serves as another confirmation in favor of the separation of these common oat subspecies [112, 115]. Comparison of the metabolomic profiles of groups of oat cultivars, including landraces, primitive cultivars, and modern cultivars bred in Russia and France, demonstrated considerable differences between the studied groups. Differences were revealed upon comparison of metabolites that are important for the formation of the resistance of the cultivar to stress, as well as nutritional, medicinal, and dietary benefits of grain products. The most informationally valuable traits were identified, which made it possible to statistically significantly separate oat accessions of different origin and with different degrees of breeding elaboration. The study showed that realization of breeding programs aimed at the improvement of biochemical parameters of oat grains required the use of genetic diversity resources of Russian landraces and primitive elite cultivars collected and created in the 1920s and 1930s [116].

In recent years, common oat has become one of the most promising and demanded agricultural crops, since it has a number of valuable properties that meet the requirements for “functional foods.” One of the classes of such compounds is polyphenolic avenanthramides, which have antioxidant, anti-inflammatory, and antiatherogenic properties [117]. It was demonstrated that different oat cultivars differed in the avenanthramide content. For instance, the Jaak cultivar (1.1–2.7 g/kg) was characterized by a consistently high concentration of the compound, compared to other standard cultivars (Belinda, 0.5–1.2 g/kg; Ivory, 0.3–1.7 g/kg). At the same time, the diploid cultivated species A. strigosa showed very high avenanthramide concentration, up to 4.1 g/kg, and the hexaploid A. byzantine, 3.0 g/kg. On the contrary, wild species of different ploidy A. hirtula, A. barbata, A. fatua, and A. sterilis were distinguished by relatively low avenanthramide concentrations (240–1585 mg/kg) [118]. An even greater variation in kernel avenanthramide concentration was revealed in the analysis of a large set of accessions of cultivated and wild oat species [119]. Oats also contain two different saponin forms, avenacosides (steroid-associated sugars) and avenacins (triterpenoid-associated sugars), which have been shown to lower cholesterol, stimulate the immune system, and have anti-carcinogenic properties [120]. Oat breeding targeted to increase the content of these substances in oat lines was not carried out previously; however, interline and interspecific differences in this index were revealed [121].

Among the many products of oat biosynthesis, the greatest role for humans seems to be played by soluble fiber and primarily β-glucans (as well as arabinoxylan, xyloglucan and some other minor fiber components), which reduce blood cholesterol levels and substantially reduce the risk of cardiovascular diseases [122124]. A trove of evidence on the beneficial role of oat β-glucans led the US Food and Drug Administration (FDA) to officially declare that soluble fiber from whole oat kernel in the form of oatmeal, bran, and flour reduces the risk of cardiovascular disease, which requires at least 3 g of β-glucans per day with food. Subsequently, similar recommendations were approved by the European Commission and the competent authorities of Australia, New Zealand, Canada, Brazil, Malaysia, Indonesia, and South Korea [124].

The genetic diversity of oats in terms of kernel β‑glucan content was assessed within the framework of two European programs. The data of the HEALTHGRAIN Diversity Screen project showed that the kernel β-glucan and antioxidant content of the five studied oat cultivars was considerably different [125]. In the European Project “Avena Genetic Resources for Quality in Human Consumption,” analysis of 658 oat cultivars confirmed the contribution of both genetic and ecological components to the formation of the trait [126]. Interestingly, a high content of β-glucans and other antioxidants in the grain, as compared to cultivated and other wild-growing di- and tetraploid oats, was found in the hexaploid species A. fatua, A. occidentalis, and A. byzantina and in the diploid Aatlantica [123, 126–129].

Sikora et al. [130] analyzed 1700 oat lines isolated from the Belinda cultivar (Sweden) and carrying EMS-induced mutations. Among these, 10 lines in which the concentration of β-glucans in the kernel exceeded 6.7% and 10 lines in which their content was under 3.6% (the concentration of β-glucans in the Belinda cultivar is 4.9%) were identified. The maximum spread in the content of these polysaccharides among 1700 mutant lines was from 1.8 to 7.5%.

Currently, the studies of associations between the genotype and the β-glucan and fatty acid content in oats using the GWAS approach are under way. For instance, four loci associated with variation in the fatty acid content and composition in oat grain were identified. However, the genomic regions providing variation in the protein, oil, sugar, and uronic acid content that in turn directly affect the quality of grain still remain unexplored [77].

The data obtained using molecular metabolomic analyses, i.e., so-called mQTL (metabolite quantitative trait loci) and mGWAS (metabolome-based GWAS), make it possible at a new level to qualitatively and quantitatively characterize secondary metabolites of interest for breeding. These analyses can provide information on the relationships of metabolites to each other, as well as to important breeding traits, which can lead to the development of more rational models linking a particular metabolite with the traits of productivity or end product quality. Even more promising is the possibility of studying the relationships between quantitative changes in metabolites and changes in plant phenotype [131].

CONCLUSIONS

Thus, wild and cultivated species of the genus Avena L. are the objects of taxonomic, phylogenetic, genetic, molecular genetics, “omics” studies. The results of these studies were reported at the recent 10th International Oat Conference held in 2016 in St. Petersburg [132]. Unfortunately, in Russia, species of the genus Avena L. are not the objects of a large body of studies, though common oat is one of the leading grain crops, which is widely used in agricultural production for fodder purposes and lately for obtaining nutritional, dietary, and functional foods.

The study of the pathways of origin, “domestication,” and the systematic position of species of the genus Avena makes it possible to better understand the genetic nature of all species and, in turn, outline the directions for the conservation of their biological diversity and the use of the resource potential. The most promising direction in the study and identification of genetic diversity in relation to the breeding programs is the use of sequencing techniques and molecular markers. Currently, not a single line of research is complete without the use of DNA technologies that make it possible at a new level and rather quickly to perform genotyping, genetic mapping, and marker assisted selection (MAS) to identify genotypes carrying valuable alleles of the genes that control different breeding important traits, which many times shortens the path to the fields of highly resistant and highly productive oat cultivars.