Introduction

Current molecular characterization of ex situ plant germplasm has placed more emphasis on cultivated gene pools and less on exotic gene pools representing wild relative species (Karp 2002). This largely reflects the challenge in species identification, the inadequate coverage of extant germplasm for wild species, the difficulty in characterizing wild species of different habits (annual or perennial) and mating types (outcrossing or inbreeding), and/or the need for substantial effort in the introgression of exotic germplasm into a plant breeding program. However, examples of successful introgressions of exotic disease and pest resistance genes from wild into cultivated species are not lacking (e.g., see Harder et al. 1992). With recent advances in molecular technology, such an introgression is expected to play an important role in unlocking the genetic potential of wild relatives for crop improvement (Tanksley and McCouch 1997; Fridman et al. 2004). Thus, more attention is warranted for characterization of exotic germplasm for plant breeding and research (Hawkes 1990; Jellen and Leggett 2006).

Plant Gene Resources of Canada (PGRC; the Canadian national seed genebank) at Saskatoon maintains a unique world collection of oat germplasm with more than 27,000 accessions of 26 oat species (Wesenberg et al. 1992; Diederichsen et al. 2001). To facilitate the utilization of the exotic oat germplasm in both oat breeding and research, an attempt was made to establish a reference set of exotic germplasm accessions for these Avena species. A large set of exotic oat accessions were morphologically characterized and baseline data with verified species identity were obtained for the development of reference accessions. This characterization provides a unique opportunity to assess the genetic variation of Avena gene pools and the genetic relationships of Avena species. Knowledge about the Avena gene pools has become more critical to search for new sources of genes for oat improvement (Ladizinsky 1988; Frey 1991; Leggett 1996; Jellen and Leggett 2006; Fu et al. 2007).

The genus Avena, belonging to the Gramineae family, consists of 30 different species (Baum 1977; Baum and Fedak 1985a, b; Ladizinsky 1998). All the representative species are annual inbreeders, with the exception of A. macrostachya as an outbreeding, perennial tetraploid. These species are generally recognized by oat workers, though there is some disagreement over classification of some species/taxons (Leggett 1992). Malzew (1930) published the first most comprehensive treatment of Avena. Based on the morphological, genetical and ecological evidence, Ladizinsky made the first attempt to re-classify the Avena species into six species, in which A. sativa included all classical hexaploid species, and later introduced 14 biological species of Avena (Ladizinsky 1971, 1988). Baum (1977) presented an advanced taxonomic treatment of Avena with seven sections and 27 species.

The Avena species form a distinct polyploidy series ranging from diploid through tetraploid to hexaploid with a basic chromosome number of seven. Diploid species have either the A or C genome, tetraploids have either the AC or AB genome, and hexaploids have the ACD genome designation. The classification of these cytologically distinct genomes was based on their karyotypes and the pairing behavior in their hybrids (Rajhathy and Thomas 1974; Baum 1977; Thomas 1992). The A genome is structurally different from the C genome. The A genome species also displayed minor structural differentiations, which were designated as As (A. altantica, A. brevis, A. hirtula, A. nuda, A. strigosa, A. wiestii), Ac (A. canariensis), Al (A. longiglumis), Ad (A. damascena), and Ap (A. prostrata). No diploids with the B or D genome have been identified. The three diploid C genome species were separated into two genome types (Cv and Cp) (Leggett and Thomas 1995), and both have been proposed as the putative donors of the C genome of the hexaploids (Rajhathy and Thomas 1974; Chen and Armstrong 1994; Jellen et al. 1994).

Efforts have been made using different molecular techniques to assess Avena species and genome relationships, but these relationships are still poorly understood (Thomas 1995; Li et al. 2000; Drossou et al. 2004; Loskutov 2008). Close relationships between the A and D genomes (Chen and Armstrong 1994; Jellen et al. 1994) and between the A and B genomes (Leggett and Markhand 1995; Katsiotis et al. 1997) were found by genomic in situ hybridizations. Differentiation of the D or B genomes from the A genome was reported with molecular probes (Linares et al. 1998; Irigoyen et al. 2001). Species relationships were largely confirmed with various molecular markers such as isozymes (Sanchez de la Hoz and Fominaya 1989), restriction fragment length polymorphism (RFLP) (Alicchio et al. 1995; Nocelli et al. 1999), randomly amplified polymorphic DNA (RAPD) (Nocelli et al. 1999; Loskutov and Perchuk 2000), microsatellite (Li et al. 2000), and amplified fragment length polymorphism (AFLP) (Drossou et al. 2004). However, these assessments are largely limited to a small set of different Avena species with different molecular markers (Li et al. 2000; Drossou et al. 2004), which makes inferences of genetic relationships less compatible. Also, previous molecular studies rarely have addressed specifically the genetic diversity of these Avena species and thus offered little resolution to understand the exotic oat gene pools.

The objectives of this study were to (1) assess the genetic diversity of 163 accessions representing 25 Avena species and eight genome types using AFLP markers, and (2) infer the genetic relationships of 25 Avena species and eight known genome types. The AFLP technique (Vos et al. 1995) is a robust, highly effective method of DNA fingerprinting that can be used to assess molecular genetic variability. The AFLP markers, although scored dominantly (i.e., without distinction between homozygotes and heterozygotes) and not always homologous (Mechanda et al. 2004), have shown to be effective in detecting phylogenetic signals in many plant species (Hodkinson et al. 2000; Koopman 2005; Althoff et al. 2007), including Avena species (Drossou et al. 2004).

Materials and methods

Plant materials

About 240 Avena accessions of diverse geographic origins representing 26 Avena species were selected from the PGRC wild oat collection. Seeds of the selected accessions were planted from 2004 to 2005 in the greenhouse at the Saskatoon Research Centre, Agriculture and Agri-Food Canada. Oat plants were characterized with more than 20 morphological characters including germination, growth habit, leaf morphology, flower characteristics and seed character. A total of 163 accessions of 25 species with confidence of correct species identification were chosen in 2006 for this study (Table 1; Supplementary Table S1). One additional barley accession (CN 2458) was also selected from the PGRC barley collection to serve as an outgroup.

Table 1 AFLP variations for 163 accessions representing 25 Avena species

DNA extraction and AFLP analysis

About 10–15 kernels of each selected accession were grown in the greenhouse. Young leaves were collected from 10 5-day-old seedlings of each accession, bulked, freeze-dried (in a Labconco Freeze Dry System for 3–5 days), and stored at −80°C. DNA extraction and AFLP analysis have previously been described in detail by Fu et al. (2004). Based on the previous AFLP analyses of oat germplasm (Fu et al. 2004, 2005), the five most informative primer pairs (Table 2) were selected for this AFLP analysis. To assess the consistency of the AFLP profiles over the two gels of 164 samples, four randomly selected DNA samples were placed in both gels for each primer pair.

Table 2 AFLP variations revealed by five AFLP primer pairs in 163 Avena accessions

Data analysis

Automatic analysis of banding patterns on ten gels was conducted using GelComparII™ (Applied Maths, Belgium). The TIFF format gel images were processed from the autoradiographs using a digital camera. The conversion, normalization, and background subtraction of the gel images were conducted following the Gelcompar II user’s guide. The image within one gel was normalized using the four duplicate samples as reference lanes. Image alignment among gels produced from the same primer pairs was performed using an external reference (a 30–330 bp AFLP DNA ladder; Promega, Madison, WI, USA) and an internal reference (bands that were monomorphic across gels). The aligned gel images were automatically scored as 1 (present) or 0 (absent). Any AFLP band displaying two or more mismatches between gels over four duplicate pairs was discarded from analysis.

The selected polymorphic bands were analyzed for the level of polymorphism with respect to primer and species by counting the number of polymorphic bands and generating the summary statistics on the band frequencies. To visualize the variation pattern, the numbers of polymorphic bands were plotted against their frequencies of occurrence in all the assayed Avena accessions. To assess the impact of accession size on the polymorphism observed for a species, a regression was done using SAS PROC REG (SAS Institute 2004) on the number of accessions over the number of polymorphic bands, the mean band frequency, and species-specific proportion of the total AFLP variation obtained from the analysis of molecular variance (AMOVA; Excoffier et al. 1992) given below.

To assess AFLP variations across four groups (species, section, ploidy, and genome type), an AMOVA was performed using Arlequin version 3.01 (Excoffier et al. 2005). This analysis not only allows the partition of the total AFLP variation into within- and among-group variation components, but also provides a measure of inter-group genetic distances as the proportion of the total AFLP variation residing between Avena accessions of any two groups (called the Phi statistic; Excoffier et al. 1992). Models involving different levels and types of structuring (species, section, ploidy, and genome type) were applied. Standardized proportions of the total AFLP variation (i.e., member-specific Fst) were also generated for specific members of four groups. Significance of resulting variance components and inter-group genetic distances was tested with 10,100 random permutations.

A neighbor-joining analysis of 163 Avena and one barley accessions was also made using PAUP* (Swofford 1998) and a radiation tree was displayed using MEGA 3.01 (Kumar et al. 2004) to confirm the genetic associations of individual accessions representing various Avena species. To assess the genetic associations of the Avena species and genome types, the inter-group genetic distances obtained from AMOVA for both the Avena species and genome types were analyzed using the neighbor-joining method in MEGA 3.01 program (Kumar et al. 2004).

To assess the AFLP similarity among Avena species, a computer program was specifically written with SAS IML (SAS Institute 2004). An AFLP similarity was calculated for each pair of the individual samples between two accessions using Dice’s similarity coefficient (Dice 1945) and the similarities of all the pairs were averaged for these two accessions. An accession with a higher similarity with another accession would mean they are genetically more related than to the other accessions, if the detected AFLP fragments adequately sampled Avena genomes. The computer program is available upon request.

Results

Five AFLP primer pairs amplified more than 650 AFLP bands for one barley and 163 oat accessions. Removing those mismatched bands between the two gels in more than one of the four duplicated accessions yielded a total of 413 AFLP bands selected for further analyses (Table 2). The primer pair E+AAG/M+CAC had the most polymorphic bands, followed by the primer pairs E+ACG/M+CTG and E+AGG/M+CGC. The individual band frequencies in the assayed accessions ranged from 0.006 to 0.994 and the mean band frequencies found for these primer pairs ranged from 0.342 to 0.650. The number of the polymorphic AFLP bands ranged from 20 to 55 over their occurrence frequencies from 0.01 to 0.99 for the oat accessions alone (Fig. 1).

Fig. 1
figure 1

Number of polymorphic AFLP bands with respect to their frequencies of occurrence in 163 individual Avena accessions

AFLP variation of the Avena accessions

According to the oat species, the AFLP variation was quantified in this study for the number of polymorphic bands, the mean band frequency, and species-specific proportion of the AFLP variation obtained from the AMOVA (Table 1). The number of polymorphic bands observed for a species with multiple accessions assayed ranged from 30 (A. nuda) to 259 (A. sterilis), depending on the number of the assayed accessions. The mean band frequency ranged from 0.367 (A. hispanica) to 0.526 (A. fatua). The species-specific proportions of the total variation ranged from 0.574 (A. maroccana and A. sterilis) to 0.671 (A. canariensis). Linear regressions of diversity estimates over the numbers of the assayed accessions confirmed the dependence of the number of polymorphic bands on the accession sizes used for a species, but not for the mean band frequency and the species-specific proportion of the total variation. Based on AMOVA, 59.5% of the total variation resided among these oat species and 40.5% was present within individual species (Table 3).

Table 3 Analysis of molecular variance (AMOVA) sum of squares partitioning of total AFLP variation into among- and within-species (or group) components based on four structure models

For the six sections of Avena species defined by Baum (1977), more AFLP variation was found within than among sections (Table 3). Based on the section-specific proportion of the total variation, the most diverse section was Agraria, followed by Ethiopica, Ventricosa, Avena, Pachycarpa, and Tenuicarpa (Table 4). For the three ploidy levels, more AFLP variation was observed within than among ploidy levels (Table 3). When measured by ploidy-specific Fst, the hexaploid species were the most diverse, followed by the diploid and the tetraploid species (Table 4), even though statistical significance of these differences was not tested. Slightly more variation resided among than within eight genome types (Table 3). Genome-type-specific Fst estimates were also similar, with the most diverse genome type being Ad, followed by Ac, Cp, Al, As, ACD, AC, and AB (Table 4).

Table 4 Group-specific proportions of the total AFLP variation (Fst)

Genetic relationships of the Avena species

Three approaches were applied to assess the genetic relationships of 25 Avena species. Based on the differences of 413 polymorphic AFLP bands, 163 individual accessions representing these Avena species were clustered with a neighbor joining method. Several major patterns of genetic association are observed (Fig. 2). First, as expected with a diploid genome, the outgroup barley accession is close to the three diploid species of the section Ventricosa (A. ventricosa, A. eriantha, and A. clauda). Second, all the species within a ploidy level are closely related and the diploid species are more related to the tetraploid than the hexaploid, species. Third, four species in the Agraria section are mixed and three hexaploid species (A. hybrida, A. fatua, and A. occidentalis) are not well separated at the individual accession level. Fourth, the three sub-groups of A. sativa (common oat, red oat labeled with b, and hull-less oat labeled with h) were relatively well separated. Fifth, three newly discovered species are well positioned in the cluster. The tetraploid species A. insularis is closely related to A. murphyi and A. maroccana and another tetraploid species, A. agadiriana, is closely related to A. abyssinica. The diploid species, A. atlantica, is closely related to A. wiestii and should share the same genome type of As.

Fig. 2
figure 2

Genetic associations of 163 individual Avena accessions representing 25 Avena species, reflected in the differences of 413 AFLP bands. Each accession is identified with the first two letters of the species name and the numerical code for the accession(s) of the species (see Supplementary Table S1). The species A. strigosa shares the first two letters of the species name with A. sterilis and thus is labeled with the third letter as sr. The barley accession as the outgroup is labeled as Barley

To assess the species relationship directly, the genetic distance was estimated as the proportion of the total AFLP variation residing between any two species with multiple samples. Based on this distance matrix, 22 Avena species were clustered. Although three species (A. ventricosa, A. murphyi, and A. abyssinica) were removed, the species relationships obtained (Fig. 3) remain largely unchanged when inferred at individual accession level (Fig. 2). Clearly, the species with Cp genomes (A. clauda and A. eriantha) are the oldest species in Avena, followed by A. canariensis (with Ac genome). All the species with the As genome are closely related, with the exception that A. lusitanica is more related to A. damascena with the Ad genome and A. longiglumis with the Al genome. Two tetraploid species with the AC genome (A. insularis and A. maroccana) are more related to the five hexaploid species with the ACD genome than the tetraploid species with the AB genome (A. agadiriana, A. barbata, and A. vaviloviana). The hexaploid species closest to the cultivated species A. sativa is A. sterilis.

Fig. 3
figure 3

Genetic associations of 22 Avena species reflected in the proportion of the total AFLP variation residing among species. The genome type and/or possible four letters representing the known sources of resistance or tolerance to diseases (R crown or stem rust, S smut, P powdery mildew, B barley yellow dwarf virus) are given in the parenthesis following a species. The known sources of resistance were obtained from Harder et al. (1992)

Analyses of the AFLP similarities among 25 Avena species (Table 5) revealed that the AC genome tetraploid A. maroccana is more similar to the Cp genome diploid A. eriantha and the As genome diploid A. wiestii than the other diploids. It also displayed the highest similarity to the five ACD genome hexaploids among all seven tetraploids. It appears that the As genome diploid A. wiestii had the highest similarity with all three AC genome tetraploids and is the most likely A genome donor of the AC genome. Four As genome diploids with the high similarities were also identified as the likely A genome donor of the AB genome, although variation existed to each AB tetraploid (Table 5).

Table 5 AFLP similarities of AC or AB genome tetraploids with C or A genome diploids and ACD genome hexaploids assayed

Genetic relationships of the Avena genome types

To assess the genetic relationships of the Avena genome types, individual accessions were first grouped according to the defined genome types. The genetic distance was estimated by AMOVA as the proportion of the total AFLP variation residing between any two groups of accessions representing two genome types. This resulting distance matrix was clustered for eight genome types. Clearly, three clusters with the Cp genome were most distinct (Fig. 4). The second cluster for A genome indicates that the Ac genome is the oldest, followed by the As, Al and Ad genomes. It appears that the Al genome is more related to the Ad than is As. The third cluster shows that the AC genome is more related to the ACD genome than the AB genome, but the AB genome seems to be the oldest in this cluster.

Fig. 4
figure 4

Genetic associations of eight genome types of Avena species reflected in the proportion of the total AFLP variation residing among accessions representing various genome types of Avena species. See Table 1 for genome type designations

Discussion

This AFLP analysis not only yielded the first information on genetic variation among and within 25 Avena species for future germplasm exploration, but also provided a comprehensive view of the evolutionary pathways of Avena species and genomes. A majority of the AFLP variation resided among 25 oat species, within six assessed sections of the genus, within three existing ploidy levels, or among eight defined genome types. All the species were clustered together according to their ploidy levels. The C genome diploids appeared to be the most distinct, followed by the Ac genome diploid A. canariensis. The Ac genome seemed to be the oldest of all the A genomes, followed by the As, Al and Ad genomes. The AC genome tetraploids were more related to the ACD genome hexaploids than the AB genome tetraploids. Analysis of AFLP similarity suggested that the AC genome tetraploid A. maroccana was likely derived from the Cp genome diploid A. eriantha and the As genome diploid A. wiestii, and thus might be the progenitor of the ACD genome hexaploids.

The genetic relationships obtained for these 25 Avena species displayed several new pieces of information for our understanding of the species evolution in the genus (Leggett 1992). First, A. maroccana is the most likely progenitor of the ACD genome hexaploids, not A. insularis as previously suggested (Ladizinsky 1998, 1999). Second, A. wiestii appears to be the likely A genome donor of the ACD genome hexaploids, not A. strigosa as earlier proposed (Rajhathy and Thomas 1974) and nor A. canariensis as recently suggested (Li et al. 2000). Third, A. eriantha appears to be the likely C genome donor of the ACD genome hexaploids, which supports the suggestion made by Li et al. (2000). Fourth, the As diploids are the likely donor of either A or B genome for the AB genome tetraploids, which accords well with the indication that the B genome of the tetraploid A. barbata complex is closely related to the As genome of the diploid A. strigosa (Leggett and Thomas 1995; Leggett 1996). Fifth, the evolutionary sequence of the A genome diploids obtained from this analysis appears to be Ac > As > Al > Ad, rather than the early proposed sequence of Ap > Al > Ad > Ac > As (Rajhathy and Thomas 1974; Leggett 1992; Nocelli et al. 1999). However, these relationships, although more comprehensive than any of those reported so far, still need to be further assessed with other more effective molecular tools because the AFLP analysis is not free of limitations in phylogenetic studies due to the possible non-independence and non-homology in AFLP fragments (Koopman 2005).

Three C genome diploids form a group clearly distinct from other Avena taxa. As expected with the similar Cp genomes, A. eriantha and A. clauda are more related than A. ventricosa with a Cv genome. It appears that A. ventricosa is the progenitor of all the Avena taxa assayed. Two exceptions were found in the grouping of the 11 A genome diploids. Clustering of A. lusitanica of the As genome with A. damascena with the Ad genome and A. longiglumis with the Al genome (Fig. 3) does not seem to support the designation of an As genome for the species proposed by Drossou et al. (2004). The diploid species A. atlantica is closely related to A. wiestii and should share the same genome type of As. Thus, further assessment on the genome type of A. lusitanica and A. atlantica is desirable. Also, A. canariensis appears to represent the oldest lineages in the A genome diploids, respectively shown in Figs. 2 and 3.

The four AB genome tetraploids appear to be older than the three AC genome tetraploids (Figs. 2 and 3). A. vaviloviana is more related to A. barbata, followed by A. abyssinica and A. agadiriana. A. insularis appears to be the oldest of three AC genome tetraploids (Fig. 2). The five ACD genome hexaploids are clustered together, but A. hybrida is more related to A. fatua and A. occidentalis. A. sativa included common oat, red oat and hull-less oat (labelled with additional letters b and h in Fig. 2, respectively). The red oat and hull-less oat were previously named A. byzantina and A. nudisativa, respectively (Baum 1977), but it appears that the three red oat accessions are well separated from, while all the hull-less oat accessions are still mixed with, the common oat accessions.

Genetic variations among and within relative species of a genus rarely have been comprehensively assessed using molecular techniques with respect to germplasm conservation and utilization. This analysis clearly illustrates how diverse Avena species is over section, species, ploidy and genome type (Tables 1 and 3). For example, the diploids generally are more diverse than the tetraploids and hexaploids within a species as shown in Fst (Table 1). The most diverse oat species was A. canariensis, followed by A. nuda, A. wiestii, and A. brevis. The least diverse oat species were A. maroccana, A. sterilis, and A. sativa. Two sister hexaploids, A. sterilis and A. sativa, although displaying similar levels of AFLP variation, had lower variation than that of the other three hexaploids assayed. Associating these diversity measurements to morphological characters would add some resolution to understand Avena species divergence. For example, A. hybrida, although sharing many morphological characters with its relatives A. fatua and A. occidentalis (Baum 1977), appears to display larger AFLP variation (Table 1), suggesting A. hybrida might diverge more rapidly than its relatives.

However, this AFLP analysis would be more informative if all 30 Avena species had been adequately represented with multiple accessions. Five well recognized species were unavailable for this study: A. macrostachya (unknown genome type; 2x), A. hirtula (As; 2x), A. prostrata (Ap; 2x), A. atherantha (ACD; 6x) and A. trichophylla (ACD; 6x), and three species were represented with only one accession (Table 1). Thus, continuous efforts are still needed to collect and characterize germplasm for these under-represented species.

Implications for exotic gene pools

Following the idea of Harlan and De Wet (1971) and based mainly on the ease of gene transfer between the different species, Leggett (1996) elaborated the exotic gene pools of Avena species with all the hexaploids as the primary, two tetraploids (A. murphyi, A. maroccana) as the secondary, and the diploid and all the other tetraploids as the tertiary gene pool. Clearly, A. insularis should be included in the secondary gene pool. The genetic relationships of Avena species reported here should enhance our understanding of these exotic gene pools in terms of genomic structure and relationship. Understanding the evolutionary pathways of Avena species will increase our confidence in introgressions of exotic genes into cultivated germplasm (Jellen and Leggett 2006).

The patterns of AFLP variation reported here with respect to species, section, genome type, and ploidy level are useful in sampling accessions for establishing reference set(s) of exotic Avena germplasm. Relatively lower within-species than among-species variations imply the need to place more emphasis on species representation than within-species sampling. Species with a higher diversity may be considered with a larger representation in the reference sets. As ploidy and genome type explained considerable levels of AFLP variations, Avena accessions may need to be verified for their ploidy and genome types, particularly for those species with little available information. Thus the 163 Avena accessions assayed in this study will be modified accordingly for an adequate representation in the reference set.

Exotic oat germplasm has been well documented as a source of disease resistance for oat improvement (e.g., Harder et al. 1992). To facilitate the inference of the sources of disease resistance, the association of oat genome type with four diseases (crown or stem rust, smut, powdery mildew, and barley yellow dwarf) was assessed based on the known sources of disease resistance (Fig. 3). It appears that all four identified diseases are associated with three genome types: A, AB and ACD. Rust and powdery mildew are also associated with C genome, but only rust resistance was found in oat plants with AC genome.