Introduction

Maize is an important crop that is produced worldwide primarily as a food crop and as fodder and fuel (Ranum et al. 2014). Although its domestication is relatively new, the genetic diversity of maize germplasm is higher than any other major cereal crops (Tenallion et al. 2001; Vigouroux et al. 2008; Dao et al. 2014). Such exceptionally high genetic diversity ensures phenotypic diversity resulting in its ability to be cultivated in environments that range from tropical rainforests to high mountains and its ability to adapt to the short growing season in Canada. Thus, maize is cultivated in over 75 countries worldwide (Prasanna 2012). Analysis of genetic diversity is required to understand the genetics of maize domestication and dissemination into different environmental habitats. Molecular markers have allowed for the analysis of genetic diversity among diverse maize germplasms. Isozymes with accessions derived from particular countries or areas were formerly used for diversity analyses (Sánchez et al. 2000, 2007), but the isozyme method has been replaced with DNA-based markers. Matsuoka et al. (2002) examined microsatellite variations among 193 accessions representing the entire pre-Columbian range from eastern Canada to northern Chile and concluded that all maize arose from a single domestication in Southern Mexico about 9000 years ago. Similarly, Vigouroux et al. (2008) conducted additional research on 945 accessions from the same geographic range using more microsatellite markers and identified highland Mexico and the Andes as potential sources of genetic diversity among the elite lines in modern maize breeding programs. By analyzing sequence variations at loci 21 of chromosome 1 among 25 individuals of 16 exotic landraces and nine US inbred lines, Tenallion et al. (2001) found roughly one single-nucleotide polymorphism (SNP) per every ~100 bp between two randomly chosen maize lines; this difference is equivalent to the difference between humans and chimpanzees (Buckler and Stevens 2006). Recently, Dao et al. (2014) reported substantial levels of SNP variations between local and exotic germplasms at the CGIAR (Consultative Group for International Agricultural Research) institutes.

Because amplified fragment length polymorphism (AFLP) detects numerous anonymous loci with relatively modest technical complexity, it has been used for numerous genetic studies since it was first reported in 1995 (Vos et al. 1995). AFLP detects restriction site variations derived from base substitutions or base in/del mutations. Transposable elements (TEs) are genetic entities that can create mutations by changing their positions within a genome (McClintock 1950). Two types of TEs are recognized by their distinct transposition modes (Finnegen et al. 1989). While class I retrotransposons transpose via a “copy-and-paste” mechanism, class II DNA transposons transpose via a “cut-and-paste” mechanism. Both types of TEs are present in all eukaryotic genomes and they constitute as much as 85 % of the maize genome (Schnable et al. 2009). In addition to being highly abundant in the maize genome, these TEs also cause high levels of variation among maize lines or races. Wang and Dooner (2006) demonstrated remarkable haplotype diversity at the bronze locus in eight sets of inbred maize lines as a result of TE insertions. Because TEs are highly abundant and inert in eukaryotic genomes (Wicker et al. 2007) until a genome is challenged (Fedoroff and Bennetzen 2013), TEs have been utilized as molecular marker systems (Syed and Flavell 2006; Kalendar et al. 2011; Roy et al. 2015a, b).

We analyzed the genetic diversity among accessions or cultivars derived from southern Asia, northern China, and Canada with multi-allele detecting marker systems, such as AFLP, TE-based molecular markers, sequence-specific amplified polymorphism (SSAP), and transposon display (TD).

Materials and methods

Plant materials and genomic DNA isolation

Seventy-eight corn accessions or hybrid varieties were used in this study. They consisted of 10 inbred lines obtained from Agriculture and Agri-Food Canada (Ottawa, Canada) and 68 hybrid varieties collected from commercial markets in China, Thailand, India, Vietnam, and Canada (Supplementary Table 1). Plant genomic DNA was extracted from pooled leaf tissue samples from five young plants using the DNeasy Plant Maxi Kit (Qiagen, USA).

Molecular marker analysis

AFLP, CACTA-TD, and SSAP protocols, except electrophoresis, were from Roy et al. (2015b). Primer sequences are shown in Table 1. Amplification products were electrophoretically separated using a gel system on a LI-COR 4300 sequencer according to the manufacturer’s protocol (LI-COR Biotech. Lincoln, USA).

Table 1 Adapters and primers used in AFLP, CACTA-TD and SSAP

Data analysis

Only distinctive bands ranging from 200–500 bp were recorded as either 1 for present or 0 for absent. Faint or orphan bands were not read to avoid recording uncertainties. The percentage of polymorphic loci, the observed and effective numbers of alleles, Nei’s gene diversity, and Shannon’s information index were calculated using POPGENE software version 1.31 (Yeh et al. 1999). The genetic diversity matrix data was processed using Genalex software version 6.5 (Peakall and Smouse 2005) for the principal coordinates analysis (PCoA) and the Analysis of Molecular Variance (AMOVA) test. Similarity coefficients were calculated, and cluster analyses were performed using NTSYS software version 2.1 (Executor Software, Setauket, NY, USA). The effective marker index was calculated as the product of the total number of loci and the fraction of polymorphic loci, and the marker index was calculated as the product of the expected heterozygosity and the effective multiplex ratio (Powell et al. 1996; Nagaraju et al. 2001).

Results

Fingerprinting of maize lines with AFLP and TE-based markers

Eight primer combinations were used for AFLP, SSAP, and CACTA-TD (Table 1). The numbers of polymorphic bands obtained were 178/232 by AFLP, 405/419 by SSAP, and 249/290 by CACTA-TD (Table 2). Thus, the percent polymorphism detected by each marker system was 67 % for AFLP, 91 % for SSAP, and 86 % for CACTA-TD. Although the average heterozygosity was 0.25 for SSAP, the average heterozygosity was 0.18 for AFLP and 0.16 for CACTA-TD, indicating that SSAP provided the highest marker index. We did not find specific bands for inbred or hybrid cultivars from specific regions.

Table 2 Relative efficiency of molecular markers in determining polymorphism in analyzed maize population

Genetic diversity

DNA pooled from five plants in each line was used for genetic diversity analyses to detect the maximum number of alleles in each line. Of the three marker systems, SSAP showed the highest gene diversity between populations (Ht), as well as within populations (Hs), followed by AFLP and CACTA-TD (Table 3). AFLP and CACTA-TD mostly showed variations within populations rather than between populations. The coefficient of relative differentiation (Gst) was 0.08 for CACTA-TD and SSAP and 0.09 for AFLP. The gene flow estimates were 6.13 for SSAP, 5.68 for CACTA-TD, and 4.4 for AFLP. Higher genetic diversity within populations than between populations was also shown by hierarchical AMOVA analysis; variation within populations was 84 % for AFLP, 86 % for CACTA-TD, and 89 % for SSAP (Table 4). Similar diversity measures (all indices) were found in maize populations from north Asia, south Asia, and North America (Table 3).

Table 3 Diversity measures of Zea mays populations by AFLP, CACTA-TD and SSAP
Table 4 AMOVA analysis for the partitioning of AFLP, CACTA-TD and SSAP variation in analyzed maize varieties among and within populations

Cluster analysis

Cluster analyses with all three marker systems showed that the maize lines from Asia grouped separately from the Canadian maize lines (Fig. 1). An AFLP-derived dendrogram revealed that the maize lines were separated into three clusters with 75 % similarity: Asian maize lines, Canadian hybrid lines, and Canadian inbred lines. Two of the Canadian inbred lines were grouped with northern Chinese maize lines. CACTA-TD profiles divided the maize lines into two clusters with similarity coefficients of 35 %: Asian maize lines and Canadian maize lines. One of the Canadian hybrid lines did not fall into either of the two large clusters. The southern maize lines and northern maize lines were not separated in the Asian maize cluster. Of the Canadian lines, the hybrid varieties were clearly separated from the inbred lines, and the hybrid maize lines had higher similarity coefficients than the inbred maize lines. An SSAP dendrogram distributed the genotypes into two major clusters with a similarity coefficient of 56 %. Major cluster 1 was then divided into two sub-clusters, in which sub-cluster 1 comprised all of the Asian lines except a single Chinese accession (Si Da 204), which fell into sub-cluster 2 containing the Canadian lines. Major cluster 2 only contained three Canadian inbred (CO416, CO423, CO428) accessions.

Fig. 1
figure 1

Cluster dendrogram of 78 maize accessions based on AFLP markers (a), CACTA-TD markers (b), and SSAP markers (c)

PCoA was performed to determine the relationships among maize genotypes with regards to their positions on two coordinate axes. A plot of the first and second components accounted for 11.41 and 6.53 % of the variations (a cumulative variation of 17.94 %) identified by AFLP; 10.43 and 6.32 % of the variations (a cumulative variation of 16.74 %) identified by TD; 8.50 and 4.98 % of the variations (a cumulative variation of 13.49 %) identified by SSAP. PCoA was performed based on Nei’s distances and confirmed division of the corn lines into two major groups: Asian and Canadian groups (Fig. 2). All of the marker systems revealed that the north and south Asian lines were clustered into a single group and the Canadian lines were clustered into separate groups with all of the hybrid Canadian lines in one group (light blue boxes in Fig. 2) and the inbred Canadian lines (dark blue boxes in Fig. 2) in a separate group.

Fig. 2
figure 2

Two-dimensional PCoA plots based on AFLP, TD and SSAP genetic distance data produced using Genalex v6.5 software

Discussion

Maize has been domesticated in diverse environments ranging from low latitudes in tropical regions to high latitudes in Canada. In maize breeding programs, securing enough inbred lines is important because crosses between inbred lines that are genetically unrelated are better in terms of recombination than crosses between hybrids derived from similar crosses (Barata and Carena 2006; Phumichai et al. 2008; Reid et al. 2011). We collected commercial hybrid varieties from southern Asia, China, and Canada to ultimately create new genetically stable inbred lines from them. This study investigated the genetic diversity and population structure among 78 diverse maize lines, which will be utilized in future breeding programs, using multi-band producing marker systems, such as AFLP and TE-derived marker systems.

The results of this study confirmed that there are significant genetic variations among the maize lines analyzed. The employed molecular marker systems, AFLP, SSAP, CACTA-TD, clearly discriminated between the geographically diverse maize lines. The polymorphic information content (PIC) values of 0.21 with AFLP and 0.36 with SSAP demonstrated good marker discriminatory power suggesting considerable variation among these markers. Similar AFLP and SSAP PIC values were reported for genetic diversity studies of dent, waxy, and sweet corns grown in Korea (Roy et al. 2015b). These values also agree with studies done with other crops using TD (Kwon et al. 2005; Hirano et al. 2011; Lee et al. 2012) and SSAP (Porceddu et al. 2002; Lou and Chen 2007; Sanz et al. 2007). Our molecular marker data indicated that the overall genetic diversity (Ht) was high among all of the maize lines. SSAP provided higher polymorphism and marker indices of gene diversity, which is congruent with studies of other crops, such as tomato and pepper (Tam et al. 2005), durum wheat (Mardi et al. 2011), and maize (Roy et al. 2015b). TEs comprise 85 % of the whole maize genome (Baucom et al. 2009), and in this study, we analyzed the Copia–type retrotransposons and the Ji and Opie Sirevirus elements, which are the most abundant retrotransposon sub-families in the maize genome (Sanmiguel and Vitte 2009), thus we found higher polymorphisms using the retrotransposon-based molecular marker system.

We used pooled DNA samples to detect most of the alleles in the heterogeneous hybrid lines. The advantages and disadvantages of pooled DNA sampling analysis have been discussed by Michelmore et al. (1991) and Loarce et al. (1996). Pooled DNA sampling analysis saves labor and is quick, but information on individual genotypes, which is necessary for estimating the genetic structure and genetic variability within populations, is lost. The population structure in our study was found to be geographically restricted. The dendrograms created by using similarity coefficients grouped all of the lines into two major geographical divisions. PCoA also supported the dendrograms and separated individuals into two proper groups of Canadian lines and Asian lines.

Maize has been cultivated in a wide range of habitats including low latitude tropical countries to high latitude Canada (Prasanna 2012). Early maturation is an important characteristic of the maize lines grown in Canada and in northern China, however, there was no clear clustering of maize lines according to latitude. Rather, the country of origin was more prominent in the clustering patterns as the maize lines from Asia were separated from the Canadian lines in all of the three marker systems. This implies that, in addition to the duration of the growing season, selection for local maize lines is complicated by other factors, including day length, pests, soil, and regional tastes.

Maize breeding focuses heavily on hybrid vigor by introducing genetic diversity by crossing diverse inbred lines to maximize recombination. Therefore, understanding the molecular diversities among introduced maize lines is required to design vigorous hybrid crosses. We surveyed the genetic diversity and relationships among maize lines derived from southern Asia, northern Asia, and Canada. AFLP and TD-based marker systems separated Asian maize lines from Canadian lines. Inbred lines from commercial hybrids are currently being created by successive self-pollinations and by producing doubled haploids (Prigge and Melchinger 2012). Because for successful crossing, flowering time is important, our molecular data will likely be useful in designing crosses between northern Chinese inbred lines and Canadian inbred lines.