Introduction

Wheat (Triticum aestivum L.) is one of the most important food crops worldwide, feeding about 40% of the world’s population (Gupta et al. 2008). Accordingly, selection for high grain yield is an important goal of wheat breeding programmes. Seeding emergence rate (SER) and tillering are the most important agronomic traits affecting biomass and grain yield potential in wheat. SER is an important target for judging the quality of seeds, and it is a reflection of seedling vigour and growth that indirectly affects yields. Several studies on the relationship between high seed vigour and yield improvement have been reported in rice (Ismail et al. 2012), maize (Cervantes-Ortiz et al. 2007) and wheat (Botwright et al. 2002; Ludwig and Asseng 2010). Tillering is an important agronomic character in wheat and tiller number changes over the course of development. For winter wheat, tiller numbers increase throughout the autumn until tiller growth stops when the mean temperature reaches 0 C, during this point the maximum tillering of prewinter (MTW) is reached. In the following spring, again a great number of new tillers are formed when the mean temperature reaches 10 C. At the joint stage, the number of tillers overall reaches its maximum, which is called maximum tillering in spring (MTS). However, only a proportion of tillers survive to contribute to the ultimate number of tillers, i.e., the effective tillering at harvest (ETH) depends on initial tiller growth and the ultimate tiller survival (Hodges 1991; Evers et al. 2006). MTW, MTS and ETH are important traits that determined the shoot architecture across different wheat growth stages (Li et al. 2010). The architecture of the shoot system affects plant’s light harvesting potential, the synchrony of flowering and seed set and, ultimately, the reproductive success of a plant (Yang et al. 2006; Kuraparthy et al. 2007). The rate of the available tillering (RAT) is a derived trait of ETH and MTS, which reflects potential wheat yields. Hence, the genetic elucidation of seedling emergence and tiller number at different stages of growth has become a focus of wheat genetics and breeding research programmes.

At present, quantitative trait loci (QTL) mapping and genomewide association studies (GWAS) are the two main approaches to dissect the genetic bases of complex traits (Risch and Merikangas 1996). Many wheat QTL mapping studies have been performed on traits associated with seed vigour and tillering (Rebetzke et al. 2005; Spielmeyer et al. 2007; Landjeva et al. 2010; Li et al. 2010). However, in all these studies, although QTL mapping was successful in detecting QTL, the genetic variation explored in the mapping population is restricted to the genomes of only the two parents, and the genetic markers for detected QTL are not necessarily transferable to other populations.

As a complement to QTL mapping, GWAS is a high-resolution and cost-effective method for genetic mapping that use existing germplasm (such as landraces, elite cultivars and advanced breeding lines) and is based on linkage disequilibrium (LD) (Zhu et al. 2008; Waugh et al. 2009; Ingvarsson and Nathaniel 2011). As genomics have rapidly developed and genotyping costs have dramatically decreased, association mapping has rapidly become a promising approach for the genetic dissection of complex agronomic and crop quality traits in cereal crops such as rice, barley and maize (Yu and Buckler 2006; Cockram et al. 2010; Huang et al. 2012).

To achieve a high resolution association analyses, a large number of molecular marker sets are preferred for association mapping. Single-nucleotide polymorphisms (SNPs) are the most promising molecular markers, and they have two major advantages over other molecular markers: (i) SNPs are the most abundant form of genetic variation within genomes (Zhu et al. 2003); and (ii) a wide array of technologies have now been developed for high throughput genotyping (Fan et al. 2006). These factors combined make SNPs ideal for GWAS using large samples and high-density markers, which greatly improves the resolution of association mapping. At present, GWAS with SNPs have been widely used to characterize the genetics of many animals and plants, such as humans (Eric et al. 2011), Drosophila (Gerrard et al. 2013), rice (Huang et al. 2010) and maize (Wilson et al. 2004). However, GWAS using SNPs in wheat is a challenge because of wheat’s complex genomic architecture and incomplete genome sequence (Sukumaran and Yu 2014). While few GWAS in wheat have identified SNP markers for grain yield (Sukumaran et al. 2014) and heading date (Zanke et al. 2014), a GWAS examining the association of SER and tiller number with SNP markers has not yet been reported.

As such, the present study is a GWAS examining SER and tiller number in different growth stages and the association of these traits with 24,355 SNPs genotyped using the 90K Illumina iSelect array (Wang et al. 2014) in a diverse population of winter wheat. The objectives of this study were to identify molecular markers associated with SER and tiller number in different growth stages and explore a set of elite alleles influencing SER and RAT. The ultimate aim of this research was to facilitate molecular breeding and identify strategic combinations of traits in winter wheat.

Materials and methods

Plant material

To ensure a broadly representing sampling of wheat accessions, 182 elite varieties, developed largely from 1980s to 2010s, and 150 breeding lines including two varieties introduced from countries other than China were selected. A representative collection of 205 varieties and lines was chosen by discarding accessions with similar pedigrees and phenotypes related to yield. The remaining 205 varieties were used for the present GWAS. The accessions comprised 77 bred varieties, 55 founder parents and 73 breeder’s lines from 10 provinces that are major winter wheat regions in China, as well as two additional lines, used as founder parents from Mexico and France (see table 1 of electronic supplementary material at http://www.ias.ac.in/jgenet/). This association mapping panel expressed substantial phenotypic variation in the main wheat agronomic characters including tiller number at different stages of growth.

Field trials and phenotypic trait evaluation

Field trials were conducted in 2013–2014 and 2014–2015 growing seasons at the experimental fields of Shandong Agricultural University, Tai’an (116 36 E, 36 57 N) and in 2014–2015 growing season at the Dezhou Institute of Agricultural Sciences, Dezhou (116 29 E, 37 45 N). The experiment followed a completely randomized block design with two replications at each location. Each plot consisted of three rows of wheat plants: two rows grown for the purpose of evaluating SER with a distance of 2.5 cm between plants, and one row for evaluating MTW, MTS and ETH with a distance of 10 cm between plants. Each row was 2 m in length, and the distance between adjacent rows was 25 cm. The field management followed the standard local practices. RAT was calculated as RAT = ETH/MTS. Trait measurements (i.e. MTW, MTS, ETH and RAT) were estimated from averages of five plants within each plot for subsequent statistical analyses.

Analysis of variance (ANOVA) and correlations among phenotypic traits were performed using the statistical software SPSS ver. 17.0 (SPSS, Chicago, USA). Heritability (h 2) was calculated as

$$h_{B}^{\mathrm{2}} = \sigma_{g}^{\mathrm{2}}/(\sigma_{g}^{\mathrm{2}} + \sigma_{e}^{\mathrm{2}}), $$

where \(\sigma _{g}^{\mathrm {2}}\) is between group variances and \(\sigma _{e}^{\mathrm {2}}\) is within group variances (Li et al. 2009). The estimates of \(\sigma _{g}^{\mathrm {2}}\) and \(\sigma _{e}^{\mathrm {2}}\) were obtained from variance estimates included in the ANOVA.

DNA extraction and SNP genotyping

DNA was extracted from young leaf tissues of each variety using the protocol recommended by Triticarte (http://www.triticarte.com.au http://www.triticarte.com.au). DNA quality was checked by electrophoresis on 0.8% agarose gels, and DNA concentration was determined with a NanoDropND-1000 UV-Vis spectrophotometer (NanoDrop Technologies, Wilmington, USA).

SNP genotyping was performed at the University of California, Davis Genome Center using the novel 90K iSelect wheat chip (Wang et al. 2014), which consists of 81,587 SNP loci distributed across all 21 wheat chromosomes. The SNP data were clustered and called automatically using the Illumina BeadStudio genotyping software (Illumina, San Diego, USA) and were filtered to remove alleles with a detection rate less than 0.8 and/or a minor allele frequency (MAF) less than 0.05 (Li et al. 2014a).

Genetic diversity and composite genetic map

Allele number, MAF and the polymorphism information content (PIC) value for each locus were calculated with the Power Marker program v3.25 (Liu and Muse 2005). A composite genetic map for the association panel was built to GWAS based on the published genetic map (Wang et al. 2014).

Population structure

Population structure analysis for the association panel was performed using the STRUCTURE v2.2 software (Pritchard et al. 2000) with 3297 SNP markers positioned at least 2 cM apart in the genome. STRUCTURE was run 10 times for each subpopulation value K, ranging from 1 to 15, using the admixture model with 100,000 iterations for burn-in and 50,000 iterations for the analysis itself. The K value was determined by using estimates of ln P (D) from the STRUCTURE output and an ad hoc statistic ΔK based on the rate of change in ln P (D) between successive K values (Evanno et al. 2005).

Genomewide association analysis

Significant marker-trait associations (MTAs) were identified using a general linear model (GLM) in TASSEL 3.0 (http://www.maizegenetics.net/). The population structure (Q matrix) was inferred using STRUCTUTE 2.2. The P value determined whether a QTL was associated with the marker or not and the R 2 value was used to evaluate the magnitude of the QTL effects. The false discovery rate (FDR) procedures were used to adjust for multiple testing from the number of SNP loci detected. The individual P values from GWAS were sorted from smallest to largest, and then were multiplied by different coefficients (the number of SNP loci tested in the GWAS / the rank values of individual P values), i.e. the corrected P values of the SNP loci detected. SNPs with corrected P values ≤ 0.05 were considered to be in significant association with phenotypic traits.

Results

Phenotypic performance

The measurements of the five traits occurring in the three growth stages for the association panel are listed in table 1. Extensive phenotypic variation for these five traits was observed in three environments (i.e. two growing seasons, the second of which was split between two locations) among 205 winter wheat accessions. Broad sense heritability of the five traits ranged from 61.78 (MTW) to 74.5% (SER), indicating that both genetic and environmental factors played roles in the expression of these measured traits. Both absolute values of skewness and kurtosis were less than 1.0, which indicates that these quantitative traits were determined by many alleles of small individual effects.

Table 1 Phenotypic summary of the five traits in the association panel of three environments.

The correlations between SER and tiller number are shown in table 2. SER exhibited a significantly positive correlation with MTS, ETH and RAT. MTS exhibited a significantly positive correlation with ETH, but a significantly negative correlation with RAT. Additionally, ETH exhibited a significantly positive correlation with RAT.

Table 2 Correlation analysis between SER and tiller number.

Genetic diversity and the composite genetic map

A total of 32,432 of 81,587 SNPs assayed with the functional iSelect bead chip were polymorphic among the 205 winter wheat accessions, and the polymorphism rate was 39.75%. After SNPs with calling detection frequencies exceeding 0.8 and MAF less than 0.05 were excluded, 24,355 SNPs that mapped to the 21 wheat chromosomes were used for the MTA analysis.

A total of 48,710 alleles were detected (i.e. two alleles per SNP locus). About 59.9% of the SNPs (14,597/24,355) exhibited MAF exceeding 0.2, which were then selected as markers with normal allele frequencies, and 9.60% of the SNPs (2337/24,355) exhibited almost equal allele frequencies (i.e. MAFs close to 0.5). The overall SNP diversity was expressed by a PIC value. The PIC values estimated for 24,355 SNP markers ranged from 0.05 to 0.38 with a mean of 0.27, and about 69% of SNPs (16,757/24,355) had PIC values over 0.20.

The number of markers and map lengths of the composite genetic map are shown in table 3. The total length of the map was 3674.16 cM with a mean genetic distance of 0.15 cM between markers. Chromosome 1B contained the highest number of markers (n=2390), followed by 5B (n=2187), and chromosome 4D had a very few loci (n=78). Among A, B and D genomes, B genome contained the most loci (n=12,321) with a total length of 1150.47 cM, followed by A genome (n=9523) with a length of 1252.51 cM, and D genome (n=2511) with a length of 1271.18 cM.

Table 3 Information of SNP in the composite linkage map.

Population structure

The potential existence of population structure among the 205 winter wheat accessions was detected using STRUCTURE program. The ΔK value was plotted against the number of hypothetical subgroups K with the highest ΔK observed when K = 4 (figure 1), indicating that four subgroups exist in the association panel used in this study (figure 2). Q values of 205 accessions in four groups are shown in table 2 of electronic supplementary material. Using the maximum membership probability in STRUCTURE, subgroup 1 (43 varieties), subgroup 2 (32 varieties) and subgroup 3 (105 varieties) were dominated by varieties from Henan province, Hebei province and Shandong province, respectively, and subgroup 4 (25 varieties) was dominated by those from three provinces (Shanxi, Jiangsu and Ningxia provinces). The clustering analysis based on the genotypic data generally corresponded to known pedigrees, and the varieties that are closely related in the pedigree usually clustered together.

Figure 1
figure 1

Rate of change in the log probability of data between successive K values (ΔK).

Figure 2
figure 2

Population structure of the association panel based on SNP marker.

MTA

In this study, 24,355 SNPs that mapped to the 21 wheat chromosomes were used for the MTA analysis. After FDR corrections for multiple testing (with P=0.05), 31 significant MTAs were detected for the five traits in the three environments. These SNPs were located on 12 chromosomes and accounted for 5.35–11.33% of PVE. Detailed information on significant markers for each trait are shown in table 4. Manhattan plots of the GWAS results are shown in figure 3.

Table 4 Summary of marker–trait associations detected for the five traits in three environments.
Figure 3
figure 3

Manhattan plot of GWAS for five traits with general linear model. (a), (b), (c), (d) and (e) indicate manhattan plots for SER, MTW, MTS, ETH and RAT, respectively; E1, E2 and E3 indicate 2013 Tai’an, 2014 Tai’an and 2014 Dezhou, respectively).

Four MTAs for SER were detected on chromosomes 5A, 7A and 7B and explained 5.51–8.18% of PVE. Of which, the marker Kukri_c28160_2017 on chromosome 7B had the highest R 2 (8.18%) and was detected as significant in two environments (figure 3a).

We detected 10 MTAs for MTW on chromosomes 1B, 2B, 3A, 3D and 5B, explaining 7.92–10.14% of PVE. The marker GENE-0592_268 on chromosome 2B exhibited a significant MTA in two environments. Three markers, Excalibur_c25522_755, Excalibur_c25522_791 and Excalibur_ c60554_394 located at effectively the same position of chromosome 5B (144 cM), had the highest R 2 (10.14%) and the smallest P value (figure 3b).

Five MTAs on chromosomes 1B, 2B and 6B were identified for MTS, which explained phenotypic variance ranging from 5.35 to 8.38%. The markers Excalibur_c40993_1129 on chromosomes 1B, Kukri_c47259_416 on chromosome 2B and RAC875_c14309_317 on chromosome 6B were inferred to have significant MTA in two environments (figure 3c).

For ETH, seven MTAs on chromosomes 1B, 2A, 2B, 3A, 4B and 7A were found to account for 5.94 to 9.08% of PVE, of which the markers Jagger_c6853_60 on chromosome 2B and wsnp_Ex_c14654_22713386 on chromosome 7A exhibited significant MTAs in two environments (figure 3d).

For RAT, five MTAs on chromosomes 1B, 4A, 5B, 7A and 7B were detected, which explained 6.15–11.33% of PVE. The markers wsnp_Ex_rep_c69766_68723140 on chromosome 1B, BS00073989_51 on chromosome 7A and IAAV3713 on chromosome 7B exhibited significant MTAs in two environments (figure 3e).

Overall, 10 MTAs were consistently detected in two environments, which were therefore regarded as relatively stable loci controlling these traits. The highest number of stable loci were found for MTS and RAT, with three MTAs each, followed by ETH with two MTAs. Only one stable MTA was detected for SER and MTW. Further, most of the MTAs were detected in just a single environment, indicating SER and tiller-related traits were substantially affected by environmental factors.

Elite allele exploration

The genetic bases of high SER and RAT values have always been a research focus, but it has been a difficult area for breeders. In this study, to explore the elite alleles of these traits, MTAs for SER and RAT were further analysed through a comparison of phenotypic effects between alleles at each locus. Phenotypic effects of alleles for the loci influencing SER and RAT are given in table 5.

Table 5 Phenotypic effect of alleles for the loci of SER and RAT.

Three of four MTAs for SER, exhibited phenotypic differences between alleles that reached a significant level (P<0.01). Of these, alleles T and G of the marker Ra_c14761_1348 were associated with the largest phenotypic differences (13.35%). The phenotypic value of SER with Ra_c14761_1348-T was significantly higher than that of Ra_c14761_1348-G across all three environments, which indicats Ra_c14761_1348-T is an elite allele when compared with Ra_c14761_1348-G. wsnp_Ex_c4666_8349206-G and wsnp_Ex_rep_ c107017_ 90850230-T alleles were two other elite alleles explored in this study.

Four of five MTAs for RAT, exhibited significant phenotypic differences among alleles (P<0.01). Four alleles, wsnp_Ex_rep_c69766_68723140-T, BobWhite_c8436_391-T, Excalibur_c11045_236-A and IAAV3713-G were elite relative to their alternative alleles at each locus, which have increased RAT from 4.23 to 14.78%. Excalibur_c11045_236-A had the most significant effect, increasing RAT by 14.78%, followed by BobWhite_c8436_391-T (8.47%) and IAAV3713-G (4.23%). The other marker, BS00073989_51, did not exhibit a significant phenotypic difference among its alleles.

Discussion

Rapid advancements in sequencing and genotyping technologies over the last decade have enabled the development of SNP arrays even for polyploid crops such as wheat (Edwards et al. 2012). GWAS using SNPs is a promising avenue for future research that combines routinely generated phenotypic data with high-density markers to elucidate the genetic architecture of agronomic traits. Here, we used the currently available 90K SNP array to genotype a mapping population panel, from which we performed GWAS for SER and tiller number and investigated the advantages of this approach.

SNP markers for GWAS in wheat

High-density marker coverage of the genome and reliable genotype data are the preconditions for GWAS. SNP markers are the most common type of sequence variation in the genome (Rafalski 2002), and they are consequently well-suited for genomics approaches that require a high number of markers such as association mapping (Myles et al. 2009). High-throughput SNP genotyping platforms have been available for diploid crops like rice (Huang et al. 2010) or maize (Yang et al. 2006) for some time, but not for polyploid species. For wheat, SNP discovery and detection have both progressed much more slowly for several reasons: (i) the lack of a reference genome sequence, (ii) availability of gene sequences from only a few genotypes, and (iii) the highly repetitive and duplicated nature of the wheat allopolyploid genome (Edwards and Batley 2010). Consequently, genotypic data in wheat has so far been obtained with traditional markers like AFLPs and SSRs (Zhang et al. 2013), which are laborious and time consuming to use. SNP arrays constitute a major breakthrough in wheat genotyping, enabling the cost-effective, large-scale acquisition of genotypic data in hexaploid wheat (Akhunov et al. 2009). Recently, the wheat 90K iSelect SNP genotyping array has made GWAS possible in wheat. In the present study, we used the currently available 90K SNP array developed by Wang et al. (2014) to genotype a panel of 205 diverse Chinese winter wheat lines. The very high number of SNPs on the 90K array revealed clear genetic differences among the 205 wheat accessions, specifically, 32,432 SNPs. Of these 32,432 SNPs, a total of 24,355 SNPs mapped to the 21 wheat chromosomes and were used for MTA analysis. The size of our dataset in marker number (24,355 SNPs) for GWAS was larger than that of similar association mapping studies that have been carried out in wheat so far (e.g., Somers et al. 2007; Zhang et al. 2011; Le et al. 2012). A high number of markers are required for genomic studies like GWAS to increase mapping resolution. In the near future, when the complete genome sequence is available for wheat, GWA mapping will be facilitated by SNP markers with known physical locations in the genome, and this can facilitate the discovery of major genes that control the traits under investigation as it has already occurred in rice (Famoso et al. 2011) and maize (Setter et al. 2011).

Population structure analysis

Different levels of population structure in Chinese winter wheat have been reported (Chen et al. 2012). In our study, using the maximum membership probability in STRUCTURE, the association population was classified into four groups (comprised of 43, 32, 105 and 25 varieties, respectively) (figure 2). Individuals were unequally distributed among the four subgroups, which indicate that there was a high extent of population structure in our mapping population. Some elite lines typically are used in many crosses and thus, genetically contribute more to the association panel population than other lines that have been used in breeding populations (Würschum et al. 2011). The causes are also likely to be important for the subpopulation differentiation observed in our mapping population because our mapping panel included many bred varieties and breeder’s lines, which shared one or more founder parents in wheat breeding programmes. Consequently, a correction for population structure might be required to enhance analyses of this dataset. Moreover, the population structure analysis was used to explore the relationships among varieties used in the association mapping panel, which is important not only for parental selection but also for breeding system design.

MTA analysis

Seedling emergence is one of the most important indices for seed vigour in agricultural production. Earlier studies have reported QTLs for seed vigour on at least 13 chromosomes, including 1A, 1B, 1D, 2A, 2D, 3A, 3B, 4B, 5A, 5B, 5D, 7A and 7D (Rebetzke et al. 2001; Landjeva et al. 2010; Liu et al. 2013; Xiao et al. 2013; Li et al. 2014b). In the present study, four MTAs were detected for SER on chromosomes 5A, 7A and 7B. By comparing these chromosomes, some important loci controlling seed vigour on chromosomes 5A and 7A were detected.

Earlier studies have detected QTLs for tillering on almost all wheat chromosomes using the traditional QTL mapping approach (Kato et al. 2000; Kuraparthy et al. 2007; Li et al. 2010; Deng et al. 2011). Compared with previous reports, the marker Kukri_c47259_416 on chromosome 2B associated with MTS was located in a similar position as a significant SSR marker reported by Li et al. (2010), who used a biparental population. Marker BS00097939_51 on chromosome 3A in this study was a pleiotropic loci, explaining about 9.60 and 6.01% of PVE in MTW and ETH, respectively. Kato et al. (2000) and Kuraparthy et al. (2007) also detected a QTL for tiller number on chromosome 3A with significant effects. Deng et al. (2011) detected a QTL for ETH on chromosome 4B. Yang et al. (2013) also detected a pleiotropic QTL on chromosome 4B associated with both ETH and MTS. At a similar location, the marker Tdurum_contig4974_355 was associated with ETH in this study. This indicates that an association mapping approach is a viable alternative to QTL mapping to identify some quantitative traits such as those corresponding to tillering-related traits, which are influenced by environmental effects. Further, we found some QTLs with significant effects on tiller number that were not reported by previous studies. For example, the major marker, Tdurum_contig13879_919 on chromosome 1B was detected in different growth stages and explained 7.98–9.69% of PVE, thus playing an important role in wheat tiller development. Whether these QTLs correspond to a new locus detected using GWAS remain to be further confirmed.

The expression of a stable MTA is less affected by the environment. In general, a MTA that is consistent across environments is of great value for marker-assisted selection in breeding varieties adapted to various ecological environments. Therefore, the three stable MTAs for MTS, three for RAT, two for ETH, and one each for SER and MTW detected in two environments in the present study should be used to develop cleaved amplified polymorphic sequence (CAPS) markers for marker-assisted selection and to begin mining SER and tillering genes through bioinformatic analyses.

Elite allele analysis

Identifying useful alleles by association analysis has become a practical strategy for plant genomics research over recent years. In the present study, we established a link between genotypes and SER and RAT phenotypes by analying differences in phenotypic values between different alleles and identified elite alleles for SER and RAT. For example, the allele Ra_c14761_1348-T could increase SER by 13.35%, while alleles Excalibur_c11045_236-A and BobWhite_c8436_391-T could increase RAT by 14.78 and 8.47%, respectively. Accordingly, the lines that carry these elite alleles should be used as genetic stock for the improvement of these traits through breeding.

In conclusion, the association mapping is a powerful tool for identifying molecular markers associated with agronomic and quality traits in wheat. Thirty-one significant loci were determined to underlie variation in SER and tiller number in different growth stages in this population using GWAS. Of these loci, markers for SER and tillering traits at 10 loci were consistently detected in two environments and should therefore be used to develop CAPS markers. A set of elite alleles for SER and RAT were identified for the purpose of improving these traits through breeding procedures.