Introduction

In the mature common wheat kernel, starch accounts for approximately three-quarters of the grain composition and contains 20–30% amylose (AMS) and 70–80% amylopectin (AMP). Starch also accounts for approximately 65 to 70% of the dry grain weight (Li et al. 1999). Endosperm starch content not only influences the grain weight but also grain processing and end-use quality (Hurkman et al. 2003). In fact, not only starch content but its components, properties and functions also affect food quality, such as appearance, flavour, texture, and nutritional value (Martin and Smith 1995; Liu et al. 2003; Song et al. 2008). Starch components, especially the ratio of the components, are very important for processing and end-use quality. The varietal differences in the AMP structure are predominantly due to chain length variation and play a critical role in determining the physicochemical properties of starch in wheat endosperm. AMS content is an important indicator of wheat grain quality. Previous studies showed that AMS was significantly correlated with the quality of noodles, bread and steamed buns. There was a negative correlation between AMS and noodle quality, that is, a high AMS content is associated with a reduced overall noodle score (Wang et al. 1998). In contrast, in sensory evaluations, as the amount of AMP increased, the scores for cohesiveness, springiness, and acceptability of cooked noodles also increased. The proper AMS: AMP ratio improved the freeze–thaw stability and the sensory acceptability of wheat flour dough and noodles (Cho et al. 2007). Steamed bread made from flour with a low AMS content was bulky and had good eating quality (Liu et al. 2003), and a significant positive correlation was found between AMP content in flour and the quality of bread (Zhou 2012). AMP forms a network structure in dough, resulting in a porous material that produces more water vapour during the expansion process, improves bread volume, texture and palatability. The characteristics and functions of AMP are exactly opposite to those of AMS, and the smaller the ratio of AMS: AMP, the more favourable the processing quality of gluten-based food.

Some key enzymes are involved in the biosynthesis of AMS and AMP. AMS is mainly controlled by granule bound starch synthase I (GBSS I), which is the waxy protein whose encoded genes are located on chromosomes 4AL (Wx-B1), 7AS (Wx-A1) and 7DS (Wx-D1) (Nakamura et al. 1993; Martin et al. 2004). A lack of waxy genes affects the AMS content and further influences starch quality. AMP synthesis involves several key enzymes, such as soluble starch synthase (SSS, involving SSI, SSIIa, SSIII), debranching enzymes (DBE, containing DBEI), and branching enzymes (BE, involving BEI, BEIIa, BEIIb) (Li et al. 2003; Nakamura et al. 2017; Crofts et al. 2017). The expression of these enzymes affects the biosynthesis of AMP. The physical and chemical properties of starch are directly affected by the AMS: AMP ratio. Therefore, expression of these enzymes influences the contents and ratio of AMS to AMP and ultimately affects processing quality (Caballero et al. 2008). Traditional quantitative trait loci (QTL) mapping and genome-wide association studies (GWAS) have been used as the main methods for dissecting complex traits (Risch and Merikangas 1996). Many QTL mapping studies have been performed to explore and characterize the genetic basis of accumulation of wheat starch and its components (Araki et al. 1999; Batey et al. 2002; Deng et al. 2015, 2018). QTLs were mainly distributed on chromosomes 1A, 1D, 2A, 2D, 3B, 4A, 5A, 7A, 7B and 7D in various populations (McCartney et al. 2006; Sun et al. 2008; Tian et al. 2015; Deng et al. 2017). Although traditional QTL mapping has been very successful for genetic analysis when exploring genetic variation, its relevance is limited to the genomes of only two parents.

GWAS, as a complement to QTL mapping, is the most cost-effective way to use existing germplasm (such as landraces, elite cultivars, and advanced breeding lines) for genetic mapping (Newell et al. 2012; Bradbury et al. 2011; Bandillo et al. 2015). With the rapid development of high-density marker-assisted genotyping techniques and next-generation sequencing (NGS), GWAS has become a widely used method for identifying the genes responsible for the quantitative variation of complex traits (Zhu et al. 2008). This strategy has been successfully used for agronomic traits in rice, maize, barley, common wheat, durum wheat and other crops (Huang et al. 2011; Cockram et al. 2010; Yu and Buckler 2006; Chen et al. 2015; Li et al. 2017a; Ovenden et al. 2017; Liu et al. 2016; Shu et al. 2012).

To improve the resolution of association maps and to cover the entire genome with sufficient resolution, a large number of molecular markers are needed for GWAS (Sajjad et al. 2012). Single nucleotide polymorphism (SNP) markers representing third-generation molecular markers are abundant and evenly distributed across genomes, satisfying the large sample and high-density marker requirements of GWAS (Gupta et al. 2008). At present, GWAS using SNPs has been widely used to illuminate the genetics of many animals and plants, such as humans (Mick et al. 2011), rice (Huang et al. 2011), and maize (Wilson et al. 2004). However, GWAS in wheat continues to be a challenge due its complex genomic architecture and incomplete genome sequence (Sukumaran and Yu 2014). Recently, several SNP-based technologies, such as genotyping chips, have become available. For example, the 90 K Illumina iSelect array (Wang et al. 2014) is commonly used in wheat genetics and breeding research, including genetic mapping and association analysis of important agronomic traits.

Compared to traditional QTL mapping, association mapping studies has advantages especially in increased QTL resolution and allele coverage. Although previous researchers have dissected the genetic basis for the accumulation of wheat starch and its components using QTL mapping, GWAS was rarely found in their analysis. Therefore, the present study used GWAS to dissect total starch and its components, with 24,355 SNPs genotyped using the 90 K Illumina iSelect array in a population of diverse winter wheat varieties. The objectives of this study were to identify markers and candidate genes for loci associated with these traits in order to improve wheat starch quality by breeding.

Materials and methods

Plant material and growth conditions

The association mapping panel of 205 wheat genotypes for GWAS comprised 77 released cultivars, 55 founder parents, and 73 breeding lines (Table S1) from 10 provinces that represent the major winter wheat production regions in China. Two lines from Mexico and France were included as additional founder parents.

The panel was grown in the 2013–2014 and 2014–2015 cropping seasons in experimental fields at Shandong Agricultural University, Tai’an (116°36′E, 36°57′N) and Dezhou Institute of Agricultural Sciences (116°29′E, 37°45′N). The experimental fields were arranged in randomized block design, with two replicates for each environment. All lines were grown in 2 m plots with 3 rows spaced 25 cm apart, and 70 seeds were evenly spaced in each row. Field management followed local procedures. No serious pest damage or lodging problems occurred during the trials.

Measurement of starch components

Starch, AMS and AMP contents were measured by the double-wave method (Jin et al. 2009) with modifications. The main wavelength for determining AMS content was 471 nm, and the comparison wavelength was 632 nm. The main wavelength for determining AMP content was 553 nm, and the comparison wavelength was 740.3 nm. The AMS and AMP contents in each sample were determined according to the extracted dilution factor relationship, and the total starch content (TSC) was taken as the sum of the AMS and AMP contents.

Analysis of phenotypic data

Analysis of variance (ANOVA) and correlations among phenotypic traits were carried out using SPSS version 17.0 (SPSS Inc., Chicago, IL, USA). Heritability (h2) was calculated as h2B = σ2g/(σ2g + σ2ge/r + σ2e/re), where σ2g, σ2ge, and σ2e were estimates of genotype, genotype × environment and residual error variances, respectively. Estimates of σ2g, σ2ge, and σ2e were obtained from the ANOVA, which was performed using the PROC GLM procedure in SAS 8.0 (SAS Institute Inc., Cary, NC, USA).

SNP markers and genotyping

SNP genotyping was performed at the University of California, Davis Genome Center. An Illumina iScan Reader was used to carry out the genotyping assays (Chen et al. 2016). The genetic diversity data were reported previously (Chen et al. 2016, 2017).

DNA extraction and a composite genetic map

DNA was extracted from the young leaf tissues of each variety following to the method recommended by Triticarte Pty. Ltd. (http://www.triticarte.com.au). Samples were genotyped using the 90 K iSelect wheat chip, which consists of 81,587 SNP loci distributed across all 21 wheat chromosomes.

The total length of the map was 3674.16 cM, with a mean genetic distance of 0.15 cM between markers. Chromosome 1B contained the most markers (n = 2390), followed by 5B (n = 2187), whereas chromosome 4D had the fewest loci (n = 78). Among the A, B and D genomes, the B genome contained the largest number of loci (n = 12,321) and a total length of 1150.47 cM, followed by the A genome (n = 9523) at 1252.51 cM, and the D genome (n = 2511) at 1271.18 cM (Chen et al. 2017).

Population structure

Population structure analysis was performed on genotypic data obtained from unlinked SNP markers in the 205 winter wheat accessions using NJ cluster analysis in STRUCTURE v 2.2 (Chen et al. 2017).

Genome-wide association analysis

Significant marker-trait associations (MTAs) were identified using a mixed linear model (MLM) in TASSEL 3.0. Decisions on whether a QTL was associated with a marker was determined by P value. R2 values were used as estimates of the magnitude of MTA effects. SNPs with corrected P values ≤ 0.01 were considered to be significantly associated with phenotypic traits.

Identification of candidate genes

To identify the position of important MTA loci in the physical map and to identify possible candidate genes, a BLAST search was performed on the International Wheat Genome Sequencing Consortium database (IWGSC; http://www.wheatgenome.org/, accessed 27th April, 2018) using the sequences of significant SNP markers identified by GWAS. When a SNP marker sequence from the IWGSC was 100% identical to any wheat contig, the sequence was extended 5 kb using the IWGSC BLAST results. The extended sequence was used to run BLAST searches on the National Center for Biotechnology Information (NCBI) database (http://www.ncbi.nlm.nih.gov, 27th April, 2018) and Ensembl Plants (http://plants.ensembl.org/Triticum_aestivum/Tools/BLASt, 27th April, 2018) to confirm possible candidate genes and putative functions.

Results

Population structure

When ∆K values were plotted against hypothetical subgroups the highest ∆K was observed at K = 4, indicating the likelihood of four subgroups in the association panel. Using the maximum membership probability in STRUCTURE, the 205 accessions were segregated into four subpopulations: subgroup 1 (43 accessions), subgroup 2 (32 accessions), subgroup 3 (105 accessions) and subgroup 4 (25 accessions) (Chen et al. 2017). The LD values of the different chromosomes were reported in Chen et al. (2016).

Phenotypic data

The phenotypic values for the wheat starch trait in diverse environments are shown in Table 1. Extensive phenotypic variation for AMS, AMP and TSC among the 205 winter wheat accessions was observed across four environments (i.e., two growing seasons and two locations). The AMS contents ranged from 16.47 to 22.99% in the flour, AMP contents ranged from 38.43 to 61.15%, TSC contents ranged from 55.78 to 82.19%, and AMS/AMP ratio ranged from 33.01 to 52.78%. Broad-sense heritabilities were 89.31, 68.10, 75.36 and 32.45%, respectively, indicating that both genetic and environmental factors influenced the expression of each trait. There was no significant difference found between environments with regard to AMS, AMP and TSC (Table S2).

Table 1 Phenotypic values for starch content and starch composition of wheat flour from 205 winter wheat accessions grown in four environments

Thousand kernel weights (TKW) ranged from 26.33 to 60.13 g, and protein contents ranged from 10.30 to 17.98% across environments (Table 2). These two phenotypic data were approximately normally distributed in this population with the absolute values of skewness and kurtosis of less than 1.0. Hence, they belonged to typical quantitative traits controlled by multiple loci.

Table 2 Mean phenotypic values for thousand kernel weight (TKW) and grain protein content of 205 winter wheat accessions grown in four environments

Marker-trait associations and elite allele exploration

A total of 24,355 mapped SNPs was used for MTA analysis. Forty-seven significant MTAs were detected for all four traits across environments (Table 3, Table S3, Fig. 1). We further analysed MTAs for AMS and AMP by comparing the phenotypic effects of alleles at each locus to identify elite genes for the starch components and AMS: AMP ratio (Table S4, Table S5). Nine MTAs were recorded for the two starch traits, and there were 11 MTAs for three traits. These SNPs on eight chromosomes, each accounted for 11.26–23.83% of the phenotypic variance. Eighteen MTAs on chromosomes 1B, 2A, 3B, 3D, 4A, 5B, 6A, 6B and 7B were identified as being related to AMS: AMP ratio, each explaining 5.92–17.2% of the phenotypic variation. Nine MTAs were detected in two environments; seven in E1 and E2, and two in E3 and E4.

Table 3 Main marker-trait associations detected in at least two of four environments
Fig. 1
figure 1

Manhattan plots of GWAS for three traits with a mixed linear model. a–c indicate Manhattan plots for total starch content, amylose and amylopectin, respectively; E1, E2, E3 and E4, Tai’an 2013; Dezhou 2013; Tai’an 2014; and 2014 Dezhou, respectively

Fifteen MTAs for AMS were identified on chromosomes 2A, 2B, 3A and 4A explaining 11.8–18.41% of the phenotypic variation. Two MTAs, IAAV4464 (2A_112) and JD_c3742_1130 (2A_112), on chromosome 2A were detected in three environments; these MTAs located at the same position had the highest R2 (18.41%) and smallest P values (Fig. 1b). Twelve of the 15 MTAs showed significant phenotypic differences among alleles (P < 0.01; Table S4), and the same MTAs exhibited phenotypic differences in environments E2 and E4 (P < 0.05). Alleles A and G of marker Kukri_c5615_1214 (3A_93) were associated with the largest phenotypic differences (5.80%). The phenotypic value of AMS associated with Kukri_c5615_1214-A (3A_93) was significantly higher than that associated with Kukri_c5615_1214-G (3A_93) across all four environments, indicating that Kukri_c5615_1214-A (3A_93) was a more elite allele than Kukri_c5615_1214-G (3A_93). The marker RFL_Contig4517_1276 (2A_110) revealed no significant phenotypic differences among its alleles.

Twenty-three MTAs for AMP detected on chromosomes 2A, 2B, 3A, 3B, 4A, 6A, 6B and 7D accounted for 11.26 to 22.44% of the phenotypic variation. Among them, six MTAs were detected in three environments. Markers Kukri_c50842_573 (2B_104) and TA004152-0921 (2B_107) on chromosome 2B and Excalibur_c16376_351 (6B_0) and CAP11_c1087_327 (6B_6) on chromosome 6B were identified in all four environments and, except for Excalibur_c16376_351 (6B_0) in E3 and E4, revealed significant MTAs in E2 and E4 (Fig. 1c). Twenty-two MTAs for AMP showed significant allelic differences in phenotype (P < 0.01). Among them, Kukri_c5615_1214 (3A_93) had the most significant effect, increasing AMP by 15.32%, and allele G was identified as an elite allele. Other elite alleles at each locus increased AMP from 1.25 to 4.43%. Eight MTAs for both AMS and AMP were identified; of these, six exhibited highly significant phenotypic differences between alleles for both traits.

Twenty-two MTAs for TSC were detected on chromosomes 2A, 2B, 3A, 3B, 4A, 6A and 6B, explaining 11.31 to 23.83% of the phenotypic variation. Four MTAs were found in four environments and 12 MTAs were detected in three environments. Marker Excalibur_c16376_351 (6B_0) had the highest R2 (23.83%) and was significant in four environments (Fig. 1a).

Eleven of 18 MTAs for AMS: AMP ratio (Table S5) exhibited significant phenotypic differences among alleles (P < 0.05) in at least two environments; six showed significant phenotypic differences between alleles (P < 0.01), and three showed highly significant differences in two environments. Markers BS00022255_51 (1B_57) and D_contig25392_201 (1B_61) had the most significant effects, increasing AMP by 3.77%. The T allele of the former marker and A allele of the latter were elite alleles.

We identified four significant MTAs for TKW on chromosomes 3A, 4B, 6A and 6D, explaining 5.74 to 11.28% of the phenotypic variation, and SNP locus BS00023893_51 (6A_86) on chromosome 6A was identified in three environments (Table 4). For grain protein content, five MTAs were found on chromosomes 4B, 6B, 7A, 7D and 5A, explaining 6.75 to 10.88% of the phenotypic variation (Table 4).

Table 4 Significant (P ≤ 10−4) or stable MTAs and percentages of phenotypic variation explained for mean thousand kernel weights from four environments

Putative candidate genes linked to starch-related traits

Significant MTAs identified in more than two environments and correlated with more than one trait were selected for candidate gene prediction (Table 5). For marker IAAV4464 on chromosome 2AL there were four candidates but gene TRIAE_CS42_2AL_TGACv1_093900_AA0288950 was related to beta-glucosidase and hydrolysis of O-glycosyl compounds that participate in carbohydrate metabolism. The candidate gene for marker JD_c3742_1130 on 2AL also participates in carbohydrate metabolism. Significant markers RAC875_c6280_292 and Tdurum_contig41127_265 on chromosome 4AL had the same candidate in gene TRIAE_CS42_4AL_TGACv1_288945_AA0961860 that is expressed in the aleurone layer and endosperm 10–30 days post anthesis. The candidate gene TRIAE_CS42_2AL_TGACv1_093900_AA0288950 for marker BobWhite_c10583_352 was predicted to participate in carbohydrate metabolism. Both genes are novel with unknown function, and are different from Wx-B1. Markers Excalibur_c16376_351 and CAP11_c1087_327 had the same candidate gene related to adenosine diphosphate (ADP) binding. These candidate wheat genes could be related to starch synthesis; their functions will be investigated in future research.

Table 5 Predicted candidate genes for SNP markers significantly associated with amylose (AMS), amylopectin (AMP) and total starch content (TSC) in more than two environments

Discussion

A number of significant loci were identified in this study, suggesting the presence of at least some MTAs with medium and small effects on starch traits. Presumably, we should focus on highly significant or stable MTAs and multi-trait MTAs. Due to differences in marker types and marker positions on different genetic maps, the MTA results of our study were extensively compared with previously reported MTA results involving the same chromosome arms.

In previous QTL mapping studies, at least 10 chromosomes related to the TSC were identified, namely, 1A, 1B, 1D, 2A, 3D, 3B, 4A, 5B, 5D and 7D. Eight chromosomes had QTL for AMS content, including 1B, 2A, 2D, 3A, 3B, 4A, 5D and 7D, and eight chromosomes were identified for AMP content, namely 1B, 2A, 2B, 3A, 3B, 4D, 5A and 5D (McCartney et al. 2006; Sun et al. 2008; Tian et al. 2015; Deng et al. 2015). Starch granule size was related to AMS and AMP (Peterson and Fulcher 2001), that is, a higher amylose content in larger granules was found than that in smaller granules, while the amylopectin content of small starch granules was higher than that of large starch granules, so A-granules had a higher ratio of amylose to amylopectin (Park et al. 2004; Li et al. 2011). Li et al. (2017b) identified 15 chromosomes, including 2A, 2B, 3A, 4A, 6A, 6B and 7D, that were related to the percentage volumes of A- and B-granules and the ratio of A-/B-granule volumes. In the present study, seven chromosomes were linked to TSC (2A, 2B, 3A, 3B, 4A, 6A and 6B), chromosomes 2A, 2B and 3A were linked to AMS, and eight chromosomes were linked to AMP (2A, 2B, 3A, 3B, 4A, 6A, 6B and 7D). Thus, chromosomes 2A, 2B and 3A were associated all three traits (AMS, AMP and TSC), and were also related to A- and B-granules and starch components in previous studies. No associations on chromosomes 1A, 1B, 1D, 3D, 5A, 5B and 5D were identified in this study, but chromosomes 4A and 7D were linked to AMP but not AMS, and chromosomes 6A and 6B were related to AMP and TSC. It is interesting that chromosomes 3A and 6D were also related to TKW, and chromosomes 6B and 7D were related to grain protein content. These results indicated that starch content was related to TKW and grain protein content. Starch development also affects grain yield (Hurkman et al. 2003; Tetlow et al. 2004). Moreover, there was a negative correlation between starch content and grain protein content. It is possible that some genes on the same chromosome affect starch content, TKW and grain protein content.

Using association analysis to identify elite alleles has become a useful strategy for plant genomics research (Cai et al. 2014; Li et al. 2012). In this study, we established a link between genotypes and AMS and AMP phenotypes by analysing differences in phenotypic values among various alleles and identified elite alleles for AMS and AMP. For example, the allele Kukri_c5615_1214-G (3A_93) increased AMS and AMP by 5.80 and 15.32%, respectively, and the allele Excalibur_c16376_351-T (6B_0) increased AMP by 4.43%. Consequently, lines that carry these elite alleles could be used as parents for breeding. The marker Kukri_c5615_1214 (3A_93) on chromosome 3A was a stable multi-effect MTA. Its elite allele had a considerable effect in increasing both AMS and AMP contents. These results indicated that these markers were closely related to genes involved in starch synthesis.

GBSS is a key enzyme in AMS synthesis. The genes expressing GBSS are located on 7AS, 4AL and 7DS (Murai et al. 1999; Yan et al. 2007). Previous studies found QTLs for AMS content near the Wx-B1 on chromosome 4A (Araki et al. 1999). During grain development, QTsc-4A.1 and QAms-4A.1 located in the Xwmc262Xbarc343 interval made a large contribution to TSC and AMS synthesis over the whole grain-filling process (Tian et al. 2015). And by comparing them with physical map (Cui et al., 2014), QTsc-4A.1 and QAms-4A.1 were found on chromosome 4AL. In the present study, three SNP markers (RAC875_c6280_292, Tdurum_contig41127_265 and BobWhite_c10583_352) were also detected on chromosome 4A and were closely associated with all three traits. These three SNP markers were associated with unknown genes functioning in carbohydrate metabolism in the aleurone layer and endosperm at 10–30 days post-anthesis. By prediction and comparison these genes seemed to be new (Table 5) because of their different positions from the Wx-B1 locus.

Compared to previous studies, chromosomes 2A, 3A and 6B appeared to be important in control of starch synthesis in the present study. The SNPs IAAV4464 (2A_112) and JD_c3742_1130 (2A_112) on chromosome 2A were related to all three traits (TSC, AMS and AMP). Annotations of the candidate genes predicted participation in carbohydrate metabolism (Table 5). The marker Excalibur_c16376_351 (6B_0) on chromosome 6B had the highest R2 (23.83% for TSC and 22.44% for AMP) and was detected in all four environments. The associated gene was predicted to be a novel gene for starch synthesis (Table 5). Chromosome 6B appears to be an essential chromosome for control of components of starch content.

Conclusions

Thirty-two significant marker-trait associations (MTAs) were detected for total starch (TSC), amylose (AMS) and amylopectin (AMP) contents in four environments. Fourteen MTAs were detected for two traits, and eight MTAs were found for all three traits. The SNPs were distributed across seven chromosomes, 2A, 2B, 3A, 3B, 4A, 6A and 6B. A set of elite alleles was identified, including Kukri_c5615_1214-A, Excalibur_c16376_351-T, BS00022255_51-T and D_contig25392_201-T. Fourteen candidate genes associated with significant markers were identified. The lines that carry these elite alleles could be used as parents in wheat breeding. These results could lay the foundation for fine mapping, gene discovery, and molecular marker-assisted selection of these three traits in wheat.