Introduction

Wheat (Triticum aestivum L.) is one of the world’s most staple and widely consumed crops, providing approximately 20% of daily calories and protein for human nutrition (Shiferaw et al. 2013). Recent studies found that global temperature is reported to be increasing at a rate of 0.18 °C every decade, and wheat production is estimated to fall by 6% for each additional °C increase in temperature (Cheng et al. 2015; Hansen et al. 2012). However, the world population is expected to reach 9.8 billion in 2050 (UN 2017); the wheat yield must continuously increase to meet the challenges of global food and nutritional security (Hunter et al. 2017). Therefore, the identification of wheat genotypes with stable yield and wide adaptation across a range of environments is one of the most important objectives of modern wheat breeding programs worldwide.

Wheat grain yield is the ultimate result of plant growth and development, and also a typical quantitative trait controlled by lots of genes, strongly influenced by other environmental factors and crop management, such as abiotic stress tolerance, disease resistances, adaptation to different soils, and climate changes (Tricker et al. 2018; Brinton and Uauy 2019). The primary numerical components of grain yield are thousand grain weight (TGW), grain number per spike, and spike number per area, and among them, TGW is more stably inherited with relatively high heritability. Therefore, identification quantitative trait loci (QTL) or genes for TGW has become a research focus in wheat in recent years (Brinton and Uauy 2019). Moreover, the stable QTL constitutively expressed across multiple environments are often the preferred loci for further fine-mapping, map-based cloning and marker-assisted selection (MAS) in the future. Additionally, we found that most of the reported QTL/genes were detected under yield potential conditions, but few studies have investigated consistency of QTL across a range of contrasting environments with stress and non-stress.

Wheat is a cereal grain, originally from the Fertile Crescent around; meanwhile, wheat is also a cool-season crop and grows best under moderate temperatures. Unfortunately, high-temperature stress caused by global warming is becoming a serious threat to wheat productivity worldwide. According to a recent study by Yang et al. (2017), by year 2100, heat stress could lead to a yield reduction of 7.1% relative to current levels for winter wheat in China, including the Yellow and Huai River Valley Winter Wheat Region. This region is the most important agroecological production zone in China, accounting for 60–70% of both total harvested area and grain production (Chen et al. 2014). Heat stress affects almost all growth stage of wheat, but the effect of high temperature (> 30 °C) during grain-filling stage is the most serious (Stone and Nicolas 1995), which can reduce yield by 40% under severe heat stress (Hays et al. 2007). This poses a considerable challenge for wheat breeders in developing new heat-tolerant varieties to address climate change. Thermotolerance is a complex quantitative trait controlled by many genes with minor effects in wheat (Bohnert et al. 2006; Howarth, 2005). Several indirect selection parameters including stress susceptibility index (SSI), grain-fill duration (GFD), canopy temperature depression (CTD), and green area under decline (GAUD), for measurement of heat tolerance have been reported (Ayeneh et al. 2002; Kumar et al. 2010, 2013; Reynolds et al. 1994). Among them, SSI as an effective indicator of measure of stress susceptibility based on the screening of genotypes grown under optimal and stress conditions has been widely applied in previous studies (Fischer and Maurer 1978; Mason et al. 2010; Paliwal et al. 2012). To date, considerable attempts have been made to understand the genetic mechanisms of heat tolerance by biparental QTL mapping, and large numbers of QTL were identified on almost all chromosomes in wheat (Mason et al. 2010; Kumar et al. 2013; Paliwal et al. 2012).

Recently, genome-wide association studies (GWAS) based on random, high-density genotyping in natural populations allows greater resolution of genes/QTL underlying complex traits, including yield-related traits, biotic and abiotic stress tolerance in many crops (Oyiga et al. 2018; Lv et al. 2016; Habtemariam et al. 2014). In the current work, a total of 688 winter wheat accessions were genotyped with the Illumina iSelect 90K SNP assay and phenotyped for TGW under normal-sown (NS) and late-sown (LS) conditions. GWAS was performed with multiple environmental trial data to identify SNP loci associated with TGW under two sowing dates and putative QTL for stress susceptibility index (SSI) across the entire hexaploid wheat genome.

Materials and methods

Plant materials and experimental design

A set of 688 winter wheat accessions consisting of modern cultivars (431), landraces (70), and breeding lines (129) of Chinese origin and some foreign accessions (58) were used in this study (Table S1). To evaluate the performance of grain weight and abiotic stress tolerance accurately and high-throughput, field trials were conducted under different sowing dates, namely timely (normal sowing, NS) and late (late sowing, LS). Timely (NS) sowing was done about October 4 and late sowing (LS) about February 18 in the following year. Moreover, the 688 accessions were selected from more than 1,000 accessions, which could head and mature normally in both sowing dates. The materials were grown at Xianyang (Shaanxi province, 108.93°E, 34.62°N) in 2014–2015 and 2015–2016 (XY15 and XY16); Shijiazhuang (Hebei province, 114.83°E, 38.03°N) in 2014–2015 and 2015–2016 (SJZ15 and SJZ16); and Linfen (Shanxi province, 111.52°E, 36.08°N) in 2014–2015 and 2015–2016 (LF15 and LF16). All plants were grown in rows 2 m long and 0.3 m apart at a sowing rate of 30 seeds per row according to randomized complete blocks with three replications. Trials were fertilized and maintained free from weeds, insects, and diseases. Detailed environment characteristics are provided in Table S2. Irrigation and other management at all sites were performed in accordance with local practices.

Phenotyping and statistical analysis

Grain-filling duration (GFD) was calculated as the number of days from anthesis to physiological maturity. When the plants reached maturity, 15–20 representative spikes from different plants were sampled from each replication and threshed together for determination of TGW. The meteorological data of the experiment sites, including sunshine hours, maximum and minimum temperatures, relative humidity, and solar radiation, were obtained from China Meteorological Data Service Center (https://data.cma.cn). SSI was used to assess the stress tolerance of each accession; low SSI was indicative of superior tolerance. SSI of each accession was calculated using the formula (Fischer and Maurer 1978): SSI = [(1 − TGWLS/TGWNS)/(1 − XLS/XNS)], where TGWLS and TGWNS were the TGW for each accession under late and normal sowing conditions, respectively; and XLS and XNS were the TGW means for all accessions under late and normal sowing conditions, respectively.

Phenotypic data from all the environments were analyzed with analysis of variance (ANOVA) R software. Broad sense heritability (H2) of TGW and SSI were calculated across environments from variance components by: \(H^{2} = \sigma_{{\text{g}}}^{2} /\left( {\sigma_{{\text{g}}}^{2} + \sigma_{{{\text{gy}}}}^{2} /y + \sigma_{{{\text{gl}}}}^{2} /l + \sigma_{{{\text{gyl}}}}^{2} /yl + \sigma_{{\text{e}}}^{2} /ylr} \right)\), where \(\sigma_{{\text{g}}}^{2}\) was the genotypic variance, \(\sigma_{{{\text{gy}}}}^{2}\) was the genotype by year effect, \(\sigma_{{{\text{gl}}}}^{2}\) was the genotype by location effect, \(\sigma_{{{\text{gyl}}}}^{2}\) was the interaction among genotype, location and year effect, \(\sigma_{{\text{e}}}^{2}\) was the residual error, y was the number of years, l was the number of locations, r was the number of replications.

DNA extraction and genotyping

Genomic DNA was extracted from a single plant at the seedling stage for each accession using a modified CTAB method (Cheng et al. 2015). All accessions were genotyped with the Illumina 90K iSelect SNP array (Wang et al. 2014) by CapitalBio Technology Co. Ltd., Beijing; https://www.capitalbiotech.com. SNP allele clustering and genotype calling were performed with GenomeStudio software v.2011.1 (Illumina Inc.). The 90K SNP genotyping assay was filtered by TASSEL 5.2 software to delete loci with minor allele frequency (MAF) < 0.05 and missing data > 10%. All accessions were screened with functional alleles at the VRN-A1 and VRN-D1 loci (Table S3). The primer sequences and PCR protocols for genotyping VRN-A1 and VRN-D1 were described previously (Fu et al. 2005; Yan et al. 2004).

Linkage disequilibrium and population structure

Linkage disequilibrium (LD) was estimated as the squared allele frequency correlation (r2) using TASSEL version 5.2 (Bradbury et al. 2007). Pairwise linkage disequilibrium was measured using the squared allele frequency correlation, according to Weir (1996), and assessed by calculating r2 for pairs of SNP loci. The mean r2 over genetic distances were calculated for the A, B, and D sub-genomes and the whole genome. The LD decay plot was generated using r2 and the genetic map distance between markers. The genetic distance corresponding to r2 > 0.1 was considered to represent the limits of a QTL/locus.

Population structure was determined by the model-based Bayesian clustering software STRUCTURE 2.3.3 (Pritchard et al. 2000). A burn-in period of 10,000 iterations and 50,000 Markov chain Monte Carlo iterations across five independent repeats were implemented to estimate K (number of populations) from 1 to 10. The most likely K in the association panel was estimated using the ad hoc statistic ∆K based on the rate of change in log probability of data between successive values (Evanno et al. 2005).

Genome-wide association study

GWAS was performed using a mixed linear model (MLM) incorporating population structure and kinship coefficients in TASSEL 5.2 (Kang et al. 2008). Principal components (PCs) of the association panel were calculated in R using the filtered genotypes. The first five PCs were used to estimate population structure, which could explain 20.84% of the phenotypic variation. The variance–covariance kinship matrix was automatically calculated using the centered IBS method (Endelman and Jannink 2012). The significance of marker–trait associations (MTAs) was initially based on Bonferroni or FDR adjusted P values with cutoff set at 0.5 (Storey et al. 2002). However, Bonferroni or FDR adjusted P values were found to be highly stringent. Hence, considering the potential risk of type II error and combining the GWAS results in all of the environments, the criterion of selecting the P values obtained within the bottom 0.1 percentile of the distribution was utilized (Sehgal et al. 2015; Sukumaran et al. 2012; Sun et al. 2017). Thus, a threshold of − log10 (P) = 3 was used to declare significant marker–trait association (MTA).

Results

Variations of TGW under two sowing regimes

The meteorological factors of the six environments during the grain-filling stage were collected under normal- and late-sown conditions. There were significant differences between two sowing conditions in maximum temperature (p value = 0.025) and solar radiation (0.015), while the other meteorological factors were not significant. Moreover, multivariate regression analysis was performed with meteorological factors and TGW of the two conditions. The result indicated that the absolute value of the estimated regression coefficient of maximum temperature (17.43) is the largest (Table S4), which suggests that maximum temperature was the main factor causing the difference of TGW between two sowing conditions. The maximum temperature during grain-filling stage of accessions under normal- and late-sown conditions is presented in Fig. S1.

TGW across the multiple environments ranged from 20.08 to 70.25 g under normal conditions and from 7.24 to 55.83 g under late-sown conditions (Fig. S2). These differences were significant and the average reduction in TGW was 12.73 g (Table S1; Fig. 1a). Phenotypic variability of the 688 wheat accessions was presented in the form of box plots (Figs. 1a, S2). To evaluate the stress tolerance, SSI for each accession was calculated by the TGW data from the two sowing dates (Figs. 1b, S3). The results from analysis of variance (ANOVA) for the three examined traits indicated significant variations among genotypes, locations, years, year-by-location interaction, and genotype-by-environment interaction (Table 1). TGW had high levels of genotypic variance (\(\sigma_{{\text{g}}}^{2}\)) under both normal- and late-sown conditions, and broad-sense heritabilities (H2) were 97.19% for TGWNS and 91.32% for TGWLS, respectively. SSI displayed a modest H2 of 83.81%. TGWNS was positively correlated with TGWLS and SSI, whereas those between TGWLS and SSI were negative (Fig. S4).

Fig. 1
figure 1

Distribution of the average phenotype for the 688 wheat accessions across all environments. a Thousand grain weight under normal-sown (TGWNS) and late-sown (TGWLS) conditions; b stress susceptibility index

Table 1 Analysis of variance (ANOVA) and broad sense heritability (H2) for thousand grain weight under normal-sown (TGWNS) and late-sown (TGWLS) conditions and the stress susceptibility index (SSI)

Linkage disequilibrium decay and population structure analysis

Before GWAS, the 90 K SNP genotyping assay was filtered in TASSEL software. Finally, 20,065 SNPs were retained for further analysis, with minor allele frequency (MAF) ≥ 0.05 and missing data ≤ 10%.

LD was calculated using 7778, 10,291, and 1996 post-filtered SNP markers from the A, B, and D sub-genomes, respectively. The mean r2 values for the A, B, and D sub-genomes as well as for the whole genome gradually decreased with increasing pairwise distance (Fig. S5). The longest LD decay distance was for the D sub-genome (approximately 9 cM) and the shortest was for the B sub-genome (approximately 3 cM). The decay distances for the A sub-genome and whole genome were approximately 4 cM. We performed corresponding gene annotations and synteny analysis of the rice and Arabidopsis thaliana genomes with the Chinese Spring International Wheat Genome Sequencing Consortium (IWGSC) RefSeq v1.0 genome (IWGSC 2018) based on markers and flanking sequences in the LD decay distances.

The number of sub-populations (K) was plotted against the ∆K calculated by the STRUCTURE software with the filtered markers. The ∆K (Evanno et al. 2005) peaked at K = 4, indicating that there are four (K = 4) sub-groups, named Sub-G1, Sub-G2, Sub-G3, and Sub-G4, containing 171, 177, 101, and 239 accessions, respectively (Fig. 2). Most landraces were included in Sub-G3, and most of the modern cultivars grown in the Yellow and Huai River Valley Winter Wheat Region were classified as Sub-G1 or Sub-G4; Sub-G2 contained many cultivars from the Northern Winter Wheat Region (Table S1). For the phenotypic performance, Sub-G4 accessions had the highest average TGW (48.12 g) across environments and differed significantly from Sub-G1 and Sub-G2 under normal conditions (Fig. S6a). Sub-G4 also had the highest TGW (35.78 g) under late-sown conditions and differed significantly from the other sub-groups; Sub-G3 had the lowest TGW (29.50 g) and also differed significantly from the other groups (Fig. S6b). The SSI of Sub-G3 (0.69) and Sub-G4 (0.92) were smaller than that of Sub-G1 (1.11) and Sub-G2 (1.13) (Fig. S6c). Thus, accessions in Sub-G3 and Sub-G4 were more stress tolerant and those in Sub-G1 and Sub-G2 tended to be more sensitive.

Fig. 2
figure 2

Population structure of 688 wheat accessions. a Delta K plotted against putative K ranging from 1 to 10; b Stacked bar plots of STRUCTURE for K = 4 subgroups. Each individual is represented by a vertical bar

Analysis of marker–trait associations

Association studies of TGW at two sowing dates and SSI identified 179, 158, and 228 significant SNPs, and among them, 226, 243, and 40 were on genome A, B, and D, respectively (Tables S5–S7). SNP repeatedly detected in at least 3 individual environments were considered to be stable in this study.

One hundred and seventy-nine MTAs for TGW under normal conditions were identified across all wheat chromosomes apart from 3D and 5D in the present study (Table S5). Among them, 16 stable SNPs for TGWNS were mainly distributed on chromosomes 1B, 2B, 3A, 3B, 5A, 5B, and 7D (Table 2). According to LD decay, the 16 MTAs consisted of 10 QTL (Table 3). The phenotypic variation explained (PVE) by each significant SNP ranged from 1.61% by marker BS00088035_51 on chromosome 1A to 3.89% by marker Ku_c17560_162 on chromosome 3A (Table S5). Among them, two SNPs at 65 cM on chromosome 1B and four SNPs at 142 cM on chromosome 5A formed two haplotype blocks, respectively (Fig. 3c, Table 3). Moreover, three SNPs at 30–33 cM on chromosome 5B were tightly linked. Additionally, the chromosomes 1B, 3A, and 3B contained two different loci, respectively.

Table 2 SNPs significantly associated with thousand grain weight under normal-sown (TGWNS) and late-sown (TGWLS) conditions and with the stress susceptibility index (SSI) in multiple environments
Table 3 Quantitative trait loci (QTL) for thousand grain weight under normal-sown (TGWNS) and late-sown (TGWLS) conditions and the stress susceptibility index (SSI) identified in the current and previous study
Fig. 3
figure 3

Comparison of the significant marker–trait associations (MTAs) for thousand grain weight under normal-sown (TGWNS) and late-sown (TGWLS) conditions and stress susceptibility index (SSI). a Heat map of the highest − log10(P) values of marker–trait associations within quantitative trait loci (QTL) associated with TGWNS, TGWLS, and SSI. X-axis represents traits in different environments, Y-axis represents quantitative trait loci (QTL); chromosome 5A Manhattan plot (top) and linkage disequilibrium heatmap (bottom) in the confidence interval at 87–95 cM (b), and 137–145 cM (c). Colored heat maps represent pairwise LD as r2 between marker pairs in the genetic subgroups that were identified in analysis of population structure (see Fig. S5). Blue asterisks represent the location of the significant SNPs

One hundred and fifty-eight MTAs for TGW under late-sown conditions were detected on all wheat chromosomes except chromosome 3D (Table S6). Six stable SNPs representing 5 QTLs for TGWLS were mainly distributed on chromosomes 5A, 5D, and 6B, explaining 1.63–3.01% of the variance (Tables 2 and 3). Among them, two SNPs (IAAV3365 and wsnp_Ex_c5998_10513766) were located simultaneously at 91 cM on chromosome 5A, forming a haplotype block (Fig. 3b, Table 3). Another SNP tplb0049a09_1302 at 140 cM on chromosome 5A was tightly linked to a haplotype block at 142 cM on chromosome 5A that significantly associated with TGWNS (Fig. 3a, c, Table 3).

For SSI, 8 stable MTAs representing 3 QTLs were identified on chromosomes 4B, 5A, and 5D, explaining 1.87–4.73% of the variance (Tables 2 and 3). Among them, three SNPs at 33 cM on chromosome 4B and four SNPs at 91 cM on chromosome 5A formed two haplotype blocks, respectively. Moreover, we found that the SNP RFL_Contig1091_1538 at 138 cM on chromosome 5D, as well as the haplotype block at 91 cM on chromosome 5A were also detected for TGWLS (Fig. 3a).

Additive effects of superior alleles on TGW and SSI

In order to facilitate description of allelic effects, marker alleles with positive additive effects leading to increased TGW or decreased SSI were considered to be “superior alleles” and marker alleles leading to decreased TGW or increased SSI were “inferior alleles.” Allelic effects were simulated for TGW under two sowing conditions and SSI by using the MTAs with the highest log10(P) value of each locus. The relationship between the effects of inferior or superior alleles was identified by phenotypes averaged across all environments (Fig. 4a–c). The patterns of relationship were similar for TGWNS, TGWLS and SSI, where superior alleles additively increased TGWNS (R2 = 0.38) and TGWLS (R2 = 0.083), and decreased SSI (R2 = 0.15) (Fig. 5a–c). Among subgroups, Sub-G1, Sub-G2, and Sub-G4 included more superior alleles for increased TGW than Sub-G3 composed mainly of landraces. However, for SSI, Sub-G3 harbored more superior alleles for increased stress tolerance than the other subgroups (Table S1). Additionally, the top 20 high average TGWNS genotypes were identified to determine the genetic composition, suggesting that there were rich in superior alleles for TGW but were poor in superior alleles for SSI (Fig. 6).

Fig. 4
figure 4

The average phenotype values of the highest − log10(P) SNPs alleles within the quantitative trait loci (QTL) associated with thousand grain weight under a normal (TGWNS) and b stress (TGWLS) conditions and c stress susceptibility index (SSI). Inf represents the inferior allele and sup represents the superior allele

Fig. 5
figure 5

Superior allelic effects of the highest − log10(P) SNPs within the quantitative trait loci (QTL) based on linear regression analysis. a Thousand grain weight under normal conditions (TGWNS), b Thousand grain weight under stress conditions (TGWLS), and c stress susceptibility index (SSI). At the bottom of the box plot corresponding to each SNP allele

Fig. 6
figure 6

Distribution of superior alleles for the highest − log10(P) SNPs within the quantitative trait loci (QTL) identified with TGWNS and SSI, in the top 20 high average TGWNS genotypes. X-axis represents cultivar names. The first parenthesis is TGW; the second is SSI. Inf represents the inferior allele and sup represents the superior allele

Discussion

High temperature is the main stress under late-sown condition

High temperature (> 30 °C) after anthesis has detrimental effects on grain filling in wheat (Stone and Nicolas 1995). However, how to effectively and efficiently evaluate heat stress under field conditions in wheat is a perplexing problem, because ensuring an optimal stress treatment is critical for consistent detection of QTL. In previous studies, the late sowing treatment ensured severe temperature stress has become a popular method to study heat stress under field conditions, especially in spring wheat (Jamil et al. 2019; Sukumaran et al. 2018). In this investigation, a winter wheat population was planted under normal- and late-sown conditions in multiple years and locations to identify differences in TGW. To avoid the failed or insufficient vernalization in winter wheat, the panel was selected from more than 1000 accessions, which could head and mature normally in late-sown condition. Moreover, the selected 688 accessions with similar phenology, most of those flowering dates occurred within one week (data not shown).

In this study, under normal growing condition, when plants approached maturity, the maximum temperature was gradually higher than 30 °C, while the maximum temperature had exceeded 30 °C during the mid-grain filling under late-sown condition (Fig. S1). Moreover, multivariate regression analysis of meteorological factors also revealed that the maximum temperature is the main factor causing the decrease in TGW between two batches of sowing conditions in this study (Table S4). Thus, consistent with the findings reported previously (Jamil et al. 2019; Sukumaran et al. 2018), we thought that high temperature is the main stress under late-sown condition in the present study. Under the late-sown condition, TGW was reduced by 18.24–33.27% compared to the normal-sown condition (Fig. S2). Broad sense heritabilities of TGW under normal- and late-sown conditions and of SSI were quite high (83.81–97.19%) as reported in previous studies (Ogbonnaya et al. 2017; Sehgal et al. 2017; Valluru et al. 2017). This is one reason that SSI based on TGW is a suitable criterion for determination of tolerance to stress (Sharma et al. 2008).

Linkage disequilibrium and population structure

LD in population is the foundation of GWAS. We observed the fastest LD decay rate in the B sub-genome, and the D sub-genome showed the slowest. These findings were consistent with previous studies (Edae et al. 2014; Liu et al. 2017). The relatively rapid LD decay implies a higher number of markers required for GWAS, which can result in higher mapping resolution (Abdurakhmonov and Abdukarimov 2008). Next-generation sequencing (NGS) platforms could help in generating a larger number of markers to GWAS, particularly for the D genome where marker coverage was low (Lozada et al. 2017). Population structure has a strong influence on the result of GWAS (Flint-Garcia et al. 2003). In this study, population structure inferred by STRUCTURE software indicated that four sub-populations were appropriate in delineating the structure, which indicated substantial numbers of genotypes did not share the same parents and higher diversity among association panel. Most landraces were included in Sub-G3, which was separated from cultivars, and cultivars from different wheat regions trended to different sub-groups. This trend was observed previously when germplasm belonging to multiple breeding programs (Ogbonnaya et al. 2017).

Consistent and novel MTAs for TGW

Grain weight as a highly heritable trait makes significant contributions to yield potential (Peng et al. 2003). Many loci controlling grain weight have been identified on all 21 common wheat chromosomes (Hanif et al. 2016; Sun et al. 2017; Sharma et al. 2008; Gao et al. 2017). For TGWNS, a comparison with previous studies revealed that five of the ten stable loci coincided with known genomic regions (previously known QTL/gene for grain weight and/or grain size) and 5 loci were novel (Table 3). Two stable QTLs were detected on chromosome 1B, one at 65 cM and one at 142 cM (Table 3). The QTgw.cau.1B_65 was close to previously mapped TGW QTL, which was reportedly selected during breeding improvement (Gao et al. 2017). The QTgw.cau.1B_142 was also identified with the Wheat Association Mapping Initiative (WAMI) panel of 287 elite spring bread wheat lines (Sukumaran et al. 2018). Moreover, the QTgw.cau.3A_161 on chromosome 3A was closely linked to the TaTGW6-A1 gene (Hanif et al. 2016). The TGW6 gene determines grain weight and encodes a protein with indole-3-acetic acid (IAA)-glucose hydrolase activity in rice (Ishimaru et al. 2013). Furthermore, The QTL QTgw.cau.5A_140-142 on chromosome 5A were identified by four stable SNPs (BS00021860_51, BS00066421_51, IACX5640, and Tdurum_contig82476_64), of which Zanke et al. (2015) also reported a QTL at this location affecting TGW in European winter wheat lines. The QTgw.cau.5B_30-33 identified by three stable SNPs (RAC875_c47084_378, wsnp_Ex_c8962_14947544, and Ex_c8962_467) on chromosome 5B, was close to previous QTL for TGW (Su et al. 2018). In this locus, annotated gene TraesCS5B02G029100 encoding Serine/threonine-protein phosphatase in the confidence interval is the homologous gene of OsPPKL3 encoding a protein phosphatase with Kelch-like repeat domain in rice. OsPPKL3 plays important roles in regulating grain size (Zhang et al. 2012).

Under late sowing condition, the QTgw.cau.6B_72 at 72 cM on chromosome 6B (Table 3) was close to previously detected marker TaGW2-6B (Mohler et al. 2016). Five SNPs associated with TGWLS were detected at 79 cM on chromosome 6A in three environments (Table S7). This locus was predicted to be TaGW2-6A (Bednarek et al. 2012; Sun et al. 2017). TaGW2 plays a role in both cell division and late grain fill, and significantly affects grain weight by control of grain width and length (Nadolska-Orczyk et al. 2017). Moreover, TaGW2-6A and TaGW2-6B were also detected in association with TGWNS in multiple environments. In addition, the QTgw.cau.5A_140-142 detected with TGWLS was tightly linked to a haplotype block with the stable SNPs (BS00021860_51, BS00066421_51, IACX5640 and Tdurum_contig82476_64), which was significantly associated with TGWNS, at 142 cM on chromosome 5A (Fig. 4a, c). The SNP Tdurum_contig82476_64 was also detected with TGWLS in XY15. This kind of loci, which could be detected under both normal and stress environments, are considered to be an important prerequisite for successful marker-assisted selection programs aimed at improving grain yield stability (Cheng et al. 2015).

Promising candidate genes associated with SSI in wheat

SSI is considered an appropriate parameter for selecting stress-tolerant genotypes, and has been widely used in the study of stress tolerance in soybean, wheat and other crops (Ayeneh et al. 2002; Githiri et al. 2006; Talukder et al. 2014). In this study, the QSsi.cau.5A_91 identified with a cluster of stable significant SNPs localized at 91 cM on chromosome 5A, and another QTL QSsi.cau.5D_138 associated with SSI at 138 cM on chromosome 5D was detected in multiple environments. Moreover, the two loci were also associated with TGWLS stably. Interestingly, the vernalization loci VRN-A1 and VRN-D1 were reported nearby these respective loci (Gao et al. 2017; Ogbonnaya et al. 2017; Yan et al. 2003). Varieties with spring-type alleles are reportedly exposed to heat stress for shorter periods during the grain-filling stage compared to those with winter-type alleles (Ogbonnaya et al. 2017). However, use allele-specific VRN-A1 primer sets showed that the spring-type allele (Vrn-A1) was rare (MAF < 0.05) among the 688 winter wheat accessions (Table S3), which meant that the presence of a heat tolerance related gene situated close to VRN-A1. Similar results have been found also under heat stress in Mexico with a different population (Sukumaran et al. 2015). Moreover, in the confidence interval of the locus at 91 cM on 5A, the annotated gene TraesCS5A01G383800 encodes a heat shock transcription factor (HSF) (Table S8). Thus, we postulate that TraesCS5A01G383800 is a possible candidate gene for the locus, which will require further investigation. For VRN-D1, 155 accessions had allele Vrn-D1, and 533 had the allele vrn-D1 (Table S3). The functional marker of VRN-D1 was significantly associated for the TGWNS and SSI (Table S9). However, the LD decay distance for the D sub-genome was approximately 9 cM leaving a possibility that a locus other than VRN-D1 could be involved. Annotated gene TraesCS5D01G393200 also encodes an HSF in the confidence interval of the locus at 138 cM on 5D. Further study is needed to understand the biological functions of these HSF loci and genes associated with heat tolerance.

Additionally, QTL QSsi.cau.4B_33 identified with three stable MATs (wsnp_Ex_c18318_27140346, wsnp_Ra_c9755_16199734 and wsnp_Ra_c9755_16200944) on chromosome 4B were localized at the same position (33 cM), which are likely novel locus for SSI. Three SNPs (BobWhite_c4303_524, Tdurum_contig28482_427, and RAC875_c45062_305) associated with SSI on chromosomes 1D, 5B, and 7B, respectively, are close to markers that were reported to be associated with heat stress (Valluru et al. 2017). A locus at 0–6 cM on chromosomes 6B was associated with SSI (BS00073090_51 and BS00075789_51). This region was implicated in conferring heat tolerance in previous research using a Nongda 3338/Jingdong 6 doubled haploid (DH) population (Guan et al. 2018). A consensus of opinion from many studies is that the identification of molecular markers closely linked with heat tolerance and through marker-assisted selection to increase heat tolerance by aiding the pyramiding of heat tolerance genes (Hays et al. 2007; Mason et al. 2011; Pinto et al. 2010).

Simultaneous improvement of yield and stress tolerance by marker-assisted selection

Resistance to stress in crop plants is an energy-expending process. Consequently, yield potential and resistance to stress are often negatively correlated (Xia et al. 2019; Paliwal et al. 2012). Our study confirmed that TGWNS was positively correlated with TGWLS and SSI, and TGWLS was negatively correlated with SSI (Fig. S4). This is partly because wheat breeding usually focusses on selection for high yield, and hence there is slow progress in improving the resistance to stress (Blum 1996). An analysis of the grain weight and stress tolerance in modern Chinese cultivars from different decades showed that increased TGW had been strongly selected in breeding, but stress tolerance was not concurrently improved (Fig. 7a, c). Correspondingly, among the top 20 high TGW genotypes, almost all genotypes had more than 6 TGWNS superior alleles, but most genotypes only contained one or no SSI superior allele (Fig. 6). This indicated genotypes with high TGW accumulated superior alleles of TGWNS that did not include the superior alleles associated with stress tolerance (Figs. 6 and 7b, d). Therefore, it seems that the current selection strategy in high-yield wheat breeding has resulted in lack of plasticity, especially when climate change is beginning to have major impact. In this study, we found that the Sub-G4 had the highest TGW under normal-sown conditions but with lower SSI (Fig. S6). Moreover, there are some accessions (e.g., Neixiang5, Taishan4606, etc.) with high thousand grain weight and strong stress tolerance (Table S1) in this study, which indicates that breeding stress-tolerant and high yield wheat to address global warming is feasible. Rational design based on the combination of precise genetic dissection of agronomic traits and high-resolution chromosome haplotyping analysis has been proposed as a way to increase the effectiveness of pyramiding desirable traits (Zeng et al. 2017). Identification of superior alleles in our study will be beneficial for rational design in simultaneously improving grain yield and yield stability in wheat.

Fig. 7
figure 7

Average TGW and SSI, and number of superior alleles in landraces, and cultivars released before the 1980s, 1980s, 1990s, 2000s, and 2010s. The distributions of a TGW under normal conditions; b number of superior alleles for TGW under normal conditions; c SSI; and d number of superior alleles for SSI. Values shown are means ± SEM