Introduction

Wheat is one of the most important food crops in the world, serving as the major source of carbohydrate and protein for 35% of the human population (Paux et al. 2008). It has been estimated that wheat yields must increase by over 60% in the next 30 years to meet the demands of growing populations (Langridge 2013). Although significant progress has been made in wheat yield improvement during the last 50 years, yield growth rates are generally at no more than 1% per annum (Fischer and Edmeades 2010; Gao et al. 2017). Therefore, genetic improvement in yield potential is required, along with better crop management, to achieve further increases in wheat yield in farmers’ fields.

Thousand grain weight (TGW), grain number per spike (GNS), and spike number per unit area are three major yield components of wheat. They showed less sensitivity to environment than yield itself and are treated as indirect traits for yield improvement (Xu et al. 2017). Historically, genetic gains in grain yield potential have been driven mainly by increased grain number per unit area, but positive contributions of TGW were also observed in China and other countries (Fischer 2008; Gao et al. 2017; Sadras and Lawson 2011). TGW is a complex trait determined by grain size and grain filling (Simmonds et al. 2014). Grain size can be broken into three components: grain length (GL), grain width (GW), and grain thickness. From a developmental point of view, GL is determined in the early stage of grain development and is less influenced by environment conditions, whereas GW and grain thickness are established later and are more environmentally sensitive (Lizana et al. 2010; Prashant et al. 2012; Xie et al. 2015). Large grains are generally favored in wheat breeding because they contribute not only to TGW, but also to seedling emergence/vigor and consumer preference (Chastain et al. 1995; Gegas et al. 2010). Grain filling can be divided into two components, namely rate and duration (Xie et al. 2015). Grain filling rate (GFR) reflects the efficiency of dry matter accumulation (Shewry 2009), and grain filling duration (GFD) reflects the time that it takes (Xie et al. 2015). Both rate and duration contribute to grain weight, with the former showing a stronger correlation with grain weight than the latter (Wang et al. 2009; Xie et al. 2015). Selection of cultivars with high GFR appears to be a promising choice to increase grain yield in the Yellow and Huai River Valleys Winter Wheat Region in China where grain filling duration cannot be freely prolonged under the annual wheat–maize double-cropping system (Wang et al. 2009). Negative correlation among TGW and GNS is a typical selection trade-off problem in breeding, but the molecular mechanisms underlying the individual traits and their interactions are still largely unknown (Xu et al. 2017).

With the use of molecular markers, a large number of quantitative trait loci (QTL) for TGW have been identified on all 21 wheat chromosomes (Cabral et al. 2018; Campbell et al. 1999; Cheng et al. 2017; Cui et al. 2014; Gao et al. 2015; Guan et al. 2018; Huang et al. 2003; Jahani et al. 2019; Jia et al. 2013; Li et al. 2018; Liu et al. 2018; 2019; Prashant et al. 2012; Quarrie et al. 2005; Simmonds et al. 2014; Su et al. 2018; Wang et al. 2019; Wu et al. 2015; Xu et al. 2017; Yan et al. 2017; Yu et al. 2017; Zhai et al. 2018). Many studies were conducted to identify QTL associated with GL and GW with the availability of high-throughput phenotyping pipelines for grain morphology traits (Cabral et al. 2018; Gegas et al. 2010; Li et al. 2018; Maphosa et al. 2014; Su et al. 2018; Wang et al. 2019; Williams and Sorrells 2014; Xiao et al. 2011; Zhai et al. 2018). However, only a few studies focused on GFR (Bhusal et al. 2017; Charmet et al. 2005; Griffiths et al. 2015; Wang et al. 2009; Xie et al. 2015). Charmet et al. (2005) mapped a stable QTL for maximum GFR on chromosome 2B and an environment-specific one on chromosome 3A. Wang et al. (2009) documented six stable QTL associated with mean and/or maximum GFR on chromosomes 1B, 2A, 3B, 5B, 6D, and 7D and seven environment-specific QTL on chromosomes 1A, 2D, 3A, 3B, 3D, 4D, and 5B. An environment-specific QTL on chromosome 7B was reported by Griffiths et al. (2015). Xie et al. (2015) conducted a comprehensive study on GFR by investigating the initial, rapid, late, mean, and maximum GFR of grains from the first, second and third florets within a spikelet. Important QTL clusters on chromosomes 2A, 3B, 4A, 5DL, and 7B, as well as other genetic regions on chromosomes 1A, 1DS, 2D, 3DL, 4DL, 5A, and 7D, were identified (Xie et al. 2015). Bhusal et al. (2017) detected a stable QTL for GFR on chromosome 2A and an environment-specific QTL on chromosome 6D. Although hundreds of QTL for grain-related traits have been identified, subsequent fine mapping was only reported in a few publications, and no gene has been isolated through map-based cloning due to the lack of genome sequence until WAPO1 was identified as a promising candidate gene for a 7AL locus affecting spikelet number per spike (Brinton and Uauy 2018; Kuzay et al. 2019). With the release of wheat genome sequences, more effort will most likely be devoted to map-based cloning. Therefore, rapid identification and validation of major stable QTL for grain-related traits and tightly linked markers are important.

Traditionally, QTL are identified by linkage analysis using progenies derived from crosses between parents showing contrasting phenotypes. Genetically distant parents are chosen in order to generate complete linkage maps. This can lead to unwanted variation that decreases the accuracy of evaluation of target traits and increases the complexity of subsequent fine mapping (Takagi et al. 2013). On the other hand, whenever populations derived from closely related parents are used the number of DNA markers becomes a limiting factor (Takagi et al. 2013). Genotyping an entire segregating population with markers covering the whole genome is laborious, time-consuming, and expensive (Zou et al. 2016). Recent progress on wheat genome sequencing and availability of high-throughput chip-based markers have accelerated QTL analysis and make the use of populations derived from genetically closely related parents possible. Selective genotyping of individuals from the high and/or low tails of a phenotypic distribution provides a cost-effective alternative approach for genetic mapping with negligible practical disadvantage in terms of detection power (Sun et al. 2010). This methodology has been successfully used in rice, maize, and rye (Farkhari et al. 2013; Gimhani et al. 2016; Myskow and Stojalowski 2016), whereas it has not been reported in wheat for identification of QTL associated with complex traits such as grain weight and size.

The objectives of this study were to: (1) identify loci controlling TGW, GL, GW, and GFR in a wheat recombinant inbred line (RIL) population developed from two sister lines by a selective genotyping approach, (2) investigate their effects on GNS, and (3) develop and validate breeder-friendly markers for marker-assisted selection (MAS) in wheat breeding.

Materials and methods

Plant materials

The mapping population of 266 F6 RILs was derived from a cross between Zhongmai 871 and Zhongmai 895 (hereafter ZM871 and ZM895, respectively) by single-seed descent. ZM895 was released in 2012 jointly by the Institute of Crop Sciences and the Institute of Cotton Research, Chinese Academy of Agricultural Sciences. Now it is a leading cultivar in the Yellow and Huai River Valleys Winter Wheat Region, with an annual production area around 0.45 million ha. ZM871 and ZM895, developed by pedigree selection and fixed at F5, are two sister lines that could be traced back to a single F2 plant of the Zhoumai 16/Liken 4 cross. As revealed by the genotyping results of the 90 K SNP array, 3% of the markers were polymorphic between ZM871 and ZM895 compared with around 10–15% among genetically distant varieties (Dong et al. 2016; Wang et al. 2014). They exhibited similar agronomic traits such as plant height, heading, and maturity dates. However, ZM895 had higher TGW, larger grain, faster GFR, and lower GNS than ZM871 (Fig. S1). Two sets of materials were used for QTL validation. The first comprised backcross (BC1) F4 populations ZM871/ZM895//ZM871 and ZM871/ZM895//ZM895 of 121 and 175 recombinant inbred lines, respectively. The second consisted of 186 advanced breeding lines from a joint wheat breeding program conducted by the Institute of Crop Sciences and the Institute of Cotton Research, Chinese Academy of Agricultural Sciences. Pedigrees and other relevant information are listed in Table S1.

Two hundred and forty-six F2:8 RILs from the Zhou 8425B/Chinese Spring cross and 275 F2:6 RILs from the Doumai/Shi 4185 cross were used to examine the relationships between the 5B QTL for TGW and GL found in the present work and those reported in our previous studies (Gao et al. 2015; Li et al. 2018). Relevant information about these populations and their parents can be found in previous reports (Gao et al. 2015; Li et al. 2018).

Field trials and phenotyping

Details of the growing environments and traits evaluated in the RIL and BC1F4 populations are summarized in Table S2. Field trials were carried out over three cropping seasons at Anyang and Shangqiu in Henan province (2014–2015, 2015–2016 and 2016–2017) and two seasons at Zhoukou also in Henan (2015–2016 and 2016–2017) and at Xianyang in Shaannxi province (2014–2015 and 2015–2016). A randomized complete block design was used for all populations in all environments with no replication in 2014–2015, two replications in 2015–2016 and three replications in 2016–2017. Each plot comprised two 1-m rows spaced 25 cm apart. Thirty seeds were sown evenly in each row. TGW was evaluated in duplicate by weighing 200 grains after the grain had been dried to a constant moisture content at room temperature. GL and GW were calculated using image analysis software (Image-Pro Plus 6.0, http://www.mediacy.com/) after scanning 50 sound, fully developed grains placed on a scanner panel with grains crease-down. Flowering dates were visually assessed and recorded when 50% of the spikes had extruded their anthers. Physiological maturity was recorded when 50% of the peduncles lacked green color. GFD was the number of days from flowering to maturity. Mean GFR (g/day) was calculated as GFR = TGW/GFD. GNS was calculated from the mean grain number of 30 representative spikes of each plot at physiological maturity. Collectively, TGW, GL, GW, and GNS of the ZM871/ZM895 RIL population were evaluated in 10 environments, and GFR of this population was evaluated in six environments, whereas TGW, GL, GW, and GNS of the two BC1F4 populations were evaluated in eight environments, and their GFR was evaluated in four environments.

Among the 186 advanced breeding lines, 62 were sown at Anyang, 26 at Xinxiang in Henan province and the remaining 98 lines were grown at both locations (Table S1). As a result, 160 lines were evaluated at Anyang and 124 lines were assessed at Xinxiang in the 2016–2017 cropping season in 4.0 × 1.6 m six row plots using a randomized design with no replication, and a check was added in every 12 plots. The planting density was 2.4 × 106 plants/ha. TGW, GL, and GW were recorded using a scaled camera-assisted phenotyping system (Wanshen Detection Technology Co., Ltd., Hangzhou) and GNS was evaluated using the same method as the mapping population.

Field trials and phenotypic evaluation of the Zhou 8425B/Chinese Spring and Doumai/Shi 4185 populations were described in our previous studies (Gao et al. 2015; Li et al. 2018). In brief, the Zhou 8425B/Chinese Spring population was grown at Zhengzhou and Zhoukou in Henan province during the 2012–2013 and 2013–2014 cropping seasons, providing TGW data for four environments. The Doumai/Shi 4185 population was evaluated at Shunyi in Beijing and Shijiazhuang in Hebei province for three successive cropping seasons (2012–2013, 2013–2014 and 2014–2015), providing data of TGW, GL, and GW from six environments.

Genotyping

Genomic DNA was extracted from young leaves using the CTAB method (Doyle and Doyle 1987). Based on mean values of TGW obtained from cropping season 2014–2015, 60 lines showing extreme phenotypes, including 30 lines exhibiting the highest TGW and 30 lines with the lowest TGW, were selected from the mapping population and genotyped using a Wheat 660 K SNP array (http://wheat.pw.usda.gov/ggpages/topics/Wheat660_SNP_array_developed_by_CAAS.pdf). The selected proportion was 11% at each tail for extreme phenotype, that is supposed to have a 95% probability of detecting QTL with large effects (that explain more than 10% of total phenotypic variation) and a 75% probability of detecting QTL with medium effects (that explain around 7% of total phenotypic variation) (Sun et al. 2010). Genotyping was performed by CapitalBio Corporation (http://www.capitalbio.com) according to the Affymetrix Axiom 2.0 Assay Manual Workflow protocol. Zhou 8425B/Chinese Spring and Doumai/Shi 4185 populations were genotyped using the 90 K SNP array as described by Gao et al. (2015) and Li et al. (2018).

Array-based SNP markers closely linked to the QTL for TGW-related traits were converted into Kompetitive Allele Specific PCR (KASP) markers for QTL confirmation. Allele-specific and common reverse primers for each KASP marker were designed using PolyMarker (http://polymarker.tgac.ac.uk/), a fast polyploid primer design pipeline. Newly designed KASP markers were evaluated for polymorphisms between the parents before genotyping the entire mapping population. Two BC1F4 populations and advanced breeding lines were genotyped with the flanking markers of the QTL regions on chromosomes 1AL, 2BS, and 3AL and three markers (Kasp_5B4, Kasp_5B8, and Kasp_5B11) representing the 5B QTL region. Zhou 8425B/Chinese Spring and Doumai/Shi 4185 populations were genotyped using KASP markers in the 5B QTL region that showed polymorphisms between two parents. KASP assays were performed in a 5 μl reaction volume containing 2.5 μl 2 × KASP Master Mix, 0.056 μl KASP primer mix and 2.5 μl genomic DNA at 30 ng/μl. Fluorescence was detected in a Synergy H1 microplate reader (BioTek Instruments Inc., USA) and the data were analyzed using KlusterCaller 2.24 (KBioscience, UK).

Map construction and QTL analysis

A SNP-based genetic map was constructed using 60 selected RILs from the mapping population for preliminary QTL identification. Markers were discarded if they were monomorphic between parents or missing (treating heterozygous as missing) in either of two parents, contained > 20% missing data or showed minor allele frequencies < 0.2. The BIN function in IciMapping 4.1 (http://www.isbreeding.net/) was used to remove redundant markers (co-segregating markers) to reduce the complexity of calculation. Linkage analysis was performed with JoinMap 4.0 using the regression mapping algorithm. Linkage groups with less than five markers or markers with no linkage were discarded in the subsequent analysis. The remaining linkage groups were assigned to chromosomes based on the 660 K genetic map reported by Cui et al. (2017). Selective genotyping was subjected to marker-based analysis, in which trait means were compared between classes defined based on marker genotypes, or to ‘trait-based’ analysis, in which marker allele frequencies were compared between classes of progeny defined based on trait values (Navabi et al. 2009). Single marker analysis (SMA) and selective genotyping mapping (SGM) was performed with IciMapping 4.1 software to find potential chromosome regions responsible for TGW, GL, GW, and GFR. Phenotypic data for the mapping population obtained from three locations in cropping season 2014–2015 were used to declare significant associations between marker genotypes and traits, with default LOD thresholds of 2.5 in SMA and 5.0 in SGM. Proportions of the bottom and top tails used in SGM were set to 0.5. To minimize the probability of false positives in selective genotyping analysis and identify stable QTL, a QTL was declared and chosen for further confirmation only when at least two closely linked SNP markers simultaneously showed significant associations with the TGW-related traits and at least one marker was detected in two or more environments.

The genetic maps used for QTL confirmation were constructed with JoinMap 4.0 using 26 KASP markers (Table S3) that were converted from SNP markers closely linked to the preliminarily identified QTL. QTL analysis was performed by inclusive composite interval mapping (ICIM) using IciMapping 4.1. Phenotypic data obtained from individual environments and the best linear unbiased estimators (BLUEs) across 10 (TGW, GL, GW, and GNS) or six (GFR) environments were used for QTL detection. A LOD threshold of 2.0 was set based on 1000 permutation tests at P < 0.01.

Genotypes of KASP markers were merged with those of the 90 K SNP array, and new linkage maps of chromosome 5B were generated for Zhou 8425B/Chinese Spring and Doumai/Shi 4185 populations. Map construction and QTL analysis followed the procedures described in previous reports (Gao et al. 2015; Li et al. 2018).

Physical positions of mapped SNPs in the QTL regions were obtained by blasting SNP flanking sequences against the Chinese Spring reference genome sequence (IWGSC 2018).

Statistical analysis

Phenotypic data analyses were conducted with SAS 9.2 software (SAS Institute Inc, Cary, NC, USA). PROC MIXED was used in the analysis of variance (ANOVA) to evaluate the contributions of lines (RILs) and environments, where environments, lines, line × environment interaction and replicates nested in environments were all considered as random effects. In parallel, a model considering lines as fixed factors was fitted for estimating BLUEs of lines across environments. Adjusted means of each line for each trait in individual environments were separately computed with PROC MIXED. Original phenotypic data obtained from the 2014–2015 cropping season, adjusted mean phenotypic data of each environment obtained from cropping seasons 2015–2016 and 2016–2017 were used for broad-sense heritability (h 2b ) estimates and Pearson’s correlation analyses. Broad-sense heritability on a genotype mean basis was estimated following Holland et al. (2003). Genotypic and phenotypic correlation coefficients among different traits were also estimated (Holland 2006). Homozygous lines in the ZM871/ZM895//ZM871 and ZM871/ZM895//ZM895 populations were used to verify QTL effects. The differences in TGW, GL, GW, GFR, and GNS between two classes of genotypes (homozygous for ZM871 and homozygous for ZM895) were calculated by PROC MIXED, treating genotypes as fixed effects, and lines nested in genotypes, environments, environment-related interactions, and replicates nested in environments as random effects. In the advanced breeding lines, QTL effects on TGW, GL, GW and GNS were evaluated by performing Student’s t tests. The effects of the 2BS and 5B QTL were evaluated at individual marker level instead of interval level because of frequent recombination. Generally, QTL repeatedly detected in different environments and/or across multiple genetic backgrounds were considered to be stable. In the present study, a QTL was considered to be major and stable when it was detected in more than three environments and had significant effects in at least one set of the validation materials, accounting for more than 10% of the phenotypic variation. Associations between KASP markers and phenotypic values in the Zhou 8425B/Chinese Spring and Doumai/Shi 4185 populations were determined by Student’s t tests.

Results

Phenotypic evaluations

ANOVA showed that line and line × environment interaction effects were significant for TGW, GL, GW, GFR, and GNS at P < 0.001, and environment effects were significant for TGW, GL, GW, and GNS at P < 0.05 (Table S4). All traits had broad-sense heritabilities exceeding 0.85. Better among-environment correlations were observed for TGW-related traits than for GNS. Among TGW-related traits, TGW and GL had better among-environment correlations (Table S5). Pearson’s correlation coefficients among environments ranged from 0.30 to 0.89 for TGW-related traits and from 0.26 to 0.75 for GNS.

Larger and heavier grains, faster grain filling and more grains per spike are favorable for breeders. ZM895 had larger grain size, higher TGW, and GFR, but lower GNS than ZM871 under all the environments tested. Continuous distribution and transgressive segregation were observed for TGW, GL, GW, and GFR, indicating polygenic inheritance (Fig. S2). GNS showed a more-or-less bimodal distribution, suggesting the presence of potential major QTL for GNS in the ZM871/ZM895 RIL population (Fig. S2).

At both the phenotypic and genotypic levels, positive correlations among TGW-related traits and negative correlations between GNS and TGW-related traits were observed (Table S6). TGW, GW, and GFR were highly correlated with each other (r = 0.75–0.96) except GW with GFR (r = 0.64) at the phenotypic level. GL was moderately correlated with GFR (r = 0.48 and 0.38) and weakly correlated with GW (r = 0.18 and 0.26). Weak to moderate negative correlations were observed between TGW and GNS (r =− 0.37 and − 0.26).

SNP-based genetic map construction and QTL identification

After filtering the genotypic data, 39,189 high-quality polymorphic markers from the Wheat 660 K SNP chip were employed for subsequent analysis. By performing Bin function, 5745 non-redundant markers were identified and used for linkage analysis, of which 4231 were grouped into 65 linkage groups representing all chromosomes except 3D (Table S7; Fig. 1).

Fig. 1
figure 1

Distribution of the 4231 loci in the 65 linkage groups belonging to 20 chromosomes. Linkage groups of the same chromosome are shown in different colors

Eighty-one and 76 markers were significantly associated with TGW-related traits in SMA and SGM, of which 57 (70%) and 56 (74%) were mapped on four chromosomes 1AL, 2BS, 3AL and 5B, each containing 2–3 QTL (Tables 1, S8, S9). Thirty-two markers (56.1 and 57.1%) and five QTL (62.5 and 71.4%) in these four genetic regions were common between SMA and SGM. Collectively, the 1AL and 2BS QTL regions showed significant effects on TGW and GW, whereas the 3AL and 5B QTL regions had significant effects on TGW and GL. Two GFR QTL were identified on chromosomes 2BS and 3AL. All favorable alleles came from ZM895.

Table 1 Four genomic regions harboring QTL for TGW, GL, GW, and GFR identified by selective genotyping

QTL confirmation

To confirm the preliminarily identified QTL based on two tails of the mapping population, tightly linked SNP markers were converted into KASP markers for QTL analysis. In total, 26 KASP markers were used for genetic map construction by genotyping 266 RILs of the ZM871/ZM895 population. The resulting linkage maps represented segments of chromosomes 1AL, 2BS, 3AL and 5B which contained four, six, seven and nine markers, spanning 2.2, 6.6, 8.2 and 19.3 cM in length, respectively (Fig. 2).

Fig. 2
figure 2

Genetic maps of chromosomes showing QTL for TGW, GL, GW, GFR and GNS in the Zhongmai 871/Zhongmai 895 RIL population. TGW thousand grain weight, GL grain length, GW grain width, GFR mean grain filling rate, GNS grain number per spike

Inclusive composite interval mapping (ICIM) indicated that chromosomes 1AL, 2BS, 3AL, and 5B contained QTL for TGW, GL, GW and/or GFR with favorable alleles from ZM895 (Tables 2, S10; Fig. 2). The QTL on chromosomes 1AL and 2BS showed significant effects on TGW, GW, and GFR, but no significant effect on GL. Qgw.caas-1AL was a major QTL associated with GW explaining 5.0–20.6% of the phenotypic variation. Chromosome 3AL possessed three QTL associated with TGW, GL, and GW, respectively. Qgl.caas-3AL, a GL QTL detected in all 10 environments, explained 5.7–15.7% of the phenotypic variation. Two major QTL for TGW and GL, respectively, and a minor QTL for GFR were identified on chromosome 5B. Qtgw.caas-5B and Qgl.caas-5BS, observed in all 10 environments, explained 5.7–17.1% and 12.0–19.3% of the phenotypic variation of TGW and GL, respectively.

Table 2 QTL for TGW, GL, GW, GFR, and GNS identified on chromosomes 1AL, 2BS, 3AL and 5B using the entire mapping population

No epistatic interaction among different QTL was identified using IciMapping 4.1, indicating all the QTL had additive effects. There was a linear relationship between phenotype and the number of favorable alleles (Fig. 3); with the addition of each favorable allele additively contributing to enhanced phenotype values. The RILs carrying positive alleles at all four QTL regions exhibited 17.7% higher TGW (7.6 g), 4.1% higher GL (0.29 mm), 5.8% higher GW (0.20 mm), and 23.8% higher GFR (0.3 g/day) than those possessing contrasting alleles.

Fig. 3
figure 3

Linear regressions between number of favorable alleles and across-environment BLUEs of TGW, GL, GW and GFR in the Zhongmai 871/Zhongmai 895 RIL population. BLUE best linear unbiased estimator, TGW thousand grain weight, GL grain length, GW grain width, GFR mean grain filling rate. Numbers of lines carrying the corresponding number of favorable alleles are shown in brackets. x and Y in the equations represent number of favorable alleles and across-environment BLUEs of TGW (a), GL (b), GW (c) and GFR (d), respectively

Pleiotropic effects on GNS

QTL mapping for GNS was conducted using the new KASP linkage maps for four chromosomes to determine whether these QTL regions had significant effects on GNS. A major GNS QTL, explaining 5.2–15.2% of the phenotypic variation, was identified on chromosome 5BS with ZM871 contributing the favorable allele (Tables 2, S10; Fig. 2). RILs carrying ZM871 alleles had 5.7% higher GNS (2.5 grains) than those having ZM895 alleles. Interestingly, this locus did not share a common marker interval with Qtgw.caas-5B in seven of the eight environments (Table S10), suggesting that variation of TGW and GNS was probably controlled by different genes.

QTL validation

ANOVA of data from the BC1F4 populations indicated a significant influence of genotypes on TGW, GL, GW, GFR, and GNS (Table 3). Significant differences between the ZM871 and ZM895 genotypes in TGW, GW and GFR at the 1AL locus, in TGW, GL, GW, and GFR at the 2BS locus, and in TGW, GL and GW at the 3AL locus were present in both populations. Lines with homozygous ZM895 alleles exhibited significantly higher phenotypic values than those with ZM871 alleles irrespective of QTL region, with the differences ranging from 1.3 to 2.3 g for TGW, from 0.11 to 0.14 mm for GL, from 0.04 to 0.10 mm for GW, and from 0.06 to 0.11 g/day for GFR. Unexpectedly, a significant negative effect on GNS contributed by the ZM895 allele was observed for the 3AL QTL in both populations and for the 1AL and 2BS QTL in the ZM871/ZM895//ZM895 population, with differences ranging from 1.3 to 1.9 grains per spike. Differences in GL (0.15 and 0.16 mm) and GNS (2.1 and 2.4 grains per spike) associated with the 5B QTL were significant in both populations, whereas differences in TGW (2.0 g), GW (0.04 mm) and GFR (0.04 g/day) were significant only in the ZM871/ZM895//ZM895 population. Lines homozygous for the ZM895 5B allele had larger grain size, higher rate of grain filling and grain weight, but lower GNS than those possessing the ZM871 allele.

Table 3 Comparison of TGW, GL, GW, GFR and GNS between Zhongmai 895 (ZM895) and Zhongmai 871 (ZM871) genotypes in the two BC1F4 populations

Experiments on the advanced breeding lines provided further evidence for significant effects of all four QTL (Table 4). At the 1AL locus, ZM895 allele was significantly associated with higher TGW (4.0 and 5.1 g), GL (0.32 and 0.42 mm) and GW (0.10 and 0.12) at both locations. Eighty-one percent of advanced breeding lines grown at Anyang and Xinxiang had the ZM895 genotype, indicating a strong past, positive field selection on the ZM895 allele. The ZM895 genotype for the 3AL QTL was also present in high frequencies at both locations (72 and 80%, respectively). Significant differences in GW (0.07 and 0.21 mm) were observed at both locations, whereas differences in TGW (5.5 g), GL (0.33 mm) and GNS (3.0 grains) were identified only at Xinxiang, with the ZM895 allele contributing positive effects on TGW-related traits and negative effect on GNS.

Table 4 Comparison of TGW, GL, GW, and GNS between Zhongmai 895 (ZM895) and Zhongmai 871 (ZM871) genotypes in advanced breeding lines

Effects of the 2BS and 5B QTL were evaluated by investigating the association between marker genotype and phenotype because frequent recombination was apparent among markers used to genotype the lines. As indicated by Kasp_2B55 in the 2BS QTL region, lines carrying homozygous alleles from ZM895 had significantly higher TGW (2.0 and 2.4 g), GL (0.23 and 0.36 mm) and GW (0.05 and 0.06 mm) than those with ZM871 alleles. The ZM895 genotype was present at lower frequencies (29 and 37%, respectively). For the 5B QTL, Kasp_5B4 and Kasp_5B8 were significantly associated with GL and GNS at both locations, with differences between the two genotypes ranging from 0.13 to 0.28 mm for GL and from 1.9 to 3.4 grains per spike for GNS, respectively. In addition, significant differences in TGW (3.0 and 2.8 g) and GW (0.10 mm) were detected between the ZM895 and ZM871 genotypes at the Kasp_5B8 locus. Seventy-nine and 84% of advanced breeding lines tested at the two locations, respectively, had the ZM895 genotype at the Kasp_5B4 locus compared with 54 and 61% at the Kasp_5B8 locus, indicating strong selection on the ZM895 allele at Kasp_5B4 locus.

Comparison of the 5B QTL

A TGW QTL, flanked by wsnp_Ra_c5634_9952011 and RAC875_c14882_275, was previously identified in the Zhou 8425B/Chinese Spring population (Gao et al. 2015). In the present study, RAC875_c14882_275 was 4.3 cM from Kasp_5B11 and 5.7 cM from Kasp_5B12 on the new linkage map of chromosome 5B (Fig. S3a). Moreover, the TGW QTL was mapped to a marker interval (JD_c20126_516–Kukri_rep_c105540_177) next to the original one, with Zhou 8425B contributing the favorable allele (Fig. S3a). This QTL was detected in two environments, explaining 5.2 and 8.9% of the phenotypic variation, respectively, in agreement with previous results. Zhou 8425B had the ZM871 genotype, whereas Chinese Spring had the ZM895 genotype at Kasp_5B11 and Kasp_5B12 loci. RILs with the ZM871 genotype exhibited significantly higher TGW than those with the ZM895 genotype (Fig. S3b, c). It is possible that the QTL for TGW in the Zhou 8425B/Chinese Spring and ZM871/ZM895 populations are controlled by the same gene.

Previously, a QTL for GL was mapped near Excalibur_c4232_2834 in the Doumai/Shi 4185 population (Li et al. 2018). This locus was confirmed using the new linkage map comprising Kasp_5B4 and Kasp_5B8 (Fig. S4a), with the favorable allele from Doumai. Kasp_5B8, 0.8 cM from Excalibur_c4232_2834, showed significant effects on GL in all six environments (Fig. S4c). Kasp_5B4 was 8.2 cM from Excalibur_c4232_2834, significantly associated with GL in four out of the six environments (Fig. S4b). Shi 4185 had the ZM871 genotype, whereas Doumai had the ZM895 genotype at Kasp_5B4 and Kasp_5B8 loci. Lines with the ZM895 genotype exhibited significantly higher GL than those with the ZM871 genotype (Fig. S4b, c), suggesting that the QTL for GL in the Doumai/Shi 4185 and ZM871/ZM895 populations are likely the same.

Discussion

Selective genotyping is an economical and effective approach for QTL mapping

Using a mapping population with 266 RILs, a 11% selection proportion at each tail for extreme phenotype and a high-density genetic map, we identified QTL associated with TGW-related traits that explain 5–19% of the phenotypic variation, indicating the effectiveness of selective genotyping in genetic analysis of complex traits in wheat. Confirmation of QTL using the entire mapping population is not required, although it could provide a better estimation of QTL effects that are less accurately assessed in selective genotyping. The probability of finding false positives decreases with increased numbers of markers that simultaneously show significant associations (Sun et al. 2010). In the present study, all QTL were represented by more than one marker and were confirmed by ICIM. No other stable QTL was identified by rerunning the selective genotyping analysis using phenotypic data from six or 10 environments (data not shown), in agreement with the similar among-environment correlations (Table S5) of the phenotypic data for 60 extreme lines observed in the cropping season 2014–2015 and those in the following seasons.

Compared with conventional QTL mapping, selective genotyping is cost-effective when the ratio of genotyping to phenotyping costs is higher than one (Gallais et al. 2007). The overall expenses could be further reduced by excluding an appropriate proportion of individuals with intermediate phenotypes after each round of evaluation (Myskow and Stojalowski 2016), or replacing complex and expensive techniques with quicker, easier and cheaper ones that are accurate enough for identifying extreme phenotypes from the intermediate phenotypes. However, we have to notice that selective genotyping is limited to only one or a few correlated traits in a study, while an entire population may need to be genotyped if many traits are considered.

Generally, a lot of crosses and selections are involved in breeding programs every year, resulting in many progenies or lines in which multiple favorable alleles from different genetic resources are present. Selective genotyping provides an excellent choice for breeders to explore these materials for QTL detection underlying the variation of targeted traits, making QTL identification a co-product of breeding programs. This is particularly attractive to breeders who are mainly interested in identification of QTL for marker-assisted selection of traits of interests (Gallais et al. 2007). Effectiveness and allele frequency changes of the QTL regions on chromosomes 1AL, 3AL and 5B in the advanced breeding lines indicate that combination of selective genotyping and breeding practice is feasible. Though marker-based and trait-based analyses are equally powerful in biparental populations (Tables 1, S8, S9; Navabi et al. 2009), the former may be more appropriate for breeding populations because not all loci respond to selection.

Comparison with previous reports

In the present study, QTL for TGW-related traits were mapped on chromosomes 1AL, 2BS, 3AL and 5B, and a QTL for GNS was mapped on chromosome 5BS. Previously identified QTL and cloned genes on the chromosomes mentioned above are summarized in Tables S11 and S12, respectively. In addition to consensus maps, the IWGSC (2018) Chinese Spring reference sequence was used as a common coordinating system for comparisons of QTL identified in different studies.

1AL QTL

The linkage map we generated for mapping the 1AL QTL contained only four KASP markers, spanning 2.2 cM and corresponding to an interval of 307.8–439.0 Mb in the IWGSC reference sequences. This low recombination rate (0.017 cM/Mb) informed us that the 1AL QTL region located in the pericentromeric region. It is difficult to compare its position with previously reported QTL due to strong suppression of recombination and the poor relationship between physical and genetic distance of pericentromeric region (Campbell et al. 1999; Su et al. 2018; Wang et al. 2009; Xiao et al. 2011). Using the 1A consensus map of Somers et al. (2004) as an example, the marker order in intervals Glu-A3–Xwmc24 and Xwmc312–Xgwm99 are in accordance with their physical positions whereas those in Xgwm357–Xcfd22, a 13 cM interval corresponding to about 50–500 Mb, were not (Fig. S5a). In another high-density consensus map, a pericentromeric region of chromosome 1A corresponds to an about 1 cM interval covering 100–300 Mb (Fig. S5b; Maccaferri et al. 2015). We could not distinguish QTL when they co-located in 50–500 Mb, especially those with large confidence intervals identified using low-density maps (Wang et al. 2009; Xiao et al. 2011). TaSnRK2.3-1A, a homoeologue of plant-specific protein kinase gene SnRK, was reported to have significant effects on TGW and plant height (Miao et al. 2017). This gene corresponds to TraesCS1A02G215900, located at 381.8 Mb in the IWGSC reference sequences. No variation was detected by sequencing TaSnRK2.3-1A from ZM871 and ZM895, indicating that TaSnRK2.3-1A was not the gene underlying the 1AL QTL.

2BS QTL

The 2BS QTL located in the 41.4 to 44.3 Mb region overlapped with several QTL identified for TGW and yield components (Cabral et al. 2018; Kumar et al. 2006; Prashant et al. 2012; Xu et al. 2017). A copy of the photoperiod response gene Ppd-B1 was found at 56.2 Mb by blasting its sequence (Beales et al. 2007) against the IWGSC reference sequences. It is not likely that this is the causal gene underlying the 2BS QTL, because markers distributed from 50 to 100 Mb on chromosome 2B were not polymorphic in the mapping population and there was no significant effect on heading date. Another gene associated with TGW on the short arm of chromosome 2B is TaSus2-2B (Jiang et al. 2011). It was annotated as TraesCS2B02G194200 and located at 171.0 Mb, clearly different from the present QTL. Qgfr.caas-2B probably represents a new locus because only one GFR QTL has been identified on chromosome 2B in Xgwm148–Xgwm388 (100.8–555.7 Mb, Charmet et al. 2005).

3AL QTL

The 3AL QTL was located in a 2.6 cM interval that corresponds to 497.7–533.0 Mb in the IWGSC reference sequences (Fig. S5c). Although many QTL controlling TGW, GNS and other agronomic traits have been reported on chromosome 3A (Ali et al. 2011; Bennett et al. 2012; Gao et al. 2015; Huang et al. 2004; Jia et al. 2013; Li et al. 2018; Wu et al. 2012; Zhai et al. 2018; Zhang et al. 2014), most of them were either in the pericentromeric region (about 100–450 Mb; Fig. S5c) or located in the distal regions of 3AS (< 25 Mb) and 3AL (> 625 Mb). In addition, Ma et al. (2018) mapped a QTL for plant height, spike length and TGW at 53.6–57.7 Mb and Ali et al. (2011) identified a QTL for GNS in 597.5–624.0 Mb. Few QTL for grain size or GFR have been documented on chromosome 3A compared to agronomic traits. Gegas et al. (2010) detected two QTL for grain size and shape close to Xgwm2 at 60.2 Mb and Xbarc19 at 310.7 Mb. Wang et al. (2009) identified a genetic region affecting GFR and GNS but not TGW in the interval Xwmc505–Xwmc264 corresponding to 90.0–625.7 Mb. Two genes on 3AL, Tackx4 and TaTGW6-A1, were reported to have significant effects on grain weight at 712.1 and 722.4 Mb, respectively (Chang et al. 2015; Hanif et al. 2015). The present QTL is likely to be a new locus associated with TGW, GL and GW.

5B QTL

We constructed a 19.3 cM linkage map spanning 27.5 to 394.2 Mb in chromosome 5B on which QTL for TGW, GL, GW, GFR, and GNS were identified (Table 2; Fig. S5d). Many stable QTL for TGW have been identified in the 5B QTL region (Cui et al. 2014; Huang et al. 2003; Ma et al. 2018; Prashant et al. 2012; Quarrie et al. 2005; Su et al. 2018; Wu et al. 2015; Zhai et al. 2018) and some coincide with QTL for GL and/or GW (Cui et al. 2014; Prashant et al. 2012; Su et al. 2018; Wu et al. 2015; Zhai et al. 2018). This genetic region co-located with a TGW QTL in the Zhou 8425B/Chinese Spring population and a GL QTL in the Doumai/Shi 4185 population (Gao et al. 2015; Li et al. 2018). Zhou 8425B contributed the favorable allele, but its genotype was different from ZM895. There are several possible explanations for this phenomenon. The peaks of LOD contours presented near Kasp_5B12 in the ZM871/ZM895 population and between JD_c2012_516 and RAC875_14882_275 in the Zhou 8425B/Chinese Spring population are 5–20 cM apart. We cannot rule out the possibility for the presence of two different genes responsible for the QTL, which is supported by many reported QTL in the 5B QTL region. Co-location at the present mapping level cannot guarantee the same gene. Relative effects caused by different alleles of the same gene could also lead to this result. TaSAP7-B, corresponding to TraesCS5B02G200000 at 360.9 Mb on chromosome 5B, was significantly associated with TGW and plant height (Wang et al. 2018). However, no difference was observed in the complete coding sequence and partial promoter sequence (about 700 bp upstream from ATG) between ZM871 and ZM895. Both carried the superior allele. Wang et al. (2009) identified two QTL for GFR in marker intervals Xcfd7–Tx37-38 and Xbarc232–Xbarc275, respectively. Xbarc74 (402.7 Mb) was 14.1 cM from Xcfd7 and 80.4 cM from Xbarc232 at 619.8 Mb, suggesting that Qgfr.caas-5B may be a new QTL for GFR. As for GNS, several environment-specific QTL have been detected in the pericentromeric region of chromosome 5B (Cui et al. 2014; Li et al. 2015; Tang et al. 2011); none of them located in the 35.1–42.1 Mb (Kasp_5B3–Kasp_5B4) interval. DEP1 gene plays important roles in regulating grain number per panicle and grain yield in rice (Huang et al. 2009). One of its homologs is located at 378.5 Mb (between Kasp_5B9 and Kasp_5B11) on chromosome 5B, and at least 10 cM from Kasp_5B4 (42.1 Mb). There is no direct evidence that DEP1 plays the same role in wheat, indicating that DEP1 is unlikely to be the gene underlying Qgns.cass-5BS. These results suggest that Qgns.caas-5BS is a new GNS QTL.

Applications in wheat breeding

Major stable QTL for yield-related traits and their tightly linked markers are of high importance in molecular breeding. In this study, the QTL for TGW, GW, and GFR on chromosome 1AL showed constant effects on TGW and GW and negligible TGW-GNS tradeoffs in different genetic backgrounds, and were strongly selected in breeding, represents a valuable target for MAS to enhance grain size and weight. The availability of time-saving and cost-effective KASP markers could facilitate its use in wheat breeding.

As shown in the present and previous studies, the 5B QTL was located in an important but complex region with more than one gene responsible for TGW, GL, and GNS (Table S11). Results from the Zhou 8425B/Chinese Spring population indicated that associations between the favorable allele and marker genotype depended upon the genetic background. This is a major limitation in the application of MAS in breeding (Liu et al. 2012). In the present study, Qtgw.caas-5B, Qgl.caas-5BS and Qgns.caas-5BS were major stable loci located in different marker intervals. A more precise delimitation of these QTL is needed to determine if they are caused by the same or closely linked genes before using them to improve grain weight and size or GNS. Therefore, to initiate fine mapping of these QTL, two BC1F4 lines from the ZM871/ZM895//ZM871 population with residual heterozygosity at the 5B QTL were identified and self-pollinated to generate heterogeneous inbred lines (HILs). The availability of genome sequences and high-throughput KASP genotyping system will make fine mapping easier and faster. Exploring high-resolution genotyping data in large diversity collections showed great potential in candidate gene elucidation (Voss-Fels et al. 2019). Application of this approach in our future fine mapping work may save years of self-pollinating, genotyping and phenotyping.