Introduction

Wheat is the most important food crop and ranks first in harvested area, total production, and traded volume worldwide (Hawkesford et al. 2013; Tripathi et al. 2016). Wheat production largely contributes to food security, socioeconomic development, and living standards (Piao et al. 2010). Kernel number per spike (KNPS) is one of the key factors affecting yield (Fischer 2008; Reynolds et al. 2009; Gao et al. 2017). An unbalanced distribution of grains per spikelet along the spike (top, center, and bottom of a spike) has been widely reported and the fertility of the apical and basal spikelets showed greater effects on KNPS than the middle spikelets in wheat spikes (Ferrante et al. 2013a; Guo and Schnurbusch 2015). Therefore, breeding wheat with higher fertility of the apical and basal spikelets could increase KNPS and thus yield (Acreche et al. 2008; Zheng et al. 2016).

Lower fertility or sterility of apical and basal spikelets is commonly observed in cereal crops such as wheat and rice (Satoh-Nagasawa et al. 2006; Meng et al. 2007; Gallavotti et al. 2011; Guo and Schnurbusch 2015). Genetic variations have been found in different genotypes with most varieties showing a lower fertility of the apical and basal spikelets. To understand the mechanism of this phenomenon, many morphological and physiological studies of spike development have been conducted (Bancal 2009; Shitsukawa et al. 2009; González et al. 2011; Ferrante et al. 2013b; González-Navarro et al. 2015). Langer and Hanif (1973) found unsynchronized development of spikelets dependent on the position within the spike, where the two basal spikelets developed much slower leading to lower fertility or complete sterility of basal spikelets. It was also found that spikelet fertility was largely affected by environmental factors such as planting density (Mishra and Mohapatra 1987), trace element (Rerkasem and Jamjod 1997), sowing time (Saifuzzaman et al. 2008), and drought (Dencic et al. 2000). A good crop management is essential to maximize the number of fertile florets and improve grain set numbers in apical and basal spikelets (Ferrante et al. 2010, 2012; Dreccer et al. 2014; Zheng et al. 2014).

Many studies have been conducted in cereal crops to identify genes or quantitative trait loci (QTL) controlling the fertility of apical and basal spikelets (Yamagishi et al. 2004; Li et al. 2009; Tan et al. 2011; Cheng et al. 2011; Akter et al. 2014). In rice, Yamagishi et al. (2004) located three QTL affecting pre-flowering basal floret abortion on chromosomes 1, 10, and 11, respectively. A candidate gene Short panicle1 was isolated and mutation of this gene could cause significant reductions in basal floret numbers (Li et al. 2009). An interactive effect was also found between different QTL for apical spikelet fertility (Tan et al. 2011). qPAA8, a gene controlling panicle development in rice, has been fine mapped to the 68 kb zone on chromosome 8 (Cheng et al. 2011). There are limited studies in wheat on the inheritance of grain set in apical and basal spikelets (Guo et al. 2015; Guo et al. 2017). After performing genome-wide association studies (GWAS) of 16 floret fertility traits in 210 European winter wheat accessions, Guo et al. (2017) proposed a genetic network underlying floret fertility and related traits, nominating determinants for improved yield performance. A full understanding of the genetics of grain setting at the molecular level is needed for breeders to improve apical and basal floret fertility.

The aim of this study was to identify MTAs for the fertility of three apical and basal floret. The grain numbers in apical and basal spikelets of 212 wheat varieties were collected from five environmental conditions (years/locations). Genome-wide association studies revealed several MTAs controlling grain numbers in apical and basal spikelets. These MTAs have the potential to be used in future fine mapping, cloning, and marker-assisted selection.

Materials and methods

Plant materials

The materials consisted of 212 wheat varieties, including 200 from China, 3 from Italy, 1 from Japan, 1 from Pakistan, and 7 with unknown origins (Table S1). The Chinese varieties were from Jiangsu (63), Henan (22), Shaanxi (21), Shandong (19), Sichuan (15), Beijing (12), Anhui (11), Hunan (9), Hebei (7), Hubei (6), Shanxi (4), Fujian (4), Gansu (3), Guizhou (2), Jiangxi (1), and Zhejiang (1).

Phenotyping

All genotypes were planted at three locations (Jingzhou (JZ) in Hubei province; Yangzhou (YZ) in Jiangsu; and Xinxiang (XX) in Henan) in three growing seasons (2013–2014 (14), 2014–2015 (15), and 2015–2016 (16)). The environments were designated as 14JZ, 14YZ, 15JZ, 15YZ, and 16XX, respectively. Field experiments used randomized block designs with three replications. Each line was planted in five 2-m-long rows with a row spacing of 0.2 m. Forty seeds were planted in each row that were thinned back to 30 per row after germination giving a final plant density of 75 plants/m2. Field management followed local practices. Seedling numbers were thinned to about 30 per row at early seedling stage. The traits recorded included spikelet number per spike (SNPS), thousand kernel weight (TKW), KNPS, grain numbers of the top three apical spikelets (GNAS), and grain numbers of the bottom three basal spikelets (GNBS). Grain numbers in the three apical spikelets were designated as GNAS1, GNAS2, and GNAS3 from the apex downwards, and three basal spikelets were designated as GNBS1, GNBS2, and GNBS3 from the base upwards.

Genotyping and statistical analysis

Genomic DNA extraction was carried out according to CTAB method (Sharp et al. 1989). Descriptive statistical analysis and analysis of variance (ANOVA) of phenotypic data and G × E interaction were calculated by using SAS 9.4 (https://www.sas.com/en_us/software/sas9.html). The best linear unbiased prediction (BLUP) method was used to calculate the mean values of each trait (Bernardo 1996a, b; Bernardo et al. 1996). The broad sense heritability (h2) was calculated according to the formula h2 = σg2/(σg2 + σe2), where σg2 is genetic variance and σe2 is the residual variance.

SNP genotyping was performed on the BeadStation and iScan instruments and conducted at the Genome Center of the University of California at Davis according to the manufacturer’s protocols (Illumina, USA) (Cavanagh et al. 2013). Data correction, input, and output were performed using GenomeStudio v2011.1 (Wang et al. 2014). Information on chromosome location of polymorphic SNPs was obtained from Cavanagh et al. (2013).

PowerMarker V3.25 was used to estimate genetic diversity of SNPs (Liu and Muse 2005). Population structure of the 212 cultivars was evaluated with 3792 SNP markers distributed on all 21 chromosomes using Structure 2.3.4 (Pritchard et al. 2000). The subpopulation number was estimated using the ΔK model (Evanno et al. 2005).

The average data from five environments were used for GWAS. The unified mixed model approach (Q + K model) was applied to the data using TASSEL 5.0 to estimate marker-trait associations (MTAs) (Yu et al. 2005; Bradbury et al. 2007; Zhang et al. 2010). After exclusion of SNP loci with frequencies < 0.05, a uniform suggestive genome-wide significance threshold (1/3271 = 3.06 × 10−4, or P < 3.06 × 10−4, -LogP > 3.51) was given.

Markers with significant association with the traits were converted to 1 (favorable allele) and 0 (unfavorable allele) and were used for regression analysis. Trait values of different genotypes were predicted with these markers and compared with the actual values.

Results

Phenotypic assessment

All eleven traits (SNPS, TKW, KNPS, GNAS1, GNAS2, GNAS3, GNAS, GNBS1, GNBS2, GNBS3, and GNBS) were assessed in five environments (14JZ, 14YZ, 15JZ, 15YZ, and 16XX). The average coefficients of variation for these traits ranged from 6.06 to 164.29%, indicating that grain set in the materials was significantly affected by environments, especially GNBS1. The mean values of GNAS1, GNAS2, GNAS3, and GNAS across the five environments were 1.45, 1.58, 1.82, and 4.85, respectively. All three apical spikelets showed similar fertilities with the uppermost spikelets showing slightly less fertilities. In contrast, the three basal spikelets showed much greater differences in floret fertilities with the average grain number for GNBS1, GNBS2, and GNBS3 being 0.41, 1.43, and 2.62, respectively (Table 1).

Table 1 Descriptive statistics for nine phenotypic traits assessed in five environments

Table 1 shows that SNPS showed the highest h2 (50.52–55.43%), followed by TKW (47.56–49.34%). Among the traits associated with grain setting, KNPS had the highest h2 (38.76–48.51%). The heritability for grain number of apical and basal spikelets was 30.27–43.62% and 30.00–41.63%, respectively. Significant differences (P < 0.0001) were found among genotypes (G), environments (E) for all 11 phenotype traits. G × E interactions were also significant (Table 2).

Table 2 Analysis of variance of 11 traits in five environments of 212 wheat cultivars

KNPS showed significant correlations (P < 0.01) with both GNAS and GNBS. Significant positive correlations were also found between GNAS and GNBS. However, SNPS showed a negative correlation with GNAS and GNBS. TKW showed insignificant negative correlations with all other traits but only significant with KNPS (Table 3).

Table 3 Pearson’s correlation coefficients between phenotypic traits in the BLUPs

Allelic diversity and genetic structure

Genotyping of the 212 wheat cultivars using the 9K SNP array identified 3778 polymorphic SNPs. Among them, 1793 were in the A genome chromosomes, 1778 were in the B genome, and 207 were in the D genome (Table S2). The values of gene diversity and polymorphism information content (PIC) ranged from 0.009 to 0.500 and from 0.009 to 0.375, with averages of 0.318 and 0.255, respectively. Major allele frequencies ranged up to 0.995 with an average of 0.765 (Table S2), indicating that the germplasm was highly diverse.

The number of subpopulation (K) was plotted against the ΔK calculated from the Structure, and the peak of the broken line graph was observed at K = 2 (Fig. S1a, b), indicating that the population was basically divided into two subpopulations.

GWAS of grain set-related traits and their phenotypic effects

Of the 3778 SNP markers, 3269 had frequencies above 0.05. Association analyses between the 11 traits and SNP markers showed that there were 112 significant associations (P < 3.06 × 10−4), with 4, 32, 33, and 43 for TKW, KNPS, apical, and basal grain set numbers, respectively (Fig. 1, Table S3). The associated loci were distributed on all chromosomes except 1D, 3D, 4D, and 5D (Table S3). Twenty-two SNP loci were significantly associated in at least two environments with phenotypic explanation rates (R2) ranging from 6.24 to 18.18% (Table 4). Frequencies of favorable alleles at these associated loci ranged from 7.11 to 93.50%.

Fig. 1
figure 1

Manhattan and Q-Q plots of eight phenotypic traits with 3778 genome-wide SNP markers shown as dot plots of compressed MLM at P < 3.06 × 10−4. Red horizontal line corresponds to the threshold value for significant association. Green and orange colors separate different chromosomes. a GNAS1. b GNAS2. c GNAS3. d GNAS. e GNBS1. f GNBS2. g GNBS3. h GNBS. i KNPS. j TKW

Table 4 Stable MTAs and phenotypic effects of favorable alleles revealed by GWAS consistently identified in at least two environments

Most of GNAS-associated loci were distributed on six chromosomes, two on 5A (wsnp_Ex_c2702_5013188 and wsnp_Ex_c31799_40545376), one on 4B (wsnp_Ex_c32500_41144083), and one on 2A (wsnp_Ex_rep_c103167_88181968). Two other loci with minor genetic effect were on 3B and 5B, respectively (Table S3). These six loci determined more than 30% phenotypic variation. The total GNAS predicted from these six markers showed very significant correlation with the actual numbers (Fig. 2a) with an increased number of favorable alleles increasing the total number of GNAS (Fig. 2b).

Fig. 2
figure 2

Linear regression of predicted and actual numbers of GNAS (a); boxplot among the 212 wheat cultivars in BLUP between number of favored alleles and GNAS (b). The predicted GNAS are calculated from the following equation: y = 0.82x1 + 0.58x2 + 0.60x3 + 0.36x4 + 0.43x5 + 3.80, where y is the predicted GNAS; x1 is wsnp_Ex_c2702_5013188; x2 is wsnp_CAP7_c1405_706142; x3 is wsnp_Ku_c35386_44598937; x4 is wsnp_Ex_rep_c68599_67447880; and x5 is wsnp_Ex_c3130_5789791

The SNP wsnp_BF293620A_Ta_2_3 on 5A showed the largest effects on grain numbers in basal spikelets. Three other minor QTL were found on 1A, 2A, and 6B, respectively (Table S3). Similarly, the total GNBS predicted from these four markers showed a significant correlation with the actual GNBS (Fig. 3a) and an increased number of favorable alleles increased the total number of GNBS (Fig. 3b).

Fig. 3
figure 3

Linear regression of predicted and actual numbers of GNBS (a); boxplot among the 212 wheat cultivars in BLUP between number of favored alleles and GNBS (b). The predicted GNBS are calculated from the following equation: y = 1.25x1 + 0.76x2 + 0.93x3 + 0.67x4 + 2.45, where y is the predicted GNBS; x1 is wsnp_Ex_c54193_57155632; x2 is wsnp_Ra_c4850_8698731; x3 is wsnp_Ex_c15595_23910900; and x4 is wsnp_Ex_c8588_14419007

Stable SNPs for GNAS and GNBS

Two in 22 stable MTAs were significantly associated with two or more grain setting properties under various environmental conditions, including wsnp_Ex_c31799_40545376-5ATT (GNAS1, GNAS2, and GNAS) and wsnp_BF293620A_Ta_2_3-5ACC (GNBS1, GNBS2, and GNBS) (Table 4, Fig. 4a). The frequencies of favorable alleles of wsnp_Ex_c31799_40545376-5ATT (GNAS1, GNAS2, and GNAS) and wsnp_BF293620A_Ta_2_3-5ACC were 7.11% and 12.43%, respectively (Fig. 4b). The favorable allele at SNP locus wsnp_Ex_c31799_40545376-5A in 14JZ and 15JZ improved the grain set in the first apical spikelet by 0.21 and 0.26, increased the grain set of the second apical spikelet in 14JZ and 16XX by 0.37 and 0.31, and increased the grain set of the top three apical spikelets in 14YZ and BLUP by 1.36 and 0.69, respectively (Fig. 4c). Among the favorable alleles for basal grain setting, wsnp_BF293620A_Ta_2_3-5ACC increased the grain sets of the first [0.46 (14JZ), 0.33 (15JZ), 0.39 (16XX)], the second [0.20 (14YZ), 0.23 (15YZ), 0.10 (BLUP)], and the bottom three [0.41 (16XX), 0.94 (BLUP)] basal spikelets (Table 4, Fig. 4d).

Fig. 4
figure 4

Effect of favorable allele wsnp_Ex_c31799_40545376-5A and wsnp_BF293620A_Ta_2_3-5A. a Associated loci identified in the germplasm set by a mixed linear model (P < 3.06 × 10−4); green and orange colors separate different traits. b Allelic frequencies of wsnp_Ex_c31799_40545376-5A and wsnp_BF293620A_Ta_2_3-5A in the germplasm set; blue columns represent favorable alleles TT and CC, respectively. c Genetic effects of wsnp_Ex_c31799_40545376-5ATT in selected environments. d Genetic effects of wsnp_BF293620A_Ta_2_3-5Acc in selected environments

Discussion

KNPS is a fundamental yield component comprised of apical, basal, and middle spikelets. Assuming all other yield-determining factors are fixed, an increase in grain set in apical and basal spikelets could modestly but significantly improve the yield of wheat (Arisnabarreta and Miralles 2006; Acreche et al. 2008). However, not enough effort has been made to improve the fertility of apical and basal spikelets in breeding program. Both GNAS and GNBS were positively correlated with KNPS (P < 0.01) with only weak but insignificant negative correlations with TKW (Table 3). Our study also showed a large variation in both GNAS (2.1–7.1) and GNBS (0.5–9.2), indicating a great potential for improving the fertility of both basal and apical spikelets, thus increasing the total number of grains per spike.

Many MTAs for GNAS and GNBS were identified in this study. Most of them are in similar positions to those for grain yield and yield components (Table 5). The wsnp_Ku_rep_c68318_67259259 for GNAS1 on chromosome 4B was associated with grain yield (Ain et al. 2015). The SNPs wsnp_RFL_Contig4134_4692458 and wsnp_Ex_c2288_4293430 associated with GNBS on chromosome 2D and 4A were mapped to QTL interval Kukri_c14902_1112RAC875_c77816_365 and Kukri_rep_c106490_583–RAC875_c29282_566 that affected KNPS (Gao et al. 2015). A QTL for spike number/m2 (SN) in the marker interval BS00032003_51-BS00070871_51 (Gao et al. 2015) was in a similar position to wsnp_Ex_c607_1204733 which was found to be associated with GNBS in this study. The SNP wsnp_Ex_c11446_18468102 associated with GNBS on chromosome 6A was located in a pleiotropic region, affecting TKW and SN (Gao et al. 2015). Another SNP marker wsnp_Ex_c32500_41144083 associated with GNAS1, GNAS2, and GNAS on chromosome 4B was close to QTL interval (IWB67166IWB25207) that affected days to maturity (Milner et al. 2016). However, its physical location on chromosome 4B is 574.9 Mb, which is far from the Rht-B1 gene (30.8 Mb) (Table S4).

Table 5 Significant SNP loci identified in current and previous studies

Gene pyramiding has been proved to be an effective approach in improving not only a plant’s tolerance to biotic stresses (Zheng et al. 2017) and abiotic stresses (Zhou 2011) but also other agronomic traits and yield components (Mirabella et al. 2015). Among all MTAs for different traits, the SNPs wsnp_Ex_c31799_40545376 and wsnp_BF293620A_Ta_2_3 on 5A showed consistent significant association with all grain setting properties under various environmental conditions (Table 4, Fig. 2). The corresponding overlapping genes related to these two loci have not been reported for either spikelet fertility or yield-related genes according to the published sequence of the hexaploid wheat genome (Table S4, Fig. S2). Interestingly, Vrn-B gene was found in the 4.5-Mb region, and its physical distance to wsnp_Ex_c31799_40545376 and wsnp_BF293620A_Ta_2_3 was 2 Mb and 1.1 Mb, respectively. Therefore, the new markers can be potentially used in breeding programs to improve the fertility of both basal and apical spikelets. Further studies should be conducted using a segregating population to identify the gene and verify their roles in spikelet fertility in wheat.

In our study, apart from the MTAs on 5A, several other MTAs were identified for GNAS and GNBS. The combination of the favorable alleles from different MTAs significantly improved the fertility of both GNAS and GNBS (Figs. 2 and 3). From the 212 wheat accessions used in this study, less than 10% of accessions have favorable allele on 5A and just around 10% of accessions have favorable allele for the other significant MTA on 4B, suggesting that less effort has been made in improving the fertility of GNAS and GNBS. No accessions were found to have favorable alleles from both significant MTAs, i.e., 5A and 4B MTAs, which opens the door for breeders to improve the fertility of GNAS and GNBS by pyramiding those two loci.

In conclusion, two new MTAs were identified for the fertility of basal and apical spikelets, respectively. Both of loci were located on chromosome 5A and not found to be associated with any grain setting properties in previous studies. The combination of these loci with other MTAs for spikelet fertility could improve the grain setting in both basal and apical spikelets.