Introduction

Bread wheat (Triticum aestivum L.) is one of the most important crops in the world. Given the current and future scenario of increased global demand for grains, breeding efforts will concentrate on improving grain yield (CIMMYT 2017). Hence, the identification of specific and efficient selection criteria, which are not only associated with yield but are also amenable to high throughput application in breeding programs, is of utmost importance. In addition, knowledge of the genetic and molecular bases underlying yield and its components will help to improve selection methods and genetic gain.

Breeding efforts in raising grain yield have mainly focused on increasing grain number per unit area (Austin 1982; Slafer and Andrade 1989; Slafer et al. 1990), as this continues to be the component that best explains yield variation (Slafer et al. 1990; Abbate et al. 1998; Borrás et al. 2004; Shearman et al. 2005; Elía et al. 2016; Ferrante et al. 2017; Lynch et al. 2017). However, grain number per unit area is a difficult trait to select for in early generations of a breeding program, in which little seed is available for accurately assessing traits on a per unit area basis. Besides, its low heritability and high genotype by environment interaction limit genetic progress attainable in this trait by conventional breeding.

Fischer (1984) proposed that, under non-limiting growing conditions (i.e., without water or nutrient limitations and absence of pests and diseases), grain number per unit area in wheat can be considered the product of (1) duration of rapid spike growth period, (2) crop growth rate during the spike growth period, (3) dry weight partitioning to spikes during the spike growth period, and (4) number of grains per unit of spike chaff dry weight, i.e., a “spike fertility” index (SF). This index, also termed “fruiting efficiency” (Ferrante et al. 2012), was considered to be “another partitioning efficiency trait like harvest index” (Fischer and Rebetzke in press).

During the last ~ 25 years, several authors have studied SF in different genetic materials (mainly commercial cultivars) under different environmental conditions, using an ecophysiological approach (Stapper and Fischer 1990; Abbate et al. 1998, 2007; Miralles et al. 1998; González et al. 2005, 2011, 2014; Serrago et al. 2008; Foulkes et al. 2015). As a result, the existence of a positive association between SF and grain number per unit area has been consistently reported (Abbate et al. 1998; Shearman et al. 2005; Acreche et al. 2008; Foulkes et al. 2015; Terrile et al. 2017). Furthermore, Abbate et al. (2007, 2013) and Mirabella et al. (2016) found that differences in SF of commercial cultivars were stable among contrasting environments, including sub-potential ones, with low genotype × environment interaction. Regarding the mode of inheritance, Martino et al. (2015) and Mirabella et al. (2016) reported mid to high heritability values in early generations of segregating populations, and the former authors found that selection for high SF led to greater grain yield through increased grain number. However, the fact that this was observed in the F3 generation and at the spike level poses a need for further investigation in advanced generations.

Studies by García et al. (2014) of a doubled haploid population derived from a cross between two well-adapted, high-yielding spring bread wheat cultivars, in two environments (Buenos Aires, Argentina and Ciudad Obregón, México), and by Martino et al. (2015) and Mirabella et al. (2016) in early generations of the biparental population used in the present study at Balcarce, Argentina, showed transgressive segregation for SF, i.e., the occurrence of a fraction of individuals with phenotypic values exceeding the parents, either in the negative or positive direction (Rieseberg et al. 1999). This is a relevant point as the selection of those extreme phenotypes (the positive ones, in this case) could further increase SF and hence, raise grain yield (Slafer et al. 2015).

The above evidence indicates that SF is a promising selection criterion for increasing yield in breeding programs (Fischer 2007, 2011; Foulkes et al. 2011; González et al. 2011; Lazaro and Abbate 2012; Abbate et al. 2013; García et al. 2014; Martino et al. 2015; Slafer et al. 2015; Elía et al. 2016; Mirabella et al. 2016; Terrile et al. 2017; Fischer and Rebetzke in press). Also, a non-destructive, simple method for fairly accurately estimating SF at maturity, amenable to high throughput phenotyping was proposed by Abbate et al. (2013) and recently valued as a promising selection trait (Fischer and Rebetzke in press). However, before implementing this strategy it is necessary to make certain considerations. For instance, as a negative association between grain number per unit area and grain weight has been reported in some studies (Slafer et al. 1990; Fischer 2007; Acreche et al. 2008; González et al. 2011; Martino et al. 2015; Terrile et al. 2017), a negative association between SF and grain weight might occur as well. Indeed, several papers have reported this (Lazaro and Abbate 2012; Ferrante et al. 2015; González-Navarro et al. 2016; Joudi et al. 2016; Terrile et al. 2017), whereas others have reported no such association (Fischer 2011; González et al. 2014; Ferrante et al. 2012, 2017; Elía et al. 2016). Thus, the possible effect of selecting for high SF on grain number per unit area, grain weight and yield needs to be further ascertained.

The aim of this study was to determine the mode of the inheritance of SF and effect of selection for high SF on yield and yield components in advanced lines from an actual breeding program, in order to establish whether SF can serve as an effective selection criterion in bread wheat breeding programs aimed at increasing grain yield.

Materials and methods

Plant materials

A biparental population of 146 recombinant inbred lines (RILs) developed by Alonso et al. (2018) and studied in early generations [F3; Martino et al. (2015), and F1-F2; Mirabella et al. (2016)], was used in all field experiments. The RIL population was derived from ‘Baguette 10’ (B10) × ‘Klein Chajá’ (KCJ), Argentinean spring bread wheat cultivars with contrasting SF, and released in 2000 and 2002, respectively. Both cultivars have similar, intermediate growth cycles, and are well adapted to Argentinean wheat growing areas. They show several differences in spike architecture (Supplementary Fig. 1): KCJ has a long, lax spike with a large number of spikelets and grain and heavy chaff structure, whereas B10 has a compact, shorter, more dense spike, with very thin glumes and rachis.

Field experiments

Field experiments were carried out at the experimental station of the Instituto Nacional de Tecnología Agropecuaria (INTA) Balcarce (37º45′ S; 55º18′ W; 130 m a.s.l.), Buenos Aires province, during the 2013, 2014 and 2015 crop seasons. In each crop season, all RILs and the parents were grown in a randomized complete block design with two replications. Experimental units consisted of seven-row 5 m-long plots with 0.2 m inter-row spacing. All experiments were conducted under optimal nutritional and water conditions, with chemical control of weeds, pests and fungal diseases. Sowing dates were June 27 2013, July 24 2014 and July 15 2015. Anthesis and physiological maturity dates of each plot were recorded when 50% of spikes reached those phenological stages. Physiological maturity was determined as loss of greenness from the peduncle. Weather conditions were recorded at with a standard meteorological station located at the experimental station.

Measurements, calculations and statistical analyses

At maturity, 20 random spikes were sampled from the five central rows of each plot. They were cut at the lowest spikelet level, counted, weighed and threshed. Grain weight (g) was obtained as the quotient between total grain weight and grain number per 20-spike sample. Spike chaff dry weight (g) was calculated as the difference between total spike dry weight (i.e., before threshing) and total grain weight. Spike fertility (grains/g) was calculated as the quotient between grain number and spike chaff dry weight per sample (Abbate et al. 2013). Grain yield (g/m2) was determined by mechanical harvest of the five central rows of each plot. Grain number/m2 was calculated as the quotient between grain yield and grain weight.

Linear mixed models were fitted for SF, grain weight, grain number/m2, and grain yield of the RIL population, using the lme function from package nlme (Pinheiro et al. 2016) of the R software (RCore-Team 2016). The models included replicates within environments (years) and years as fixed factors, and genotypes and genotype × environment (years) interaction as random factors. Linear fixed models were fitted for the above traits in the parental cultivars. The critical level of significance used was p = 0.05 for all tests.

Correlation analysis between SF, grain yield and yield components was carried out with standardized data for each crop season. In addition, best linear unbiased predictors (BLUPs) for each variable and RIL were obtained from lme models and used to estimate genetic correlations between SF, grain yield, and yield components.

Variance components were estimated from the lme models by restricted maximum likelihood (REML) method (Milliken and Johnson 1992). Narrow-sense heritabilities were estimated according to Hallauer and Miranda (1981), as follows [Eq. (1)]:

$$h^{2} = \frac{{\sigma_{g}^{2} }}{{\sigma_{g}^{2} + (\sigma_{ge}^{2} /e) + \left( {\sigma_{res}^{2} /re} \right)}}$$
(1)

where \(\sigma_{g}^{2}\) is the genotypic variance; \(\sigma_{ge}^{2}\) is the genotype × environment interaction variance; \(\sigma_{res}^{2}\) is the error variance; e is the number of environments, and r is the number of replications per experiment (equal to 2 in all cases).

Selection strategies and responses to selection

Data from the three independent experiments described above were used to evaluate responses in grain yield under different selection strategies. Six cases were tested per selection strategy, under different selection intensities: they were the result of combinations of selection in each of the three years, and evaluation of trait response at each of the remaining two years.

The selection strategies used were

  1. a)

    Selection of the top yielding 3, 5, 10 or 20% lines.

  2. b)

    Selection of the 3, 5, 10 or 20% lines with the highest SF values.

  3. c)

    Two-step selection: Step 1: selection of lines with grain yield higher than the population average, and Step 2: among these, selection of 3 or 5% of lines with the highest SF values.

Grain yield responses to selection were calculated as the difference between the mean values of the ‘selected group’ and the general mean of the population of each of the two evaluation years. Response to selection was then expressed as a percentage (%RS) of the population mean [Eq. (2)]:

$$\% RS = \frac{{\bar{x}_{sg} - \bar{x}_{p} }}{{\bar{x}_{p} }}$$
(2)

where \(\bar{x}_{sg}\) is the mean of the selected group and \(\bar{x}_{p}\) is the population mean.

Results

Weather conditions and phenology

Mean daily temperature, radiation and photothermal quotient values, both per crop season and historical, are shown in Supplementary Table 1. All three crop seasons were warmer than the historical mean with 2014 as the warmest season. Radiation values were similar or lower than the historical mean. Figure 1 shows the distribution of anthesis and physiological maturity dates of the 146 RILs and parents. In general, anthesis occurred within ~ 10 day periods in early November. In 2014, anthesis and physiological maturity dates were more variable than those in 2015. Both dates were intermediate in 2013 crop season in comparison to the other crop seasons.

Table 1 Means of parental cultivars, Baguette 10 (B10) and Klein Chajá (KCJ), and median, maximum and minimum values of the RIL population (n = 146), for spike fertility (SF); grain weight (GW), grain number/m2 (GN), and grain yield (GY) in 2013, 2014 and 2015 at Balcarce, Argentina (partially published data; Alonso et al. 2018)
Fig. 1
figure 1

Frequency distribution of dates of anthesis (bars at the left) and physiological maturity (bars at the right) in Julian days for 146 RILs derived from Baguette 10× Klein Chajá evaluated in three crop seasons: 2013 (white), 2014 (grey) and 2015 (black)

Spike fertility, grain yield and related traits

Table 1 shows the mean values of parental cultivars and median, minimum and maximum values of the RIL population for SF, grain weight, grain number/m2, and grain yield for each of the three crop seasons. Spike fertility values for B10 (the parental cultivar with high SF; 112.4, 115.3 and 96.0 grains/g in 2013, 2014 and 2015, respectively) were consistently and significantly greater than those for KCJ (84.1, 72.8 and 77.1 grains/g in 2013, 2014 and 2015, respectively) (Table 1). This is coincident with results reported by Martino et al. (2015) and Mirabella et al. (2016) for the same cultivars; i.e., B10 always showed greater SF values than KCJ, with no significant year or cultivar by year interaction effects. The quotient between grain number per spike and spike chaff dry weight defines the SF index; whereas no significant differences were found in grain number per spike between B10 and KCJ or between crop seasons, spike chaff dry weight at maturity was greater for KCJ than B10, with no cultivar by year interaction (values for KCJ and B10 were 0.6 vs. 0.4, 0.6 vs. 0.3 and 0.7 vs. 0.4 g in 2013, 2014 and 2015, respectively). The RILs showed a median SF value between the parental values and minimum and maximum values more extreme than those of the parents in all years (median: 97.9, 92.1 and 89.1 grains/g in 2013, 2014 and 2015, respectively; Table 1). The highest SF values were recorded during the 2013 crop season and the lowest occurred in 2014. For grain weight, the RIL median was greater in 2015 than in 2013 or 2014 (Table 1). Grain yield was more closely related to variation in grain number/m2 than to variation in grain weight, regardless of the genotype or environment. The RIL median grain yield was greater in 2015 (756 g/m2) than in 2013 (700 g/m2) despite a higher median grain number in 2013 (21.5 103 grains/m2) than in 2015 (17.8 103 grains/m2) and because of greater median grain weight in the latter year (43 vs. 33 mg in 2013).

A bell-shaped, nearly symmetrical distribution of SF BLUPs was observed in the RIL population (Fig. 2). Transgressive segregation (i.e., RILs with more extreme SF BLUPs than the parents) was detected: 3.5% of individuals showed higher values than B10 and 2.1% of individuals showed lower values than KCJ.

Fig. 2
figure 2

Distribution of SF BLUPs for the RIL population. The black star and diamond indicate SF BLUPs of parental cultivars Klein Chajá (− 12.2) and Baguette 10 (11.6), respectively

Relationships between traits

Phenotypic associations between SF and grain yield, grain number/m2 and grain weight in each of the three crop seasons are shown in Fig. 3. Significant negative associations were detected between SF and grain weight in all three experiments. Significant positive associations were observed between SF and grain number/m2 for all three years and between SF and grain yield in 2014 (Fig. 3).

Fig. 3
figure 3

Relationship between spike fertility (SF) and a grain weight (GW), b grain number/m2 (GN) and c grain yield (GY) of 146 RILs during three crop seasons: 2013 (diamonds), 2014 (squares) and 2015 (triangles), in standardized units. Pearson’s correlation coefficients (r) shown in each panel are significant (p < 0.05) except for those of c SF and GY, in 2013 (r = 0.05) and 2015 (r = 0.02)

Table 2 shows the coefficients of genetic correlation (r) between BLUPs for SF, grain yield and yield components in the RILs. Grain yield showed a strong association with grain number/m2 (r = 0.81; p < 0.001), but it was not associated with grain weight (r = − 0.12; p = 0.149). SF showed positive associations with grain number/m2 (r = 0.55; p < 0.001) and grain yield (r = 0.36; p < 0.001) and a negative association with grain weight (r = − 0.50; p < 0.001).

Table 2 Bottom left: Genetic correlation coefficients (r) between spike fertility (SF), grain yield (GY), grain weight (GW) and grain number/m2 (GN). Upper right: P-values of each r. Best linear unbiased predictor (BLUP) data of the RIL population (n = 146) derived from Baguette 10× Klein Chajá, evaluated in three crop seasons at Balcarce, Argentina

Estimation of variance components and narrow-sense heritabilities

Variance components and narrow-sense heritabilities for SF, grain yield, grain number/m2, and grain weight are shown in Table 3. Genotype × environment interaction variance was greater than, and fairly similar to, genetic variance for grain yield and grain number/m2, respectively. Genetic variances for SF and grain weight were 5.8- and 2.6-fold greater than the genetic × environment variances, respectively. Narrow-sense heritabilities (h2) of SF and grain weight were 0.84 and 0.75, respectively. Grain yield and grain number/m2 showed much lower h2 values (0.28 and 0.35, respectively).

Table 3 Variance components (σg: genetic variance; σge: genetic × environmental variance; σRes: error variance) and narrow-sense heritability (h2) of grain yield, grain weight, grain number/m2and spike fertility of a RIL population (n = 146) derived from Baguette 10×  Klein Chajá, evaluated in three crop seasons at Balcarce, Argentina

Selection strategies and responses to selection

Changes in grain yield (i.e., responses to selection) after the application of different selection strategies are shown in Table 4. When the selection criterion applied was solely high grain yield, the responses were random and showed no association with selection intensity: in some test years, grain yield responses were positive and high, but in others, they were zero or negative. For instance, yield responses ranged between − 6 and 22.8% and between 0.1 and 28.5% when applying 5 and 3% selection intensities, respectively (Table 4). On the other hand, when genotypes were selected only for high SF, grain yield response was positive in all cases (2.3–24%), and was improved, on average, by increased selection intensity. Finally, the highest responses to selection were achieved when a two-step selection strategy for high grain yield (i.e., yield greater than the population average) and high SF was applied. Moreover, no negative responses to selection were observed using this procedure.

Table 4 Response to selection for grain yield (GY) (g/m2 and %) under different selection strategies and selection intensities. Six different cases, constituted by different combinations of testing and selection years, are presented, with data of 146 RILs derived from Baguette 10× Klein Chajá evaluated for three years at Balcarce, Argentina

Responses to selection under different strategies varied with both selection and test year. When selection was carried out exclusively for high yield, the responses were highly variable depending on both selection and test year. The addition of SF as a selection criterion in combination with high yield led to slightly decreased inter-annual variability in response to selection. In turn, responses to selection were more stable and independent of the selection and test year when high SF was used as the sole selection criterion (Table 4).

Discussion

In this study we showed that the use of spike fertility index as a selection criterion, either solely or in combination with selection for high yield, resulted in higher and more stable yields than selection for high yield alone. Despite the consideration of SF as a promising trait for wheat breeding programs aimed at increasing grain yield, supported by several publications (including González et al. 2011; Pedró et al. 2012; Abbate et al. 2013; and Ferrante et al. 2015), no evidence was available on its efficacy in achieving actual yield increases. In addition, our results confirm previous findings (Martino et al. 2015; Mirabella et al. 2016) showing that SF is a highly heritable trait, controlled by several genes with additive effects.

Contrasting SF values were observed for the parental cultivars, whereas the median value of the RILs was intermediate. Lines with significantly more extreme values than those of the parents were also observed; for this particular trait, genotypes with higher values than the best parent are the ones of potential use for selection. Despite the fact that experiments were conducted under optimum nutrient and water conditions with disease and pest control, the different years varied in environmental conditions that affected grain yield and its components (Alonso et al. 2018). Regardless of the environmental effect on the measured traits, similar phenotypic correlations between traits were observed across crop seasons. The significant correlations observed between grain number/m2 and grain yield, SF and grain yield, and SF and grain weight, are consistent with those widely reported in the literature (Abbate et al. 1998; Shearman et al. 2005; Acreche et al. 2008; González et al. 2011; Ferrante et al. 2012; García et al. 2014; Martino et al. 2015; Elía et al. 2016; González-Navarro et al. 2016). As a result, selection of lines with high SF should increase grain number/m2 and grain yield, but may negatively affect grain weight, as reported by González et al. (2014) and Slafer et al. (2015). However, genotypes with high SF and average grain weight were observed in this study (Fig. 3), as well as those reported by Bustos et al. (2013). These results support the idea that simultaneous selection of both traits may reduce the trade-off between grain number/m2 and grain weight.

The analysis of SF variance components showed a significant genotype × environment interaction, but it represented only 9% of the total variation, whereas 51% of the variation was genetic, thus resulting in high h2 (0.84). Grain yield and grain number/m2 also showed low levels of genotype × environment interaction (12 and 7% of the total variation, respectively; Table 3), but these traits also showed low genotypic variances resulting in low h2. Relatively low heritability values for grain yield and grain number per unit area similar to those obtained in this study (Table 3) were also reported by Cooper et al. (1997) and Arguello et al. (2016). In contrast, SF showed high heritability, even higher than the values reported by Martino et al. (2015) and Mirabella et al. (2016) in early generations of the same population. This is probably a consequence of differences in the genetic structure of the population across generations and/or in the environmental conditions under which the experiments were carried out.

Phenotypic correlations between SF and grain yield or grain number/m2 showed high inter-annual variability (Fig. 3), probably due to the low heritability values (and high environmental and genotype × environment variances) observed for grain yield and grain number/m2 (Table 3). However, none of these correlations were negative. This is encouraging in terms of assessing the feasibility of applying SF as a selection criterion in breeding for higher grain yield. Furthermore, when environmental and interaction effects are removed and only the genetic values are considered (i.e., when BLUPs are used), SF shows a high genetic correlation with grain number/m2 (Table 2) and a lower, yet significant and positive, correlation with grain yield.

Normally, an essential condition for an advanced line to be released as a commercial cultivar is that it shows high and stable yield across environments (locations and years); otherwise it will probably be discarded. The results of the different selection strategies applied in this study show the advantage of using SF in achieving genetic progress in yield and its stability across years. Selection of the highest yielding lines or those with the highest SF values increased average grain yield across years. When yield was used as a sole selection criterion, responses to selection were highly variable (Table 4). On the contrary, response to selection in grain yield when the top SF lines were selected was positive in all cases, and more so when the selection intensity applied increased. Selection for high SF not only increased grain yield but also generated a more stable yield response than when lines were selected by high grain yield alone. Furthermore, when the selection criterion applied was a combination of high yield and high SF, additional yield increases were observed, but with some loss in yield stability. These selection schemes should be further validated in breeding programs by evaluating advanced lines with diverse genetic backgrounds under the specific environmental conditions to which they are targeted. Such studies should also include commercial quality traits, in order to assess possible trade-offs that could arise under these selection strategies. On the other hand, bread-making quality traits, such as protein content and quality, should be considered and evaluated as well.

Conclusion

Our findings show the feasibility of using spike fertility index as a selection criterion for improving grain yield and stability in bread wheat. Further evaluation of RIL populations or advanced lines with diverse genetic backgrounds under different environmental conditions is required to validate the present results.