Introduction

In maize (Zea mays L.) breeding, genomewide selection (or genomic selection) is typically performed among the progeny within a biparental cross. Suppose a biparental population is formed from the cross between two maize inbreds (A and B) that belong to the Iowa Stiff Stalk Synthetic (BSSS) heterotic group. A training population for the A/B cross, as well as for all other BSSS biparental crosses, can be made by pooling all prior biparental crosses that belong to the same BSSS genetic background. On the other hand, because genomewide selection is most effective when the training population is representative of the A/B population (Schulz-Streeck et al. 2012; Riedelsheimer et al. 2013; Jacobson et al. 2014), the response to selection (R) and predictive ability (rMP) within the A/B population may be higher if the training population includes only those prior biparental populations that are most related to the A/B cross.

Suppose A/* is a prior biparental cross with A as a parent, */B is a biparental cross with B as a parent, and */* is a biparental cross with parents that are different from A and B but have the same genetic background as A and B. These prior biparental crosses lead to three types of training populations to predict the performance of progeny within the A/B biparental cross. In the general combining ability (GCA) model, the training population is formed by pooling all prior A/* populations and all prior */B populations (Jacobson et al. 2014). In the same background (SB) model, the training population is formed by pooling all prior */* populations. In the SB + GCA model, all prior */*, A/*, and */B biparental crosses are pooled as the training population for the A/B biparental cross. Among 30 maize A/B populations, the mean R for grain yield, moisture, and test weight was higher with the GCA model than with the SB model and SB + GCA model (Jacobson et al. 2014). The GCA model therefore provides an easy rule for choosing an ad hoc training population that is smaller but is highly related to each A/B population undergoing genomewide selection.

In the Jacobson et al. (2014) study, the number of crosses (NX) in the training population was kept constant between the GCA model and the SB model, and the number of lines (N) was kept generally similar between the two models. This assumption was needed to make an equal resource comparison between the two models. However, this assumption did not reflect the reality that NX and N are naturally higher in the SB model than in the GCA model because a breeding program has more */* crosses than A/* and */B crosses. In practice, all prior */* crosses can be used in the SB model. As shown later in this study, the training population in the GCA model may consist of 4500 lines from all prior A/* and */B biparental crosses, but the training population in the SB model may consist of 50,000 lines. Because rMP increases as N increases (Daetwyler et al. 2008, 2010; Endelman et al. 2014; Lian et al. 2014), the larger training population may compensate for the lower level of relatedness between the training population and A/B population in the SB model.

Choosing only those */* crosses that meet a specified threshold of similarity with the A/B population may increase the relatedness between the SB training population and the A/B population. Furthermore, the GCA model assumed that both A and B had previously been used as parents of biparental crosses. While a new inbred obtained from an external source (Parra and Hallauer 1996; Phillips 2010) may immediately be used as one of the parents of an A/B cross, the new inbred would not have prior A/* or */B data available. Having either A/* crosses only or */B crosses only may decrease the effectiveness of the GCA model.

Our main objective was to determine whether R and rMP in an A/B population are higher with a single, large training population that is used for all biparental crosses (SB model), or with smaller ad hoc training populations that have a high level of relatedness with a given A/B population (GCA model). Our specific goals were to determine whether the usefulness of the GCA model is diminished in comparison with having prior data on only A or B, or in comparison with a SB model that has a larger NX and N as well as different levels of similarity between the */* populations and the A/B population.

Materials and methods

Phenotypic and marker data

The data and procedures have been previously described (Jacobson et al. 2014, 2015a, b; Lian et al. 2014; Brandariz and Bernardo 2018) but are also briefly described here for the readers’ convenience. Monsanto provided us with testcross phenotypic and SNP marker data for 969 biparental maize populations. The populations were evaluated for grain yield (Mg ha−1 at 155 g H2O kg−1), moisture (g H2O kg−1), and test weight (kg hL−1) at four to 12 environments in the USA from 2000 to 2008 (Jacobson et al. 2014). A total of 27 F2 populations were selected as the A/B populations according to criteria described by Jacobson et al. (2014, 2015a). The lines in the A/B and training populations had the same inbred as the tester. The A and B parents belonged to the same heterotic group, whereas the tester belonged to the opposite heterotic group. Among the 969 biparental crosses, 485 A/B populations were in one heterotic group, whereas 484 A/B populations were in an opposite heterotic group. For a given trait, F2 populations with nonsignificant (p > 0.05) h2 estimates were excluded from the analysis.

The parents of the populations were genotyped with 2911 SNP markers, whereas the progeny in each of the 27 F2 populations was genotyped with 25–123 markers. The genotypes at each locus were coded as 1 if the line was homozygous for the SNP allele from parent A, − 1 if the line was homozygous for the SNP allele from parent B, and 0 if the line was heterozygous. Marker loci that were monomorphic between the two parental inbreds or that had a minor allele frequency less than 0.10 were excluded within each population (Lian et al. 2014; Jacobson et al. 2015a). The SNP data for the progeny were then imputed from the parental SNP data, based on the conditional probability of a non-observed marker genotype given the two flanking marker genotypes (Jacobson et al. 2015a). A previous study with the same data sets showed that marker imputation improved the predictive ability and response to selection, and that around 500 SNP markers were sufficient for grain yield and 1000 SNP markers were sufficient for moisture and test weight (Jacobson et al. 2015a).

Training populations

The training populations were constructed as follows: (1) GCA model, wherein all A/* populations and all */B populations were pooled in the training population for predicting the performance of progeny in the A/B population; (2) A/* populations only; (3) */B populations only; (4) SB model, wherein the training population comprised all available */* populations within each heterotic group; (5) SB model with the same NX and a similar N as the GCA model (SBEqual); (6) SB model with a coefficient of similarity greater than 0.60 between the */* crosses and the A/B population (SB0.60); (7) SB model with a coefficient of similarity greater than 0.70 between the */* crosses and the A/B population (SB0.70); and (8) SB + GCA model, wherein all */*, A/*, and */B populations were pooled. The coefficient of similarity between the parents of an A/B population and training population was calculated as the simple matching coefficient across the SNP loci (Sokal and Michener 1958), as described by Jacobson et al. (2015b).

Coefficient of coancestry

For each A/B population, we estimated the coefficient of coancestry among A, B, and the parents denoted by * to which A and B were crossed. As shown in “Results” section, these coefficients of coancestry (fAB, fA*, and f*B) determined the coefficient of coancestry between an individual in the training population and an individual in the A/B population. The marker-based coefficient of coancestry between any two individuals (X and Y) was estimated as fXY = (SXY − Ɵ)/(1 − Ɵ), where SXY was the marker similarity between X and Y, and Ɵ was the probability that two individuals share alleles that are alike in state but are not identical by descent (Lynch 1988; Bernardo 1993, 2010). Given that SNP loci are biallelic, we assumed Ɵ was equal to 0.50. This value of Ɵ further assumed allele frequencies of 0.50 among unrelated individuals. Nevertheless, Ɵ was expected to remain close to 0.50 as long as marker allele frequencies ranged from about 0.40–0.60. With the latter allele frequencies, Ɵ was [1 − 2(0.60)(0.40)] = 0.52 instead of 0.50.

Response to selection and predictive ability

Marker effects were obtained for all 2911 SNP loci (non-imputed and imputed) by ridge regression–best linear unbiased prediction (RR–BLUP) as implemented in the rrBLUP package (Endelman 2011) in R software (R Core Team 2017). We used RR–BLUP because previous studies have shown that more complex models did not substantially improve the prediction accuracy for yield (Lorenzana and Bernardo 2009; Heffner et al. 2011; Heslot et al. 2012; Riedelsheimer et al. 2012). Marker effects were estimated separately within each cross for each trait according to procedures described by Jacobson et al. (2014). The performance of all N individuals in the A/B population was predicted as y = μ1 + Xm, where y was an N × 1 vector of predicted performance; μ was the estimated overall mean; 1 was an N × 1 vector with elements equal to 1; X was an N × NM incidence matrix with elements of 1, − 1, and 0; and m was an NM × 1 vector of RR–BLUP marker effects averaged across the biparental populations in the training population, e.g., A/* and */B populations in the GCA model, and */* populations in the SB model (Jacobson et al. 2014). The R and rMP were calculated for each A/B population. For each trait, R was calculated as the phenotypic mean of the 10% of lines with the best predicted performance minus the overall mean of the A/B population. The rMP was calculated as the correlation between the marker-predicted and observed values for the progeny in each A/B population. A t test was used to determine whether the R and rMP values across the test populations were significantly different (p < 0.05) among the training population models for each trait.

Results

Training population size, genetic similarity, and coancestry

The number of crosses (NX) and lines (N) varied among the different types of training populations. The ranking of the models in terms of size of the training population (largest to smallest) was as follows: SB + GCA, SB, SB0.60, SB0.70, GCA and SBEqual, and A/* and */B (Fig. 1). The SB + GCA model had a mean (range in parentheses) NX of 347 (338–357) and a mean N of 54,466 (53,406–55,627). The SB model had a mean NX of 320 (289–352) and a mean N of 49,941 (45,422–54,691, Fig. 1). The GCA model had a mean NX of 27 (5–59) and a mean N of 4525 (894–10,171). In terms of N, the SB model training population was 450–6000% as large as the training population in the GCA model. The A/* model training population was 60% of the size of the GCA model training population, and the */B model training population was 40% of the size of the GCA model training population.

Fig. 1
figure 1

Mean genetic similarity between training and A/B populations (top), number of crosses in the training population (middle), and number of lines in the training population (bottom) for eight training population models across the 27 A/B populations. Across the 27 A/B populations, the mean genetic similarity was: 0.80 for the GCA, A/*, and */B models; 0.75 for the SB0.70 model; and 0.72 for the SBEqual, SB, SB0.60, and SB + GCA models. The mean number of crosses in the training population was: 347 for the SB + GCA model; 320 for the SB model; 314 for the SB0.60 model; 198 for the SB0.70 model; 27 for the GCA and SBEqual models; 16 for the A/* model; and 11 for the */B model. The mean number of lines in the training population was: 54,466 for the SB + GCA model; 49,941 for the SB model; 48,992 for the SB0.60 model; 30,774 for the SB0.70 model; 4525 for the GCA model; 4175 SBEqual model; 2726 for the A/* model; and 1777 for the */B model

The marker similarity between the training population and the A/B population (STP,A/B) varied among the different models for constructing the training population. The STP,A/B was highest in the GCA model, with a mean STP,A/B (range in parentheses) of 0.80 (0.73–0.86, Fig. 1). The A/* and */B models had a mean STP,A/B of 0.80 (0.73–0.85). The SBEqual model had the lowest STP,A/B, with a mean of 0.72 (0.63–0.77), but the STP,A/B values were close among the SBEqual, SB, SB0.60, and SB + GCA models.

The coefficient of coancestry between the training population and the A/B population (fTP,A/B) likewise varied among the different models. First, the marker-based coefficients of coancestry had a mean (range in parentheses) of 0.47 (0.21–0.67) for fAB, 0.65 (0.47–0.94) for fA*, and 0.70 (0.45–0.92) for f*B in the GCA model. Second, we found that the expected value of fTP,A/B was (1 + fAB + fA* + f*B)/4 in the GCA model. The estimated fTP,A/B in the GCA model had a mean of 0.71 (0.59–0.80). Third, the marker-based coefficients of coancestry had a mean (range in parentheses) of 0.44 (0.26–0.55) for fA*, and 0.43 (0.26–0.52) for f*B in the SB model. Fourth, we found that the expected value of fTP,A/B was (fA* + f*B)/2 in the SB model. The estimated fTP,A/B in the SB model had a mean of 0.44 (0.28–0.51). Fifth, we found that the expected value of fTP,A/B was (1 + fAB + fA* + f*B)/4 in the A/* and */B models. This expected value was the same as that for the GCA model, and the estimated fTP,A/B had a mean of 0.73 (0.57–0.81) in the A/* model and 0.67 (0.57–0.78) in the */B model. Sixth, the estimated fTP,A/B in the SB + GCA model had a mean of 0.45 (0.28–0.53).

Response to selection and predictive ability with different training populations

A single, large training population (SB model and SB + GCA model) led to lower mean R and mean rMP across the 27 A/B populations compared to the GCA model (Table 1). The differences in R and rMP between the GCA model and SB model were statistically significant (p < 0.05) for moisture and test weight, whereas the differences between the GCA model and SB + GCA model were statistically significant only for test weight. For grain yield, R was 0.22 Mg ha−1 with the GCA model, 0.17 Mg ha−1 with the SB + GCA model, and 0.15 Mg ha−1 with the SB model. The corresponding rMP for grain yield was 0.20 with the GCA model, 0.18 with the SB + GCA model, and 0.16 with the SB model. For moisture and test weight, the SB model and SB + GCA model led to larger decreases in R and rMP (Table 1).

Table 1 Mean and range (in parentheses) of response to selection (R) and predictive ability (rMP) for grain yield, moisture, and test weight across 27 A/B maize populations, for the following training population models: (1) general combining ability (GCA) model with A/* and */B crosses; (2) A/* crosses only (A/*); (3) */B crosses only (*/B); (4) same background model with all */* crosses (SB); (5) SB with the same number of randomly selected */* crosses as the GCA model (SBEqual); (6) SB with a coefficient of similarity between the */* crosses and A/B population greater than 0.60 (SB0.60); (7) SB with a coefficient of similarity between the */* crosses and A/B population greater than 0.70 (SB0.70); and (8) SB + GCA model with */*, A/*, and */B crosses

Restricting the SB model to having the same number of randomly selected crosses (SBEqual) as the GCA model significantly decreased R and rMP for all traits (p < 0.05, Table 1). For grain yield, R was initially 0.22 Mg ha−1 with the GCA model, decreased to 0.15 Mg ha−1 with the SB model, and decreased further to 0.11 Mg ha−1 with the SBEqual model. The corresponding rMP for grain yield was 0.20 with the GCA model, 0.16 with the SB model, and 0.12 with the SBEqual model. Larger reductions were found for moisture and test weight (Table 1). The values of R and rMP with the SBEqual model differed from those reported by Jacobson et al. (2014) because of three reasons: (1) the previous study used unimputed marker data, while in this study marker data were imputed from lower-density screening of the progeny in each biparental cross (see “Materials and methods”); (2) the previous study used 30 instead of 27 A/B populations; and (3) the crosses for the SBEqual training population were selected at random, which meant that the */* crosses used were not the same in the two studies.

Filtering the */* crosses (used in the SB model) to increase similarity with the A/B population (SB0.60 or SB0.70) was ineffective for improving R and rMP compared to the SB model (Table 1). For grain yield, R was 0.22 Mg ha−1 with the GCA model and 0.15–0.16 Mg ha−1 with the SB, SB0.60, and SB0.70 models. Similar trends were found for moisture and test weight (Table 1).

When the training population included only the A/* crosses or only the */B crosses, R and rMP were reduced although the differences were significant (p < 0.05) only for test weight (Table 1). For grain yield, R was 0.22 Mg ha−1 with the GCA model and 0.15–0.16 Mg ha−1 with the A/* and */B models, whereas the corresponding rMP was 0.20 with the GCA model and 0.15–016 with the A/* and */B model. Similar trends were found for moisture and test weight (Table 1).

Discussion

Our main finding was that in genomewide selection in maize, it is better to use an ad hoc training population for each A/B biparental population (GCA model) than to use a single, large training population for all A/B populations (SB model and SB + GCA model). The results also showed the importance of both the relatedness between the training population and A/B population, and the size of the training population.

The SB model and SB + GCA models had, to our knowledge, the largest training populations ever described in the literature for plants (mean N > 50,000; Fig. 1). Despite this very large N in the SB model and SB + GCA model, the R and rMP were higher with the GCA model, which had a mean N of about 4500. This result indicated that the relatedness between the A/B population and training population is more important than the size of the training population. Previous studies likewise highlighted the importance of including related crosses in the training population rather than increasing N by adding unrelated or less related crosses (Riedelsheimer et al. 2013; Jacobson et al. 2014; Lorenz and Smith 2015).

The higher relatedness in the GCA model was evidenced by the higher coefficient of coancestry between the training population and the A/B population (fTP,A/B) in the GCA model (0.71) than in the SB model (0.44) and SB + GCA model (0.45). In accordance with theoretical expectations, the estimated fTP,A/B was equal between the GCA model and the A/* or */B models. Furthermore, fTP,A/B is expected to be highest when individuals in an A/B population are used as the training population for other individuals in the same A/B population [i.e., A/B model, Jacobson et al. (2014)]. For the 27 A/B populations used in this study, the estimated fTP,A/B for the A/B model had a mean (range in parentheses) of 0.74 (0.60–0.84). The mean fTP,A/B of 0.71 in the GCA model was therefore close to the value of fTP,A/B in the A/B model.

The expected values of fTP,A/B were (1 + fAB + fA* + f*B)/4 in the GCA model and A/* and */B models, versus (fA* + f*B)/2 in the SB model. The high fTP,A/B in the GCA and A/* and */B models was partly due to the higher values of fA* and f*B in these three models than in the SB model. These higher values of fA* and f*B in the GCA model and A/* and */B models were likely due to A/B crosses being made primarily within subgroups of parental inbreds. Suppose that the Iowa Stiff Stalk Synthetic maize heterotic group includes three subgroups: B14-type inbreds, B37-type inbreds, and B73-type inbreds. Inbreds within each subgroup are more closely related than inbreds in different subgroups. Furthermore, suppose that A/B crosses are most often (but not always) made within each subgroup. If A and B are both B73-type inbreds, the A/* and */B populations that are pooled into the GCA model training population will be mostly of the B73 background. In contrast, the SB model and SB + GCA model for a B73-type A/B population will also include crosses within the B14 and B37 subgroups. The fA* and f*B will consequently be lower in the SB model than in the GCA model, as was observed in this study.

The GCA model and the A/*, */B, and SB + GCA models obviously assume that the A and B parents have been used as parents of prior biparental crosses. There may be situations in which neither parent has been used in inbred development in previous years. In this situation, the GCA model cannot be used and the SB model will need to be used. The lower R and rMP with the SB model than with the GCA model underscores the difficulty in predicting the performance of progeny of two untested parental inbreds.

Smaller sizes of the training population have been shown to decrease prediction accuracy (rMG), which is defined as the correlation between predicted and true genotypic values and is equal to rMP/h, where h is the square root of heritability (Daetwyler et al. 2008, 2010; Endelman et al. 2014; Lian et al. 2014). The expected prediction accuracy is E(rMG) = r2 [(Nh2/r2Nh2 + Me)]1/2, where r2 is the linkage disequilibrium between a marker and a quantitative trait locus, and Me is the effective number of chromosome segments (Lian et al. 2014). The above equation indicates that there is no minimum N required to reach a certain rMG, because the required N will depend on both h2 and Me (Combs and Bernardo 2013). The equation for E(rMG) applies only if the training population is genetically identical to the population undergoing genomewide selection (i.e., A/B model). For the 969 maize biparental crosses used in this study, Lian et al. (2014) reported mean values of r2 = 0.46, h2 = 0.46, and Me = 82 for grain yield. The mean N in the SB model (49,941) was about 10 times larger than the mean N (4525) in the SBEqual model. However, E(rMG) [with the r2, h2, and Me values reported by Lian et al. (2014)] increases by less than 4% when N increases from 4525 to 49,941. In contrast, if N increases tenfold from 450 to 4500, E(rMG) increases by 31%. Therefore, when N is already large, the increase in rMG due to further increases in N is minor. The training population was also already large with the A/* model (mean N of 2726) and */B model (mean N of 1777). This phenomenon of diminishing returns when N is large also explained why a 50% decrease in N led to less than a 50% decrease in rMP when the training population included prior crosses for only one of the two parents (A/* and */B models) in comparison with the GCA model.

In the SB0.60 and SB0.70 models, filtering the */* crosses to increase the similarity between the training population and the A/B population (STP,A/B) increased fTP,A/B while maintaining a large N (mean N > 36,000, Fig. 1). However, R and rMP were not significantly higher in the SB0.60 and SB0.70 models than in the SB model (Table 1). Because the mean STP,A/B in the GCA model was 0.80, we tried a stricter threshold of 0.80 (SB0.80 model) in 20 of the 27 A/B populations for which such a stricter threshold was possible. The SB0.80 model had a mean (range in parentheses) N of 4983 (164, 12,483) and a mean NX of 34 (1, 88), but the R and rMP in the SB0.80 model were lower than those in the SB0.70 and SB0.60 models (results not shown).

This lack of improvement was noteworthy because the mean N, NX, and STP,A/B were all roughly equal between the SB0.80 model and the GCA model. Suppose A1 and A2 are the parents of A, and B1 and B2 are the parents of B. A grandparental training population can be formed by pooling the A1/*, A2/*, */B1, and */B2 biparental crosses, but such a grandparental training population was found ineffective for predicting the performance of progeny in the A/B biparental population (Hickey et al. 2014). While all of the individual alleles in the A/B population are found among the four grandparents, chromosomal blocks found in the A/B biparental population are represented better in the A/* and */B crosses than in the grandparental crosses.

These previous results (Hickey et al. 2014), as well as those in the current study, suggest that the usefulness of the GCA model may be due to having large blocks of chromosomes in common between the A/* and */B crosses used as the training population and the A/B population undergoing genomewide selection. The current study used data on 2911 SNP markers and the Hickey et al. (2014) study simulated up to 100,000 markers. Given the agreement between our empirical results and the Hickey et al. (2014) simulation results, we speculate that our overall results would remain the same even if a larger number of SNP markers are used. In the maize populations we studied, the mean r2 between adjacent SNP markers was 0.93 within each of the 27 A/B populations, 0.49 across the pool of A/* and */B populations used in the GCA model, and 0.23 across the pool of */* populations used in the SB model. Having a larger number of markers may increase the linkage disequilibrium in the training populations for both the GCA and SB models, but the effect of this higher linkage disequilibrium is unclear given that the linkage disequilibrium in the A/B populations was already very high (0.93) with 2911 SNP markers. The current study and the Hickey et al. (2014) studies both used RR–BLUP, and it remains to be seen whether the combined use of Bayesian models and higher marker densities in the SB model and SB + GCA model would improve R and rMP for traits with no major QTL (such as grain yield) in maize biparental crosses.

Author contribution statement

RB conceived the study and helped revise the final manuscript. SPB conducted the analyses, interpreted the results, and wrote the manuscript.