Introduction

Grain yield in wheat (Triticum aestivum L.) is controlled by many genes and influenced by the interactions between genes and with environments (Heffner et al. 2011; Narjesi et al. 2015). Despite the widely recognized importance, it is still challenging to estimate gain yield across cycles or environments. In addition, the growing human population and climate change call for increasing global crop productions and boosting genetic gains for grain yield per cycle (Ray et al. 2013). Genomic selection (GS) is an approach that allows the prediction of genomic estimated breeding values of lines in a breeding population by using the genome-wide marker information (Meuwissen et al. 2001). Based on phenotypic and genotypic data from a training population, the GS approach is capable of building a prediction model and predicting the unobserved lines using genotypic data only (Crossa et al. 2017). Compared to other traditional approaches, such as marker-assisted selection (MAS), GS stands out with some intrinsic advantages: increasing genetic gain by reducing the duration of breeding cycles (Heffner et al. 2010) and capturing minor effect loci based on markers spread over the whole target genome (Hayes et al. 2009). The higher prediction accuracy of GS prediction over MAS for quantitative traits (Arruda et al. 2016; Wang et al. 2014; Zhang et al. 2016) makes GS a promising approach for wheat breeding. With next-generation sequencing technology, GS has been applied to several quantitative traits in wheat, including grain yield (Heffner et al. 2011; Poland et al. 2012a, b; Sun et al. 2017), disease resistance (Juliana et al. 2017; Rutkoski et al. 2012, 2014), and nutritional quality (Heffner et al. 2011; Manickavelu et al. 2017; Velu et al. 2016).

In addition to genotyping, accurate prediction model training for GS requires reliable phenotypes. Because of the high labor and time cost, phenotyping becomes a crucial factor that limits genetic gains in plant breeding. Therefore, substantial efforts have been devoted to the development of high-throughput phenotyping (HTP) platforms in many crops in order to generate large-scale and in-depth phenotyping at low cost and labor intensity (Araus and Cairns 2014; Yang et al. 2014). Field-based HTP platforms have been established by the remote or proximal sensing and imaging technologies, in which the sensors and imaging techniques are differentially deployed based on each of their advantages, the traits of interest, and the experimental design in the field (Araus and Cairns 2014). Recently, HTP platforms have extended their applications to measure different traits in wheat, such as plant height (Holman et al. 2016), growth rate (Holman et al. 2016), vegetation indices (Haghighattalab et al. 2016), and disease resistance (Bauriegel et al. 2011; Devadas et al. 2015).

The majority of HTP platform applications in GS can be grouped into two categories. One takes advantage of the phenotypic data directly generated from the HTP platforms as the primary trait in the genomic prediction model training. For example, Watanabe et al. (2017) applied the unmanned aerial vehicle (UAV) remote sensing to collect the indicator of sorghum plant height. They demonstrated that the predictive ability of GS model, based on the phenotypic data measured by UAV, was similar to the traditional measurements, but it significantly reduced the labor cost compared to traditional sorghum height measurements. The other improves the prediction accuracy by firstly using the HTP platforms to measure the traits that are genetically correlated with the primary trait, followed by incorporating such secondary traits with the primary trait in a multi-trait genomic prediction model. For example, Rutkoski et al. (2016) and Sun et al. (2017) utilized the canopy temperature (CT) and normalized difference vegetation index (NDVI) to improve the ability to predict grain yield within a population, leading to an average of 70% improvement in the predictive ability of GS. The traditional hand measurements of CT and NDVI are sensitive to the environmental conditions; in contrast, the data collected from HTP platforms are more robust because the data collection period and measurement errors are significantly reduced. Furthermore, the HTP platforms offer the opportunity to collect time-series data to observe plant growth continuously over time. Therefore, it enables the comparison between the height of different sorghum accessions at the same growth stage (Watanabe et al. 2017) and allows to select wheat cultivars with high grain yield at an early plant growth stage (Sun et al. 2017). Nevertheless, the development of HTP platforms is still sensitive to field variation that adds to the error variances (Araus and Cairns 2014) and must be reduced through the improvement in the experimental designs and HTP technologies (Araus and Cairns 2014). Certainly, the potential of applying HTP platforms in GS has been demonstrated and more traits from HTP platforms will become accessible in the near future.

In addition, researchers have investigated different models to extract the information of big data collected from HTP platforms that have a different structure in terms of response variables, for example, the time-series data. Rutkoski et al. (2016) utilized a repeatability model for secondary traits by considering each time point within a growth stage as a repetitive collection for the same trait. Sun et al. (2017) proposed a random regression model that is able to capture the trait evolution during the growth stages. Besides, functional regression analysis was applied to develop prediction equations for yield and other traits using hyperspectral crop image data together with genomic information by Montesinos-López et al. (2017a, b), in which the method demonstrate similar prediction accuracy in most cases; however, its predictive power is superior to conventional regression techniques for some particular cases.

Nowadays, breeders have gained valuable insights into the implementation of GS in breeding, but those applications were mostly limited to the same population within a breeding cycle (Michel et al. 2016). Auinger et al. (2016) pointed that the GS predictive ability obtained within cycle could be considered as the upper limit value since those materials within the same cycle share close family relatedness, similar environmental and climatic conditions. However, when predicting across multiple growing cycles, it is expected that the genetic relationships between families in the population would be reduced, and the phenotypic data would be more variable due to the external environments, as a result, those two factors reduce the genomic prediction accuracy across cycles. Several researchers have proposed approaches to increase the prediction accuracy for GS across cycles. Auinger et al. (2016) investigated the genomic prediction accuracy for grain yield and other traits across multiple breeding cycles in rye, and suggested that prediction accuracy across cycles could be improved by increasing sample size when the different cycles shared a sufficient number of common parents. In contrast, Michel et al. (2016) have evaluated the genomic prediction for grain yield, protein content, and protein yield across five independent breeding cycles in wheat, they found that dropping outlier cycles or environments had a negligible effect on the genomic prediction accuracy. Herein, we report an approach to improve the prediction accuracy for GS across cycles by utilizing the secondary trait collected from HTP platforms. The objectives of this study were to: (1) compare the predictive ability of grain yield within cycle and across cycles; (2) determine the ability of secondary traits in improving genomic prediction accuracy across populations and cycles; (3) evaluate the appropriate and optimum stage of secondary trait to be collected to improve the prediction accuracy for grain yield across cycles in different environments.

Methods and materials

Population and phenotyping

We generated phenotypic data from three different populations that were also grown in three different crop cycles, 2013–2014, 2014–2015, and 2015–2016, as part of the elite yield trials conducted by the International Wheat and Maize Improvement Center (CIMMYT) in Norman E Borlaug Research Station, Cuidad Obregon, Mexico. Hereafter, cycles 2013–2014, 2014–2015, and 2015–2016 will be referred to as cycles 2014, 2015, and 2016. Each cycle comprised 1094 lines including 1092 unique genotypes and two common checks for a total of 3282 lines for all three populations. Within each cycle, lines were grouped into 39 trials, and each trial there were 28 unique lines and two checks in an alpha-lattice design with three replicates and six blocks. Grain yield was collected for all lines in three cycles. Days to heading, which was recorded as number of days from planting to 50% of spikes emerged from the flag leaf, were calculated for the first replicate of each trial in cycles 2014 and 2016 and for all three replicates in cycle 2015. Canopy temperature (CT) and green NDVI (GNDVI) were collected by the hyperspectral and thermal cameras in an aircraft flown over multiple wheat growth stages (Rutkoski et al. 2016). Days to phenotyping (phenotyping days) for CT and GNDVI were calculated as the phenotype collecting date for CT or GNDVI minus the planting date within each cycle. The planting date of lines and the phenotyping date for secondary traits varied in each growing cycle resulting in different phenotyping days for secondary traits in each cycle (Supplemental Fig. 1). We analyzed phenotypic data for three growing cycles in three diverse field conditions: optimal, heat, and drought, and the field conditions (plot and irrigation), planting date, the average days to heading, as well as the climatic information for each cycle in each environment are summarized in Table 1.

Table 1 Field condition and climatic summary for each cycle in optimal, late heat, and drought environments, respectively

Genotyping

Genotyping by sequencing (GBS, Poland et al. 2012a, b) was applied for the genome-wide genotyping. Single nucleotide polymorphisms (SNPs) were called using the TASSEL GBS pipeline (Glaubitz et al. 2014) and the Chinese Spring reference genome (International Wheat Genome Sequencing Consortium, 2014), and they were filtered based on the following criteria: the markers were removed if more than 80% of the individuals had missing data for a SNP, or if more than 20% of individuals were heterozygous for a SNP, and lines that had more than 80% missing markers were removed. In addition, markers were also filtered for minor allele frequency less than 0.01, and missing data were imputed based on the mean of marker, resulting in a total of 18,728 GBS SNP markers for 2960 individuals.

Statistical models

We applied a two-step analysis GS strategy in this study. Different statistical models were used to derive best linear unbiased predictions (BLUPs) of each genotype for grain yield, CT, and GNDVI, separately, in the first step. The BLUPs of grain yield were predicted using the first replicate, and the BLUPs of secondary traits were predicted from the rest of two replicates (Sun et al. 2017). In addition, since the lines in this data set are replicated the same number of times for each cycle within each field condition, differential shrinkage of the BLUPs used as the dependent variable is not an issue for the genomic prediction in the second step.

Grain yield

Best linear unbiased predictions (BLUPs) of each genotype for grain yield were calculated using a mixed model for each cycle in each environment, separately, and BLUPs for grain yield were adjusted for each cycle and environment by including days to heading as a fixed effect in the model (1):

$${\mathbf{y}} = {\mathbf{Xb}} + {\mathbf{Zg}} + {\mathbf{Wt}} + {\mathbf{Qp}} + {\mathbf{e}}$$
(1)

where \({\mathbf{y}}\) is the vector of observations for grain yield, \({\mathbf{X}}\), \({\mathbf{Z}}\), \({\mathbf{W}}\), and \({\mathbf{Q}}\) are incidence matrices corresponding to the fixed effect as days to heading (\({\mathbf{b}}\)), random genetic effect (\({\mathbf{g}}\)), random environmental trial effect (\({\mathbf{t}}\)), and random environmental block effects (\({\mathbf{p}}\)), and \({\mathbf{e}}\) is the random residual errors. The variance and covariance structures are based on the following assumptions: \({\mathbf{g}} \sim \,N\left( {0, {\mathbf{I}}\sigma_{\text{g}}^{2} } \right)\), \({\mathbf{t}}\sim\,N\left( {0, {\mathbf{I}}\sigma_{\text{t}}^{2} } \right)\), \({\mathbf{p}}\sim\,N\left( {0, {\mathbf{I}}\sigma_{\text{p}}^{2} } \right)\), and \({\mathbf{e}}\sim\,N\left( {0, {\mathbf{I}}\sigma_{\text{e}}^{2} } \right)\), \(\sigma_{\text{g}}^{2}\) is the genetic variance, \(\sigma_{\text{t}}^{2}\) and \(\sigma_{\text{p}}^{2}\) are environmental variances, \(\sigma_{\text{e}}^{2}\) is the residual variance, and \({\mathbf{I}}\) is the identity matrix.

Secondary trait

For secondary traits, CT and GNDVI from HTP platforms were collected over wheat growth stages and were considered as longitudinal data. BLUPs of each genotype for secondary traits were predicted by fitting a random regression cubic smoothing spline model for each trait within each year of each environment, separately. Sun et al. (2017) has applied a random regression model to capture the change of a secondary trait continually over wheat growth stages. A covariance at or between each time point can be fitted in the random regression model using cubic smoothing spline. A cubic smoothing spline is a curve that is joined continuously by piecewise cubic functional segments, and each joint in the curve is referred to as a knot (Meyer 2005; White et al. 1999). More details about random regression models could be found in Meyer (2005). In this model, for each cycle within each environment, the number of knots (q) was the same as the number of time points (n) for each secondary trait in each environment. The matrix notation for RR model is (DeGroot et al. 2007; Mrode 2005; White et al. 1999):

$${\mathbf{y}} =\, {\mathbf{Xb}} + {\mathbf{Z}}_{{\mathbf{s}}} {\mathbf{s}} + {\mathbf{W}}_{{\mathbf{g}}} {\mathbf{g}} + {\mathbf{Z}}_{{\mathbf{g}}} {\mathbf{g}}_{{\mathbf{s}}} + {\mathbf{W}}_{{\mathbf{t}}} {\mathbf{t}} + {\mathbf{Z}}_{{\mathbf{t}}} {\mathbf{t}}_{{\mathbf{s}}} + {\mathbf{W}}_{{\mathbf{r}}} {\mathbf{r}} + {\mathbf{Z}}_{{\mathbf{r}}} {\mathbf{r}}_{{\mathbf{s}}} + {\mathbf{W}}_{{\mathbf{p}}} {\mathbf{p}} + {\mathbf{Z}}_{{\mathbf{p}}} {\mathbf{p}}_{{\mathbf{s}}} + {\mathbf{e}}$$
(2)

Here \({\mathbf{y}}\) is the vector of observations for secondary traits, \({\mathbf{X}}\) is the incidence matrix corresponding to fixed effects which is phenotyping days in the model, \({\mathbf{b}}\) is the vector for fixed effect. The matrices \({\mathbf{Z}}_{{\mathbf{s}}}\), \({\mathbf{Z}}_{{\mathbf{g}}}\), \({\mathbf{Z}}_{{\mathbf{t}}}\), \({\mathbf{Z}}_{{\mathbf{r}}}\), \({\mathbf{Z}}_{{\mathbf{b}}}\) are incidence matrices of the spline coefficients for overall spline, genetic effect, and environmental effects including trial, replicate, and block effects. \({\mathbf{s}}\) is the overall spline parameter with length (q−2), \({\mathbf{g}}_{{\mathbf{s}}}\) is the spline deviation parameter for each genotype with length (q−2) × m where m is the number of genotypes, and \({\mathbf{t}}_{{\mathbf{s}}}\) is the spline deviation parameter for trial effects with length (q−2) × t where t is the number of trial, \({\mathbf{r}}_{{\mathbf{s}}}\) is the spline deviation parameter for replicates nested within the trial effects with length (q−2) × r × t where r is the number of replicates, and \({\mathbf{p}}_{{\mathbf{s}}}\) is the spline deviation parameters for block effect nested within replicate and trial with length (q−2) × p × r × t where p is the number of blocks. The matrices \({\mathbf{W}}_{{\mathbf{g}}}\), \({\mathbf{W}}_{{\mathbf{t}}}\), \({\mathbf{W}}_{{\mathbf{r}}}\), \({\mathbf{W}}_{{\mathbf{b}}}\) are incidence matrices of linear coefficient relating to random genetic, random environmental trial, replicate, and block effects. \({\mathbf{g}}\) is the vector of genetic effect for each genotype including genetic intercept (\(g_{\text{i}}\)) and slope parameters (\(g_{\text{sl}}\)) with length of 2 m, \({\mathbf{t}}\), \({\mathbf{r}}\), and \({\mathbf{p}}\) are vectors of environmental (trial, replicate, and block) effects including environmental intercept (\(t_{\text{i}}\), \(r_{\text{i}}\), \(p_{\text{i}}\)) and slope (\(t_{\text{sl}}\), \(r_{\text{sl}}\), \(p_{\text{sl}}\)) parameters with length of 2t, 2r × t, and 2p × r × t, separately. \({\mathbf{e}}\) is the residual effect (DeGroot et al. 2007).

The variance components are assumed as: \({\mathbf{s}}\sim\,N\left( {{\mathbf{0}},{\mathbf{D}}\sigma_{s}^{2} } \right)\), \({\mathbf{g}}_{{\mathbf{s}}} \sim\,N\left( {{\mathbf{0}},{\mathbf{D}}\sigma_{gs}^{2} } \right)\), \({\mathbf{t}}_{{\mathbf{s}}} \sim\,N\left( {0,{\mathbf{D}}\sigma_{ts}^{2} } \right)\), \({\mathbf{r}}_{{\mathbf{s}}} \sim\,N\left( {0,{\mathbf{D}}\sigma_{rs}^{2} } \right)\), \({\mathbf{p}}_{{\mathbf{s}}} \sim\,N\left( {0,{\mathbf{D}}\sigma_{ps}^{2} } \right)\), \({\mathbf{g}}\sim\,N\left( {0,{\mathbf{I}} \otimes {\mathbf{K}}_{g} } \right)\), \({\mathbf{t}}\sim\,N\left( {0,{\mathbf{I}} \otimes {\mathbf{K}}_{t} } \right)\), \({\mathbf{r}}\sim\,N\left( {0, {\mathbf{I}} \otimes {\mathbf{K}}_{r} } \right)\), \({\mathbf{p}}\sim\,N\left( {0, {\mathbf{I}} \otimes {\mathbf{K}}_{p} } \right)\), \({\mathbf{e}}\sim\,N\left( {0, {\mathbf{I}}\sigma_{e}^{2} } \right)\), where \({\mathbf{D}}\) is the identity matrices for splines with dimension (q−2) × (q−2), \({\mathbf{I}}\) is the identity matrices with different orders corresponding to genetic, environmental (trial, replicate and block), and residual effects, \(\otimes\) denotes the Kronocker product. \({\mathbf{K}}_{g}\), \({\mathbf{K}}_{t}\), \({\mathbf{K}}_{r}\), \({\mathbf{K}}_{p}\) are unstructured covariance matrices: \({\mathbf{K}}_{{\mathbf{g}}} = \left[ {\begin{array}{*{20}c} {\sigma_{{g_{\text{i}} }}^{2} } & {\sigma_{{g_{\text{i}} g_{\text{sl}} }} } \\ {\sigma_{{g_{\text{sl}} g_{\text{i}} }} } & {\sigma_{{g_{\text{sl}} }}^{2} } \\ \end{array} } \right]\), \({\mathbf{K}}_{{\mathbf{t}}} = \left[ {\begin{array}{*{20}c} {\sigma_{{t_{\text{i}} }}^{2} } & {\sigma_{{t_{\text{i}} t_{\text{sl}} }} } \\ {\sigma_{{t_{\text{sl}} t_{\text{i}} }} } & {\sigma_{{t_{\text{sl}} }}^{2} } \\ \end{array} } \right]\), \({\mathbf{K}}_{{\mathbf{r}}} = \left[ {\begin{array}{*{20}c} {\sigma_{{r_{\text{i}} }}^{2} } & {\sigma_{{r_{{\rm i}} r_{{\rm sl}} }} } \\ {\sigma_{{{{\rm r}}_{{\rm sl}} {\text{r}}_{{\rm i}} }} } & {\sigma_{{r_{{\rm sl}} }}^{2} } \\ \end{array} } \right]\), and \( {\mathbf{K}}_{{\mathbf{p}}} = \left[ {\begin{array}{*{20}c} {\sigma_{{p_{\text{i}} }}^{2} } & {\sigma_{{p_{{\rm i}} p_{{\rm sl}} }} } \\ {\sigma_{{p_{{\rm sl}} p_{{\rm i}} }} } & {\sigma_{{p_{{\rm sl}} }}^{2} } \\ \end{array} } \right] \). where subscripts i and sl represent intercept and slope, separately.

The BLUP for each line at each time point was calculated as the following:

$$ {\text{BLUP}} = {\mathbf{W}}_{{\mathbf{g}}} {\mathbf{g}} + {\mathbf{Z}}_{{\mathbf{g}}} {\mathbf{g}}_{{\mathbf{s}}} $$

The method to calculate \({\mathbf{Z}}_{{\mathbf{g}}}\) was described in White et al. (1999). The ‘predict’ function implemented in ASReml-R could also be utilized to calculate the BLUP for each line at each time point by including \({\mathbf{W}}_{{\mathbf{g}}} {\mathbf{g}}\) and \({\mathbf{Z}}_{{\mathbf{g}}} {\mathbf{g}}_{{\mathbf{s}}}\) terms only. The BLUP was predicted at the same time points individually for 3 years in each environment, and those time points were selected within the range of available phenotyping days across three cycles (Supplemental Fig. 1). An averaged BLUP across all time points for each cycle was calculated as well.

Heritability and correlation

Variance components for narrow sense heritability for each secondary trait and grain yield in each environment were estimated using the following model:

$${\mathbf{y}} = {\mathbf{Xb}} + {\mathbf{Zg}} + {\mathbf{e}}$$
(3)

where \({\mathbf{y}}\) is the BLUPs of genotypes for secondary traits, or BLUPs of genotypes for grain yield, \({\mathbf{X}}\) and \({\mathbf{Z}}\) are incidence matrices corresponding to the fixed effect (\({\mathbf{b}}\)) and random genetic effect (\({\mathbf{g}}\)), and \({\mathbf{e}}\) is the random residual errors. The variance and covariance structures are based on the following assumptions: \({\mathbf{g}}\sim\,N(0, {\mathbf{G}}\sigma_{\text{a}}^{2}\)), where \({\mathbf{G}}\) is the genomic relationship matrix, and \(\sigma_{\text{a}}^{2}\) is the additive genetic variance, and \({\mathbf{e}}\sim\,N\left( {0, {\mathbf{I}}\sigma_{\text{e}}^{2} } \right)\), \(\sigma_{\text{e}}^{2}\) is the residual variance, and \({\mathbf{I}}\) is the identity matrix. Narrow sense heritability was calculated as: \(h^{2} = \frac{{\sigma_{a}^{2} }}{{\sigma_{a}^{2} + \sigma_{e}^{2} }}\).

Variance and covariance components for correlations were estimated using the bivariate model for each year in each environment:

$$\left[ {\begin{array}{*{20}c} {{\mathbf{y}}_{1} } \\ {{\mathbf{y}}_{2} } \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {{\mathbf{X}}_{1} } & 0 \\ 0 & {{\mathbf{X}}_{2} } \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} {{\mathbf{b}}_{1} } \\ {{\mathbf{b}}_{2} } \\ \end{array} } \right] + \left[ {\begin{array}{*{20}c} {{\mathbf{Z}}_{1} } & 0 \\ 0 & {{\mathbf{Z}}_{2} } \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} {{\mathbf{g}}_{1} } \\ {{\mathbf{g}}_{2} } \\ \end{array} } \right] + \left[ {\begin{array}{*{20}c} {{\mathbf{e}}_{1} } \\ {{\mathbf{e}}_{2} } \\ \end{array} } \right]$$
(4)

where \({\mathbf{y}}\) are BLUPs of genotypes for grain yield and secondary traits, and subscripts 1 and 2 represent trait 1 (grain yield) and trait 2 (one of the secondary traits, CT or GNDVI), separately, \({\mathbf{X}}\) and \({\mathbf{Z}}\) are the fixed and random effects design matrix, individually, and \({\mathbf{b}}\), \({\mathbf{g}}\), and \({\mathbf{e}}\) are vectors of fixed effects, random genetic, and residual effects for each trait, respectively. Variance components were estimated by assuming \(\left[ {\begin{array}{*{20}c} {{\mathbf{g}}_{1} } \\ {{\mathbf{g}}_{2} } \\ \end{array} } \right]\sim\,N\left( {0, {\mathbf{H}} \otimes {\mathbf{G}}} \right)\), where \({\mathbf{G}}\) is the genomic relationship matrix, and \({\mathbf{H}}\) is the genetic variance–covariance matrix for traits. In addition, \(\left[ {\begin{array}{*{20}c} {{\mathbf{e}}_{1} } \\ {{\mathbf{e}}_{2} } \\ \end{array} } \right]\sim\,N\left( {0, {\mathbf{I}} \otimes {\mathbf{R}}} \right)\), where \({\mathbf{I}}\) is an identity matrix, and \({\mathbf{R}}\) is the residual variance–covariance matrix between traits. Both \({\mathbf{H}}\) and \({\mathbf{R}}\) are assumed as unstructured.

Genetic correlations between secondary traits and grain yield were calculated as:

$$r_{{{\text{g}}\left( {{\text{ST}}, {\text{GRYLD}}} \right)}} = \frac{{{\text{cov}}_{\text{g}} \left( {{\text{ST}}, {\text{GRYLD}}} \right)}}{{\sqrt {{\text{var}}_{\text{g}} \left( {\text{ST}} \right){\text{var}}_{\text{g}} \left( {\text{GRYLD}} \right)} }}$$

where \(r_{{{\text{g}}\left( {{\text{ST}}, {\text{GRYLD}}} \right)}}\) is the genetic correlation between secondary trait (either CT or GNDVI) and grain yield, \({\text{var}}_{\text{g}} \left( {\text{ST}} \right)\) and \({\text{var}}_{\text{g}} \left( {\text{GRYLD}} \right)\) are the genetic variances of secondary trait and grain yield, individually; \({\text{cov}}_{\text{g}} \left( {{\text{ST}}, {\text{GRYLD}}} \right)\) is the genetic covariance between a secondary trait and grain yield.

Cross-validation

In the second step of GS, the BLUPs of individuals except checks for secondary traits and grain yield were utilized as the dependent variables in our genomic prediction models. The predictive ability for grain yield was investigated in two different genomic prediction models: univariate (UV) and bivariate (BV) prediction models. The UV model was the same as model (3), where \({\mathbf{y}}\) is the BLUPs of genotypes only for the grain yield. The BV genomic prediction model was employed to identify the genomic predictive ability for grain yield after including secondary trait in the model fitting, in which the model was the same as model (4). Fivefold cross-validation was applied for all genomic predictions. The predictive ability for grain yield for three cycles was identified in two different ways: within cycle and across cycles. Thus, four different types of cross-validation schemes were evaluated based on different objectives:

  1. 1.

    UV prediction model within cycle: the data within a growing cycle were randomly divided into five equally sized folds, and using the grain yield data of 80% of the lines as the training population to predict the grain yield for the rest of 20% of the lines as the testing population within each growing cycle.

  2. 2.

    BV genomic prediction model within cycle: the data within a growing cycle were randomly divided into five equally sized folds, and the grain yield of 20% of the lines as the testing population was predicted by the grain yield data of 80% of the lines as the training population and secondary trait data of all lines in both training and testing populations within each cycle.

  3. 3.

    UV prediction model across cycles: one of the cycles was considered as the training cycle, and the other cycle was considered as the testing cycle. The data in the testing cycle were randomly separated into five equally sized folds, and for every fold, the grain yield of 20% of randomly selected lines in the testing cycle was predicted by the grain yield data of all lines in the training cycle.

  4. 4.

    BV prediction model across cycles: one of the cycles was considered as the training cycle, and the other cycle was considered as the testing cycle. The data in the testing cycle were equally and randomly separated into five folds, and for every fold, the grain yield of 20% of randomly selected lines in the testing cycle was predicted by the grain yield and secondary traits of all lines in the training cycle and the secondary trait of those 20% of lines in the testing cycle.

For each fold, the predictive ability was calculated as the Pearson correlation between the BLUPs of grain yield and the estimated breeding values (EBVs) of grain yield from genomic prediction models of lines in the testing population based on genomic relationship matrix. In addition, cross-validation was conducted for each field condition, separately. The percentage of the improvement in GS with secondary traits was calculated as the predictive ability of GS with secondary trait (BV model) minus the predictive ability of GS with grain yield only (UV model) and then divided by the absolute value of the predictive ability of GS with grain yield only (UV model).

Software and package

All data analyses were implemented in the R environment (R Development Core Team 2010; Butler et al. 2009), and all models were fitted in ASReml-R (VSN International Ltd). Genomic relationship matrix was calculated according to equation 15 in Endelman and Jannink (2012), using the R package rrBLUP (Endelman 2011).

Results

Phenotypic data summary

Grain yield varied in different environments: the optimal environment produced the highest average grain yield ranging from 6.14 to 7.19 t/ha, followed by the drought environment with 3.28 to 4.51 t/ha, and last, the heat environment yields only 2.33 to 3.84 t/ha (Table 2). In the optimal environment, cycle 2016 had the highest yields, but in the stressed environments, cycle 2015 showed the best performance (Table 2). The grain yield of two cycles showed a moderate heritability ranging from 0.23 to 0.46; however, grain yield in cycle 2014 was highly heritable in the heat environment (0.75, Table 2). The heritability of grain yield was mostly lower than those of secondary traits, CT and GNDVI, ranging from 0.39 to 0.78 (Table 3). For cycle 2015 in the optimal and drought environments, the heritabilities of GNDVI, phenotyped at different time points, increased from 0.60 to 0.75 over growth stages. As a comparison, the heritabilities of secondary traits, CT and GNDVI, for the other cycles were similar over growth stages in all three environments (Table 3). In contrast to the heritabilities of secondary traits, the correlations between secondary traits and grain yield within each cycle varied significantly across the growth stages (Table 4), suggesting that the correlations of secondary traits and grain yield played the dominant role in influencing the predictive ability of GS for grain yield. Consistent with previous studies, our results indicated that CT and grain yield were negatively correlated, whereas the GNDVI and grain yield were positively correlated. In addition, our results showed that the heat environment gave rise to the highest correlation between grain yield and both CT and GNDVI (Table 4).

Table 2 Mean with standard error and heritability of grain yield for each cycle in optimal, late heat, and drought environments, respectively
Table 3 Heritabilities of secondary traits and grain yield at different phenotyping days over wheat growth stages for three cycles in different environments
Table 4 Correlations between secondary traits and grain yield at different phenotyping days over wheat growth stages for three cycles in different environments

Genomic prediction ability

Comparison between within cycle and across cycles

In three environments, the GS predictive ability was moderate for grain yield within each cycle, from 0.13 to 0.34 and with an average of 0.24 (Figs. 1, 2, 3, 2014/2015/2016_UV). In contrast to the predictive ability of grain yield in the optimal and drought environments, the heat environment of cycle 2015 was characterized as the worst and was largely determined by the heritability of grain yield. With regard to the genomic prediction for grain yield across cycles, they were evaluated reciprocally across three cycles. Compared to within cycle, the across cycles predictive abilities for grain yield were much lower—from − 0.02 to 0.17 with an average of 0.09 (Figs. 1, 2, 3, 15-14/16-14/14-15/16-15/14-16/15-16_UV)—in three environments, in which cycles from 2014 to 2016 even showed negative or zero predictive abilities for grain yield in the optimal environment.

Fig. 1
figure 1

Comparison between within cycle analysis and across cycles analysis for genomic selection with secondary trait or without secondary trait in the optimal environment. Model: UV: univariate model; BV: bivariate model with the average best linear unbiased predictions of secondary trait; Cycle: 2014_UV/2015_UV/2016_UV: genomic selection within each cycle using univariate model; 2014_BV/2015_BV/2016_BV: genomic selection within each cycle using bivariate model; 15-14_UV/16-14_UV/14-15_UV/16-15_UV/14-16_UV/15-16_UV: genomic prediction across cycles using univariate model, where the first number represent the training cycle, the second number represent the testing cycle; 15-14_BV/16-14_BV/14-15_BV/16-15_BV/14-16_BV/15-16_BV: genomic prediction across cycles using bivariate model, where the first number represents the training cycle, the second number represents the testing cycle

Fig. 2
figure 2

Comparison between within cycle analysis and across cycles analysis for genomic selection with secondary trait or without secondary trait in the drought environment. Model: UV: univariate model; BV: bivariate model with the average best linear unbiased predictions of secondary trait; Cycle: 2014_UV/2015_UV/2016_UV: genomic selection within each cycle using univariate model; 2014_BV/2015_BV/2016_BV: genomic selection within each cycle using bivariate model; 15-14_UV/16-14_UV/14-15_UV/16-15_UV/14-16_UV/15-16_UV: genomic prediction across cycles using univariate model, where the first number represent the training cycle, the second number represent the testing cycle; 15-14_BV/16-14_BV/14-15_BV/16-15_BV/14-16_BV/15-16_BV: genomic prediction across cycles using bivariate model, where the first number represents the training cycle, the second number represents the testing cycle

Fig. 3
figure 3

Comparison between within cycle analysis and across cycles analysis for genomic selection with secondary trait or without secondary trait in the heat environment. Model: UV: univariate model; BV: bivariate model with the average best linear unbiased predictions of secondary trait; Cycle: 2014_UV/2015_UV/2016_UV: genomic selection within each cycle using univariate model; 2014_BV/2015_BV/2016_BV: genomic selection within each cycle using bivariate model; 15-14_UV/16-14_UV/14-15_UV/16-15_UV/14-16_UV/15-16_UV: genomic prediction across cycles using univariate model, where the first number represent the training cycle, the second number represent the testing cycle; 15-14_BV/16-14_BV/14-15_BV/16-15_BV/14-16_BV/15-16_BV: genomic prediction across cycles using bivariate model, where the first number represents the training cycle, the second number represents the testing cycle

Predictive ability with secondary traits

When including CT or GNDVI in the genomic prediction model for grain yield within cycle, the predictive abilities improved by 18% on average for three cycles in all three environments, in which CT increased accuracy by 26% and GNDVI by 10% (Figs. 1, 2, 3, 2014/2015/2016_BV). This is consistent with our previous study (Sun et al. 2017) which concluded that the secondary traits can improve the GS predictive ability within the same growing cycle. Furthermore, our results also showed that the predictive ability across cycles was largely improved by as much as 146% on average (Figs. 1, 2, 3, 15-14/16-14/14-15/16-15/14-16/15-16_BV). CT improved the predictive ability by an average of 202% and GNDVI by 90%. Note that the large improvement for predictive ability in terms of percent can be partly ascribed to the low or negative predictive ability in our populations resulting from the absence of secondary traits across cycles. In addition, for each environment, the group with secondary traits improved most in the optimal environment and least in the drought environment, in particular, no visible improvement for GS was observed either within cycle or across cycles by using GNDVI in the drought environment.

The optimum date

CT and GNDVI from HTP platforms were phenotyped over the course of wheat growth stages, and the predictive ability of secondary traits was investigated at selected phenotyping time points that allow breeders to determine the optimal time points to utilize for breeding value estimation and selection. The results showed that the predictive ability for grain yield was improved by using secondary traits in both optimal and drought environment, whereas improvement was less evident in the heat environment probably due to a limited number of time points (Figs. 4, 5, 6). Secondary traits data collection from the HTP platforms started from 45 days after planting, and periodically phenotyping lasted more than 2 months (January to March) for the optimal and drought environments, and 1 month (April to May) for the heat environment (Supplemental Fig. 1). Based on the available phenotyping dates for secondary traits in our populations, our study suggested that the optimum timings for CT and GNDVI phenotyping were about 100 to 120 days after planting for the optimal and drought environments, and about 70 days for the heat environment. Given that the planting date in the heat environment typically started 3 months later than the other two environments, all three environments shared a similar optimum timing, and that is around late March to early April. Additionally, we also quantified the predictive ability of using secondary trait for GS within each cycle; likewise, our results indicated the optimum date of phenotyping for use in genomic prediction was late March, except for cycle 2016 in the optimal condition (Supplemental Figs. 2–4).

Fig. 4
figure 4

Predictive ability of secondary traits to grain yield in different time points across years in the optimal environment. Date: phenotyping days after planting; 60: predictive ability of grain yield with secondary traits collected at 60 days after planting using bivariate genomic selection model, same for other numbers; UV: predictive ability of grain yield without secondary traits using univariate genomic selection model

Fig. 5
figure 5

Predictive ability of secondary traits to grain yield in different time points across years in the drought environment. Date: phenotyping days after planting; 65: predictive ability of grain yield with secondary traits collected at 65 days after planting using bivariate genomic selection model, same for other numbers; UV: predictive ability of grain yield without secondary traits using univariate genomic selection model

Fig. 6
figure 6

Predictive ability of secondary traits to grain yield in different time points across years in the heat environment. Date: phenotyping days after planting; 67: predictive ability of grain yield with secondary traits collected at 67 days after planting using bivariate genomic selection model, same for other numbers; UV: predictive ability of grain yield without secondary traits using univariate genomic selection model

Discussion

Genomic prediction across cycles without secondary traits

Often, the genetic relationships between the observed lines in the training population and unobserved lines or selection candidates in the testing population (Crossa et al. 2017) act as one of the main factors that govern the accuracy of GS. In our population, the principle components analysis of genetic relationships shows no evidence of strong population structures for the three growth cycles (Fig. 7), which agreed with our previous expectation on populations from CIMMYT because lines in the three cycles are derived from several of the same parents and thus possess the close family relatedness features. Previous studies have indicated that common ancestors in both training and testing cycles can improve the genomic prediction across cycles (Auinger et al. 2016). Despite the inherent family relatedness between training and testing cycles, the predictive ability for grain yield across cycles, as compared to the one within each cycle, were generally low in this study. In addition, the previous studies indicated that increasing the training population size increased the GS accuracy for the trait controlled by many genes with minor effects (Asoro et al. 2011; Hoffstetter et al. 2016; Lorenz et al. 2012). We evaluated the across cycles predictive ability of GS by using two of three cycles as the training population to predict the rest cycle, our results suggest the accuracy for the testing cycle remained similar without visible improvement (Supplemental Table 1). This may be explained by the limitation of the methodology, where the ability of further improving the accuracy based increasing the population size has a plateau (Asoro et al. 2011), and on the other hand, training population size has less effect on the training population composed of related lines compared to the one comprised of unrelated lines (Asoro et al. 2011; Rutkoski et al. 2015). Therefore, for populations sharing related lines but with low GS accuracy across populations, utilizing secondary traits highly correlated with the trait of interest can be a useful approach to improve the GS accuracy across cycles and populations. This study indicated that secondary traits can improve the genomic prediction across cycles and revealed the optimum time point to collect secondary traits. The synergy of GS and HTP platforms offer the opportunity to increase the genetic gain by reducing the breeding time and labor cost per cycle. Meanwhile, by taking advantage of secondary traits collected at multiple time points from HTP platforms, breeders can select the optimum and the appropriate phenotyping time for the secondary trait depending on breeding objectives and resources accessible in the practical breeding programs.

Fig. 7
figure 7

Principle component analysis based on genomic relationship matrix. Each group represent one wheat growing cycle

Secondary traits improve predictive ability for grain yield across cycles

Previous studies (Rutkoski et al. 2016; Sun et al. 2017) together with this work demonstrated that including secondary traits in the multivariate genetic prediction models significantly improved genomic predictive ability for grain yield within the same population or cycle. The advantage of using secondary traits to improve GS for grain yield lies in the genetic correlations between the secondary traits and grain yield (Jia and Jannink 2012). CT generally demonstrated superior predictive ability for grain yield compared to GNDVI because of its higher correlations with grain yield as shown in Figs. 1, 2, 3 and Table 4. For GS across cycles, the relationships between the improved predictive ability and the correlations of grain yield with secondary traits were investigated, where the secondary traits were collected from three types of populations, training cycle only, testing cycle only, and both training and testing cycles (Fig. 8). For CT in the stressed environments and for GNDVI in all three environments, our results indicated that the improved predictive ability can be mainly ascribed to the correlations between grain yield and secondary traits from the population of the testing cycle only (Supplemental Table 2). This illustrates the difficulty of genomic prediction across cycles or environments in the stressed environments, mainly because of considerable environmental variances and unpredictable Genotype x Environment (G × E) between cycles, such as the severity and the time of the stress (Araus 2002; Ovenden et al. 2018). In this regard, the correlation between secondary traits and grain yield in the testing cycle governs the genomic prediction accuracy for the grain yield of unobserved lines across cycles. By contrast, the improvement in predictive ability across cycles in the optimal environment can be largely attributed to the correlations between secondary traits and grain yield in the training population, as exemplified by CT (Fig. 8; Supplemental Table 2).

Fig. 8
figure 8

Relationship between the improved predictive ability and the correlations between the secondary traits and grain yield improved predictive ability, predictive ability for grain yield with secondary trait minus without secondary trait; pop: the correlations between the secondary traits and grain yield from the population including both training and testing; test: the correlation between the secondary traits and grain yield from the testing population only; train: the correlation between secondary traits and grain yield from the training population only

The optimum time for genomic prediction using secondary traits

In order to efficiently apply the secondary traits to increase genomic prediction accuracy across cycles, determining the optimum collection time for the secondary traits in the testing cycle is essential. Among CIMMYT wheat growing cycles and available time points, our study suggested that the optimum stage of collecting secondary traits was between late March and early April in all three field conditions, despite the fact that there was no single phenotyping date. Moreover, even though the predictive ability from the secondary traits at early time points was not as high as the later stages, they still had potential advantages in increasing the genetic gain per cycle. For example, using secondary traits collected before heading date improved the predictive ability by 89% on average. Hence, selecting the optimum collection time for secondary traits allows the breeder to maximize genetic gain of GS, whereas collecting secondary traits at the early time points of secondary traits enable breeders to eliminate lines before harvest saving time and labor costs. Therefore, these results are valuable for breeders to optimize the resources allocations in the practical breeding programs.

The comparison between GNDVI and CT

GNDVI failed to improve the predictive ability for grain yield in the drought environment and was consistently inferior to CT for genomic prediction of grain yield in all environments. The inconsistency of correlations with grain yield across different environments or cycles is a major barrier for the application of GNDVI in GS across cycles. GNDVIs are usually positively correlated with the grain yield; however, the correlation becomes negative under the drought-stressed environments (Rutkoski et al. 2016; Sun et al. 2017) for the reason that the plants probably tend to avoid or escape the drought conditions at an early stage. Therefore, GNDVI was not useful for GS for grain yield across environments when the environments or management in the training population differs significantly from the testing ones. Compared to the other two cycles, the drought environment defined in our study for cycle 2015 suffered from accumulated precipitations, thus presenting positive correlations between GNDVI and grain yield (results not shown), which is inconsistent with the 2014 and 2016 cycles. Adjusting days to heading for grain yield provided a partial solution to eliminate the discrepancy in the drought environment (Table 4); however, the advantage of GNDVI in improving the genomic prediction accuracy for grain yield across cycles was compromised due to precipitation differences across cycles (Fig. 5). Therefore, without knowing the environmental and climatic conditions for different cycles or environments, CT from HTP platforms was superior to GNDVI in terms of predicting grain yield across cycles or environments.

Future directions

Even though no population structure existed in three cycles based on the principle component analysis of genetic relationships (Fig. 7), our observations revealed the low predictive ability for grain yield across cycles in the absence of secondary traits. Accordingly, the genotype-by-environment (G × E) interactions played the major role that impeded the prediction accuracy across cycles in this population. The genotypes behaved differently in response to the environments because of G × E interactions, enhancing the phenotypic variation across environments and lowering the accuracy for genomic prediction across environments or cycles (Heslot et al. 2014). For example, based on the climatic data (Table 1), the considerable precipitations have mitigated the stress environments for cycle 2015, leading to the higher grain yield than other two growing cycles. A number of studies have indicated that including G × E interaction terms in different models improve the predictive accuracy, as can be exemplified by G × E interaction kernel regression model (Cuevas et al. 2017), crop modeling into GS (Heslot et al. 2014), reaction norm model (Jarquín et al. 2014), where the accuracy was improved by more than 10% on average (Crossa et al. 2017). Recently, Montesinos-López et al. (2017a, 2018) proposed Bayesian functional regression models to predict grain yield, in which two types of basis B-splines and Fourier and all wavelengths of the reflectance data from the HTP platforms are involved for analysis. They found that including the Band × E interaction term in the calculation provides the best accuracy (2017b). Therefore, the combination of both approaches, G × E interactions and secondary traits, demonstrate promising potential to GS because of their remarkable ability in improving the genomic prediction accuracy by involving the genetic correlations between environments (Falconer and Mackay 1996; Heslot et al. 2014) and employing the genetic correlations between traits (Jia and Jannink 2012).

Conclusion

In conclusion, our studies demonstrated that the prediction accuracies across cycles were improved by including secondary traits in the genomic prediction models, and predicted the optimum date for secondary traits collection. The analysis on our dataset revealed the vital role of secondary traits, which improved genomic prediction of grain yield across cycles by an average of 146%. In addition, secondary traits showed their remarkable capabilities of detecting genotype under heat and drought-stressed environments for GS across cycles or environments, allowing breeders to make selections at an early stage and to capture the environmental variances for GS across environments. Our results conclude that, to improve the genomic prediction accuracy for grain yield in the CIMMYT breeding cycles, late March and early April are the optimum times for secondary traits collection. This suggested collection time for secondary traits falls into the range of wheat heading to early grain filling stages, and therefore, those results should also be applicable to other wheat breeding programs.

Author contribution statement

JS performed the analysis and drafted the manuscript. MES, JAP, RPS, and JC planned the study and supervised the analysis. SM, PJ, LCH, GV, JHE were involved in collecting the phenotyping data. JER and JLJ provided statistical analysis suggestions.