Introduction

Marker-assisted selection (MAS) has been implemented widely in breeding for mono- or oligogenic resistance traits and has the potential to play an even more important role in the future. However, to date, the genetic improvement of many important polygenic resistance traits through MAS has posed significant challenges (Miedaner and Korzun 2012). Despite a large number of published quantitative trait loci (QTL) mapping studies focusing on quantitative resistance traits, very few reports demonstrate the successful application of QTL-based MAS in a practical breeding programme (St. Clair 2010). Among the few successful examples are the improvement of resistance to the soybean cyst nematode (Heterodera glycines Ichinohe) in soybean (Glycine max L.) and the improvement of Fusarium head blight resistance in wheat (Triticum aestivum L.), caused mainly by Fusarium graminearum (Schwabe) (Cahill and Schmidt 2004; Anderson et al. 2008; Bernardo 2008). Both diseases have high economic impacts, and in both cases, it was possible to identify QTL with effects sufficiently large and stable to be exploited routinely for MAS in different genetic backgrounds.

For the genetic improvement of native resistance to stalk damage caused by the European corn borer (ECB) Ostrinia nubilalis (Hübner) in maize (Zea mays L.), MAS has also been considered a promising approach. The ECB is a major insect pest of maize, causing up to 30 % yield losses in some cases (Meissle et al. 2010). ECB larvae feed on young leaves during the early growth of maize and start tunnelling within the stalks and ears before the flowering stage. Under current Central European climatic conditions, damage is caused primarily by ECB stalk tunnelling after flowering, which results in damage to stalk vascular tissue and direct harvest loss through stalk breakage (Melchinger et al. 1998). Considering that rising average temperatures are expected to increase ECB pressure in the future (Trnka et al. 2007), varieties with improved resistance to ECB stalk damage would be a significant contribution to securing maize yield productivity in Central Europe. The native genotypic variation for resistance to ECB stalk damage has been described as polygenic (Schön et al. 1993; Krakowsky et al. 2002). Several QTL mapping studies have been carried out, mainly in the USA (Schön et al. 1993; Cardinal et al. 2001; Jampatong et al. 2002; Krakowsky et al. 2007; Orsini et al. 2012) and in Central Europe (Bohn et al. 2000; Papst et al. 2001, 2004). A few of these studies reported QTL with sizeable effects that should be promising for MAS. Flint-Garcia et al. (2003), for instance, compared QTL-based marker-assisted selection with phenotypic recurrent selection and confirmed the potential of MAS in improving ECB resistance. However, to our knowledge, the routine implementation of this approach in practical breeding programmes is limited.

Recent advances in marker genotyping technologies, coupled with new and powerful statistical methods, have allowed the development of MAS towards genome-wide selection (Meuwissen et al. 2001). This approach differs from traditional QTL-based MAS in its ability to exploit information provided by dense genome-wide single nucleotide polymorphism (SNP) markers, which are used to predict the total genetic value of genotypes (genome-wide prediction, GP). Statistical methods making use of information from all available SNP markers are able to cover a large number of small genetic effects and should be suitable for highly polygenic traits (de los Campos et al. 2013). In addition, GP has been shown to capture adequately large-effect QTL and additionally cover the remaining genome-wide effects in a single statistical model (Wimmer et al. 2013). Due to the quantitative nature of resistance to ECB stalk damage, we hypothesise that a genome-wide approach should be more effective than a QTL-based approach with respect to predicting genotypic values. First studies of GP applied to polygenic disease resistance traits in plants are encouraging. For example, Technow et al. (2013) reported GP accuracies of up to 0.70 for Northern corn leaf blight (Exserohilum turcicum Pass.) resistance in maize, while Rutkoski et al. (2012) evaluated the prospects of GP for Fusarium head blight resistance in North American wheat, concluding that germplasm from different origins could be used to train accurate prediction models. Thus, it seems that GP can also be applied successfully to traits possessing large-effect QTL already exploited in traditional MAS.

The first—and main—objective of our study was to evaluate the potential of GP relative to QTL-based MAS in improving resistance to ECB stalk damage in Central European elite maize material. We used established resistance traits to assess different components of resistance to ECB stalk damage. Stalk tunnelling by ECB was evaluated based on the number of feeding tunnels in the stalk and tunnel length (Cardinal et al. 2001; Flint-Garcia et al. 2003; Krakowsky et al. 2007; Ordas et al. 2010; Tefera et al. 2011), while a stalk damage rating score was used to assess the tolerance of plants to stalk breakage under ECB pressure (Hudon and Chiang 1991; Kreps et al. 1998; Melchinger et al. 1998; Bohn et al. 2000; Papst et al. 2004; Orsini et al. 2012). We used three connected doubled haploid (DH) maize populations and conducted individual- and joint-population QTL mapping as well as GP analyses of ECB resistance traits. A direct comparison of the cross-validated predictive abilities of QTL and genome-wide prediction models was performed. Further objectives were to evaluate the predictive performance of GP models for ECB stalk damage in: (i) predictions across biparental populations and (ii) predictions of testcross performance based on GP models trained at the line per se level.

Materials and methods

Plant material

Four elite early maturing inbred lines, originating from the Central European dent heterotic pool, were used as the parents of the DH populations. Three inbred lines selected for resistance to ECB stalk damage in observation trials under high ECB pressure (R1, R2 and R3) were crossed to the same susceptible line (S1) for the development of three connected populations: Pop1 (R1 × S1), Pop2 (R2 × S1) and Pop3 (R3 × S1), comprising 85, 243 and 262 DH lines, respectively. DH lines were developed using in vivo haploid induction technology (Röber et al. 2005). The success of DH line production varied for the three populations, thereby leading to differences in sample size. In the 2011/2012 winter nursery, testcrosses were produced with a subset of DH lines by crossing each DH line with an ECB-susceptible flint tester. All genetic material was proprietary and supplied by KWS SAAT AG, Einbeck, Germany.

Field experiments

Field trials were conducted during 2011 and 2012 at two locations in Southern Germany under artificial ECB infestation conditions and under natural infestation conditions at four locations characterised by high ECB pressure in Germany, France and Italy (see Supplementary Table S1 for detailed information on the field environments). In 2011, the three populations were evaluated for agronomic and stalk damage traits as DH lines per se. Based on these results, bidirectional selection was performed, where 25 % resistant and 10 % susceptible DH lines were selected from each population under the side condition of similar flowering time. In 2012, the selected 31 (Pop1), 85 (Pop2) and 91 (Pop3) DH lines were evaluated as lines per se and as testcrosses in separate trials at each location. Parental lines or their testcrosses were included as duplicate entries in the respective trials. All experiments were sown in single-row plots of 20 plants in a 40 × 15 alpha-lattice design, with two replications in 2011, and a 31 × 7 alpha-lattice design, with two replications in 2012. One exception was the trial at Ferrara (Italy) in 2012, which was sown in four replications in order to exploit the extreme level of natural ECB pressure typical for this site.

Artificial ECB infestation

In the two artificially infested locations, ECB was applied at three time points over one- to two-week intervals. At each time point, approximately 20 neonate larvae were placed in the whorl or leaf collar of ten plants per plot. The first infestation date was synchronised with the occurrence of natural ECB moth flight. The last infestation was carried out shortly before or during flowering time. ECB egg masses, obtained from the entomology unit of Dr. P. Aupinel, INRA Le Magneraud, France, were incubated for 2–3 days at 80 % relative humidity and temperatures between 15 and 30 °C, depending on the developmental stage at delivery time. A few hours after hatching, the larvae were mixed in corncob grit and applied to plants using specific volume dispensers (Mihm 1983).

Evaluation of traits

Days to anthesis (ANT) was recorded as the number of days between sowing and the date when 50 % of the plants in a plot had visible anthers. ANT was scored in three locations for both years (Supplementary Table S1) at two- to three-day intervals during the flowering period. Stalk damage rating (SDR) was used to evaluate the tolerance of plants to stalk breakage under ECB pressure. SDR is based on a 1–9 rating scale adapted according to Hudon and Chiang (1991), where the extremes are one for intact plants and nine for breakages below the ear. Scoring was performed on a plot basis. At all locations, SDR was assessed before harvest (from the end of August in Italy to mid-October in Germany). After the final SDR scoring, resistance to ECB stalk tunnelling was evaluated in the two artificially infested locations by recording the number of feeding tunnels in the stalk (NT) and the cumulative tunnel length in centimetres (TL). To score NT and TL, the stalks of five infested plants per plot were split from the bottom to the eighth internode above the ground. For both traits, plot averages were calculated across the five plants.

Marker analysis and linkage maps

DH populations (N = 590) and their parents were genotyped with 4,790 single nucleotide polymorphism (SNP) markers, distributed evenly across the genome, using a custom Illumina Infinium SNP array (Illumina Inc., San Diego California, USA). The SNPs represented a subsample of the Illumina MaizeSNP50 BeadChip (Ganal et al. 2011) and their physical positions on the genome were assigned based on the B73 RefGen_v2 sequence (Schnable et al. 2009). Out of the 4,790 original SNP markers, a subset of high-quality SNPs, polymorphic in at least one of the populations, was selected according to the following criteria: (i) a call rate higher than 0.90, (ii) a minor allele frequency higher than 0.05 and (iii) less than 10 % missing values. After these selection steps, 2,411 SNPs were available across the three populations for further analysis. DH lines with more than 20 % missing data in these 2,411 SNPs (4, 29 and 36 DH lines in Pop1, Pop2 and Pop3, respectively) were discarded, thereby leaving a total of N = 521 DH lines for further analysis. For each marker and population, deviations from the expected segregation ratio were tested with a Chi-square test using the sequentially rejective Holm–Bonferroni method (Holm 1979). The marker-based genetic distance between the four parental lines was calculated using the Rogers’ distance (Rogers 1972). For the investigation of patterns in variations at the molecular level, among and within populations, a principal component analysis (PCA) on marker data was performed in R (R Development Core Team 2013). Linkage maps were constructed individually for each population by using a maximum likelihood mapping approach and Haldane’s mapping function (Haldane 1919). To allow for the construction of a genetic consensus map across the three populations, we identified clusters of SNP markers colocalising in each of the three individual genetic maps. For each of these clusters, only one marker that was polymorphic in all populations was retained. The consensus map was calculated using 1,034 SNPs, which were polymorphic in at least one of the three populations. All linkage maps were constructed with JoinMap version 4.1 software (Van Ooijen 2006).

Statistical analyses of phenotypic data

Analyses of variance in individual environments were performed using PLABSTAT version 3A software (Utz 2011). Outlying observations were identified by means of residual diagnostic plots in R. The analysis across locations and within years was based on the following linear mixed model implemented in the ASReml-R software package (Butler et al. 2009):

y ijkm  = μ + g i  + l j  + gl ij  + r kj  + b mkj  + e ijkm

y ijkm trait observation

μ overall mean

g i random effect of genotype i

l j fixed effect of location j

gl ij random interaction effect of genotype i with location j

r kj random effect of replication k nested within location j

b mkj random effect of incomplete block m nested within replication k nested within location j

e ijkm random residual error.

Variance components were estimated within the three populations by introducing a categorical variable into the model assigning DH lines to their respective population. Heritabilities (h 2) on a progeny mean basis were calculated according to Hallauer and Miranda (1981). To extract the best linear unbiased estimates (BLUEs) for each genotype and trait, the above model was fitted with genotype as fixed effect. BLUEs averaged across locations represented the phenotypic input data for QTL and the genomic prediction analyses. Components of covariance between traits, which were evaluated in the same experimental unit, were estimated by expanding the above model to a bivariate model. Covariance components between DH lines per se and testcrosses were estimated using a two-stage analysis. This approach was necessary, since DH lines per se and their corresponding testcrosses were evaluated in separate trials. BLUEs were first calculated for individual locations and then used in a bivariate model across locations. Genotypic correlations were calculated according to Mode and Robinson (1959). Approximate standard errors of heritability and genotypic correlations were calculated by applying the delta method as suggested by Holland et al. (2003). Pearson’s phenotypic correlations between traits and between DH line per se and testcross performance were calculated based on BLUEs averaged across the locations.

QTL mapping

QTL analyses were carried out for each trait, using phenotypic data from the unselected DH populations evaluated in 2011 (N = 521). Analyses based on the genetic consensus map were performed both within the individual biparental populations and combined across all three populations (hereafter referred to as joint-population QTL analyses), using the MCQTL software package (Jourjon et al. 2005). We used forward stepwise regression along with the iQTLm method (Charcosset et al. 2001). For each trait and population (including the joint-population scenario), empirical LOD thresholds at the 0.05 genome-wide significance level were assessed from 1,000 permutations, according to Churchill and Doerge (1994). LOD support intervals of QTL positions were defined as the map distance in cM spanning an LOD drop of one unit on each side of the LOD peak. QTL were defined as colocalising if their respective LOD support intervals overlapped. A connected additive QTL model was implemented in the joint-population analysis, and in our specific case, the connected model estimated four allelic effects at each QTL (the effects of R1, R2, R3 and S1 parental alleles) in which the effect of the common parent S1 was assumed to be the same in all three populations. The total proportion of variance explained by the model (R 2), as well as the proportion of variance explained by individual QTL, was calculated according to Mangin et al. (2010).

Genome-wide prediction

All polymorphic SNP markers meeting quality criteria (N = 2,411) were used in the genome-wide prediction of ECB resistance traits and ANT. Marker genotypes were coded 0 or 2 according to the number of copies of the minor allele, and missing marker genotypes were imputed using the BEAGLE software package (Browning and Browning 2009). A genomic best linear unbiased prediction (GBLUP) model was used to predict the genetic values of DH lines. The realised relationship matrix between the DH lines of the three populations was computed based on marker data according to the method proposed by Habier et al. (2007). All analyses were performed using the genomic prediction framework provided by the synbreed R package (Wimmer et al. 2012).

Cross-validation for QTL and genome-wide prediction models

A graphical representation of the different cross-validation (CV) scenarios used in this study is given in Supplementary Fig. S1. The prediction performances of QTL-based and GP models were compared in a joint-population framework and also within individual biparental populations based on the unselected DH lines (N = 521) phenotyped in 2011 (hereafter referred to as joint-population and within-population prediction scenarios, respectively). For both QTL and GP analyses, we performed fivefold CV. The data set was split into five subsets: four subsets comprising 80 % of the DH lines built the estimation set (ES) and were used for model training, whereas the remaining subset (20 % of DH lines) constituted the test set (TS). The process was replicated ten times with varying allocations of DH lines to the five CV subsets. For joint-population prediction, sampling was performed by taking population information into account. Out of 521 DH lines, 81 (15 %), 214 (41 %) and 226 (43 %) belonged to Pop1, Pop2 and Pop3, respectively. These proportions were maintained in each of the five CV subsets. Each ES comprised DH lines from all three populations, while each TS was subdivided into three separate TSs comprising only DH lines from Pop1, Pop2 and Pop3, respectively. For each of the five CV subsets, the predictive performance of the two models was evaluated once for the combined TS across populations and additionally for the three population-specific TSs. In contrast, for within-population prediction, the ES and TS were constructed through the random sampling of DH lines within biparental populations.

Identical allocation of DH lines to CV subsets was used for both the QTL and the GP approaches across all CV replications. For the QTL model, predictions of DH lines in the TS were based on the sum of additive effects of all significant QTL detected in the ES, whereas GBLUP predictions were based on the effects of all polymorphic SNP markers estimated in the ES. Given that no built-in option for CV has so far been available with the MCQTL program (B. Mangin, personal communication), we developed a new R package called cvMCQTL that performs this task by running a CV loop on the QTL mapping routine of MCQTL. Furthermore, cvMCQTL runs in combination with the synbreed R package and is available for download from the supplementary material in this manuscript. We also combined information obtained from QTL analyses with GBLUP. Markers closest to each of the QTL detected in each ES were fitted as fixed effects in GBLUP. For the joint-population prediction scenario, fixed marker effects were modelled in GBLUP in the same way as they were modelled in the QTL analysis, i.e. by assuming four different allelic effects for the four parental lines. Predicted values in the TS were calculated as the sum of the fixed effects at the QTL and the remaining random genome-wide marker effects estimated within this GBLUP framework.

Predictive abilities of the different models were calculated as Pearson’s correlation coefficients between predicted and observed trait values in each TS. An overall mean predictive ability with standard deviation was calculated according to Luan et al. (2009).

GP was evaluated in additional CV scenarios: (i) prediction across biparental populations (hereafter referred to as across-population prediction) and (ii) prediction of ECB stalk damage resistance of testcrosses using models trained on DH line per se performance (hereafter referred to as testcross prediction). In across-population prediction, the ES comprised either an individual population or a merged data set of two populations, whereas the remaining population(s) represented the TS. Because of the predefined ES and TS structure, the replication of different CV subsets was not possible. Instead, a bootstrapping approach with 1,000 replications and sampling with replacement in the TS was implemented, in order to obtain the uncertainty measures of predictive abilities.

For testcross prediction, the model was trained using data from DH lines per se (N = 521). The following phenotypes were used: (i) only 2011 DH lines per se data, (ii) data from (i) plus 2012 data for the resistant fraction of DH lines per se and (iii) data from (i) plus 2012 data for the resistant and susceptible fractions of DH lines per se (Supplementary Fig. S1). Two-year observations were fitted as replicated phenotypes in GBLUP including a fixed year effect. Given that only a selected subset of DH lines was evaluated as testcrosses, the CV procedure was slightly modified. In each of the ten CV replications, we obtained a predicted genetic value for N = 195 selected DH lines evaluated as testcrosses. Out of these 195 lines, N = 137 belonged to the resistant and N = 58 to the susceptible fractions, respectively. Predictive abilities were calculated separately for the resistant and susceptible fractions by correlating the N = 137 and N = 58 predicted values against phenotypic observations of their corresponding testcrosses, respectively. The mean predictive ability and its standard deviation were calculated based on the ten CV replications.

Results

Marker analysis and genetic maps

The overall number of polymorphic SNP markers across populations was 2,411, and 365 of these SNPs were polymorphic in all three populations. Pop2 had the highest number of polymorphic SNPs (1,660), followed by Pop3 (1,340) and Pop1 (977). The polymorphic SNPs of Pop2 were well distributed across the genome, while Pop1 lacked polymorphism on chromosome 8. The parent R2 displayed the highest genetic distance to the other three parental lines (Supplementary Table S2), thereby explaining the higher degree of polymorphism in Pop2. The PCA clearly separated the populations into three distinct clusters (Supplementary Fig. S2). In a small fraction of Pop3 DH lines, we found SNPs segregating in the progeny that were monomorphic in the parents. These SNPs were removed from linkage mapping and the QTL analyses. Furthermore, approximately 40 % of the 1,340 polymorphic SNPs in Pop3 showed significant segregation distortion at the Holm–Bonferroni adjusted 0.05 significance level. In total, 133 markers displaying a heavily distorted segregation ratio were removed from the linkage analysis of Pop3, thus leaving 930 mapped markers for QTL analysis. It cannot be excluded that during the process of DH line production for Pop3, cross-pollination may have occurred at an unknown rate. However, the PCA using all 1,340 polymorphic SNPs indicated neither any apparent substructure within Pop3 nor the presence of outlying genotypes (Supplementary Fig. S2). The order of markers along the genetic map of Pop3 was consistent with the reference physical map and the Pop1 and Pop2 genetic maps. The consensus map across all three populations included 1,034 informative SNPs and displayed a total length of 1,787 cM over the ten chromosomes (Supplementary Fig. S3).

Quantitative genetic analysis

Due to climatic conditions, the overall severity of ECB damage was higher in 2012 than in 2011, as shown by the mean trait values (Table 1). Because of substantially taller plants and an overall higher centre of gravity, testcrosses tend to break more often below the ear compared to inbred lines per se, thus leading to higher average SDR scores in the testcrosses compared to the inbred lines per se. For TL and NT, testcross means were significantly lower than for their corresponding inbred lines evaluated in the same year. Differences between population means for SDR were not significant in 2011, while significant differences were observed in 2012, particularly between Pop1 and Pop3. Similar results were observed for the other traits, but in most cases, population mean differences were small. Significant genetic variance was observed for all traits in 2011, with the exception of TL in Pop1 (Table 1). The selected genotypes evaluated in 2012 showed significant genetic variance for all combinations of SDR and ANT, while for TL and NT significant genetic variance was only observed in half of the population-trait combinations. When compared across populations, heritabilities were similar for Pop2 and Pop3 but Pop1 often showed lower heritability estimates (Table 1). Heritabilities were medium to high for SDR across six locations and low to medium for TL and NT across two locations. Heritability for ANT was high in the unselected populations (2011) and decreased in the selected DH lines per se in 2012. This was expected due to the selection for similar flowering time. However, genetic variance and heritability for ANT were unexpectedly high in the selected testcrosses of Pop2. An extended version of Table 1, including estimates of genotype × location and residual error variances, is provided in Supplementary Table S3.

Table 1 Estimates of the mean, genetic variance (\( \hat{\sigma }_{g}^{2} \)) and heritability (\( \hat{h}^{2} \)) (± standard errors) for the traits stalk damage rating (SDR), tunnel length (TL), number of tunnels (NT) and days to anthesis (ANT) for three DH populations (Pop1, Pop2 and Pop3). Estimates are given for the unselected populations evaluated as DH lines per se in 2011 as well as for the selected DH lines per se and their testcrosses evaluated in 2012

Bidirectional selection for ECB stalk damage traits was performed based on results from the 2011 field trials. Selection was based primarily on SDR under the side condition of similar flowering time, and with TL and NT serving as supporting criteria for DH lines showing more intermediate SDR values. Figure 1 shows the adjusted means of the selected DH lines per se and their testcrosses evaluated in 2012 for SDR. In the DH lines per se (Fig. 1a), there was a clear difference in the mean SDR of the 147 DH lines selected for low SDR and the 60 DH lines selected for high SDR in 2011. The difference was still significant but less pronounced in testcrosses (Fig. 1b). The means of the two selected fractions differed significantly also for TL and NT at the DH line per se level, but not at the testcross level (data not shown). The mean SDRs of resistant parents R1, R2 and R3 were significantly lower than the mean SDR of susceptible parent S1 in both DH lines per se and testcrosses, with the sole exception of the R1 testcross. As expected, flowering time distribution was found to be similar for both the resistant and susceptible genotypes in both DH lines per se and testcrosses, although the means of the two fractions were significantly different in the testcrosses (Supplementary Fig. S4). This difference was driven mainly by the testcrosses of Pop2. Indeed, the testcross of parent R2 flowered significantly later than the testcrosses of the other parents, which was in contrast to the flowering time of the respective parental lines per se (Supplementary Fig. S4).

Fig. 1
figure 1

Distributions of the adjusted means of stalk damage rating (SDR) for the selected DH lines per se (a) and their corresponding testcrosses (b) evaluated in 2012 across six locations, and scatterplot of the correlation between DH lines per se and testcrosses for SDR (c) with phenotypic (r P, ** significant at p < 0.01) and genotypic (r G, ++ coefficient exceeds twice its standard error) correlation coefficients. Genotypes from the resistant and susceptible fractions are represented by grey and black colour, respectively. Arrows show the means of the parental lines R1, R2, R3 and S1. The least significant difference (LSD) at the 5 % probability level and the means of the resistant (M.res.) and susceptible (M.sus.) fractions are also shown in the graphs. Significance of difference between M.res. and M.sus. is indicated by ***(p < 0.001)

Genotypic correlations between resistance traits were high for unselected DH lines per se (Table 2). In testcrosses, correlations between SDR and stalk tunnelling traits were not significant or low due to low heritabilities. Medium to high negative genotypic correlations between ANT and the stalk damage traits were observed in 2011. As expected, these correlations were considerably lower in 2012 after selection for reduced variation in ANT. The genotypic correlation between DH lines per se and testcrosses was high for SDR (Fig. 1c).

Table 2 Phenotypic (above diagonal) and genotypic (below diagonal) correlation coefficients between the traits stalk damage rating (SDR), tunnel length (TL), number of tunnels (NT) and days to anthesis (ANT). Estimates are given for DH lines per se evaluated in 2011 as well as for selected DH lines per se and their testcrosses evaluated in 2012

QTL mapping

The results of the joint-population QTL analyses, based on the unselected DH lines per se evaluated in 2011 (N = 521), are presented in Table 3. Eight, ten, four and nine QTL were identified for traits SDR, TL, NT and ANT, respectively. R 2 values for individual QTL varied between 0.04 and 0.21, while total R 2 values representing the proportion of phenotypic variance explained by the model fitting all detected QTL simultaneously ranged between 0.25 for NT and 0.47 for SDR and ANT. The QTL for SDR on chromosomes 3 and 5 colocalised with QTL for both TL and NT. The SDR QTL on chromosome 2 colocalised with a TL QTL. All other QTL for trait TL did not colocalise with SDR. Additive allelic effects contributed by parents R1, R2, R3 and S1, estimated from a connected QTL model, were small to moderate for all traits (Table 3). As expected, at most QTL, the alleles contributing to lower trait values originated from the resistant parents, and few QTL were detected where the resistance allele was contributed by the common susceptible parent S1. One of these was the QTL for SDR on chromosome 8, which had a negative effect sign for the S1 parental allele. This QTL was found in a similar region to the QTL for TL on chromosome 8, which also showed a negative effect sign for the S1 allele.

Table 3 Chromosome (Chr.), position (Pos.), LOD support interval (LOD S.I.), LOD score at the QTL position, proportion of variance explained (R 2) and additive effects of QTL alleles derived from parents R1, R2, R3 and S1 detected in the joint analysis across the three unselected populations (N = 521) evaluated as DH lines per se in 2011 for the traits stalk damage rating (SDR), tunnel length (TL), number of tunnels (NT) and days to anthesis (ANT)

Figure 2 illustrates a comparison between QTL that were detected for SDR and the trait ANT in the joint-population analysis. Four QTL for SDR colocalised with ANT QTL. The negative genetic correlation between SDR and ANT observed in the quantitative genetic analysis was partially reflected in the QTL results as opposite signs of allelic effects at the colocalising QTL between the two traits.

Fig. 2
figure 2

LOD profiles of the joint-population QTL analysis based on DH lines per se evaluated in 2011 (N = 521) for the traits stalk damage rating (SDR, continuous line) and days to anthesis (ANT, dashed line). The arrows indicate the positions of the detected QTL in the final model. Horizontal lines attached above each arrow indicate the LOD support interval of the QTL position

Predictive abilities of QTL-based and genome-wide prediction models

The prediction performance of QTL-based and GBLUP models was compared using fivefold CV. Table 4 presents mean predictive abilities of the two approaches from within-population and joint-population prediction scenarios. Mean predictive abilities of GBLUP were consistently higher than the corresponding mean predictive abilities of the QTL-based model. Figure 3 shows this comparison by means of scatterplots of each of the 50 individual joint-population predictive ability values calculated from combined test sets across the three populations. For all traits, the predictive ability of GBLUP was higher in every single test set of the CV scheme. We observed correlations of r = 0.40 to r = 0.56 between the predictive abilities obtained from the two different approaches. In general, population-trait combinations that showed higher GBLUP predictive abilities were characterised by higher predictive abilities of the QTL model, and vice versa (Table 4). The higher heritabilities of SDR and ANT, when compared to TL and NT, generally resulted in higher predictive abilities of both the GBLUP and QTL models. High GBLUP predictive abilities of approximately 0.70 were observed for SDR and ANT in all populations, and Pop3 showed the highest predictive ability of GBLUP for all traits and in both prediction scenarios. The highest mean predictive ability of the QTL model was observed for the joint-population prediction of the ANT trait in Pop1. In this specific case, the QTL-based model was able to predict almost as accurately as its GBLUP counterpart.

Table 4 Mean predictive abilities (± standard deviation) obtained from cross-validation of the QTL-based and GBLUP models for the traits stalk damage rating (SDR), tunnel length (TL), number of tunnels (NT) and days to anthesis (ANT) evaluated in 2011. For within-population prediction (within) and joint-population prediction (joint), predictive abilities calculated based on separate test sets for each population are reported for Pop1, Pop2 and Pop3
Fig. 3
figure 3

Comparison of the predictive abilities of QTL-based and GBLUP models from the joint-population prediction scenario calculated based on combined test sets across the three populations. Predictive abilities from each test set of QTL-based cross-validation are plotted against predictive abilities from the corresponding test set of GBLUP cross-validation. Plots are shown for all four traits based on unselected DH lines per se evaluated in 2011. Pearson’s correlation coefficients (r) between the plotted predictive abilities are given (**, correlation significant at p < 0.01)

The small Pop1 population benefitted more from joint-population prediction than the two larger Pop2 and Pop3 populations (Table 4). Joint-population predictive abilities in Pop2 and Pop3 were in several cases similar to or only marginally higher than the corresponding values from within-population prediction. For the QTL-based prediction of SDR in Pop2 and NT in Pop3, within-population predictive abilities were higher than joint-population predictive abilities. Fitting the QTL detected in each ES as fixed effects in GBLUP slightly decreased prediction performance for SDR in both the within and joint-population prediction scenarios (Supplementary Table S4).

Across-population prediction and testcross prediction

For the trait SDR, results from across-population prediction scenarios are summarised in Table 5. Mean predictive abilities ranged between 0.36 and 0.61 and, as expected, were in general significantly lower compared to predictive abilities observed in the within- and joint-population prediction scenarios. Variations in across-population prediction performance reflected the varying degree of relatedness between the three resistant parental lines—as seen in Supplementary Table S2. Training the model with DH lines from Pop1 resulted in significantly higher prediction accuracy for the DH lines of the more related Pop3 than for the DH lines from Pop2. Using DH lines from Pop1, in addition to lines from Pop2 for model training when predicting the genotypic values of lines from Pop3, increased predictive ability from 0.53 to 0.61 (Table 5). In contrast, when predicting the SDR of lines from Pop2, the addition of genotypes from Pop1 did not increase the predictive ability that was already achieved with DH lines from Pop3. However, Pop2 was able to predict the other two populations with good accuracy despite being the least related family. Across-population prediction was also performed by keeping the sample size of all three populations constant (Table 5, scenario B), i.e. through a random reduction of the sample size of Pop2 and Pop3 to match the size of the smaller Pop1 (N = 81) in the respective ES and TS. Mean predictive abilities were very similar to the corresponding values obtained when using Pop2 and Pop3 with their respective original sample size.

Table 5 Predictive abilities (± standard deviation) from across-population prediction with the three biparental populations Pop1, Pop2 and Pop3. Predictive abilities are based on the trait stalk damage rating evaluated in unselected DH lines per se in 2011. Numbers in brackets represent the number of genotypes included in the respective estimation and test sets under scenario A (prediction across complete populations). Predictive abilities under scenario B (prediction across populations of equal size) were obtained by reducing the size of Pop2 and Pop3 to the size of Pop1 (N = 81) in both estimation and test set

Moderate predictive abilities between 0.43 and 0.44 were achieved in the testcross prediction of SDR (Table 6). The three scenarios did not differ significantly in their predictive abilities. Thus, training the model by adding phenotypic information on the selected DH lines per se evaluated in 2012 did not increase the prediction performance of the testcrosses. In Table 6, we report predictive abilities based on the resistant fraction of the selected lines. When correlations between predicted genotypic and observed phenotypic values were calculated using all selected lines, i.e. the resistant and the susceptible fractions, predictive abilities increased to approximately 0.50.

Table 6 Mean predictive abilities (± standard deviation) of the GBLUP model for testcross prediction of the trait stalk damage rating. The model was trained using the same number of genotypes but varying the number of replicated phenotypic observations across years (A–C). Predictive abilities are given for the resistant fraction of testcrosses

Discussion

The potential of genome-wide prediction for ECB resistance improvement

The genetic improvement of native resistance to ECB stalk damage in maize is a challenging task (Flint-Garcia et al. 2003). Screening under natural infestation becomes increasingly difficult due to mandatory crop rotation and increased chemical ECB control in many European regions affected by the pest. Moreover, natural infestation is often unreliable, as not every season offers climatic conditions suitable for effective resistance screening, even at locations with high ECB pressure. Thus, ECB resistance breeding requires cost- and labour-intensive artificial infestation to guarantee a high and homogeneous level of infestation in at least a subset of testing locations.

In this study, we evaluated the potential of genome-wide prediction for improving resistance to ECB stalk damage in Central European elite maize. Intermediate-to-high prediction accuracies were obtained for all resistance traits at the DH line per se level, and for SDR at the testcross level. In general, GP accuracies were comparable to or even higher than those reported by Technow et al. (2013), for Northern corn leaf blight resistance in maize, and by Rutkoski et al. (2012), for Fusarium head blight resistance in wheat. Our results demonstrate that GP can be an effective strategy for increasing genetic gain for ECB stalk damage resistance per unit time by predicting the performance of unphenotyped individuals based on their DNA profile.

Comparison of the QTL-based and genome-wide approaches

Resistance breeding strategies based on QTL mapping results can be applied successfully for traits influenced by at least one or a few QTL with major effects (Miedaner and Korzun 2012). A QTL-based approach allows the identification and characterisation of specific resistance alleles and their directed introgression and/or pyramidisation into elite genetic backgrounds. Targeted combinations of alleles influencing different, even negatively correlated, traits (e.g. ECB resistance and flowering time) may be feasible with a QTL approach, and especially when resistance alleles originate from non-adapted genetic sources, the introgression of individual genomic regions into elite material might be advantageous over genome-wide approaches.

The multi-parental design of the three large and connected populations used in this study provided a suitable framework for QTL mapping. Thus, several QTL with sizeable effects could be identified for all traits. However, as has been shown previously, QTL effects are optimistically biased due to model selection, unless they are derived from an independent validation sample (Schön et al. 2004). Therefore, we developed a cross-validation R routine for MCQTL to obtain unbiased estimates of the proportion of variance explained by the multi-parental QTL mapping approach. After bias correction, statistical models fitting putative QTL identified for the three resistance traits explained only a small proportion of the respective phenotypic variance (<30 %), thus corroborating the polygenic nature of resistance to ECB stalk damage reported in previous studies (Schön et al. 1993; Melchinger et al. 1998; Krakowsky et al. 2002; Papst et al. 2004) and suggesting that for the marker-based improvement of ECB resistance, a genome-wide approach might be more promising.

This conclusion was confirmed by results achieved through the GBLUP analysis. Predictive abilities obtained with GBLUP in the joint-population scenario were, on average, 31 % higher for SDR and even 62 % for TL than those that could be obtained using QTL-based prediction. Moreover, in a comparison across 50 CV test sets, predictive abilities obtained with GBLUP surpassed those obtained with the QTL-based approach without exception. A similar comparison of GBLUP and QTL-based predictions was carried out by Peiffer et al. (2013) for stalk strength in maize, by using nearly 5,000 inbred lines. As in our study, the authors reported cross-validated predictive abilities of GBLUP that were consistently higher than those of QTL models.

Several factors contribute to the differences in prediction performance between the QTL-based and the GBLUP approaches. For one, the genetic architecture of the trait under study has a strong influence on the relative efficiency of the two methods. For highly polygenic traits, it is difficult to achieve accurate estimates of both QTL positions and effects, as model selection becomes a challenge when the contribution of individual genes to quantitative trait variation is small. Nonetheless, even with large-effect QTL, it can be difficult to select a set of markers that maximises prediction accuracy, if the population exhibits long-range linkage disequilibrium and low trait heritability (Wimmer et al. 2013). Here, the relatively large LOD support intervals of QTL positions, and the more pronounced superiority of GBLUP over the QTL-based approach for the two traits with lower trait heritability (TL and NT), point in this direction.

We also investigated the potential of an integrated QTL–GP approach, by considering that a few QTL with sizeable effects were detected particularly for the trait SDR (e.g. QTL on chromosomes 2 and 3 with R 2 > 0.10 and >0.20, respectively). Zhao et al. (2014) showed that an integrated approach can be effective if the two sources of information complement each other. In their study on predicting heading time and plant height across unrelated wheat genotypes, prediction accuracy could be increased significantly by exploiting information on well-characterised functional mutations. However, our results showed that with sufficient marker coverage, the standard GBLUP model captured the QTL effects adequately, and accounting for QTL information in the GBLUP model did not increase predictive abilities. A further recent study by Zhang et al. (2014) proposed a method for increasing genome-based prediction accuracy by systematically exploiting the wealth of published QTL information (as SNP markers detected in either QTL mapping or genome-wide association studies) through integration into the GBLUP genomic relationship matrix. It remains to be seen whether this could also be a valid approach to predicting ECB resistance in maize.

In summary, we conclude that our results encourage the application of a whole-genome-based approach over a QTL-based approach for the marker-assisted improvement of ECB stalk damage resistance in elite maize. Implementation of GP in practice should be facilitated if sources of resistance are recombined with elite material in earlier selection cycles, as was the case in this study, to allow for the simultaneous genome-based prediction of resistance and other agronomically important traits.

Joint-, within- and across-population prediction

For Pop2 and Pop3, the predictive abilities of within-population prediction were almost as high as or even higher than those obtained from joint-population prediction despite the substantially smaller sample sizes of the respective ES. One explanation for this finding could be that at least partially different resistance alleles segregate in the three populations, a notion which is supported by results from the QTL analysis. Among the larger QTL identified for SDR, one of the QTL on chromosome 2 and the QTL on chromosome 8 mainly segregated in Pop2, whereas Pop3 provided major allelic effects at the second QTL on chromosome 2 and the QTL on chromosome 5 (data not shown). Only the two QTL on chromosomes 3 and 6 were detected in all three populations, and the effects were congruent in sign for the three resistant parents.

The across-population prediction scenario also supported the hypothesis concerning different resistance alleles. Prediction accuracies that could be obtained across populations were in good agreement with the level of relatedness among the parental lines. Resistant parents R1 and R3 showed higher pairwise molecular genetic similarity with each other than each of them with R2, and consequently, the mutual predictive abilities of Pop1 and Pop3 were substantially higher than their respective predictive abilities for DH lines derived from Pop2. Interestingly, predictive abilities in the across-population prediction involving Pop2 differed significantly for reciprocal scenarios. Substantially higher predictive abilities could be obtained when Pop2 constituted the ES and Pop1 and Pop3 the TS, compared to when either Pop1 or Pop3—or both—constituted the ES and Pop2 the TS. This effect remained significant when the sample size of the estimation sets was kept constant. We hypothesise that due to the higher genetic distance of R2 to the susceptible line S1, a higher number of predictive haplotypes could be formed in Pop2 compared to Pop1 and Pop3, thus leading to a higher effective sample size and higher predictive power of Pop2. Additionally, a QTL on chromosome 8 was mainly segregating in Pop2 and displayed sizable effects for SDR. The smaller effects at this QTL in Pop1 and Pop3 may have contributed to significant losses in prediction accuracy for lines from Pop2. Given that this locus colocalised with an important QTL for ANT, it is possible that SDR in Pop2 is more strongly influenced by flowering time as compared to Pop1 and Pop3.

Our results indicate that the choice of population employed in model training can significantly affect predictive abilities. As a result, parental lines should be chosen to maximise the effective sample size of a training population, especially when breeding schemes allow for only a few recombination events. If different sources of resistance are available for a given trait, model training can be performed in multi-parental populations, thereby ensuring predictive power across a diverse spectrum of resistance alleles and genetic backgrounds.

Testcross prediction

The genotypic correlation between line per se and testcross performance is an important parameter for optimising hybrid breeding programmes (Mihaljevic et al. 2005). Selection based on line per se performance may be advantageous due to the larger additive genetic variance that can be exploited in inbred lines and the accelerated phenotypic evaluation (Miedaner et al. 2014). Given that the correlation between predicted and observed values was calculated for the selected fraction of resistant DH lines only, and as prediction was performed across different years, predictive abilities were high, exceeding 0.43 for SDR in all scenarios. These results are also reflected in the high genotypic correlation between DH lines per se and testcrosses evaluated for SDR in 2012 (Fig. 1) and the absence of significant genotype × year interaction variance (data not shown). The non-significant genotype × year interaction variance also explains why the three testcross prediction scenarios yielded almost identical results, i.e. including additional data from selected DH lines per se evaluated in 2012 in model training did not lead to a significant increase in predictive abilities for testcrosses. In contrast to SDR, the predictive abilities of testcross prediction were low for stalk tunnelling traits TL and NT (data not shown) due to the low heritabilities observed for these traits at the testcross level (Table 1). The artificial infestation of testcrosses might have been less effective than for the inbred lines per se, because its success could be related significantly to the developmental stage of the plants during the first days of larval establishment. We accounted for differences in plant development by conducting artificial infestation for testcrosses one week earlier than for the inbred lines. In spite of this measure, we may not have obtained a sufficient level of ECB pressure to allow for an effective evaluation of stalk tunnelling traits at the testcross level.

In summary, our results demonstrate that, for improvements in ECB stalk breakage tolerance in testcrosses, it is promising to use GP models trained on DH lines per se phenotyped in one season at multiple locations. Considering the promising results obtained for TL and NT with DH lines per se, the potential of GP for improving ECB stalk tunnelling resistance at the testcross level deserves further investigation.

Resistance traits and flowering time

In contrast to the tight genetic correlation of 0.94 between TL and NT, genetic correlations between tunnelling traits and SDR in the DH lines per se were intermediate, thus indicating that SDR might have been affected by factors other than ECB feeding. Environmental factors unrelated to ECB damage may affect SDR, although their impact should be low in lodging-tolerant elite maize materials screened under severe ECB pressure. It remains to be seen whether training GP models based on SDR will be effective for the long-term improvement of ECB stalk damage resistance or whether a multi-trait approach including SDR and tunnelling traits needs to be implemented. TL and NT phenotyping is extremely resource intensive; however, prediction accuracies obtained for the two traits at the DH line per se level were of a similar order as observed for SDR given that both traits had substantially lower heritabilities. The higher heritability of SDR was at least to some extent a consequence of the fact that SDR could be scored in all six field environments in both years. Thus, a genome-based, multi-trait approach should be considered for the simultaneous genetic improvement of the different components contributing to resistance to ECB stalk damage in maize.

High negative genetic correlations were observed between resistance traits and ANT in the unselected populations. These correlations were reflected at the genetic level by common positions of QTL, showing in some cases opposite signs of additive effects. This was most evident for an SDR/ANT colocalising QTL on chromosome 8 (bin 8.05) with a physical LOD support interval including the well-characterised flowering time vgt1 locus (Salvi et al. 2007; Ducrocq et al. 2008). It is possible that this and other loci affecting ANT have an influence on ECB stalk damage traits. This correlation has been reported in early phenotyping research (Russell et al. 1974; Jarvis and Guthrie 1980; Hudon and Chiang 1991) as well as in QTL mapping studies (Bohn et al. 2000, Krakowsky et al. 2004). Given that early maturity is an important breeding goal in Central Europe, a negatively correlated selection response of maturity and resistance is not desired. However, results from the evaluation of the selected DH lines per se in 2012 demonstrated that it is possible to maintain genetic variation for resistance traits while considerably reducing variation in flowering time. Thus, improving resistance traits and maturity simultaneously by constructing a genome-based selection index should be a viable option in resistance breeding.

Conclusions

Marker-assisted selection can be effective for improving resistance to ECB stalk damage in maize. Our results demonstrate that the efficiency of MAS can be increased considerably when progressing from a QTL-based towards a genome-wide prediction approach. Because of the cost- and time-consuming phenotyping of ECB resistance traits, GP can be a meaningful strategy for increasing genetic gain per unit time, by predicting the performance of unphenotyped individuals based on their DNA profile. The potential of GP for improving stalk breakage tolerance (SDR) was demonstrated at both the DH line per se and testcross levels. Improvements of stalk feeding resistance (TL and NT) were shown at the DH line per se level. Recombining progeny from the three populations evaluated herein may further increase the overall level of resistance due to different resistance alleles segregating. With the availability of native resistance to ECB stalk damage in elite maize germplasm adapted to Central European conditions, our results may open up avenues for implementing an integrated genome-based selection approach for the simultaneous improvement of yield, maturity and ECB resistance.

Author contribution statement

CCS, PW, BK and MO conceived the study; FF, PW, JE, MD and BK conducted experiments; TP, MD and RK provided materials; VW and FF developed the cross-validation routine for the MCQTL program; FF and WM analysed the data; CCS and PW supervised the research; FF, PW and CCS wrote the manuscript; all authors read, edited and approved the manuscript.