Introduction

Perennial ryegrass (Lolium perenne L.) is one of the most widely sown forage grass species in the world, and is the main source of energy and protein for grazing livestock in higher fertility pastures in New Zealand. The development of cultivars with potential for increased productivity is a key objective in forage improvement.

Seed yield is a measure of the total saleable seeds recovered from harvesting and processing a seed crop. An economic level of seed yield is vital to the production and successful commercialisation of any forage grass. However, seed yield in perennial ryegrass, as in other forage grasses, has received less attention from plant breeders than herbage production and forage quality (Marshall and Wilkins 2003), because these species are primarily used as forage.

Seed yield is a quantitative trait, with its expression influenced by many genes and by the environment (Casler et al. 2003). Identification of and selection for relevant component traits is a means to enhance genetic gain for complex, quantitative traits in plant breeding (Donald 1968; Sparnaaij and Bos 1993). Such an approach dissects complex traits into components under separate and probably simpler genetic control, providing useful information on the genetic architecture and insights on the physiological control of the overarching trait. Improved genetic gain for the complex trait may subsequently be achievable by selecting on positively correlated components with sufficient heritability. Based on glasshouse and field studies, component traits with significant contributions to seed yield in perennial ryegrass have been identified. These include increased seed-set (higher proportion of ovules forming seeds), greater seed number per tiller and more reproductive tillers per plant (Marshall and Wilkins 2003; Studer et al. 2008). However, in these studies, only a few traits were measured. Therefore, a more comprehensive analysis of trait associations among traits contributing to seed yield would be very timely.

As a molecular strategy for accelerating trait improvement in plants (Heffner et al. 2009; Hayes et al. 2013), genomic selection (Meuwissen et al. 2001), underpinned by low cost single nucleotide polymorphism (SNP) marker systems (Elshire et al. 2011), is poised to supersede marker-assisted selection that uses molecular markers linked to quantitative trait loci (QTL) (e.g. Barrett et al. 2008), However, QTL analysis remains an effective tool for deciphering the genetic architecture of quantitative traits, identifying genetic interactions and inter-dependencies amongst traits, and providing a platform for fine mapping and identification of candidate genes.

Regardless of the state of play of genetic and bioinformatics technology, the desirable phenotype to be bred for still needs to be identified. Seed yield is a complex trait, and the identification of its component traits and their interactions, and related traits and the genetic factors controlling them, remains only partly investigated despite a number of insightful studies in the last 10 years, including, among others, Armstead et al. (2008), Byrne et al. (2009), and Brown et al. (2010). For example, the first mentioned study primarily investigated QTL for seed set, the second study investigated a group of three coincident QTL for days to heading, spike length and spikelets per spike, while the third was confined to spike description traits, without investigating threshed seed yield.

Considering the above, the aims of this research were (i) to evaluate relationships amongst traits associated with seed yield potential in a New Zealand-adapted perennial ryegrass population; and (ii) to identify genomic regions associated with phenotypic variation in those traits via QTL discovery and comparative analysis. In this paper we present a comprehensive analysis of associations amongst 13 seed-yield-related traits in the I × S mapping population, (Sartie et al. 2011) measured at one location (Palmerston North, New Zealand), and identification of putative QTL for those traits. Concurrent assessment of four traits in the same mapping population at a second location (Lincoln, New Zealand), enabled identification of environmentally stable QTL for SYSp and DTH and, for the latter, further validation was afforded by repeated QTL detection in consecutive years.

Materials and methods

Plant material

Two hundred progeny of a previously-described perennial ryegrass mapping population, I × S (Sartie et al. 2011), and the two parents were assessed for seed yield potential. Plants of mapping population I × S are F1 full-sibling progeny from a cross between one heterozygous plant each of ‘Grasslands Impact’ hybrid ryegrass (L. × boucheanum syn. L. hybridum) and ‘Grasslands Samson’ perennial ryegrass (L. perenne). Both parents are infected by New Zealand common toxic endophyte (Epichloë festucae var. lolii). Details on morphological and vegetative characteristics of the I × S population and the parents are discussed elsewhere (Sartie et al. 2009, 2011; Faville et al. 2012).

Field assessment of seed production traits at Palmerston North

The field experiment for study of seed yield related traits (Table 1, Fig. 1) and their associations was carried out from July 2003 to February 2004 at AgResearch in Palmerston North, New Zealand (40°21′S, 175°37′E) on a fertile, alluvial silt loam soil with pH of 5.5. Four clonal replicates of the 200 mapping population progeny and of the two parents were transplanted to the field on 14–15 July 2003, using a randomised complete block design with 60 cm between plants. There was one copy of each plant genotype, comprising 10–15 tillers per replicate. A total nitrogen (N) application of 200 kg N/ha was applied as urea in three equal portions on 1st September, 1st October and 15th October. The plant growth regulator Moddus® (trinexapac-ethyl, TE) was applied at 200 g TE/ha at first spike emergence (5th November) and again three weeks later to prevent plants from lodging (Chastain et al. 2003). Stem rust (Puccinia graminis) was observed and controlled with Systhane 125 fungicide, applied at 20 mls/100 L of water/ha in December 2003 and January 2004.

Table 1 Seed yield related traits measured, trait abbreviations, and methodology for trait assessment
Fig. 1
figure 1

a Stylistic representation of the perennial ryegrass inflorescence showing key components. b Conceptual relationships among seed yield-related traits and seed yield per plant (SYPlant) in perennial ryegrass, showing SYPlant as a product of RTiller × FSp × FSUtil × TSW. SYSp = seed yield per spike; RTiller = no. of reproductive tillers; SpktSp = spikelets per spike, FSpkt = florets per spikelet, TSW = thousand seed weight, SpLen = spike length, FSp = florets per spike, FSUtil = floret site utilisation, DTH = days to heading, SOH = spread of heading, PGHabit = plant growth habit

During November and December, data were collected on heading date [DTH (2003)], spread of heading (SOH), and plant growth habit (PGHabit), as defined in Table 1. Three spikes were harvested randomly from each plant 7 weeks after heading, placed in sealed plastic bags and kept in a refrigerator. Plants were harvested with hand-shears to about 10 cm above ground level between 6 and 26 January 2004, as seed heads ripened. In order to minimize seed shattering losses of early-emerging heads and immaturity at harvest of late-emerging heads, harvesting was carried out when most of the spikes had brown or gold coloured glumes. After harvest heads were air dried for three weeks, then stored in paper bags in a dry warehouse until they could be hand processed to determine seed-yield-related traits. The total number of reproductive tillers from each plant was counted. SYPlant and thousand seed weight (TSW) were measured, while data on spikelets per spike (SpktSp), FSpkt and spike length (SpLen) were collected from the three spikes that were harvested earlier from each plant, as defined in Table 1. Florets per spike (FSp), seed yield per spike (SYSp) were calculated from those data (Table 1). Floret site utilisation (the number of saleable seeds per plant divided by the number of florets counted per plant post anthesis, Elgersma 1985) was also calculated as: (FSUtil) = [(SYPlant, g) × (1000/TSW, g)]/[(RTiller) × FSp]. Seed moisture content was measured for four randomly selected genotypes using the high constant temperature oven method (International Seed Testing Association, 2004) at 130 °C (± 2 °C) for 1 h.

Plants were maintained in the field through winter 2004 and heading date measured as in the previous year [DTH (2004), Table 1] to evaluate the year-to-year repeatability of heading date differences among the genotypes.

Field assessment of seed production traits at Lincoln

For assessment of key seed yield traits in a second environment, population I × S was evaluated for SYPlant, SYSp and DTH at a Lincoln site near Christchurch, New Zealand as part of an existing trial of similar design to that of the Palmerston North trial (Faville et al. 2012). Traits were evaluated as described for the Palmerston North experiment, including measurement of DTH in two consecutive years.

Weather data

Data on temperature, rainfall and sunshine during the period of experimentation were recorded for Palmerston North at a weather station 600 m from the experimental plots, and at Lincoln using the closest National Institute of Water and Atmospheric Research weather station (Lincoln Broadfield, except Christchurch Aero for sunshine) (Table 2).

Table 2 Mean monthly temperature, rainfall and sunshine observed in close proximity to the Palmerston North (PN) and Lincoln (LIN) experimental sites during July 2003 to January 2004

Statistical analysis

Statistical analysis of phenotypic data from each of the Palmerston North and Lincoln locations was carried out using the linear mixed models option in GenStat v 8.1 (GenStat 2005). A random linear model was applied in the analyses, using the REML algorithm, with I × S genotype, replicate, row and column effects considered as random. Adjusted phenotypic means were based on Best Linear Unbiased Predictors (BLUP’s) (White and Hodge 1989), enabling adjustment for random error across replicates, columns and rows within replicates. The significance of an estimated variance component was determined by the ratio of the component relative to its standard error. If the variance component of a model term was more than two standard errors from zero, then the variance component was considered significant (p < 0.05). After inspection of normal probability plots, data transformation was not considered necessary.

The Palmerston North dataset contained 1616 observations for DTH (200 F1 progeny + 2 parents = 202 genotypes × 2 years × 4 replicates) and 808 observations for each of the other eleven traits which were measured in 1 year only. Similarly, the Lincoln dataset consisted of 1616 observations for DTH and 808 observations for each of SYPlant, SYSp and RTiller. Broad sense heritability for each trait was calculated from variance components yielded by the random effects model as:

$$H_{b} = \frac{{{{\upsigma }}_{g}^{2} }}{{{{\upsigma }}_{g}^{2} + \left( {\frac{{{{\upsigma }}_{\varepsilon }^{2} }}{{n_{r} }}} \right)}}$$

where σ 2 g  = genotypic component of variance, σ 2ε  = residual variance of genotypes and nr = number of replications (Burton and DeVane 1953). Trait associations were evaluated by correlation and principal component analysis (PCA) (Sartie et al. 2011) using Minitab version 10.51 (Minitab Inc, 2081 Enterprise Drive, State College, PA).

Analysis of a dataset combining trait data from the Palmerston North and Lincoln sites for DTH, SYPlant, SYSp and RTiller was also completed, using the linear mixed models option in GenStat as described above and with site additionally considered as a fixed effect. The linear model also included a genotype-by-site interaction effect. Heritability was estimated from variance components as:

$$H_{b} = \frac{{{{\upsigma }}_{g}^{2} }}{{{{\upsigma }}_{g}^{2} + \frac{{{{\upsigma }}_{gs}^{2} }}{{n_{s} }} + \left( {\frac{{{{\upsigma }}_{\varepsilon }^{2} }}{{n_{r} n_{s} }}} \right)}}$$

where σ 2 gs = genotype-by-site component of variance and ns = number of sites.

QTL analysis

Quantitative trait loci analysis was conducted using the I × S consensus genetic linkage map marker data (Sartie et al. 2011). The mean value for each trait from 188 F1 genotypes in the present study was used for QTL analysis implemented in MapQTL® 4.0 software (Van Ooijen et al. 2002). Simple interval mapping (IM) was performed and then multiple QTL model (MQM) mapping was used to refine the position and magnitude of QTL. The default mapping step size of 5 cM was used for both IM and MQM. Co-factors for MQM analysis were selected using a procedure based on forward selection followed by backward elimination (Van Ooijen et al. 2002) as described in Sartie et al. (2011).

Permutation testing (n = 1000) was performed for each trait to establish LOD thresholds for QTL declaration at a linkage group- or genome-wide significance at p < 0.05 (Churchill and Doerge 1994). QTL position was described by LOD peak position and 1- and 2-LOD support intervals. An additional criterion for declaration of a significant QTL was the presence of markers within the 2-LOD support interval that were significantly associated with the trait in a non-parametric Kruskal–Wallis analysis, executed in MapQTL® 4.0. QTL names are constructed as: trait name-year of trait assessment-linkage group (LG) number.

As described previously (Sartie et al. 2011), phenotypic trait means for the four different QTL genotype classes at a locus (ac, ad, bc, and bd) calculated in MapQTL 4.0 were used to report QTL in terms of the individual parental effects (i.e. the difference in effect of the alleles inherited from each parent, ‘I’ and ‘S’), following the model of Knott et al. (1997) as used by Sewell et al. (2000, 2002).

For mapped ryegrass ESTs, sequence alignment by the Basic Local Alignment Search Tool, BLASTN (Altschul et al. 1990, 1997) (threshold values of < E-15; SID > 85% over > 100 bp) was used to estimate identity with homologous rice genome positions in the MSU rice pseudomolecule assembly hosted at www.gramene.org. Where possible the MSU assembly was also used to identify rice genome positions for QTL-linked markers reported from previous studies.

Results

Phenotype analysis

Palmerston North experiment

Data are presented in a logical rather than chronological collection sequence: SYPlant, SYSp as the overarching traits, measured primary data (RTiller, SpktSp, FSpkt, TSW and SpLen), derived secondary data (FSp, FSUtil), and other measures (DTH, SOH, PGHabit). A majority of traits exhibited statistically significant differences between the parents and among the progeny, and transgressive segregation (Table 3). The range among genotypes in SYPlant was greater than 5×, whereas for the primary yield component traits the range was smaller (RTiller 3.2×, SpktSp 1.5×, FSpkt 1.6×). Although parent ‘I’ (from ‘Grasslands Impact’) had more tillers per plant pre-flowering than parent ‘S’ (‘Grasslands Samson)’ (data not shown), the latter had more reproductive tillers (RTiller) and greater seed yield per plant (SYPlant) than parent ‘I’ (Table 3).

Table 3 Mean (± SE, standard error of the mean), range, genotype variance component (σ 2g  ± SE), genotype-by-environment variance component (σ 2gs  ± SE) and broad sense heritability (Hb) for seed yield-related traits of the I × S perennial ryegrass mapping population progeny and parents assessed in the field as spaced plants in 2003 and 2004, at Palmerston North and Lincoln. S: ‘Samson’ parent; I: ‘Impact’ parent. LSD0.05 = least significant difference amongst I × S genotypes at p < 0.05

When trait associations are assessed by simple correlation analysis (Table 4) SYPlant is most strongly associated with seed yield per spike (SYSp) and RTiller. SYSp in turn is most closely correlated with floret site utilisation (FSUtil) while RTiller displays a negative correlation with spread of heading (SOH). The only significant negative correlations between seed yield components involve FSUtil, and these were modest, but greater SOH is associated with lower SYPlant, later heading with lower florets per spikelet (FSpkt), and prostrate growth habit (PGHabit) with lower TSW (Table 4).

Table 4 Coefficients of correlation between seed yield component traits of the I × S perennial ryegrass mapping population assessed in the field as spaced plants at Palmerston North and Lincoln (suffix ‘Lin’); (p < 0.05 if r ≥ 0.14; p < 0.01 if r ≥ 0.18; ns = non-significant, p > 0.05)

When trait associations are assessed by principal component analysis to extract trait associations corrected for the effects of variables on each other (Table 5), the first three principal components (PCs) collectively account for 57.9% of the data variation and show a progressively diminishing link with SYPlant (coefficients 0.479, 0.257, and 0.172, respectively) driven largely by FSpkt contribution to FSp and SYSp (PC1), retention of FSUtil in competition with spikelets per spike (SpktSp) and FSpkt (PC2), and high RTiller (PC3). PC4 (10.8% of data variation) is of interest as it indicates only a modest link between 1000-seed weight (TSW) and SYPlant (Elgersma 1990b), but confirmation of a positive relationship between high TSW and erect growth habit (Table 5).

Table 5 Coefficients of principal component analysis of seed yield per plant and component traits for the I × S perennial ryegrass mapping population assessed in the field at Palmerston North as spaced plants in 2003. Coefficients of absolute value less than 0.15 have been suppressed

Lincoln experiment and across-site analysis

Data for SYPlant, SYSp, RTiller and days to heading (DTH) from Lincoln also exhibited significant genotypic variation (Table 3). Mean trait values and ranges amongst IxS genotypes for Lincoln data was similar to that observed in Palmerston North, except that the range for SYPlant was smaller and mean RTiller was considerably lower than in Palmerston North. SYPlant, SYSp, DTH (2003) and DTH (2004) from Lincoln significantly (p < 0.05) correlated with their respective Palmerston North data (Table 4), with the highest correlations involving DTH (r = 0.50–0.65). Of the seed yield traits, the highest correlation between sites was for SYSp (r = 0.43). RTiller from Lincoln did not correlate significantly with Palmerston North RTiller. Within the Lincoln site, the magnitude of correlations amongst traits was similar to that observed for Palmerston North but RTiller was only weakly correlated with SYPlant at Lincoln.

Across-site statistical analysis resulted in significant σ 2 gs components (Table 3) for the four measured traits, confirming a G × E influence for all. Based on the ratio of σ 2 gs 2 g , the G × E effect was strongest for SYPlant. SYPlant had an interaction ratio of 0.94 compared with 0.29, 0.25 and 0.36 for SYSp, DTH (2003) and DTH (2004), respectively. This was broadly consistent with the estimated heritabilities for these traits (Table 3).

QTL analysis

In total 42 significant QTL, located at 21 discrete genomic positions, were detected by MQM analysis using data from the Palmerston North experiment (Fig. 2). QTL were identified for all the traits except RTiller and were present on all seven linkage groups (LG), with two to five significant QTL declared per trait. QTL were generally of low or moderate effect, explaining on average 12.3% of total phenotypic variation (Vp). Major QTL were identified for some traits, notably DTH for which a QTL accounting for 39% (2003) and 28% (2004) Vp was detected on LG2 (Table 6). Total Vp explained by all QTL for a trait ranged from 19.8% (FSpkt) to 74.9% (PGHabit), with the mean across all traits 40.2%. However, the true magnitudes of QTL effects in this study are likely to be smaller than reported as the analysis was conducted in a relatively small population (n = 188) (Elgersma 1990a), making Vp values prone to upwards bias (Beavis 1994).

Fig. 2
figure 2

Genetic linkage map developed for F1 perennial ryegrass mapping population I × S, showing QTL for seed yield and seed yield related traits, as measured in the field using spaced plants at Palmerston North, New Zealand in 2003 and 2004. QTL for selected traits at the Lincoln site and across environments are also shown. Marker names are shown at left of linkage groups (LG) and QTL locations (2-LOD confidence intervals) are indicated by rectangles at right of each LG (QTL names as per Table 6). The length of linkage groups in centimorgans (cM) is indicated by the scale at the left of the figure

Table 6 QTL controlling seed yield and related traits, detected on a genetic linkage map by simple interval mapping (IM) and multiple QTL model (MQM) analyses in I × S F1 perennial ryegrass mapping population data in 2003 and 2004 at Palmerston North, New Zealand

Eleven of the 42 QTL occurred in positions independent of other traits, most notably SpLen for which all four QTL occurred at positions free of QTL for other traits (Fig. 2). By contrast, clustering of multiple QTL (2-LOD interval overlap) occurred at ten genome locations (Fig. 2), most notably on LG2, 4 and 6, indicating a potential common genetic basis for QTL at these positions. QTL for SYPlant on LG2 and 6 both co-located with QTL for SYSp and other component or related traits.

The LG2 QTL cluster identifies relationships amongst SYPlant with SYSp, FSpkt, SpktSp, FSp, TSW and PGHabit. Estimated parental effects (phase and size) were conserved between SYPlant and SYSp (Table 6), with a strong effect due to alleles segregating from parent ‘I’. Consistent parental effects were also estimated for FSp, FSpkt and SpktSp and PGHabit at this position, but these differed from SYPlant/SYSp in that approximately equal effects were conferred by alleles segregating from both parents. For TSW the alleles segregating from parent ‘S’ exerted a large effect, opposite to that estimated for the other traits. Differing patterns of parental effects amongst QTL at the LG2 position may be due to the influence of alternative alleles at the QTL or it is possible that this reflects tight linkage amongst separate loci modulating the expression of the separate traits. It is not possible, with the resolution of QTL mapping achieved in this study, to decisively determine which of these scenarios holds.

The LG6 position identified a relationship of TSW, FSUtil and SYSp with SYPlant. Here the parental effects were consistent amongst all QTL (Table 6), implying a common genetic basis. A QTL for SOH was detected in close proximity (Fig. 2) and may be a product of the same genetic locus, but was characterised by an opposing set of parental effects (Table 6). High SOH is likely to lead to high seed loss before harvest and thus links to low FSUtil and low seed yield.

Large effect QTL for DTH on LG2 and LG4 were detected in both years (2003 and 2004), while smaller effect QTL were detected at different positions on LG7 in 2003 and 2004, respectively (Fig. 2). DTH (2003) and DTH (2004) QTL at the LG2 and LG4 positions shared consistent additive parental effects (Table 6). QTL for other traits at these positions had parental effects in opposition to those of DTH (SOH on LG2; FSpkt and FSp on LG4), in line with the negative correlations observed between these traits and DTH (Table 4).

The strong correlation observed between FSUtil and SYSp (Table 4) was manifested in the co-alignment of QTL for these traits at four positions on LG5, 6 and 7. In all positions, QTL parental effects were conserved for the two traits, suggesting a common genetic factor.

QTL analysis of data for the four traits measured at Lincoln, revealed a total of seven QTL, for SYSp and DTH (Fig. 2, Table 6). No QTL were detected for RTiller, as for Palmerston North, and additionally no significant QTL for Lincoln SYPlant were found. The QTL for SYSp (LG6 and LG7), DTH 2003 (LG2, LG4, LG7) and DTH 2004 (LG2 and LG4) (Table 6) co-located with their Palmerston North equivalents; no Lincoln-specific QTL were detected for any trait. In all cases these additional QTL shared similar direction and magnitude of parental effects with their counterparts discovered in Palmerston North data, suggesting a common genetic basis (Table 6). QTL analysis based on BLUP means from a GxE analysis for SYPlant, SYSp and DTH confirmed QTL positions for all three traits identified by the Palmerston and Lincoln analyses (Fig. 1, Table 6).

Discussion

Associations between seed yield component traits

Seed yield is a complex trait, and the identification of its component traits and their interactions, and related traits and the genetic factors controlling them, will provide a better understanding for the genetic improvement of seed yield potential in perennial ryegrass. Conceptually, seed yield per plant (SYPlant) can be understood as the product of number of reproductive tillers (RTiller) and seed yield per spike (SYSp). SYSp in turn can be understood as the product of florets per spike (FSp), floret site utilisation (FSUtil) and 1000-seed weight (TSW) (Fig. 1b). FSp is the product of florets per spikelet (FSpkt) and spikelets per spike (SpktSp). Spike length (SpLen) may also influence SpktSp, while there is no logical basis for predicting the influence of DTH, SOH and PGHabit on these other traits. The statistics presented here for these traits are not all derived independently, with some being arithmetic functions of others.

By simple correlation analysis SYPlant is most closely correlated to SYSp (Table 4), but this is as much a mathematical as a biological association because SYSp equals SYPlant/RTiller. Using PCA to isolate trait associations contributing independently to SYPlant is revealing. Firstly, it is seen that the associations among the component traits of seed yield are comparatively complex. There is not a “single size” PC encompassing most of the traits and which describes plants with either superior or inferior seed yield (as often occurs in PCA analyses of this kind). Rather, a series of four components was identified (Table 5), each describing an association between a specific subset of seed yield component traits, and each explaining a part of the observed variation for seed yield amongst genotypes. PC1 is rather strongly associated with SYPlant (coefficient + 0.479) and with three of the four measured primary seed yield component traits; RTiller, SpktSp, and especially FSpkt (coefficient + 0.349). PC2 can be seen as identifying intra-plant competition factors, such as a resource-allocation trade-off reflected by a negative association between increased FSUtil (coefficient + 0.439) and decreased SpktSp and FSpkt (coefficients − 0.246 and − 0.361, respectively). This indicates that plants forming fewer florets and attaining higher FSUtil tended to have higher SYPlant (coefficient + 0.257). This finding corroborates research of Rumball and Foote (2008), where it was found that selection for inflorescence branching to increase FSp reduced SYPlant. A recent study (Abel et al. 2017), based on path and correlation analysis of seed yield components also identified SpktSp and FSpkt as traits with a direct effect on seed yield in perennial ryegrass.

PC3 involves SOH (coefficient − 0.352), a trait not generally considered in discussion of seed yield potential. This indicates that plants with greater uniformity in flowering date (i.e. less spread) generally produce more reproductive tillers (coefficient + 0.655) with a modest contribution to SYPlant (coefficient + 0.172). Finally, PC4 is of high biological interest because it is the first to substantively involve TSW (coefficient + 0.707), a trait widely held to correlate with seed vigour, and anecdotally presumed to be important to seed yield. However, this PC accounts for < 11% of the data variation and only a modest contribution to SYPlant (coefficient + 0.189) is indicated. Indeed, TSW is more strongly associated in PC4 with erect growth habit (PGHabit coefficient − 0.517), than with SYPlant (Table 5). These findings indicate that TSW is not an important determinant of seed yield in perennial ryegrass. Other reports (Rowarth et al. 1999) show that for forage species, TSW is a seed quality factor associated with increased seedling weight (Jin et al. 1996), vigour (Bean 1980) and field emergence (Rowarth and Sanders 1996)), and hence improved field production (Hampton 1986), making it a useful target for breeders.

Previous studies by several other researchers have identified SYSp and RTiller as traits that significantly influence seed yield in perennial ryegrass (Bugge 1987; White 1990; Elgersma 1990b; Marshall and Wilkins 2003; Studer et al. 2008). However, the recognition that SYSp is itself a complex trait, determined by primary traits FSUtil, SpktSp, FSpkt and TSW, and that there are several associations amongst subsets of component traits contributing independently to SYPlant, improves our understanding of seed yield in perennial ryegrass.

Indications for breeding targets

From a perspective of prioritising breeding targets aimed at improving seed yield in perennial ryegrass, the present data provide a number of firm indications.

Firstly, in deconstructing SYPlant and SYSp to more fundamental components, an obvious trait of interest is FSUtil. The trait has high heritability (0.94, Table 3), a high correlation with SYPlant (Table 4) and contributes to SYPlant via independent trait associations in PC1 and PC2 (Table 5). However, there is a practical problem in that determining FSUtil in large numbers of plants for selection purposes would impose severe logistical difficulties. The possible use of QTL markers to indirectly select for FSUtil is discussed below. Another possible avenue for FSUtil enhancement is indirectly, by leveraging the modest association observed between FSUtil and DTH (i.e. higher FSUtil associated with later heading: Table 4; PC2 in Table 5). DTH also has high heritability (0.94, Table 3) and selecting for later heading has further implications for on-farm management of perennial ryegrass, as head emergence tends to both increase total herbage accumulation rate and decrease forage digestibility (Parsons 1988). In forage seed production, late heading in perennial ryegrass is reported to reduce floret fertility (Anslow 1963) and also the numbers of spikelets and florets per spike (Hill and Watkin 1975). These reports support the current results, except that in our study the later heading parent (I), which had reduced seed yield as a result of reduced FSp and SpktSp, showed a compensatory increase in FSUtil (Table 3).

Secondly, uniformity of heading within a plant, measured in this data set as spread of heading (SOH), is vital in seed production to minimize seed shattering losses of early-emerging heads and immaturity at harvest of late-maturing heads, and thereby make harvest date determinations easier. The recommended practice for managing harvesting time in commercial grass seed production is to harvest seeds at an average seed moisture content of 45% (Silberstein et al. 2010). This practice however, may be associated with a substantial reduction in seed yield of late or early heading plants. Our results show that with reduced SOH, increased FSUtil and SYPlant is also expected (Tables 4 and 5). This adds further importance to the goal of developing cultivars with synchronised heading of tillers within a population, as it not only minimises yield losses due to seed production practice but also maximises seed yield per plant. Measurement of SOH presents few logistical problems and our results revealed moderate broad sense heritability for SOH, making selection for this trait worthwhile.

Thirdly, a high number of reproductive tillers would intuitively be expected to enhance SYPlant. However, after extracting the positive effect of RTiller captured in PC1 and associated with reduced SOH, trait interactions offset much of any potential advantage from high RTiller. In particular, high RTiller negatively affected FSUtil (Table 4; PC3 in Table 5), presumably reflecting within-plant competition factors, like the SpktSp/FSUtil and FSpkt/FSUtil trade-offs noted above. Also, total tiller number per plant pre-flowering did not translate to high RTiller (tillers from which seed-heads were harvested) or high SYPlant (data not presented). This is consistent with previous observations that increased plant tiller number in late winter-early spring does not always result in increased seed yield (Brown 1977; Hampton 1986). Differences in tiller number per plant may also reflect plant spacing and light capture strategy (Matthew et al. 1995). For instance, there was a poor correlation between seed yield of spaced and drilled plants of perennial ryegrass (Elgersma 1990a) and this was attributed to changes in the number of spikelets (Elgersma 1990a). However, spaced plants are also more likely to develop high tiller number per plant than drilled ones and differences in the production of RTiller may be another factor contributing to the poor correlation. The potential for lack of correlation between results from spaced plants and drilled trials is a salutary reminder that in any plant breeding programme trait stability across environments must be confirmed at various points in the programme.

Fourthly, selection for TSW is less likely than selection for other traits discussed above to enhance SYPlant but, where desired, could be influenced indirectly by selection for erect growth habit (Table 5), if this association proved robust across populations and environments. Measuring PGHabit is simpler than measuring TSW, and indirectly selecting plants for TSW based on erect PGHabit will enable selection of superior plants before they are harvested.

QTL for seed yield and related traits

Consistent with previous studies (Yamada et al. 2004; Armstead et al. 2008; Studer et al. 2008; Byrne et al. 2009; Brown et al. 2010; Paina et al. 2016), quantitative trait locus (QTL) analysis identified numerous genomic locations responsible for conditioning seed yield and seed yield-related traits. Two QTL of moderate effect were identified for SYPlant itself on LG2 and LG6, together accounting for 23% of the phenotypic variation observed in the mapping population. These co-aligned with QTL for component traits of SYPlant (SYSp, FSpkt, FSp, FSUtil, TSW, among others; Fig. 2) and these associations therefore reflect the impact of those component traits on SYPlant. The resolution of QTL mapping does not allow for assessment of whether QTL co-linearity is due to a common genetic factor or whether there are closely-linked QTL which act independently of each other, but cumulatively. QTL of this type may well be responsible for the various independent trait associations identified in successive PCs by PCA. Similarly, QTL for SYSp on LG5 and LG7 co-locate with QTL for FSUtil, but in this case without involvement of other traits.

As with earlier studies (Yamada et al. 2004; Armstead et al. 2008; Studer et al. 2008; Byrne et al. 2009; Paina et al. 2016) multiple QTL were detected for DTH, including a QTL of large effect on LG2 (28–29% Vp), and these were, with one exception, stable across years and environments (see below). It seems likely that these, as well as QTL for other seed yield-related traits, are the same QTL as those detected in previous reports (Supplementary Table 2) but it is difficult to confirm this definitively. In silico comparison of ryegrass QTL from different studies, based on physical positions of marker sequences in the rice genome was limited by availability of DNA sequence information for many of the QTL- linked markers. Amongst the QTL that could be compared in this way, the DTH locus on LG7 (Hd3a described by Armstead et al. 2004) and the SpLen QTL on LG2 (Brown et al. 2010) appear to correspond positionally with equivalent QTL in those studies. QTL-linked markers for DTH from both studies mapped by BLASTN to an interval of 2.7–2.9 Mb on rice chromosome 6 and for SpLen 29.7–31.4 Mb on chromosome 4.

Co-location of QTL within the current study also occurred for trait combinations that would not logically be expected to be interdependent, one example being the QTL for FSpkt, FSp, DTH, and PGHabit at 61 cM on LG4 (Table 6, Fig. 2). As per above, this may be explained either by pleiotropic effects, or by close linkage of two different genes. Another example from our data that fits this scenario is that QTL for the vegetative traits leaf appearance interval and ligule appearance interval in the same mapping population (Sartie et al. 2011) occur at or in close proximity to the LG6 SYPlant QTL (Table 6; Fig. 2). A canonical correlation analysis (Matthew et al. 1994) to explore links between the vegetative trait phenotypic data and these reproductive trait data found there was indeed a highly significant statistical association between leaf appearance interval and SYPlant, explaining a portion of the variance for those traits (Supplementary Table 1). Further linkages between vegetative traits and heading date were also observed. Pauly et al. (2012) noted that the LG7 DTH QTL has been associated in other mapping populations with vegetative traits, including herbage yield, plant height and leaf lamina length; and proposed there may be a link between heading date and traits associated with plant growth, focused on the genes Hd3 and Hd1. Our data support this relationship, with the I × S LG7 QTL for DTH co-aligning with QTL for leaf elongation rate and leaf elongation duration (Sartie et al. 2011). In our current study, QTL for FSUtil and SYSp were also identified in this LG7 region (Table 6; Fig. 2). Furthermore, we note that the DTH QTL on LG2 occurs at positions previously associated in population I × S with QTL for dry matter yield (Sartie et al. 2011) and plant growth (Faville et al. 2012). The LG4 DTH QTL also aligns closely with QTL for leaf morphogenetic (Sartie et al. 2011) and field growth traits (Faville et al. 2012). As a point for future research, there is now considerable emerging understanding of the genetic control of inflorescence development and the associated molecular systems (Bommert et al. 2005; Kellogg 2007) and, with the emergence also of new genomic sequence resources for ryegrass (Byrne et al. 2015), it would be of interest to identify molecular mechanisms linked to particular QTL, especially those implicated in regulation of intra-plant competition, such as the SpktSp, FSpkt and TSW loci.

G × E interactions for seed yield and related traits

In their quest for understanding phenotypic variation via integrated approaches in the field environment, Pauli et al. (2016), recognised that physiological traits, plant developmental phases and growth conditions are interrelated, and this relationship requires an understanding of genotype-by-environment (G × E) interactions. Such interactions are widely observed in perennial ryegrass for herbage yield and other agronomic traits (Jafari et al. 2003; Conaghan et al. 2008; Fé et al. 2015; Easton et al. 2015; Kerr et al. 2012). In this study, we were able to explore the influence of G × E on seed production after measuring a subset of traits for population IxS at two contrasting locations. Linear mixed model analysis based on trait data from both sites showed that GxE interaction, measured as σ 2gs , was highly influential for SYPlant (Table 3) and this was reflected in a lack of QTL for SYPlant in common between the two environments. In contrast, smaller (albeit significant) GxE interaction was detected for both SYSp and DTH and QTL were identified at similar locations on the same linkage groups (four of five QTL positions for DTH and two of six QTL for SYSp), indicating QTL that are stable across environments. Environmentally-stable QTL and their associated markers represent particularly robust candidates for development of marker-assisted breeding to improve seed yield and manipulate heading date in perennial ryegrass.

Conclusions

This study has provided an improved understanding of component traits associated with seed yield in perennial ryegrass, which may be targeted for selection to improve seed yield potential via conventional and molecular breeding approaches. Superficially, variation in SYPlant was strongly associated with variation in RTiller and SYSp. Deconstruction of SYSp to component traits by PCA pointed to FSUtil as being of high importance to seed yield, and showed that TSW was less important than previously believed. QTL were discovered for all investigated traits except for RTiller. QTL were identified that may be used to inform further dissection of this economically significant trait to candidate gene level. QTL-linked markers may also be utilised in MAS or as fixed factors in a genomic prediction model for improving in seed yield related traits and heading date. Nevertheless, validation of QTL effects, in other environments and genetic backgrounds, should be a prerequisite for deployment in MAS for improved seed yield in perennial ryegrass.