Introduction

The success of hybrid rice is based on the availability of genetic variation and efficient selection strategies for germplasm that make it possible to exploit hybrid vigor commercially. Accurate assessment and assignment of parental lines into heterotic groups, defined as related or unrelated genotypes from the same or different populations, which display similar combining ability and heterotic response when crossed with genotypes from other genetically distinct germplasm groups (Melchinger and Gumber 1998), have become a prerequisite for a successful commercial hybrid breeding program as shown in maize (Melchinger and Gumber 1998; Reif et al. 2005), sunflower (Reif et al. 2013), sorghum (Menz et al. 2004), triticale (Fischer et al. 2010), and rice (Lu and Xu 2010). Traditionally, heterotic groups are evaluated by combining ability analysis, which involves a multi-environment evaluation of parents and hybrids; however, advances in molecular marker technology have made it possible to combine information on parental pedigree and field trials with molecular marker data to detect and establish heterotic groups (Melchinger 1999).

Hybrid rice has been used in commercial production for four decades. Currently, there are about 17 million hectares of hybrid rice production in China and another 4 million hectares in other countries, mainly in Bangladesh, India, Indonesia, Myanmar, the Philippines, the US and Vietnam. Heterotic rice hybrids are generally derived from distant parents by geographic origin or different ecotypes (Yuan 1977; Lin and Yuan 1980). In the earlier stage of hybrid rice development in China, two heterotic groups, that is, early season indica from southern China and mid- or late-season indica from Southeast Asia were identified for three-line hybrid rice based on wild abortive (WA) male sterile cytoplasm (Yuan 1977). More heterotic groups were studied and identified for three-line hybrids derived from other male sterile cytoplasm and for two-line hybrid rice based on thermo-sensitive genic male sterility (Wang and Lu 2007; Lu and Xu 2010). For other types of rice hybrids, however, such as tropical indica, and temperate and tropical japonica, no clear information is available for a definition of heterotic groups. Parents of tropical indica hybrid rice are still categorized by fertility reaction (restorer or maintainer of male sterility). The lack of a systematic study aimed at heterotic groups could be one of main reasons for observed low yield heterosis in tropical hybrid rice resulting from the unpredictable combination of parents.

The International Rice Research Institute (IRRI) has been one of the major developers of hybrid rice in the world, with many hybrid rice products (parents, breeding lines and hybrids) produced and disseminated. IRRI-bred hybrid rice germplasm has been playing a major role in hybrid rice programs in public and private organizations globally. The long-term goal of IRRI’s hybrid rice breeding program is to develop and disseminate broad-based heterotic germplasm, parents and hybrids fitting multi-environments with high yield, good quality, multiple resistances and tolerance to diseases and insects, and other production-required traits. There is, therefore, a crucial need not only to understand and verify the heterotic groups and patterns of IRRI hybrid rice germplasm but also to provide a guideline reference to maximize germplasm potential useful for increasing heterosis. In a previous study, we analyzed the genetic diversity and germplasm structure of hybrid rice parents historically developed at IRRI using simple sequence repeats (SSR) and single nucleotide polymorphism (SNP) markers (He et al. 2012). Both types of marker data revealed a consistent germplasm structure with six groups within two major clusters. The objectives of this study were to (1) evaluate the magnitude of yield heterosis among parents grouped by SSR markers; (2) examine the consistency between marker-based group and heterotic performance of hybrids; and (3) identify foundational male and female hybrid parents in discrete germplasm pools and construct a set of core heterotic groups to provide a reference for tropical indica hybrid rice breeding.

Materials and methods

Plant materials

A sample of 18 WA-cytoplasmic three-line hybrid rice parents, including 12 maintainer (B) lines and 6 restorer (R) lines, was selected from a distance-based cluster generated from 207 SSR markers (Table 1) (He et al. 2012). The parent selection was based on lines (1) representing the original 6 groups clustered from the SSR markers; (2) covering a maximum of the allelic variation which was 59 % of the original 168 hybrid rice parents; and (3) popularity applied in tropical hybrid rice breeding and production. A diallel mating design was used among the parents without reciprocals to develop 153 F1 hybrids in the wet season (WS) of 2009 and the dry season (DS) of 2010. Two commercial inbred varieties (IR72, IRRI 123) and three commercial hybrids (IR75217H, IR78386H and Pioneer001) were included as checks.

Table 1 Parents and their yield means (g plant−1) and GCA across environments

Field experiments

A total of 176 entries (153 hybrids, 18 parents, and 5 checks) were evaluated in field trials in five environments (year, season and location): (1) 2010WS and 2011DS, Los Baños, Philippines, 14°10′N/121°13′E; (2) 2010WS and 2011DS, General Santos, Philippines, 6°5′N/125°14′E; and (3) 2011DS, Hyderabad, India, 17°22′N/78°28′E. All entries were grown in a randomized complete block design with two replications. Forty 21-day-old seedlings were transplanted in 4 rows with 10 plants per row at spacing of 20 cm × 20 cm. Field management followed local recommendations for the two different cropping seasons. Data on days to heading were recorded at the heading stage. At ripening stage, five healthy plants in the central rows were harvested and measured for six agronomic traits: plant height, number of panicles per plant, number of spikelets per panicle, spikelet fertility, 1,000-grain weight, and grain yield per plant. All hybrids had normal fertilities. To focus on the main objectives, we analyzed only the data on grain yield per plant, yield heterosis and yield combining ability.

Statistical analyses

The genetic diversity between each pair of parents was measured as Cavalli-Sforza chord genetic distance (GD) (Cavalli-Sforza and Edwards 1967) using PowerMarker v3.25 (Liu and Muse 2005) as previously described (He et al. 2012). A cluster based on the C.S. chord GD matrix of the 18 parents was generated using Darwin 5 software (Perrier and Jacquemoud-Collet 2006). The following statistical model was used for the analysis of variance (ANOVA):

$$Y \, = \, \mu \, + \, E \, + \, R \, + G \, + \, EGI \, + \, e$$

where Y observed value of hybrid yield from each test unit; μ population mean, E environmental (Env) effect, R replication effect within each environment, G genotype (parent or F1 hybrid) effect, EGI interaction effect between each genotype and environment, and e residual effect. The environment and hybrid were treated as fixed factors and the replication-within-environment was considered to be a random factor. The significance of environmental variance was tested against replication-within-environment entity. For all other significance tests, an experimental error term was used. ANOVA was conducted for the yield of parents and hybrids for each and across all environments using the PROC GLM procedure (SAS Institute Inc 2012). General combining ability (GCA) effects of the parents and specific combining ability (SCA) of the crosses were estimated for each and across environments following Griffing’s method 2 model 1 (Griffing 1956) using a R program (R Development Core Team 2011). Yield heterosis for each hybrid was calculated as (1) mid-parent heterosis (MPH) = 100 × (F1 − MP)/MP, (2) better-parent heterosis (BPH) = 100 × (F1 − BP)/BP, and (3) standard heterosis over inbred check (SDHI) or over hybrid check (SDHH) = 100 × (F1 − CK)/CK; where F1 hybrid yield, MP yield mean of both parents, BP yield of the better-yielding parent, and CK yield of the check variety, either the inbred (IRRI 123) or hybrid check (IR75217H). Although five checks were included in the study and the hybrid of IR78386H yielded higher than the hybrid of IR75217H, only IRRI 123 and IR75217H were used for calculations of SDHI and SDHH because these two varieties have been long used in commercial rice production and as standard variety checks for inbred and hybrid yield trials hosted by IRRI, the Philippines Rice Research Institute, and elsewhere. Correlations between GD estimates and means of hybrid yield, yield heterosis and combining ability were calculated using PROC CORR of SAS (SAS Institute Inc 2012).

Results

Genetic distance and groups of parents

The genetic distance measured on the C.S. chord distance from 207 SSR markers for the original 168 parents ranged from 0 to 0.7593, with an average of 0.4469, and the GD of the 18 sampling parents ranged from 0.1246 to 0.6487, with an average of 0.4694, which was slightly higher than that of the original parent population, but the samples were considered, based on the criteria of parental selection, as fairly representative of the allelic variation and the cluster structure of the original population (He et al. 2012) (Figs. 1, 2). The sampling parents were clearly clustered into six groups (G) similar to those in the original parent population with two major clusters, i.e., one with three B-line groups (G4, G5 and G6), and another one with one B-line group (G1) and two R-line groups (G2 and G3). It is noted that G1 is clustered into the R-line sub-cluster and grouped closer to the G3 R-line groups than to the G2 R-line group, and G4 is closer genetically to the R-line groups than are the other two B-line groups (G5 and G6). Among the groups, the average GD (0.4942) of inter-groups was higher than that (0.2833) of intra-groups; however, the GDs in G4 (0.4296) and G5 (0.4145) groups had GD values similar to those in the inter-groups and were much higher than the GDs of other intra-groups (Table 2). The B × R groups had the highest allelic divergence with an average GD of 0.4868, higher than the GDs in B × B (0.4830) and R × R (0.3262) groups.

Fig. 1
figure 1

Cluster of 18 WA-cytoplasmic three-line hybrid rice parents based on genetic distance calculated from 207 SSR markers. Letter and number combination refers to parent as marker-based group_fertility group_parent code

Fig. 2
figure 2

Distribution of genetic distances of the original 168 and the 18 hybrid rice parents sampled for this study based on 207 SSR markers

Table 2 Genetic distance among groups and performance of hybrid yield and yield heterosis

Parental and hybrid performance

ANOVA were performed to determine the statistical significance for the different sources of variation affecting parental and hybrid grain yields (Table 3). Difference of parental yielding was found insignificantly across the environments, but significantly for genotype (P < 0.0001) and GEI (P < 0.01), which accounted for 21 and 28 % of the total sums of squares, respectively. The average yield of the parents was 29.0 g plant−1, ranging from 19.5 g plant−1 (V20B) to 37.2 g plant−1 (IR64R) across the environments (Table 1), and the average yield (31.8 g plant−1) of R-lines was significantly (P < 0.01) higher than the average yield (27.7 g plant−1) of B-lines.

Table 3 Analysis of variance, including degrees of freedom (DF), mean squares (MS), and percent contribution to total sums of squares (%SS) over 5 environments for parent and hybrid yield, hybrid yield standard heterosis over inbred CK (SDHI) and hybrid CK (SDHH), and mid-parent heterosis (MPH) and better-parent heterosis (BPH)

Hybrid yields differed significantly for environment (P < 0.01), genotype (P < 0.0001), GEI (P < 0.01), and replication-within-environment (P < 0.01) with an average yield of 33.3 g plant−1 and a range of 21.1 to 43.6 g plant−1 (Table 3; Fig. 3a). The environment, genotype and GEI contributed for 23, 19 and 31 % of the total sums of squares to the variation, respectively. The environments in which the field experiments were conducted were geographically and seasonally different with diverse hybrid genotypes; thus, large effects due to environment and genotype were expected. The significant and relatively large percentage of the total variation attributable to GEI suggests that hybrids responded differentially to environment for yield. The highest and the lowest yields of hybrids across environments were observed in the 2010WS at General Santos (39.3 g plant−1) and at Los Baños (27.0 g plant−1) (Table 2). Inter-group hybrids yielded significantly (P < 0.01) higher than intra-group hybrids. R × R and B × R hybrids yielded similarly, with average yields of 34.9 and 34.5 g plant−1, respectively, and their yields were significantly (P < 0.05) higher than the yield (31.7 g plant−1) of B × B hybrids. Among the individual hybrid groups, the highest yielding hybrid group was a B × R group (G3 × G5) which produced 38.6 g plant−1, significantly (P < 0.05) higher than the yields in 19 of the other 20 hybrid groups. The G3 × G6 hybrid group, which is also a B × R group, was the second highest yielding group, but was only significantly higher than the yields in 12 of the other 19 hybrid groups.

Hybrid yield heterosis and combining ability

All four sources of effects, including environment, genotype, GEI and replication-within-environment, were significant (from P < 0.05 to P < 0.0001) for MPH and BPH. GEI was the major variant source which contributed 36 and 38 % of the total sum of squares to the MPH and BPH variations, respectively (Table 3). The variation patterns of MPH and BPH were very similar, i.e., GEI was the major source contributing to the total sum of squares followed by genotype, and environment factor was the least affecting component for MPH and BPH. As for the yield heterosis over checks, only three major sources of variations (genotype, GEI and replication-within-environment) were statistically different (either P < 0.01 or P < 0.0001). The environment effect was not a significant contributor to the total variations of SDHI and SDHH, even though it took a relative large portion of the total variation in the sums of squares. GEI was still the most important factor affecting the heterosis over checks.

Comparing to the inbred check, the hybrids had a significant (P < 0.05) yield advantage of 4.3 g plant−1 (14.8 %) over their parents across environments, with an average of 18.1 % and a range of −29.6 to 53.2 % for MPH, and an average of 3.5 % and a range of −37.9 to 34.3 % for BPH across environments (Fig. 3b, c). High MPH and BPH were observed in the environments of the 2011DS at General Santos and the 2010WS at Los Baños (Table 2) because of the relatively low yields of the parents in these two environments. On average, the hybrids outyielded the inbred check by 5.6 %, with a range of −38.0 % to 36.4 %, but yielded less than the hybrid check by 4.5 %. The highest hybrid yield advantage over commercial varieties was observed at 2011DS, Hyderabad where the hybrids yielded 16.7 and 15.7 % more over the inbred and hybrid CKs, respectively, indicating a better environment than Los Banos and General Santos in the Philippines for hybrid rice extension. The inter-group hybrids had significantly (P < 0.001) higher yield and yield heterosis than the intra-group hybrids. Among the 9 hybrid groups that produced higher than the average yield (33.3 g plant−1) of all experimental hybrids, one was from R × R, two were from B × B and the rest were all from B × R groups. The hybrids in the G3 × G5 group had the highest yield, yield heterosis and combining ability among the hybrid groups, followed by the G3 × G6 hybrid group. It was noted that the G4 × G6 hybrids had relatively high MPH and BPH, but they were non-competitive compared with the commercial checks. The R × R hybrids yielded high with low MPH and BPH due to their relatively high-yielding parents, and the B × R hybrids yielded insignificantly with the R × R hybrids, but significantly better than the R × R hybrids with high yield advantage over their parents and high combining ability.

Fig. 3
figure 3

Performance of hybrid grain yield (a), mid-parent heterosis (b), and better-parent heterosis (c)

Thirty-three hybrids (21.6 %) among the 153 experimental hybrids outyielded the inbred check by more than 15 %. The parents involved in those 33 heterotic hybrids were mainly from the groups of G3 (33.3 %), G5 (21.2 %), G2 (19.7 %), and G6 (13.6 %). The parents in the G1 and G4 groups contributed only 4.5 and 7.6 %, respectively, to those 33 heterotic hybrids, and there was no hybrid derived from the parents of IR62829B, IR70368B, and V20B contributed to those top-yielding hybrids.

The parent IR64R had the highest GCA for yield followed by IR69712-154-2-3-1-3R, and they belong to the R-line groups of G2 and G3, respectively (Table 1). With the exception of IR72102-4-159-1-3-3R, the R-lines had positive GCA for yield. Among the 12 B-lines, only 3 had positive GCA yield values. The lowest yielding GCA was the line V20B, which is in the G1 group and had the lowest parental yield. The crosses of G3 × G5, G3 × G6, G2 × G6 and G2 × G5 had high SCA values, hybrid yield and yield heterosis. On average, SCA values of the B × R were better than those of the B × B and R × R groups, and inter-group crosses were better than the intra-group crosses.

Association of hybrid yield heterosis and marker-based genetic distance

Correlations were performed for detecting association of hybrid performance with parental genetic distance based on SSR markers for each and across environments (Table 4). No significant correlations were found between GD and hybrid yield for each and across environments, but some significant correlations were found in particular environments and across environments either on all individual hybrids or on particular hybrid groups based on parental fertility categories. For all of the association tests, it was noted that GD were either insignificantly or significantly, but weakly associated with SCA, MPH and BPH. The highest correlation was the GD with MPH at 2010WS, General Santos in the B × R group (r = 0.4221, P < 0.01). However, it was observed that the correlations between GD and hybrid SCA and heterosis were increased greatly using marker-based groups rather than on the bases of individual hybrids or groups based on parent fertility categories. Across the five environments, the R 2 values of correlations of GD with SCA, MPH and BPH were increased almost twofold as compared to the R 2 values based on all individual hybrids or in the B × R group.

Table 4 Correlation coefficient (r) between SSR marker-based genetic distance and hybrid yield and yield heterosis

Discussion

Parental grouping based on molecular markers

The parents we selected were based on a study investigating genetic diversity of IRRI-bred hybrid rice parents and are considered to be a fair representation of the original parent population as they maintained the same cluster structure and similar allelic variation as those in the original population. In general, the GD among inter-group parents was higher than that among intra-group parents. However, parents of two B-line intra-groups (G4 and G5) had high genetic variation as they showed GD values similar to those of the parents in inter-groups and as shown in the cluster structure. The parents in the B × R groups had the highest genetic divergence, which is desirable for making heterotic hybrids within parental categories of current IRRI hybrid rice germplasm. The parental cluster shows that some IRRI B-lines (G1 and G4) were genetically close to the R groups (G2 and G3) (Fig. 1), revealing a high degree of genetic similarity with R-lines that resulted from insufficient attention given to genetic diversity in B-line breeding. Historically, the majority of IRRI B- and R-lines developed at earlier stages were derived directly from inbred breeding programs with many common ancestors shared and selected under the same environment with similar agronomic criteria without further breeding, which resulted in a relatively high genetic uniformity among hybrid rice parents. This could be one of the reasons for the lower hybrid rice heterosis observed in the tropics than in the subtropics. In the last a few years, this issue has been addressed by separating the hybrid breeding from inbred breeding programs and by developing B- and R-line heterotic groups individually to maximize genetic diversity among hybrid rice parents.

Yield and yield heterosis

It is noteworthy that significant effects were detected for GEI in both parent and hybrid yields and heterosis due to their differential responses to environments modulated by various changes in climate, soil, and cultural practices across locations and seasons. It is an important component in hybrid rice breeding to evaluate GEI effect for developing products adapted to different environments. It is difficult to develop a “universal high-yielding or high heterotic” hybrid for different environments in the tropics because of a more divergent and fluctuating environment as compared to the subtropics. The successful story of Shanyou 63 hybrid rice, which occupied about 50 % of 17 million hybrid rice hectares annually in China during the 1990s to 2000s, could never happen in the tropics. Some varieties have relatively high and consistent yield, or combining ability across environments, such as IR64R, because of their wide suitability in tropical environments. This kind of germplasm is suitable for developing hybrid rice products widely. Some varieties, however, always perform poorly in the tropics, especially those imported from the subtropical and temperate regions. Use of these parents directly in hybrid rice development in the tropics is questionable, even though they are elite parents in a particular cropping region, such as V20B which was a widely used prominent hybrid rice parent in China from the 1980s to 2000s. A successful parent must have some degree of adaptedness to the area where the hybrid is grown (Troyer 2006). High yield potential and high combining ability of a parent could fail to meet the expectation of hybrid heterosis once the parent is moved to a poorly adapted cropping environment.

On average, our hybrids produced 14.8 % higher yield than their parents, with a wide range of variation in yield and heterosis, indicating a good opportunity to select heterotic hybrids for production. Based on an acceptable 15 % of SDHI (a general cutoff of hybrid rice yield advantage over an inbred rice variety for commercial production), there is a 21.6 % chance of our experimental hybrids being qualified as commercial hybrids without consideration of seed production, grain quality, and other factors. The opportunity for success is even higher if high-yielding parents are selected because of high and significant correlations between hybrid yield and MPH (r = 0.6179, P < 0.0001) and BPH (r = 0.6607, P < 0.0001) as revealed in this study.

Heterotic groups based on markers

Genetic diversity estimates are helpful in classifying germplasm into heterotic groups for hybrid crop breeding (Menz et al. 2004). Previously, the relative performance of inbred lines of known origin and pedigree was commonly used, which largely relies on breeders’ empirical experience, to combine parents from different genetic backgrounds to develop heterotic hybrids. Molecular markers have been used in rice to assess the genetic relationships of rice ecotypes or sub-species (Garris et al. 2005; McNally et al. 2009; Zhao et al. 2010; Ali et al. 2011; Thomson et al. 2012) and hybrid rice parents (Xu et al. 2002); however, information is scarce on assessing heterotic groups among tropical rice inbred lines and populations, and no conclusive study has been conducted to clearly defined heterotic groups of tropical hybrid rice parents. Many of those studies investigating genetic diversity in rice with molecular markers were dealing with large pools of sub-species or ecotypes, such as aus, indica, aromatic, temperate japonica, and tropical japonica from rice germplasm collections, but with limited value to practical hybrid rice breeding due to the inability to produce yield heterosis owing, for example, to mass vegetative growth and partial fertility in hybrids between sub-species. Heterotic groups that are applied in breeding and production are different from the parental groups generated from germplasm collections based on molecular markers. It is still a challenge to find agreement between high, producible yield heterosis and high divergence among rice sub-species or ecotypes.

Our study generally agrees with the conclusion derived from many studies of hybrid crops on the association of marker-based GD and hybrid performance that the correlation of the two was too small to be used for predicting hybrid performance (Dudley et al. 1991; Xie 1993, Zhang et al. 1995; Saghai Maroof et al. 1997; Zhao et al. 2010). However, our present results showed that the association and prediction could be enhanced when parental groups are formed first by molecular markers, which may not predict the best hybrid combination, but it reveals a practical value of assigning existing and new hybrid rice germplasm into heterotic groups and increasing opportunities to develop desirable hybrids from the best heterotic groups, which is consistent with a previous study in maize (Lanza et al. 1997).

Heterotic groups in tropical indica hybrid rice

Previous studies related to heterotic groups in hybrid rice (Yuan 1977; Liu et al. 2002; Wang and Lu 2006a, b; Wang and Lu 2007; Xu et al. 2002) drew the following general conclusions: (1) two major heterotic groups in indica WA three-line hybrid rice, i.e., early season indica varieties from central and southern China and indica varieties from Southeast Asia, mostly from IRRI; (2) two major groups as represented by Xie Qinzhao and Zhenshan97 in the female pool, and two major groups as represented by IR24 and IR26, and some other breeding lines with IRRI variety ancestors, such as Ming Hui 63 (an IR30 offspring) in the male pool; and (3) R-lines are more divergent than B-lines (in China). However, a previous study showed that all WA R-lines used in China could be considered as one single group because they shared many ancestors and clustered closely together (Xie et al. 2012) with similar heterotic response to WA-based female parents. Our present study also shows that IRRI-bred B-lines are more divergent genetically than the R-lines.

Opportunities are generally high to obtain superior hybrids derived from parents from an inter-population rather than parents from an intra-population. All but one of the 33 heterotic hybrids identified in this study were from inter-population crosses. Hybrids with high MPH or BPH are not necessarily commercially competitive because of low-yielding parents, such as the hybrid of V20B × IR80151B, which had the highest MPH and BPH, but low SDHI (0.3 %) and SDHH (9.1 %). The ultimate goal of hybrid rice breeding is to produce a hybrid with high yield and high heterosis over the parents and commercial checks.

For the tropical hybrid rice parents generated from IRRI, two heterotic groups could be classified based on current B- and R-line germplasm categories, as represented by parents in G5 and G6 as females and parents in G2 and G3 as males (Table S1). The parents with low or no possibility of producing heterotic hybrids, such as those in the G1 and G4 groups, have limited value for developing heterotic hybrids within current IRRI hybrid rice germplasm and they have to be further improved to combine with germplasm from other possible heterotic pools or to be changed to diverge from current R-line pools. G5 and G6 B-line groups combined with G2 and G3 R-line groups are our preferred choices of hybrids for current IRRI-bred hybrid rice germplasm. On average, the hybrids derived from these four crossing patterns produced 27.4, 11.9, and 14.9 % of MPH, BPH, and SDHI, respectively. We also tracked the pedigrees of all 11 IRRI-bred hybrids released in the Philippines for commercial production, and found that 6 parents were from G2 and G3 (36 %), 10 parents were from G5 and G6 (59 %), and only 1 parent was from G1, but none from G4. It should be noted that the core set of parents from this study generally fits tropical Asian environments based on current IRRI hybrid rice parents. Further heterotic groups could be changed or enhanced when new parents with new traits/germplasm are integrated and adapted to the targeted cropping region.