Introduction

Utilization of heterosis has become a major strategy for increasing productivity of plants, especially in rice, to feed the ever-increasing human population with the decreasing arable land (Masood et al. 2005). There have been a lot of reports on heterosis since Schnell et al. (1982) pointed out that heterosis was a major yield factor in all breeding categories except line breeding. The yield components of F1 hybrids were usually used to explore methods for the estimation and prediction of yield heterosis in rice (Xangsayasane et al. 2010; Melchinger et al. 2008; Gartner et al. 2009; Cho et al. 2004); And a full or partial diallel cross was often conducted to define the heterosis and heterobeltiosis (Zhang et al. 1994; Torres and Geraldi 2007). The heterosis level is clearly a function of the combination of two parents used for offspring production, which may result in a major challenge for plant breeders, as usually several thousand combinations of parents have to be tested for identifying the best combinations (Gartner et al. 2009; Umakanta 2002).

Breeding strategies based on selection of hybrids require expected level of heterosis as well as the specific combining ability (SCA), and the performance of F1 hybrids depends on choice of parents. But how to choose the parental lines which would result in heterotic combination without necessarily making all possible crosses among the potential parents? Several methods, tried with variable success, include: per se performance, genetic diversity as determined through geographic origin, multivariate analysis using morphological and agronomic traits, isozyme and molecular polymorphism, combining ability, mitochondrial complementation, etc. Among these methods, per se performance is based on the assumption that, in general, high yielding parents produce a larger proportion of high yielding hybrids than do low yielding parents (Mohammadi et al. 2008). And in Melo et al.’s (2001) study, where ten commercial single cross hybrids of different origins were involved, a strong correlation was detected between the per se performances and the SCA, and between the per se performances and heterosis. However, the correlation between SCA and genetic distance was not significant.

Molecular markers have been used in rice to discover the genetic relationships of ecotypes or sub-species and hybrid parents (Xangsayasane et al. 2010; Xiao et al. 1996; Xu et al. 2002). The assessment of genetic diversity is important in plant breeding, Molecular marker based genetic diversity analysis (MMGDA) has potential for assessing changes in genetic diversity over time and space (Duwvick 1984), and studies on MMGDA are also enormous and reflect potential applications in rice (Ravi et al. 2003; Xu et al. 2002), poor relationship between molecular marker-based genetic distance and hybrid performance was observed (Hua et al. 2002). Nevertheless, Xie et al. (2013) defined heterotic groups from 168 tropical indica hybrid rice germplasm from International Rice Research Institute (IRRI) using 207 SSR markers, and the result was supported by the eleven IRRI-bred commercial hybrids released in the Philippines showing that the association and prediction could be enhanced when parental groups are formed first by molecular markers.

The objectives of the present research were to (1) evaluate the magnitude of yield heterosis among lines grouped by molecular markers and yield performance; (2) examine the consistency between marker based group and yield performance; (3) analyze combining ability effect for yield and yield components of the selected lines and its hybrids; and (4) explicate the utilization of positive loci (PLs) including effective-increasing loci (ILs) and effective-decreasing loci (DLs) for the prediction of heterosis for yield and yield components.

Materials and methods

Plant materials and field experiments

According to the long-term goal of IRRI’s hybrid breeding program, 688 IRRI-inbred lines with six sets of yield trial experiment was conducted by IRRI in previous three consecutive seasons in dry season of 2007 (07DS), wet season of 2007 (07WS) and dry season of 2008 (08DS). And according to its yield performance records (data provided by IRRI was not shown), five lines with the lowest yield record (L group) and five with the top yield record (H group) from each trial set were re-collected resulting 60 lines in total (Huang et al. 2013). Twelve rice lines were further selected based on the unweighted neibhbor-joining tree of 60 IRRI inbred lines as well as its yield performance record (Huang et al. 2013) to establish the 12 × 12 diallel crosses. Table 1 shows the list of the parental lines. They were selected based on the principle (Xie et al. 2013) of lines that can be: (1) representing the original groups clustered from the simple sequence repeat (SSR) and intron length polymorphism (ILP) markers; (2) covering a maximum of the allelic variation of the original 60 lines; and (3) consisting of 6 lines with the highest yield record (H group) and 6 with the lowest yield record (L group) according to the previous yield performance record (Huang et al. 2013). All the possible crosses were made in each except the reciprocal ones; and accordingly, 66 F1 hybrids were developed from the diallel crosses at IRRI experimental station in 2011 wet season. All F1 hybrids with its parents were planted in 2011 dry season using random complete block design with 3 replications. Inbred line PSB Rc82 and hybrid combination IR75217H (Mesotiz 3) were used as controls. Each plot contained 36 plants with spacing at 25 × 20 cm. Five plants were randomly chosen for yield and yield components evaluation, i.e. number of effective panicle per plant (NEP) bearing more than 10 grains, total number of grains per panicle (TNGP), filled grain percentage (FGP), thousand kernel weight (TKW) and grain weight per plant (GWP).

Table 1 Rice parents used in the half diallel crosses

Statistical analysis

SSR and thirty six ILP markers were used in this research. Banding profiles generated by all markers were compiled into a data binary matrix based on the presence (1) or absence (0) of the selected band. Only the clear and unambiguous bands were scored. Pair similarity coefficients were calculated for all pairwise combinations of the 12 lines according to the method developed by Nei and Li (1979): S ij  = 2N ij /(N i  + N j ), where S ij is the similarity between the genotype i and j; N ij is the number of bands in both i and j, N i and N j is the number of bands presented only in i or j, respectively. Genetic distances (GDs) was calculated as GD = 1−S ij (Zhang et al. 2007). Neighbor joining analysis was performed on similarity matrix utilizing the unweighted pair group method with arithmetic averages (UPGMA) method was done by DARwin 5.0 (Perrier et al. 2003). Least-square means, based on a general linear model were used to estimate general combining ability (GCA) and SCA effects of the F1 crosses according to Griffing’s experimental model I, method 4 (Griffing 1956) using the software DIALLEL-SAS05 (Zhang et al. 2005). Heterosis for yield and yield components was calculated as (1) mid-parent heterosis (MPH) = (F 1MP)/MP × 100, (2) better-parent heterosis (BPH) = (F 1BP)/BP × 100, and (3) over-CK heterosis (CKH) = (F 1CK)/CK × 100; where, F 1, BP and CK refers to the performance of F1 hybrid, better performing parent and control, respectively; MP = (P 1 + P 2)/2 wherein P 1 and P 2 are the performance of the parents.

Screening positive and effect-increasing/decreasing loci

All the F1 hybrids were divided into two groups, i.e. homozygous group (HO) and heterozygous group (HE), according the methods explicated by Zha et al. (2008). If the parents of the F1 hybrid were homozygous (or heterozygous) at a locus, this F1 hybrid could be grouped into HO (or HE). For each locus, the difference of the trait performance between HE and HO was calculated. The locus was considered as positive locus (PL) for this trait if the difference was statistically significant. And the PL could be further defined as effective-increasing locus (IL) if the HE’s performance was significantly higher than HO’s, or contrarily, as effective-decreasing locus (DL). The process of these calculations was done by Microsoft Excel 2007 as described by Zha et al. (2008). GDs based on PLs, ILs and DLs were also calculated, respectively. The Pearson correlation coefficient among each type of GDs to the corresponding F1 performance, heterosis and specific combining abilities was calculated in SAS PROC CORR (SA SInstitute Inc. 2002).

Results

Genetic distances and clustering analysis of parents

An example of the SSR and ILP patterns generated by representative primers RM236 and RI01970 shown in Fig. 1. All responding bands, except the faint and ambiguous ones, were used for the analysis of GDs and clusters. GDs revealed by SSR and ILP markers for the 60 IRRI rice lines ranged from 0.0880 to 0.7771, with an average at 0.6007, and the GDs of the 12 selected samples varied from 0.2288 to 0.7169, averaging at 0.5882, which was slightly lower than that of the 60 lines, but the samples were considered, based on the criteria of parental selection (Xie et al. 2013), as fairly representative of the allelic variation and the cluster structure of the original population (Fig. 2a). The 12 selected lines were distinctly clustered into four groups (Fig. 2b) as lines with low-yield performance (L) record were in group I and III, and these with high yield performance record (H) were in group II and IV (Table 1). Among the groups, the average GD (0.63149) of inter-groups was significantly higher than that (0.3401) of intra-groups (P < s0.001). The highest GD (0.6602) was between group II (H) and III (L) while the lowest (0.2794) was within group IV (H).

Fig. 1
figure 1

SSR (a. RM236) and ILP (b. RI01970) profiles of the 60 lines generated with primers. M is the DNA marker 1000

Fig. 2
figure 2

Unweighted neighbor joining tree of a the 60 IRRI rice lines and b the 12 selected for diallel crosses respectively, based on Nei’s genetic similarity derived from 70 SSR and 36 ILP markers

Parental and hybrid performance

The performances of yield and yield related traits of 66 hybrids were analyzed (Table 2). The variances for each trait among all the hybrids were significantly different (p < 0.001). The five highest record of GWP were found in the hybrids HET53/HET05 (53.4 g), HET25/HET20 (49.3 g), HET58/HET04 (36.9 g), HET59/HET26 (36.6 g) and HET58/HET07 (36.5 g); while the lowest GWP were in HET53/HET04, HET12/HET05, HET48/HET04, HET26/HET04 and HET05/HET04 with the record from 20.6 to 22.8 g. The hybrids derived from parental line HET20, HET25 or HET05 had elite yield performance. The average FGP of all the hybrids was high (84.4 %). Four hybrids, i.e. HET53/HET07, HET26/HET20, HET53/HET26 and HET33/HET07 had the highest record of FGP (>91 %). For TKW, the value differed from 23.0(HET26/HET07) to 28.9 g (HET53/HET20). For TGNP, it ranged from 92.3 (HET05/HET04) to 160.3 (HET58/HET33) with an average of 122.0. For NEP, the average record was 11.4, with the lowest in HET59/HET05 (8.5) and highest in HET53/HET05 (18.7). The performance of hybrids for each trait of yield and yield components was significantly higher than that of control. For the values of each yield components of all hybrids, there was no significant difference between or among the four clustering groups, neither within/among H and L group. This was indicating a poor relationship between yield components performances and clustering groups revealed by molecular markers.

Table 2 Record of number of effective panicle per plant (NEP), total number of grains per panicle (TNGP), filled grain percentage (FGP), thousand kernel weight (TKW) and grain weight per plant (GWP) for all hybrids from the 12 × 12 diallel crosses

Combining ability effects

The GCA effect, which determines the average performance of a parent in crosses, and is an estimate of its breeding value was estimated for each parental line (Table 3). For NEP, HET05 and HET25 from group I showed the significant positive GCA effect while HET12 from group III and HET59 from group IV had the significant negative effect. We found that hybrids derived from the parents with higher GCA effect also had higher NEP values; for TNGP, FGP and TKW, the similar phenomenon was also found. For GWP, HET04 showed the significant negative effect, while HET20, HET05 and HET 25 had positive GCA effect but not significant.SCA effect was further calculated for the judgment of the usefulness of a particular cross in exploiting heterosis. The result was given in Table 4. For GWP, four of the top five positive SCA effect values for the hybrids of HET48 crossed with HET05, HET12, HET53 and HET59, were statistically significant; while the SCA effect of the cross between HET48 and HET20 was significant negative as the minimum value. This illuminated that HET48 could be used to select elite hybrids in a special combination with other parents. For FGP, the maximum SCA effect values (significant positive) were found in the crosses of HET12/HET25, HET07/HET33, HET07/HET25 and HET26/HET59, while the minimum values (significant negative) were in HET07/HET20 and HET20/HET25. For TKW and NEP, SCA effect was not significant excluding the maximum positive values in HET07/HET25 and HET05/HET53 (TKW) and the minimum negative value in HET26/HET59 (NEP).

Table 3 Estimates of general combining ability (GCA) of parents for number of effective panicle per plant (NEP), total number of grains per panicle (TNGP), filled grain percentage (FGP), thousand kernel weight (TKW) and grain weight per plant (GWP)

Screening of PLs, ILs and DLs

A total number of 337 loci were detected from the parents, the number of PLs, increasing-effective loci (ILs) and DLs were further screened. The number of PLs ranged from 15 (NETP) to 81 (TKW) (Table 4). The PLs were consisting of ILS and DLs. Only five ILs were screened for NEP while 29 for TKW, as the highest number, were screened; 5 DLs for GWP as the lowest number and 52 as the highest number were screened.

Table 4 Estimates of specific general combining ability (SCA) of each cross for number of effective panicle per plant (NEP), total number of grains per panicle (TNGP), filled grain percentage (FGP), thousand kernel weight (TKW) and grain weight per plant (GWP)

Relationships of GDs with F1 performance and SCA effect

In this study, the Nei’s GDs were calculated based on four types of loci, i.e. PLs, ILs, DLs and all loci. The Pearson correlation coefficients of GDs with values of F1 phenotypic performance and SCA effect were estimated shown in Table 5. It was obvious that the correlations of GDs based on DLs with F1 phenotypic value and SCA effect were negative; correlations of GDs based on ILs with F1 phenotypic value and SCA effect were significant positive. GDs based on PLs or all loci showed significant positive correlations with F1 phenotypic value and SCA effect only for TKW and GWP. The results indicated the usefulness of ILs for predicting positive SCA effect and phenotypic values of yield and yield components.

Table 5 Pearson correlation of genetic distances based on four types of loci (DLs, ILs, PLs and all loci) with phenotypic values and SCA effect values derived from the 12 × 12 diallel crosses

Relationships between GDs and heterosis

Since heterosis is a general phenomenon in the entire plant kingdom, a promising approach is to investigate its molecular basis in rice, as a model species. Therefore, correlations of heterosis with GDs based on the four types of loci for yield and yield components were assessed (Table 6). In the 12 × 12 diallel crosses, GDs based on DLs for NEP were significantly negative correlated to the three types of heterosis, i.e. BPH, CKH and MPH (Table 7). GDs based on ILs were significantly positive correlated to heterosis excluding BPH and MPH for both FGP and GWP. It was noted that the correlation between GD based on ILs and CKH for TKW was strong as r = 0.671 (P < 0.01). While the GDs based on PLs or all loci showed poor correlation to heterosis excluding CHK and MPH for TKW. Obviously, GDs based on ILs presented higher correlations to heterosis than that based on PLs or all loci showing that ILs could be used for the further study on the prediction of heterosis for yield and yield related traits.

Table 6 Pearson correlation of genetic distances based on four types of loci (DLs, ILs, PLs and all loci) with heterosis for yield and yield components derived from the 12 × 12 diallel crosses
Table 7 Pearson correlations between GSCA and SCA effect derived from the 12 × 12 diallel crosses

Discussion

Diallel mating designs are important tools in plant breeding programs to obtain information on the inheritance of quantitative traits. A partial diallel consisting of crosses between two parents groups is useful when there are distinct groups of populations and the breeder is not interested in evaluating the crosses between parents of the same group. Variety diallel crosses have widely used in plant breeding as practical means for the evaluation of combining ability and heterotic patterns of varieties or inbred lines in cross combinations. In this research, half diallel crosses were used for analysis of combining ability effect and heterosis for agronomic traits. We found that the hybrids derived from the parents with high GCA for GWP, TKW and FGP, had high performance for the corresponding traits. Based on the results, GCA may be used as a direction for breeding elite hybrid combinations when GCA values are estimated under the same conditions, or may be used to predict heterosis for specific breeding target. Hybrid breeding can be also called as GCA breeding, many reports support this concept, especially in the prediction for yield performance (Gopal et al. 2008; Worku et al. 2008), disease resistance and drought tolerance (Yanchuk 1996), etc. Even those with low GCA could be used as test lines in breeding (Lobato-Ortiz et al. 2010).

SCA was also reported to be used to predict heterosis (Ahangar et al. 2008; Ni et al. 2009; Torres and Geraldi 2007). In the present study, strong correlations were found between SCA and heterosis for most of the yield and yield related traits. Nevertheless, it is impractical to use SCA for the heterosis prediction. SCA is for specific cross between two parents. Once we obtain the SCA values for all the crosses through our experiments such as diallel crosses, we actually have already obtained all the phenotypic values for the hybrid crosses, so it will be unnecessary to do the correlation analysis between SCA and heterosis. While GCA is different from SCA, which is to describe the parent’s combining ability for a trait. Usually, GCA is considered to be controlled by additive genes (Vacaro et al. 2002; Yang et al. 1997). GCA can be easily recombined and accumulated through gene flow; and the inheritance of GCA is high (Lu 1999). Thus we conceived a new conception, i.e. general sum of combining ability (GSCA) to substitute SCA. The value of GSCA is the sum of GCAs for the two parents of a hybrid. Using the GCAs in Table 3 for the twelve parents, we assessed all the GSCAs of the 66 hybrids for the five traits. The correlations between SCA and GSCA were hence estimated including their correlations considering the aspects of phenotypic value and the three types of heterosis for each corresponding trait (Table 7). The correlation between GSCA and SCA for the each trait of yield and yield components was statistically significant. The correlations between GSCA and phenotypic value, between SCA and phenotypic value, between GSCA and heterosis, between SCA and heterosis, were all significant. It is expected to use GSCA to substitute SCA to predict the promising combinations from the parental lines. The obvious advantage of GSCA over SCA is that it is estimated from GCA not from the complicated crosses design. Breeders usually have many breeding lines with known GCA values, when they exchange their lines with others, the GSCA can be directly calculated to estimate their SCA values. This may need more evidence to support since it is mainly based on the 12×12 half diallel crosses.

Prediction of heterosis in rice breeding is of great meaning for breeders. How to make it come true with a high efficiency and veracity is of their concerns. Many methodologies were attempted but no one was ideal. Scientists used the heterozygosity of positive molecular loci to predict heterosis. Zhang et al. (1994)pointed out that PLs presented strong correlations to yield and yield related traits, among which the heterozygous PLs were strongly correlated to mid-heterosis for yield and other traits. Zha et al. (2008) screened ILs and DLs from PLs in two sets of diallel crosses indicating that the GDs based on ILs were significantly correlated to the performances of F1s for most of traits with an average correlation of 0.76. The correlations were stable over different environments except GWP. He et al. (2002) reported that ILs could used to predict the heterosis for yield and yield related traits. Although the correlation coefficients between DLs or ILs and the values of phenotypic (SCAs or heterosis) were statistically significant, the magnitude was low, so the DLs or ILs could not be directly used for heterosis prediction. But when stable DLs and ILs can be revealed from different sets of diallel crosses under different environments, they may be useful in prediction for breeding.

In the present research we found that ILs and DLs were significantly correlated to heterosis except for a few traits. And the frequencies of ILs for yield and yield related traits screened from all the loci were calculated. ILP marker RI00297 had 9 ILs detected in the diallel crosses, RM149 had 6 ILs, RM514, RM276, RM252, RM232 and RM216 had 5. The ILs with high frequencies detected from SSR marker were reported to be as QTLs for yield and yield components, such as RM149 and RM216 (Cho et al. 1991; Jiang et al. 2004), RM232 (Xiao et al. 1998), RM252(Xu et al. 2001), RM276 (Temnykh et al. 2000), RM514 (Moncada et al. 2001) and RM213 (Brondani et al. 2002). According this research, the ILs could be used for the further study on prediction of heterosis for yield and yield related traits; GSCA may be another considerable parameter combined with ILs for breeders in selecting elite hybrid.