Introduction

Rice is a major staple food crop for more than half of the world’s population. When all developing countries are considered together, rice provides 27% of dietary energy supply and 20% of dietary protein intake (www.fao.org). The observed levelling-off of yield in rice cultivars along with the adverse effects of climate and deteriorating environmental health in rice growing areas are of concern in today’s food security priorities. It is increasingly being recognized that exploitation of gene pools of wild progenitor species is the fastest and acceptable approach to achieve the twin goals of high productivity and adaptability in any crop (Gur and Zamir 2004; McCouch et al. 2007; Kovach and McCouch 2008; Swamy and Sarla 2008). It is therefore important to identify QTLs from wild × cultivated crosses and to introgress those in cultivated varieties through marker-aided selection (MAS). This would provide impetus to marker-assisted breeding on one hand and enable gene discovery on the other for sustainable rice production (Fridman et al. 2004; Ashikari and Matsuoka 2006).

Yield is governed by several quantitative trait loci (QTLs) across the genome and their effect changes with the genomic and environmental context. So far only few yield QTLs have been successfully used in MAS (Liang et al. 2004). Most of the QTLs for yield have been mapped in early generations (F2, BC2, and BC2F2), in limited number of environments and genetic backgrounds. Use of such QTLs in MAS is less likely to yield desired results because of their inconsistent performance in subsequent generations and in different environments. However, it is difficult to evaluate the mapping populations in all possible environments to identify such large effect and consistent QTLs across the genetic backgrounds.

The most precise major effect yield QTLs identified at the same chromosomal location across studies are more useful in MAS and positional cloning to identify candidate genes (Swamy and Sarla 2008; Price 2006). Different approaches can be followed to find precise location of common large effect QTLs across the studies. One approach is the bibliographic review of QTLs affecting a trait and their comparison across studies for co-location and effect, which is supported by a statistical analysis and graphical representation (Chardon et al. 2004). Another approach is the joint analysis of raw data collected from mapping populations in several experiments. However, this approach is impossible due to unavailability of raw data from individual studies and vastly differing data structures.

QTL meta-analysis is an approach to identify consensus QTL across studies, to validate QTL effects across environments/genetic backgrounds and also to refine the QTL positions on the consensus map (Goffinet and Gerber 2000). QTL meta-analysis requires independent QTLs for the same trait obtained from different plant populations, different locations, or different environmental conditions (Goffinet and Gerber 2000). The consistent QTL identified by meta-analysis for a set of QTLs at a confidence interval of 95% is called as meta-QTL (MQTL). The meta-QTL with smallest confidence interval (CI) and having consistent and large effect on the trait is useful in MAS. In plants, the concept of meta-analysis has been applied to the analysis of QTLs/genes for blast resistance (Ballini et al. 2008), root traits in rice (Courtois et al. 2009), plant height in Poaceae family (Lin et al. 1995), lint fiber length in cotton (Rong et al. 2007), cyst nematode resistance in soybean (Guo et al. 2006), fusarium head blight in wheat (Loffler et al. 2009), flowering time (Chardon et al. 2004), and drought tolerance in maize (Hao et al. 2010) and disease resistance in cocoa (Lanaud et al. 2009) (Table 1).

Table 1 Previous reports on meta-analysis of QTLs of different traits in crop plants

QTL regions harbor many genes; among them few key genes may be more important in the regulation of a complex trait. Meta-QTL regions with refined positions are more accurate for short listing candidate genes. The common candidate genes shortlisted across the meta-QTLs are more likely candidates regulating the yield. Superior alleles of such key genes can also be mined from different sources and incorporated in elite cultivars to develop new varieties.

In this study, QTL meta-analysis was carried for yield QTLs reported from inter-specific crosses to develop a consensus map and to identify consensus yield QTLs. This should provide MQTLs with high effects and small CIs for possible use in MAS or fine map to deduce candidate genes for gene discovery.

Materials and Methods

There are mainly three steps in identifying consensus QTLs for yield from inter-specific crosses. Firstly, in a bibliographic review of QTL mapping studies, reliable data on QTLs for yield was compiled (Supplementary Table 1). Secondly, a consensus map was created and on this consensus map QTLs of individual studies were projected. In the third step, a meta-analysis was performed on QTL clusters to identify the consensus MQTL.

Bibliographic Review and Synthesis of Yield QTL Data

QTL information was collected from 11 published reports involving inter-specific crosses in rice. The details of the wild species, size of the mapping population, number of markers used and yield QTLs identified are given in Table 2. These studies involved one of the four wild species Oryza rufipogon, Oryza grandiglumis, Oryza glumaepatula, or Oryza nivara as the donor parents and Oryza sativa as recipient parent. The O. sativa recipients were indica, tropical japonica, or temperate japonica. In all, 76 QTLs were reported for yield per plant or yield per plot and the QTLs with additive effect were either derived from wild allele or cultivated allele (Xiao et al. 1998; Moncada et al. 2001; Septiningsih et al. 2003; Brondani et al. 2002; Marri et al. 2005; Tian et al. 2006; Yoon et al. 2006; Tan et al. 2007; Kaladhar 2006; Swamy 2008).

Table 2 Overview of yield QTLs identified from inter-specific crosses of rice

Development of Consensus Map

Genetic maps comprising a large number of genetic markers have been published in rice (McCouch et al. 2002; Temnykh et al. 2001). In the present study, rice map of Temnykh et al. (2001) was used as reference map, on which the markers of 11 studies were projected to develop a consensus map (Fig. 2). Chromosomes connected with less than two common markers to the reference map were excluded before creation of the consensus map.

QTL Projections

For all the QTLs, 95% CI on their respective original maps was estimated using the approach described by Darvasi and Soller (1997):

$$ {\hbox{CI}} = \frac{{530}}{{N{R^2}}} $$

Where N is the population size and R 2 the proportion of the phenotypic variance explained by the QTL. Re-estimation of CI was conducted to control heterogeneity of CI calculation methods across studies. Projection of QTL positions was performed by using a simple scaling rule between the original QTL flanking marker interval and the corresponding interval on the consensus chromosome. For a given QTL position, the new CI on the consensus linkage group was approximated with a Gaussian distribution around the most likely QTL position. All projections of QTL onto the consensus map were performed using the Biomercator 2.0 (Arcade et al. 2004) (www.genoplante.org).

Meta-analysis

The meta-analysis was performed on the QTL clusters on each chromosome using the Biomercator 2.0 (Arcade et al. 2004). The Akaike information criterion (AIC) was used to select the QTL model on each chromosome (Akaike 1974). According to this criterion, the QTL model with least AIC value is considered the significant model indicating the number of meta-QTLs.

Insilico Identification of Candidate Genes

The 23 meta-QTLs were analyzed for the presence of genes regulating yield traits based on the comparative genomics approach with O. sativa Nipponbare sequences (www.rapdatabase.org). Gene content was noted based on annotated data of homologous regions in Nipponbare using RAP, Build 5 (www.rapdatabase.org). It is assumed that the genes identified in Nipponbare regions are homologous and collinear to those underlying the yield enhancing QTLs mapped in different studies involving different donor wild species and recipients.

Results

Overview

In the 11 populations of rice derived from inter-specific crosses, population size ranged from 96 (Brondani et al. 2002) to 353 lines (Thomson et al. 2003). The number of markers used ranged from 80 (Marri et al. 2005) to 150 (Moncada et al. 2001). The number of locations for phenotyping varied from 1 to 3. Almost all the studies covered all the chromosomes except chromosome 4, 6, 7, 10, 11, and 12 in Marri et al. (2005) and chromosomes 7, 9, and 10 in Kaladhar (2006). From the 11 studies, 76 yield QTLs were reported which were distributed on all the chromosomes except on chromosome 10. The number of QTLs per population ranged from 1 to 24. The proportion of QTLs per chromosome ranged from three QTLs each on chromosomes 5, 6, and 7 to 16 yield QTLs on chromosome 1. The distribution of yield QTLs on different chromosomes showed that chromosomes 1, 2, and 8 have the highest number of 16, 13, and ten QTLs, respectively (Fig. 1). The number of yield QTLs ranged from three to nine on other chromosomes. The trait increasing allelic effect was derived from the wild species in 43 out of the 76 QTLs. Of these 27% QTLs were from O. nivara, 29% were from O. rufipogon, 1% from O. grandiglumis and 43% from O. sativa. In O. glumaepatula × O. sativa study the trait increasing QTL alleles were derived only from O. sativa (Brondani et al. 2002).

Fig. 1
figure 1

Distribution of yield QTLs on rice chromosomes

Development of Consensus Map

The linkage map of rice by Temnykh et al. 2001 is a widely used reference map. This map contained most of the markers used in the 11 QTL mapping studies from inter-specific crosses and marker orders were almost same. In developing consensus map at least two markers should be common between the reference map and projected maps. After integration of all the maps no marker inversions were found between the projected and reference map. The consensus map contained 699 markers including SSR, RFLP, AFLP markers, and genes. The map covered a total length of 1,676 cM with an average distance of 3 cM between markers.

Meta-analysis

A total of 76 QTLs were reported for yield from 11 studies involving inter-specific crosses. All these QTLs were projected on the consensus map and meta-analysis was performed on the QTL clusters on each chromosome. The meta-analysis of 76 QTLs resulted in identification of 31 meta-QTLs on 11 chromosomes (Fig. 2). However, at eight QTL regions meta-QTLs (three on chromosome 5, two on chromosome 11, and one each on chromosome 4, 7, and 9, respectively) had only one initial QTL. Meta-analysis by definition involves more than two QTLs in a QTL cluster. So, only 23 meta-QTLs with two or more QTLs are considered for further analysis. The number of meta-QTLs along with their AIC values and confidence intervals is given in Table 3. The number of meta-QTLs identified on each chromosome varied from one to four. There were four meta-QTLs each on chromosomes 1, 2, and 3, three on chromosome 8, two each on chromosomes 4, 11, and 12, and one meta-QTL each on chromosomes 6, 7, and 9. In general the confidence intervals of all the meta-QTLs were narrower than their respective original QTLs. Fifteen of the 23 meta-QTLs were narrower than the mean of its initial QTLs. The confidence intervals of the meta-QTLs varied from 4.4 cM between the marker intervals RM248–RM210 on chromosome 8 to 35.1 cM between the marker intervals RM342–RM515 also on chromosome 8. At seven loci on chromosomes 1, 2, 3, 8, and 11, MQTLs were narrower than the smallest QTL reported in that region. At four QTL clusters the meta-analysis reduced the confidence intervals to around 5 cM. These were RM626–RM9 (5.5) on chromosome 1, RM174–RM492 (4.6) and RM3874–RM3515 (5.2) on chromosome 2 and RM248–RM210 (4.4) on chromosome 8. The physical length of the MQTLs varied from 0.18-Mb (RM3874–RM3515) to 11.8 Mb (RM501–RM11) (Fig. 3). At three MQTL regions the physical length was around 500 kb. These regions were RM3874–RM3515 (0.18 Mb), RM174–RM492 (0.28 Mb) on chromosome 2, RM342–RM515 on chromosome 8 (0.34 Mb), and RM20A–RG574 (0.59 Mb) on chromosome 12. It is interesting to note that the two MQTLs on chromosome 2 with small physical interval also had small genetic interval. These regions are important for MAS and functional analysis to identify the candidate genes for yield.

Fig. 2
figure 2figure 2figure 2figure 2figure 2

Chromosome wise output of meta-QTLs identified by meta-analysis of 68 reported yield QTLs.Vertical lines on the left of chromosomes indicate the confidence interval, horizontal lines indicate the variance, MQTLs are in red. Markers and genetic distance (cM) are shown on the right of chromosomes

Table 3 Yield QTLs identified by meta-analysis
Fig. 3
figure 3

Gentic and physical interval of MQTLs

Table 4 Meta-QTL regions tagged with QTLs for different yield components and candidate genes

Genes Underlying Meta-QTLs of Yield

The gene content was analyzed in the 23 meta-QTL regions to deduce the candidate genes. Most of the genes present in the MQTLs were genes for hypothetical and expressed proteins, pseudo-genes, genes for signal transduction, stress tolerance and transposable elements. The presence of specific kinds of transposons and retrotransposons may have some functional significance. However, there were many annotated genes/gene families which were common across the MQTL regions; these may be probable candidate genes for yield (Table 4). In eight MQTL regions with less than 1.3-Mb LRR kinase, NAM, pentatricopeptide repeat proteins, cytokinin oxidase, F-box protein, AP2-domain-containing proteins and zinc-finger transcription factors were present. Over all in most of the MQTLs, cytochrome P450, pentatricopeptide (PPR) repeat-containing protein-like, zinc-finger (AN1-like)-like protein, no apical meristem (NAM) and F-box like protein genes were commonly observed. The candidacy of these genes in yield and yield traits has already been proved in other crops.

Discussion

Naturally Occurring Alleles for Yield Improvement

Wild progenitors have emerged as an important gene pool for mapping of yield QTLs in several crops (McCouch et al. 2007; Swamy and Sarla 2008). They are rich source of naturally occurring alleles for the further improvement of yield and also crosses with wild species generate lot of new and unknown variations in the form of transgressive segregants but rarely exploited; these new variations are created because of many genetic and epigenetic factors (Wang et al. 2005; Dong et al. 2006). Among the wild species, AA genome progenitors of cultivated rice are more useful in introgression. It is clear from the overview of QTLs that 29% and 27% of the yield enhancing QTLs were from AA genome wild progenitors such as O. rufipogon and O. nivara than the genetically distant wild species such as O. glumaepatula and O. grandiglimis. Favorable effect of wild allele introgression from AA genome species has been reported in several previous studies (Xiao et al. 1998; Thomson et al. 2003; Tian et al. 2006; Rahman et al. 2008). The prevalence of favorable QTLs for yield from wild species is maximum on chromosomes 1, 2 and 8. QTLs for yield were not identified on chromosome 10.

Meta-analysis of QTLs

Meta-analysis reduced the total 68 QTLs to 23 (33%) independent meta-QTLs on ten different chromosomes. In general the MQTLs were narrower than their mean of the initial QTLs. At seven meta-QTL regions confidence interval was narrower than the smallest QTL in that region. At six of these loci the confidence interval was reduced to less than 8 cM, with a reduction in length by 1.8 times of the smallest QTLs. Similarly, the highest reduction of QTL length was observed on chromosome 2, the MQTL was 8 times smaller than smallest QTL in the group and located to a confidence interval of only 4.6 cM. Four MQTLs had physical interval of around 500 kb and three of these also had reasonably less genetic distance. These four QTLs also had high mean PV of more than 10%. The markers flanking each of these four QTLs are suitable for MAS to increase the yield of elite cultivars. MQTL 8.2 was 326Kb long with a 35-cM interval indicating it was in a recombination hot spot and suitable for fine mapping. If the selected few high-priority, trait-increasing, major effect MQTLs is pooled by marker-aided selection, it may lead to increased yield even in the presence of extensive phenotypic buffering, that obviously takes place (Fu et al. 2009) and yet an yield increase of about 15–20% can be easily expected by marker-aided introgression of these high-priority yield QTLs.

It is clear from our study that the meta-analysis is useful in identifying consensus and precise QTLs. In the earlier meta-QTL studies for different traits in wheat, maize, cotton, and soybean, 10% to 21% reduction in total QTL was reported by QTL meta-analysis and the average reduction in the CI of the QTL varied from two to four times of the original QTLs (Guo et al. 2006; Ballini et al. 2008; Rong et al. 2007; Courtois et al. 2009; Loffler et al. 2009; Hao et al. 2010; Lanaud et al. 2009). In some of these studies, MQTLs were used for deducing candidate genes also. In wheat for Fusarium head blight resistance four MQTLS were recommended for MAS after meta-analysis (Loffler et al. 2009).

Identification of Candidate Genes

Map based cloning of QTLs is the commonly used approach for identification of candidate genes underlying the complex traits. Using this approach candidate genes have been identified for heading date, tiller number, submergence tolerance, grain number, and grain yield in rice (Yano et al. 2001; Salvi and Tuberosa 2005; Ashikari et al. 2005, Xu et al. 2006). However, this approach is time consuming and results depend on the effect and consistency of the QTLs across the generations. A combination of meta-analysis and comparative in silico mapping can be an efficient and rapid approach for identifying new candidate genes for trait variation. Based on this approach candidate genes were deduced for flowering time and drought tolerance in maize (Chardon et al. 2004; Hao et al. 2010) and for lint fiber length in cotton (Rong et al. 2007). In the present study insilico candidate gene analysis of meta-QTLs resulted in several candidate genes involved in yield. Some of the important genes/gene families with sufficient evidence to support their candidacy in rice and other crops are listed and discussed further. The genes were cytochrome P450, cytokinin oxidase, PPR repeat-containing protein-like, zinc-finger (AN1-like)-like protein, F-box-like protein, and NAM-like proteins.

Cytochrome P450 is a high-priority gene family associated with yield. This is supported by recent report of a rice brassinosteroid deficient mutant osdwarf 4-1 encoding a cytochrome P450 protein increasing biomass and grain yield under dense planting (Sakamoto et al. 2005). Cytochrome P450 also has a role in homeostasis of cytokinin which regulates growth, development in wheat and grain yield in rice (Ashikari et al. 2005; Xin et al. 2010). PPR repeats are present in promoter region of Rf genes of rice, regulate embryogenesis and fertility restoration in rice (Bentolila et al. 2002; Akagi et al. 2004; Xu et al. 2009; Wang et al. 2010). Thus, association of PPR genes with 15 out of 23 meta-QTLs for yield may be through increase in fertility. Zinc-finger (AN1-like)-like proteins are known to be involved in stress tolerance and in the regulation of rice plant architecture (Zhang et al. 2010). Suppression of this gene resulted in drastic increase in leaf and tiller angles, shortened shoot height and reduced grain production in rice (Mukhopadhyay et al. 2004; Wang et al. 2008). F-box proteins play an important role in floral development and stress tolerance. They express during various stages of panicle and seed development regulating the grain yield (Jain et al. 2007).The shortlisted common candidate genes underlying precise meta-QTLs can be used for further function analysis to define functions and to identify important yield enhancing genes in rice. In addition, the presences of specific kinds of transposable elements have some functional significance in yield improvement. The large amount of variation that is usually observed in inter-specific crosses is mainly because of many genetic and epigenetic factors, including transposable elements (Wang et al. 2005; Yu et al. 2010).

Conclusions

Meta-analysis of yield QTLs helped to identify the most precise and concise MQTLs. The meta-QTLs with a small physical and genetic interval are useful in MAS/pyramiding. The QTLs for use in MAS are also targets for fine mapping and positional cloning for gene discovery. The shortlisted candidate genes underlying meta-QTLs can be cloned to unravel the molecular mechanisms regulating yield. In breeding rice for higher yield, this study provides insights into the location of important loci introgressed from wild species of rice.