Keywords

1 Introduction

Understanding the genetics of quantitative traits has been a common focus over the last few decades. Ever since Sax [1] demonstrated the use of a simple t-test for finding the association between the seed weight and color of beans, methods of mapping quantitative trait loci (QTL) in plants have evolved steadily over the years. However, over last three decades there has been a renewed interest in studying the genetics of these traits due to the availability of large numbers of genomic resources including mapping populations, molecular markers, linkage maps, and computational tools. The progress in this area has been quite unprecedented. Consequently, large numbers of statistical methods are now available that suit the nature of the trait and mapping population as well as the objective of the research. As a result, it has now become possible to rapidly identify QTLs as well as candidate genes associated with individual traits. A large number of marker-trait associations (MTAs) for different traits have also been identified in different crops, and several of these have been deployed successfully in crop improvement programs through marker-assisted selection (MAS) [2, 3]. Some of the QTLs identified over the years have also been cloned successfully in different crop plants [2, 4, 5]. Similarly, the literature regarding this aspect has also grown tremendously. Many of reviews describing different methods of QTL analysis and its various dimensions, with special emphasis on crop plants, have appeared over the years [2, 6,7,8,9,10,11,12,13,14,15]. A partial list of references on statistical genetics is available at http://pages.stat.wisc.edu/~yandell/statgen/reference/software.html.

In this chapter, the different methods of QTL analysis that are based on the principle of linkage are discussed without describing much of the statistics involved in it (Fig. 1). Comparison between these different methods, factors affecting them, and the recent trends in the QTL analysis in crop plants are also discussed, along with different computer programs available for analysis of the data. However, the aspect of association mapping, which is based on the principle of linkage disequilibrium (LD), is not covered here, but is available in another chapter in this book.

Fig. 1
figure 1

Pictorial representation of different methods of QTL analysis. It describes that when a biparental population is genotyped with molecular markers and genotypic data is available but linkage map is not available, one can use single marker analysis (SMA) (t-test or ANOVA) for identification of marker-trait associations. When linkage map is available, one can analyze the data through simple interval mapping (SIM), composite/inclusive composite interval mapping (CIM/ICIM), multiple trait mapping (MTM), multiple interval mapping (MIM), Bayesian interval mapping (BIM), expression QTL (eQTL) or metabolite or protein QTL (mQTL/pQTL). The criteria used in each of these interval mapping approaches are given in the box below the method. The relative robustness of results of these methods over one another is shown with arrow. When germplasm/natural population is genotyped, one can perform genome wide association study (GWAS), while multiparental populations enable joint linkage and association mapping (JLAM)

2 Methods of Linkage-Based QTL Mapping

The different methods of linkage-based QTL analysis can be divided into four main categories depending upon the principle involved in it, and can be classified as (1) single-marker analysis when linkage map is not available, (2) interval mapping when linkage map is available, (3) meta-QTL analysis and (4) joint linkage and association mapping. Accordingly, these different methods are discussed in the following sections.

In plant-breeding experiments, data on a trait are recorded in various ways (during growth stages, at maturity) either in a continuous scale or in several ordered categories. Therefore based on the nature of trait being studied, the different methods of QTL mapping have also been discussed in this section.

2.1 Identification of Marker-Trait Association (MTA) When Linkage Map Is Not Available: Single-Marker Analysis

During the initial years when limited numbers of marker resources were available and statistical programs for development of linkage maps and interval mapping were still in their infancy, MTAs were identified based on rather simple approaches. The approach of bulk segregant analysis (BSA) proposed by Michelmore et al. [16] was very commonly used. In this approach, molecular marker(s) showing polymorphism between the parental genotypes of the mapping population and the two pools or bulks of DNA samples differing in a trait of interest are first selected and subjected to BSA for further selection of putative markers, which are then used for genotyping the whole mapping population. The putative QTL can thus be detected from the analysis of such markers following any of the single-marker analysis (SMA) methods. It is still considered as a rapid approach (short cut) for detecting the linkage of a marker with a QTL for a trait of interest. Several important QTLs that were earlier identified using BSA were later confirmed following advance methods of interval mapping. The advantage with this method is that the huge cost often incurred in genotyping the entire population could be saved.

Although proposed more than two decades ago, the approach still remains popular among the scientific community for quick analysis of the data. Large numbers of studies have used the principle of BSA and identified important QTLs for various traits in different crop plants. Recently, BSA was used for the identification of major grain yield QTLs under drought stress in rice [17]. Similarly, in another study using the whole genome-resequencing approach (also called QTL-seq) in rice, two bulks comprising 20–50 individuals with extreme phenotypic values were analyzed and QTLs for important agronomic traits were identified [18]. Although initially proposed to be used in biparental populations, the principle of DNA pooling from extreme genotypes for the rapid identification of QTLs has also seen application in an association mapping experiment. Using this approach, recently Kujur et al. [19] identified three major QTLs and candidate genes for seed weight in chickpea. Because of its simple, time- and cost-effective features, BSA still holds promise in the QTL-mapping programs.

Different methods commonly used for SMA include the t-test, ANOVA, and simple regression [7, 20].

(i) t-test, ANOVA, or regression approach: One of the simplest ways to determine whether an association exists between a molecular marker and the trait of interest is to conduct a single-factor analysis of variance (ANOVA). In this method the marker and the trait of interest are considered as independent and dependent variables, respectively. The marker-trait association (MTA) is considered only if the marker under consideration shows a significant difference between the two marker classes for the trait of interest. Based on this simple analysis, a QTL can be inferred to be located adjoining to, or in the vicinity of, the identified marker. Similarly, linear regression can be used for the identification of MTA and can help in estimating the phenotypic variation arising from the QTL linked to the marker. The advantage with this approach is that it is computationally very easy and can be performed even when one does not have a linkage map available. Often such types of situations arise when sufficient markers are not available, which limits the construction of a linkage map. However, the major drawback with this method is that the further a QTL is from a marker, the less likely it will be detected. Several QTL mapping studies in crop plants have utilized this approach for the identification of QTLs for a variety of traits. Many of these QTLs were subsequently confirmed using the approach of interval mapping.

2.2 Identification of QTL When Linkage Map Is Available

The era of development of framework linkage maps and interval mapping in plants began with the availability of interactive computer package MAPMAKER [21, 22]. Ever since its availability, it has been by far the most used computer program for the development of linkage maps. It not only provided the basis for framework maps, but it also introduced the principle of simple interval mapping (SIM) for the mapping of QTL by scanning an interval between each pair of markers in the genome. Not only did it facilitate QTL identification, but it also addressed the shortcomings of SMA. During the 1990s, majority of the QTL mapping studies were carried out using the principle of SIM. It was only when the principle of combining IM with multiple regression was introduced [23,24,25] that the problems of SIM were addressed. This method was later named “composite interval mapping” (CIM; [25]). This was a significant development and changed the way QTL mapping studies used to be carried out. CIM became the method of choice and by far the most popular QTL mapping approach amongst the scientific community. In order to avoid chances of false-positive associations and to increase the efficiency of QTL detection, improvements in the form of empirical threshold and permutation tests have also been proposed [26, 27]. Another method called inclusive composite interval mapping (ICIM), which fixed the problem of arbitrary cofactor selection in CIM, was later proposed by Li et al. [28]. The advantage with this method is that it takes into account the significant cofactors and calculates their effects using stepwise regression before IM is conducted and the effects are fixed during genome scanning. This method has been found to improve QTL detection efficiency over that of CIM and has been used in many studies. Interval mapping can be accomplished using any of the available methods including SIM, CIM, ICIM, and several variants proposed later. Comparison between different methods of QTL analysis is given in Table 1.

Table 1 Comparison between different methods of QTL analysis

Several variants of QTL mapping were proposed subsequent to CIM that offered a better understanding of the genetics of complex traits. These include studies of multiple marker intervals simultaneously and identification of epistatic (interacting) QTLs (multiple interval mapping, MIM), analysis of multiple traits simultaneously taking into account trait correlations, analysis of dynamic and ordinal traits, and many more. These methods are discussed in greater detail in the following sections. Some of these methods, despite once being considered computationally intensive, are being used on a regular basis due to advances in computational tools. Large numbers of QTL mapping studies using either of these approaches have been conducted in different crop plants and it is not possible to include all of them in this chapter.

2.3 Identification of Interacting or Epistatic QTLs: Two-Locus Analysis

The principle of epistasis has been known to geneticists for a long time and its importance in plant breeding has been well documented [29]. However, only QTLs having a main effect (M-QTL) were used for identification in the majority of the earlier studies (single-locus analysis). This was mainly because of the computational complexity involved with using multiple QTLs in the statistical model [30]. This becomes more complex if higher order interactions are involved [31]. It therefore did not allow the identification of interacting QTLs (QTL × QTL; two-locus analysis). It is also logical to think that there may be QTLs that may or may not have a main effect, but can interact with another such QTL [32, 33]. These types of interacting QTLs also contribute significantly to trait variation. Therefore, it was also thought appropriate to put the principle of epistasis into QTL interval mapping. Accordingly, multiple interval mapping (MIM) was proposed by Kao et al. [34]. Similarly, in another study, a mixed model approach was proposed by Wang et al. [30] that enabled the identification of not only QTL × QTL (QQ) interactions, but also QTL × environment (QE), and QTL × QTL × environment (QQE) interactions. Very recently, a three-stage search strategy for the mapping of epistatic QTLs has been proposed by Laurie et al. [35]. In this approach, first the main effect QTLs are identified, which is followed by the identification of epistatic QTLs interacting significantly with other QTLs, and, finally, new epistatic QTLs are searched in pairs. These methods not only improved the precision of the commonly used approach of CIM, but also increased the efficiency of QTL-mapping experiments, as interacting QTLs (QQ and QE) that contribute significantly to the total variation of the trait could be identified. These approaches have also been included in the commonly used QTL-mapping software: QTL Network and QTL Cartographer [30, 36]. A large number of studies involving identification of such interactions have now been carried out in different crops including rice [30, 37, 38], wheat [33, 39,40,41], maize [42, 43], and barley [44, 45]. It was also shown that in wheat the proportion of variation explained by QQ and QE or QQE varies from trait to trait [39].

Molecular marker-based QTL mapping studies have provided more evidence for epistasis than the conventional biometric approaches of quantitative genetics. Therefore, for long-term progress in plant breeding, one cannot ignore the importance of epistasis [29]. In order to completely dissect the trait in terms of its total variation, it is imperative that these interactions, including higher order interactions, be identified [31]. However, the methodology for addressing the issue of higher order interactions is still underdeveloped.

2.4 Mapping QTL for Correlated Traits Simultaneously

It is a common practice to conduct QTL analysis separately for each trait. However, it is often observed that some of the traits are significantly correlated with each other. The ability to identify and use a common QTL governing more than one trait can accelerate and increase the efficiency of MAS programs significantly. Multiple-trait QTL analysis is QTL analysis applied to several traits simultaneously and can help in the identification of pleiotropic QTLs. The importance of such pleiotropic QTLs and multi-trait QTL analysis has earlier been advocated and also empirically demonstrated [46,47,48]. Taking into account the correlation structure among the traits, this type of analysis was shown to improve the statistical power of QTL detection and the precision of parameter estimation in these studies. Later this approach was also incorporated into the popularly used QTL analyses program “QTL Cartographer” and other software, and was also successfully used in wheat [40, 49], sorghum [50], and other crops for different traits.

Recently an improvement over the existing method of multiple-trait analysis was proposed by Silva et al. [51], which takes into account the genetic and environmental correlations between traits and provides more details on the genetic architecture of complex traits by separating pleiotropic QTLs from closely linked non-pleiotropic QTL and QE interactions. Further, it can also estimate the total genotypic variance-covariance matrix between the correlated traits and decompose it in terms of QTL-specific variance-covariance matrices. It is expected that this method of multiple-trait multiple-interval mapping (MTMIM) of correlated traits will be more rewarding and can enhance the speed of MAS.

2.5 Mapping QTL Using Prior Information: A Bayesian Approach for QTL Mapping

In genetics, Bayesian analysis has been used for a long time and has now become an integral part of the QTL mapping studies. It is always said that statistics deals with uncertainty that is relative to the information we have [52]. In other words, the less information, the more uncertainty, and vice versa. As opposed to the commonly used methods of QTL analysis (SMA, SIM, and CIM), also called frequentist methods, which depend on the fixed parameters, Bayesian analysis deals with the uncertainty of the data based on prior information that is gathered and updated regularly to draw the posterior distribution according to Bayes’ rule. It therefore allows for easy and systematic incorporation of prior knowledge into the data analysis [53]. Accordingly, a Bayesian model consists of three components: (1) prior distribution, (2) conditional distribution, and (3) posterior distribution.

Although once considered to be computationally demanding, in recent years the Bayesian application has become an integral part of not only QTL analysis experiments, but also of association mapping [54, 55] and genomic selection (GS) experiments [56, 57]. This all has been made possible due to advances in the computational methodologies over the last few years. In one of its earliest demonstrated uses in QTL mapping, Satagopan et al. [58] used the Bayesian principle for estimating the locations and effect parameters for multiple QTLs with pre-specified numbers of QTLs in a DH progeny of Brassica napus. Since then, a large number of studies on crop plants involving the principles of Bayesian statistics have been published and it has now become an almost integral part of any analyses.

With the growing interest in this approach, new models were also proposed that facilitated the analysis of binary and ordinal traits [59, 60], interacting QTLs/epistasis [61,62,63], permutation testing [64], QE interactions [65], multiple QTL analysis [66], multiple trait analysis [67, 68], and pleiotropy [69]. The complexity of identifying epistatic QTLs, appropriate model selection, and many other issues that earlier plagued the efficient analysis of QTLs were addressed in these studies. The only concern that might limit the use of the Bayesian approach in analysis is that different conclusions can be drawn by different researchers if they use different priors in their analysis [2, 70]. Notwithstanding this, Bayesian statistics is the preferred choice of the statistician and will be used for a long time in all aspects of genetic analysis.

2.6 The Analysis of Traits for Which Data Are Recorded Periodically: QTL Mapping for Dynamic Traits

In majority of the QTL mapping studies, the data on a quantitative trait measured at a fixed time point or stage of growth/ontogenesis are used for analysis. This way of analyzing the data can identify QTLs and estimate their effects, which are accumulated over time from the beginning of growth until the time of actual observation. However, it is a well-known fact that the development of a trait is an end result of differential activities of many related QTLs, which express during the life cycle of the crop. This is because the developmental traits are under the control of genes, which are expressed at specific stages of development in response to the existing environmental conditions. Therefore, the traits for which phenotypic values change over time during the period of growth are called dynamic traits. Wu et al. [71] called the QTL mapping of such traits time-related mapping (TRM), as opposed to time-fixed mapping (TFM) for the traits for which the data are recorded at a fixed time or stage. Later, Wu and Lin [72] termed this aspect “functional mapping.” The advantage with this approach is that recorded observations of the same individuals over different developmental stages are a form of replication that can increase the statistical power of QTL detection. Besides this, another important advantage of this approach is that the stage of growth at which the heritability of the trait is highest can also be identified. The QTLs identified at this stage will be more useful for a breeding program involving MAS [73]. One of the very common examples of this is plant height in crops, for which the differences are visible during early growth but are neutralized/minimized towards maturity.

Several QTL mapping studies have been carried out for dynamic traits in different crop plants and have reported some common as well as growth-stage specific QTLs. Earlier this approach was successfully used in rice to identify QTLs associated with increased grain filling percentage per panicle [74]. Similarly, dynamic QTLs for seed reserve utilization were identified during three germination stages in rice [75]. It was observed that more QTLs express at the late germination stage. Osman et al. [76] used this approach along with conditional analysis for growth and yield traits under submergence conditions in maize and identified some common and some stage-specific QTLs. Similarly, in a recent study in triticale a population comprised of 647 doubled haploid lines derived from four families were phenotyped for plant height using a precision phenotyping platform at multiple time points. The study identified main effect and epistatic QTLs for plant height for each of the time points. Some of these QTLs were detected at all time points whereas others were specific to particular developmental stages, while the contribution of the QTL to the genotypic variance of plant height also varied with time [77]. Recently, a Bayesian nonparametric approach was also proposed for the analysis of dynamic traits [78], which offers advantages over the existing methods of analysis. The only limitation of this method is that it cannot be used for traits on which periodical observations are not possible (for example, grain protein content, grain yield, etc.).

2.7 Analysis of Traits for Which Data Are Scored on a Numeric Scale: QTL Mapping for Ordinal Traits

QTL analysis of the trait is based on the data that are recorded on a continuous scale with the assumption that they show normality. However, in nature, many quantitative traits in plants like disease resistance or quality parameters are recorded on a certain scale in several ordered categories based on intensity or severity. Although these traits are quantitative in nature, the data do not show continuous variation and therefore contain less information. These types of traits are called ordinal traits, and appropriate statistical treatment is required to deal with this type of trait distribution. Nevertheless, in many earlier published reports of QTL mapping, data on ordinal traits was analyzed in the same way as that of continuous traits. One of the reasons attributed for treating these traits similarly in earlier studies was partly the lack of availability of statistical tools to deal with these traits. However, QTL mapping methods for dealing with ordinal traits have evolved over the years, with more emphasis on traits studied in humans than in plants.

Earlier methods for QTL analysis of ordinal traits in back-crossed populations using the general linear model (GLM) were proposed by Hackett and Weller [79], and Xu and Atchley [80], which was later extended to four-way crosses by Rao and Xu [81]. An improvement over the existing GLM method was later proposed by Xu and Xu [82] in the form of a multivariate model to deal with the ordinal traits based on the EM algorithm. Subsequently, the principle of MIM described earlier for continuous traits was also extended to ordinal and binary traits for the identification of multiple QTL effects and epistasis [83]. This method is also included in the popular QTL analysis program QTL Cartographer. More recently, another approach based on an efficient hierarchical GLM was proposed for the identification of main-effect QTL and QE interactions governing ordinal traits in AM experiments [84].

2.8 Meta-QTL Analysis

During the last two decades, there has been a surge in the number of QTL mapping studies in different crop plants, which has resulted in several thousand published articles (source, Google Scholar). It is also seen that QTL mapping for the same traits are carried out in different genetic backgrounds in the same crop, leading to the identification of several QTLs. It thus necessitates the integration of QTL mapping results from these individual experiments performed on the same crop to identify common as well as novel loci/alleles underlying complex traits, for their effective use in crop improvement programs [2]. Meta-analysis of QTLs is an important approach that integrates information from multiple QTL-mapping studies and allows greater statistical power for QTL detection and more precise estimation of their genetic effects. Besides this, meta-QTL analysis can help to refine the genomic regions of interest frequently identified in different studies, and can provide the closest flanking markers [85]. Hence, a meta-analysis can be more rewarding than those of individual studies and can give greater insight into the genetic architecture of complex traits [86].

Because of its ability to integrate results from several individual QTL mapping studies, this approach has been used in many crops, and several meta-QTLs have also been identified. In one of the first examples, Chardon et al. [87] used the approach of Goffinet and Gerber [88] to study the genetic basis of flowering time in maize by integrating results of several mapping studies. From the total of 313 QTLs used for the study, they identified a total of 62 consensus QTLs and also reported a twofold increase in the precision of QTL position estimation from the original one. Several such studies were later carried out in different crops, including: disease resistance in cocoa [89]; fiber quality, yield, and biotic and abiotic stress tolerance in cotton [90, 91]; drought tolerance in rice [92]; late blight resistance and plant maturity traits in potato [85]; root genetic architecture in rice [93] and maize [94]; and protein concentration in soybean [95]. A list of several such studies carried out in cereals is also given in Gupta et al. [2]. These studies have also been made computationally possible due to the availability of software tools like BioMercator [96] and MetaQTL [97].

2.9 Mapping of QTLs for Gene Expression and for Large and Small Molecular Weight Compounds: The Concept of Genetical Genomics

As is the case with many physiological traits, variation in gene expression (m-RNA), as well as that of large and small molecular weight compounds (protein or metabolite), often shows a quantitative distribution, thereby allowing its genetic dissection using the commonly used methods of QTL mapping [98]. Earlier, the term genetical genomics was restricted only to the mapping of expression QTL (eQTL) [99]. However, the last decade has seen tremendous progress in terms of cost-effective high-throughput genotyping techniques, which made it possible to study the complexity of traits by measuring not only gene expression, but also thousands of proteins and metabolites to map eQTL, protein QTL (pQTL), and metabolite QTL (mQTL), respectively [100,101,102]. In the experiments involving genetical genomics, data on gene expression or individual proteins or metabolites can be used as a phenotype in QTL analysis. The large-scale data on gene expression (genetical genomics), if combined with genetics, can help in connecting phenotypic variation to genotypic diversity and can lead to the identification of genetic regulatory loci, and ideally genes, which explain the observed variation [98]. The rationale behind this approach is that a specific gene’s expression level is easier to quantify than the more complex developmental or physiological traits. Thus, if the loci governing differential gene expression patterns is identified and compared with that of the loci controlling a specific physiological trait, one can have better understanding of the complex traits [103]. It is thus obvious that integration of omics data in genetic studies can reduce the number of candidate genes for a given QTL from hundreds to a sizeable list [98].

The earlier studies on genetical genomics predominantly utilized microarrays for the analysis of mapping populations in a variety of species. However, experiments involving microarrays are very expensive, thereby limiting their use in all such studies. Metabolomics platforms on the other hand are much cheaper per sample than transcriptomics, enabling large populations to be studied with sufficient replication for moderate-to-low heritability traits. Moreover, most metabolomics platforms are higher-throughput than transcriptomics, allowing for rapid analysis. Therefore, in recent years there are increasing numbers of reports pertaining to mQTL analysis in plants. Some of them have been described elsewhere ([2, 104]; also see Alseekh et al. [105]). Although earlier these studies were more common in model species like Arabidopsis ([106] and references therein [107]), they are also being carried out in different crops including potato [108], brassica [109], tomato, and wheat. Very recently, a comprehensive mQTL analysis was carried out in tomato, and a total of 679 mQTLs for secondary metabolism in tomato fruit pericarp were detected in 76 introgression lines [105]. Similarly, in wheat, mQTL analysis was combined with that of QTL analysis for agronomic traits in a doubled haploid population [110]. These studies are not limited to biparental populations, but are also becoming very popular in AM experiments (for details, see Luo [111]).

Genetical genomics has offered lots of understanding about the influence of genetic factors on a biological system. However, as like any quantitative trait, molecular networks are also influenced by environmental conditions. Therefore, for a better and complete understanding of these networks, it is necessary that this interaction component (genotype × environment) is also studied. Accordingly, a modified concept called generalized genetical genomics was proposed by Li et al. [112], which combines both the genetic as well as carefully chosen environmental perturbations, to study the plasticity of molecular networks. This will help in understanding how a genotype responds to different environmental conditions. The utility of this approach was demonstrated in Arabidopsis by identifying G × E interactions in the metabolism of germinating seeds [113]. Although these studies offer lots of information, the number of such studies in crop plants are not many and may be due to the cost associated with such experiments [113].

2.10 Identification of QTLs Using Multiparental Mapping Populations: Joint Linkage-Association Mapping

Generally, QTL mapping is carried out using a biparental mapping population for which parental genotypes exhibit contrasting phenotypes for the trait of interest. However, it is well recognized that such a mapping population will segregate for only those alleles/QTLs for which the parental genotypes differ. This leaves out many important QTLs that are controlling the trait but are not detected just because the parental genotypes do not segregate for them. Therefore, another important approach based on the principle of LD called association mapping (AM), also called genome wide association studies (GWAS), was suggested. Large numbers of studies involving AM have been published in different crop plants and are beyond the scope of this chapter. For further details, readers are referred to another chapter on this aspect in this book as well as detailed reviews [114,115,116]. It was also realized that linkage-based interval mapping and LD-based AM have their own advantages and limitations when used independently and therefore it was proposed to integrate these two approaches into one approach called joint linkage-association mapping (JLAM) [117]. This type of analysis has been facilitated by the availability of next-generation multiparental mapping populations like Multi-parent Advanced Generation Intercross (MAGIC) populations, Nested Association Mapping (NAM) populations, Multiline Cross Inbred Lines (MCILs), and Recombinant Inbred Advanced Intercross Lines (RIAILs) [2, 118].

These populations have been developed in many important crops including wheat, rice, maize, chickpea, pigeonpea, peanut, barley, oat, and tomato (for details, see review by [119, 120]). Although it may not be feasible to develop multiparental populations in all crops, alternatively one can perform JLAM using a number of biparental populations as well as an association-mapping population genotyped with a common set of markers. Several variants of JLAM were later also proposed including that for the analysis of multi-trait data [121,122,123]. The utility of JLAM was shown by Lu et al. [124] in maize. Using the NAM population, they identified 18 new QTLs and candidate genes for drought tolerance, which were earlier not identified by either of the two methods individually. Recently, in rapeseed, this method has identified two major pleiotropic QTLs for seed weight and silique length [125]. Another advantage of using JLAM is that it can effectively address the issue of rare alleles, which is a matter of concern in any AM study [114]. Looking into its important features, this method will be used for a long time in many more crops.

2.11 Quantitative Resistance Loci (QRLs) Governing Quantitative Disease Resistance (QDR)

It is now well recognized that disease resistance in crop plants is quantitative in nature, involving major as well as minor QTLs. Accordingly, they are described either as R genes (having major effect) or quantitative resistance loci (QRL), which governs quantitative disease resistance or QDR in crop plants [126]. While dealing with QRL, the data on QDR are analyzed in the same way as that of any QTL analysis experiment for any morphologic or agronomic trait. It is therefore unnecessary to make a distinction between QRLs and QTLs. This is also evident from the fact that in several earlier studies involving QDR, the term QTL was used instead of QRL. In the last few years, large numbers of these so-called QRLs have already been identified in different crop plants including cereals and legumes, which subsequently led to map-based cloning of some of these QRLs. A partial list of such cloned QRLs in cereals is available in Gupta et al. [2].

In recent years advances in whole genome sequencing accompanied by the availability of high-throughput marker approaches like GBS has brought down the cost of genotyping drastically. These advances in genotyping technologies, if accompanied with precise and high-throughput phenotyping for QDR, will definitely facilitate the elucidation of complex forms of disease resistance and QRLs associated with them in crop plants [127,128,129]. It is expected that the knowledge gained from detailed understanding of QDR and that of associated QRLs will help in breeding varieties for disease resistance in crop plants in coming years. An optimal strategy is therefore needed to effectively and efficiently use the identified QRLs in breeding programs aimed at disease resistance [128, 129].

Some of the earlier successful examples of MAS for QRLs include: (1) MAS for single QRL for Fusarium head blight (FHB) in wheat [130], leaf rust in barley [131], white mold in common bean [132]; (2) multiple QRLs (pyramiding or stacking) for stripe rust in barley [133], common bacterial blight (CBB) in common bean [134], FHB in wheat ([135]; for a review, see Miedaner and Korzun [136]), root and stem rot in pepper [137]; and (3) QRLs plus qualitative resistance genes for stripe rust in barley [138], bean golden mosaic virus (BGMV) in common bean [139], potato virus Y in pepper [140], and many others.

2.12 Discovery and Introgression of Useful QTLs from Wild-Type or Unadapted Germplasm: Advanced Backcross QTL Analysis

One of the reasons often attributed to the limited use of identified QTLs in crop improvement programs is that QTL identification and varietal development are considered as separate activities. In order to deal with this issue and to harness the potential of the wild/unadapted germplasm in breeding programs, Tanksley and Nelson [141], while working on tomato, proposed a novel method of QTL mapping called advanced backcross QTL (AB-QTL) analysis. The important feature of this method is that one can simultaneous detect and transfer useful QTLs from the wild/unadapted relatives to a popular cultivar. The backcross population (BC2, BC3) is developed from a cross between the superior cultivar and a wild species carrying the desirable trait, and molecular markers are used to monitor the transfer of desirable QTLs.

It is a means of reducing the number of donor parent alleles present in any given backcross inbred line. The reason for delaying QTL analysis until an advanced generation like BC2, BC3 is that it allows the phenotypic selection to reduce the frequency of deleterious alleles and at the same time favorable donor alleles at QTL can be more easily recognized. Since its demonstrated success in tomato, it has been used in several crops including wheat, barley, and rice for the transfer of desirable QTLs for a variety of traits from the wild/unadapted germplasm. Details of these studies are readily available in several reviews and book chapters. In recent years, its application has been seen in barley for proline accumulation and leaf wilting under drought stress conditions [142]; in rice for salinity tolerance [143], grain shape [144], and reproductive stage drought resistance [145]; and in peanut for resistance to root knot nematode [146]. Having practical significance in breeding programs, this method is going to be used for a long time.

3 Factors Affecting Results of QTL Mapping in Plants

Several factors that influence the results of any QTL-mapping experiment have been widely discussed in the literature either using computer simulations or empirical data (e.g., [8, 147,148,149]). Important factors amongst them are trait heritability, nature and size of mapping population, number of markers, and method of analysis (Table 2). All these factors are related to each other. For example, a mapping population of an average size of n = 200 will yield a low-density linkage map, which in turn will limit the precision and resolution of the QTL so identified. The end result will be that the estimates of QTL effects will be biased as QTLs with small effects will not be identified and those that are closely linked will not be separated. These factors are discussed in more detail in the following sections. There are other issues that should be considered before initiating the QTL-mapping experiment, and which have been discussed in greater detail by Wurschum et al. [150].

Table 2 Factors influencing results of QTL mapping using biparental populations

3.1 Heritability of the Trait

It is a well-known fact that the majority of the quantitative traits exhibit poor heritability, which makes it difficult to detect a minor effect QTL with a smaller population size and limited number of markers. Another issue with low heritability traits in QTL mapping is that QTL effects are always overestimated. This has been demonstrated empirically as well as by using simulations in several studies. Although heritability of the trait cannot be increased, scoring of the data in dynamic fashion wherever possible can help in identifying the correct stage of crop growth where heritability for the given trait is highest. This can also help in identifying novel loci that are specific to the growth stage and often escape detection. Similarly, the mapping population can also be evaluated at different locations and over the years for the trait of interest to resolve location and year effects.

3.2 Size and Nature of Mapping Population

Often, small mapping populations are used in linkage mapping experiments. Although one can develop a framework linkage map with smaller populations, it may not be suitable for QTL mapping. Therefore, the use of larger populations has always been appreciated for bringing precision in the QTL mapping studies. It has been shown that with a population size of >200, methods like ICIM achieve unbiased estimations of QTL position and effect. On the contrary, when using a smaller population size, there is a tendency for the QTL to be located towards the center with overestimated QTL effects [148]. Earlier also it was shown that statistical power, QTL effect estimates, and precision of QTL localization can be improved from larger populations [147, 151, 152]. Therefore, sufficiently large populations are needed for QTL mapping studies [29]. However, population size cannot be arbitrarily increased due to increasing costs associated with phenotyping all the lines. This issue can be overcome to some extent by using a large number of markers and high-density marker maps that can increase the precision of QTL mapping.

3.3 Number of Markers in the Linkage Map

The recent advances in cost-effective high-throughput genotyping techniques have made it possible to generate thousands of data points in several crops. These advances are also being effectively utilized in several GWAS and GS experiments. However, in the majority of the earlier studies on QTL mapping, linkage maps were developed using a rather limited number of markers. Using computer simulations, it was earlier shown that a marker density of 10–20 cm is sufficient for precise QTL detection and that there is no added advantage from higher marker densities [147, 153]. It is therefore often debated whether the biparental QTL mapping studies would benefit from high-density maps. Contrary to this, later it was shown that high-density maps could increase the probability and precision of QTL detection between two recombination breakpoints and tightly linked markers could be identified [154,155,156]. Moreover, two tightly linked QTLs can also be separated using high-density maps [148]. However, in a recent study based on a computer simulation as well as on experimental data of DH populations in maize, it was shown that high-density maps neither improved the QTL detection power nor the predictive power for the proportion of genotypic variance explained [157]. Furthermore, they observed that the precision of QTL localization, the precision of effect estimates for small- and medium-sized QTLs, as well as the power to resolve closely-linked QTLs profited from an increase in marker density from 5 to 1 cM. However, from an MAS point of view, precise estimates of QTL effects are more desirable and these relevant parameters may outweigh the higher costs of high-density genotyping [157].

3.4 Method of Analysis

Different methods of QTL mapping have been discussed in the earlier sections. The choice of method for QTL analysis also influences the outcome of the study. For example, ICIM has been found to be more powerful in separating tightly linked QTLs than the commonly used IM [148]. As has been discussed, the importance of interacting QTLs (QQ, QE, and QQE) cannot be underestimated. Therefore, while conducting any QTL analysis, it is necessary to choose the appropriate method that will not only identify main effect QTLs, but also different interactions with higher precision.

4 Computer Programs for QTL Analysis

Over the years, several QTL mapping approaches have been proposed, making it possible to identify thousands of marker-trait associations in crop plants. Credit for these studies also goes partly to the availability of different computer programs that facilitated these studies in a rapid manner. Since the development of the popular computer program MAPMAKER/QTL [158], large numbers of such programs are now available that can be efficiently used for the identification of QTLs using either biparental QTL mapping or association mapping. The majority of these programs are available free of cost. In recent years a shift has also been seen from the use of standalone programs to open-source environments like R. It can run on a variety of platforms and has the same ability as statistical computing and graphics (http://www.r-project.org/). A comprehensive, though not exhaustive, list of different types of software that can perform QTL analysis, along with their features, are given in Table 3. Similarly, a detailed list of computer programs available for AM is given in Gupta et al. [114].

Table 3 List of computer programs available for QTL analysis

5 Conclusion and Outlook

During the last two decades or more, significant progress has been witnessed in the studies involving complex quantitative traits in crop plants. This has been facilitated by the availability of the cost-effective high-throughput genotyping techniques as well as the constantly improving area of statistical genomics. Several of the identified QTLs for various traits have been, and are being, successfully used in the crop improvement programs following MAS. Starting from SMA and SIM to ICIM, and more recently BIM, QTL-mapping approaches have evolved over the years. These advances not only improved the understanding and precision of the QTL-mapping results but also the outcome of the MAS program. The increasing emphasis on the identification of interacting QTLs (QQ, QE, and QQE) has also provided a new dimension to the traditional QTL mapping studies. With growing interest in the area of genetical genomics involving eQTL, pQTL, and mQTL, coupled with generalized genetical genomics, it is expected that a better understanding about the biosynthetic pathways underlying complex traits will be gained.

In the future, the approaches of biparental QTL mapping as well as of AM/GWAS, either performed independently or in combination, will be used in many more crops using the recent advances in genomics. Methods like JLAM have the ability to harvest the benefits of both the approaches together as has been successfully demonstrated in maize [150, 181]. Similarly, the recent advances in the area of GS will address the issue of minor QTLs by way of considering the effects of all the markers simultaneously. Thus, it is evident that the progress made in the area of QTL mapping is huge and will be further benefited by recent advances in computational tools. The success will translate into the crop-improvement programs of the future.