Introduction

Quantitative trait loci (QTLs) mapping is useful for dissecting complex traits [Lander and Botstein 1989; Lynch and Walsh 1998, (p 379); Mackay 2001]. Most QTL mapping studies in plants have used designed mapping populations, such as F2 or backcross populations between two inbreds (Kearsey and Farquhar 1998). As an alternative approach, in silico mapping aims to exploit existing phenotypic and genomic databases to discover QTLs (Grupe et al. 2001).

The mixed-model approach, which was developed to exploit massive amounts of phenotypic and pedigree data in animal breeding (Henderson 1984), has been successfully adapted in plants (Panter and Allen 1995; Bernardo 1996). The integration of genomic data in the mixed-model approach for the purpose of QTL mapping in hybrid crops was first suggested by Bernardo (1998). Recently, Parisseaux and Bernardo (2004) found that in silico mapping via a mixed-model approach can detect repeatable associations across different populations. Specifically, they attempted to identify simple sequence repeat (SSR) markers associated with different traits in maize (Zea mays L.) by utilizing the following data already available in a private breeding program: (1) multilocation phenotypic data for 22,774 single-cross hybrids; (2) SSR marker data at 96 loci for the 1,266 parental inbreds of the single-cross hybrids; (3) pedigree records for the 1,266 parental inbreds. By in silico mapping via a mixed-model approach, they detected 37 SSR markers with significant effects for plant height, 24 for smut [Ustilago maydis (DC.) Cda.] resistance, and 44 for grain moisture.

The statistical power of in silico mapping via the Parisseaux and Bernardo (2004) approach, however, remains unknown. It has been shown that the heritability and genetic architecture (e.g., number of QTLs and distribution of effects) of the trait and the resources available for QTL mapping (e.g., sample size and number of markers) affect the statistical power of designed QTL mapping experiments (Lander and Thompson 1990; Beavis 1994; Utz and Melchinger 1994). These genetic and non-genetic factors are also expected to affect the power of in silico mapping via a mixed-model approach. Moreover, the false discovery rate (FDR), which is the probability of a QTL being false given that a QTL has been declared, has recently been proposed as another criterion to assess the ability to detect QTLs (Benjamini and Hochberg 1995; Fernando et al. 2004). Our objective was to evaluate, by computer simulation, the power of in silico mapping via a mixed-model approach in hybrid crops. We used maize as a model species, but the results should generally apply to other hybrid crops.

Materials and methods

Maize breeding comprises two stages, inbred development and hybrid testing (Hallauer 1990). During inbred development, pairs of elite inbreds that belong to the same heterotic group are crossed to form an F2 or backcross population from which new inbreds are developed. Lines are developed by selfing and are crossed to one or two inbred testers from a different heterotic group. The testcrosses are then evaluated in field trials at several locations (Smith et al. 1999). During hybrid testing, single-cross hybrids made from pairs of inbreds from different heterotic groups are evaluated in multilocation performance trials. The tested hybrids, however, often account for only 10% to 15% of all potential combinations of single-crosses (Bernardo 1996).

Our simulation mimicked this two-stage breeding process in maize. First, we considered two opposite heterotic groups, each having a total of n1=n2=112 inbreds developed from different ancestral inbreds. Second, we assumed that n=600 or 2,400 hybrids, among all potential single-cross hybrids (112×112=12,544) between the two heterotic groups, had data available from multilocation performance trails. The number of inbreds in each heterotic group and the number of hybrids with available phenotypic data were chosen to agree with the empirical data of Parisseaux and Bernardo (2004).

We conducted a total of 64 simulation experiments. These 64 experiments had contrasting values of six different parameters: level of initial linkage disequilibrium (t=10 or 20 generations of random mating), significance level (α=0.01 or 0.0001), number of QTLs (l=20 or 80), heritability (H=0.40 or 0.70), number of markers (m=200 or 400), and sample size (n=600 or 2,400 hybrids). For each experiment, 50 runs were conducted with different locations of QTLs and markers on the genetic map and different inbreds and hybrids. Data from each run of a particular experiment were individually analyzed with the mixed-model method and the results from 50 runs were then summarized. We wrote a simulation program in C++ and conducted the simulation and data analysis on an IBM Power4 supercomputer at the Supercomputing Institute for Digital Simulation and Advanced Computation, University of Minnesota.

Inbred development

Two ancestral inbreds in heterotic group 1 were crossed and random-mated for t=10 or 20 generations. Likewise, two ancestral inbreds in heterotic group 2 were crossed and random-mated for t=10 or 20 generations. The purpose of considering both t=10 and t=20 was to create different levels of initial linkage disequilibrium (i.e., high for ten generations of random mating and low for 20 generations of random mating) between the QTL and markers. A total of 16 founder inbreds were obtained through single-seed descent from the random-mated F2 population of each heterotic group. These founder inbreds were denoted by I 11 , I 21 , ..., I 161 for heterotic group 1 and I 12 , I 22 , ..., I 162 for heterotic group 2. For heterotic group 1, chain crosses were made among the 16 founder inbreds, i.e., I 11 ×I 21 , I 21 ×I 31 , ..., I 161 ×I 11 . The recombinant inbreds from I 11 ×I 21 were then testcrossed to I 12 , the recombinant inbreds from I 21 ×I 31 testcrossed to I 22 , and so on. Heritability on a testcross mean-basis (across testing locations) during inbred development was 0.40 or 0.70. Based on testcross performance, the best two recombinant inbreds from each cross were selected, resulting in a total of 32 second-cycle inbreds. Using the same chain-cross and testing system, a total of 64 third-cycle inbreds were developed. The above process for inbred development was simultaneously performed for the heterotic group 2. Coefficients of coancestry among inbreds within each heterotic group were calculated from pedigree records by tabular analysis (Emik and Terrill 1949).

Hybrid testing

A total of n=600 or 2,400 hybrids were assumed evaluated in P performance trials. Each performance trial included 30 different hybrids. The effects of performance trials were randomly drawn from a normal distribution with zero mean, and the variance of performance trial effects was scaled to account for 70% of the total variation (Delacy and Cooper 1990). The heritability was adjusted upward from H=0.40 or 0.70 during inbred development to H=0.67 or 0.88 during hybrid testing. This adjustment was necessary given that the number of locations in each performance trial is about three times greater during hybrid testing than during inbred development (Smith et al. 1999).

Genetic model

We considered a published maize linkage map with 1,749 cM for ten chromosomes (Senior et al. 1996). A total of l=20 or 80 QTLs and m=200 or 400 markers were randomly located across the genome. Four alleles were present at each QTL or marker locus, but each heterotic group has two alleles at a locus. The two ancestral inbreds in heterotic group 1 had a QTL genotype of Q1Q1 or Q4Q4 and a marker genotype of M1M1 or M4M4 at the odd-numbered loci, and a QTL genotype of Q2Q2 or Q3Q3 and a marker genotype of M2M2 or M3M3 at the even-numbered loci. The two ancestral inbreds in heterotic group 2 had a QTL genotype of Q2Q2 or Q3Q3 and a marker genotype of M2M2 or M3M3 at the odd-numbered loci, and a QTL genotype of Q1Q1 or Q4Q4 and a marker genotype of M1M1 or M4M4 at the even-numbered loci. For single crosses between heterotic groups, the four possible genotypes were Q1Q2, Q1Q3, Q2Q4, and Q3Q4 at each QTL locus, and M1M2, M1M3, M2M4, M3M4 at each marker locus.

The effects of QTL followed a geometric series. The effect of the ith QTL was a function of ai where a=0.9 for l=20 QTL, and a=0.98 for l=80 QTL (Lande and Thompson 1990). The genotypic values for the four homozygous genotypes at each QTL were ai for Q1Q1, 1/2 ai for Q2Q2, −1/2 ai for Q3Q3 and −ai for Q4Q4. Assuming complete dominance among alleles at each QTL, the genotypic values at the ith QTL for single-cross hybrids were ai for Q1Q2, ai for Q1Q3, 1/2 ai for Q2Q4, and −1/2 ai for Q3Q4. The differences in testcross general combining ability (GCA) effects were ai between Q1 and Q4, and 1/2 ai between Q2 and Q3. The testcross dominance deviations were −1/4 ai for Q1Q2, 1/4 ai for Q1Q3, 1/4 ai for Q2Q4, and −1/4 ai for Q3Q4.

Mixed model

The mixed model used for analysis with k<m markers was

y=+M1α1 +M2α2+M3δ +Z1g1+Z2g2 +e

where y=n×1 vector of observed performance for a given trait; β=p×1 vector of fixed effects associated with performance trials; α1=2k×1 vector of GCA effects associated with the marker alleles in heterotic group 1; α2=2k×1 vector of GCA effects associated with the marker alleles in heterotic group 2; δ=4k×1 dominance deviation vector for four hybrid genotypes; g1=n1×1 vector of background GCA effects, not associated with the marker being tested, of the inbreds in heterotic group 1; g2=n2×1 vector of background GCA effects, not associated with the marker being tested, of the inbreds in heterotic group 2; e=n×1 vector of residual effects; X, M1, M2, M3, Z1, and Z2 were incidence matrices of 1s and 0s relating y to β, α1, α2, δ, g1 and g2, respectively.

The variances of the random effects were Var(g1)=G1VGCA(1), Var(g2)=G2VGCA(2), and Var(e)=R V R , where G1=n1×n1 matrix of coefficients of coancestry among group-1 inbreds; G2=n2×n2 matrix of coefficients of coancestry among group-2 inbreds; R=n×n matrix with the off-diagonal elements being zero and the diagonal elements being the reciprocal of the number of locations from which each phenotypic data point was obtained. Best linear unbiased estimates (BLUE) of β, α1, α2, and δ (fixed effects), and best linear unbiased predictions (BLUP) of g1 and g2 (random effects) were obtained by solving the mixed-model equations for single crosses (Henderson 1985). Restricted maximum likelihood (REML) estimates of the variances were obtained through iteration (Henderson 1985). We assumed equal numbers of testing locations (s) among performance trials. Correspondingly, R became an identity matrix (I) and Var(e) was equal to I(VR/s).

Data analysis

In silico mapping via a mixed-model approach comprised three steps. In the first step, a mixed model ignoring the marker data (i.e., without α1, α2, and δ in the model) was fitted to obtain estimates of V R , VGCA(1), and VGCA(2) using the above formulas. In the second step, single-marker analysis was performed for each marker, using V R /VGCA(1) and V R /VGCA(2) ratios obtained from the first step rather than from new estimates obtained by iteration. Using REML estimates of V R /VGCA(1) and V R /VGCA(2) ratios obtained from the first step reduced the computational time 200- to 400-fold, with little impact on the markers chosen for the multiple-marker analysis (i.e., the third step). An F-test was performed to test the significance (α=0.01 or 0.0001) of the marker effects (α1, α2, and δ) as described by Kennedy et al. (1992). To reduce multicollinearity, only the marker with the most significant P-value was chosen if several adjacent markers were significant.

In the third step, multiple-marker analysis was performed by simultaneously fitting effects for those markers retained from the single-marker analysis. The BLUE of fixed effects and BLUP of random effects were obtained by solving the mixed-model equations until convergence of V R , VGCA(1), and VGCA(2). An F-test at the α=0.01 or 0.0001 significance levels was performed for the fixed marker effects. Based on the complete dominance model, a=1.5×(α1+α2+δ) was considered the estimated effect for the significant marker. The power of in silico mapping via a mixed-model approach was evaluated based on the results from the final multiple-marker analysis. A marker is expected to have a significant regression coefficient only if it is adjacent to a QTL (Doerge et al. 1994; Whittaker et al. 1996). A true positive was therefore declared if a marker had at least one significant regression coefficient for α1, α2, or δ and a QTL was present in either or both of the marker’s adjacent intervals. A false positive was declared if a marker had at least one significant regression coefficient for α1, α2, or δ but no QTL was present in either of the marker’s adjacent intervals.

In this manuscript, power refers to the ability to detect a given QTL, whereas average power refers to the mean power across QTL. Power was calculated as the proportion of the number of times a particular QTL was detected out of 50 runs. The correlation was calculated between the true effect of a QTL and the power to detect that QTL. Average power was calculated as the number of true positives divided by the total number of QTLs simulated, averaged across 50 runs. The FDR was calculated as the number of false positives divided by the total number of significant markers detected, averaged across 50 runs.

For a true positive, the true effect of a significant marker was assumed to be equal to the effect of the adjacent QTL (ai). The true effect was zero for a false positive. The correlation was calculated between the estimated effects and the true effects of significant markers for each run whenever possible—i.e., more than two significant markers were identified for that particular run. The mean and median of these correlation coefficients across 50 runs were calculated. A chi-square test was applied to the ratio between the number of times the effect was overestimated and the number of times the effect was underestimated, given the null hypothesis of a 1:1 ratio.

Results

The average map distance between loci (both QTLs and markers) under different combinations of the number of QTLs and the number of markers ranged from 3.6 cM to 8.0 cM (Table 1). The corresponding recombination frequencies (r) in the initial F2 population between the ancestral inbreds, for the Kosambi mapping function, ranged from 0.04 to 0.08. The linkage disequilibrium before random mating (D0) at the genome level was similar among different combinations of the number of QTLs and markers. After t=10 generations of random mating, the linkage disequilibrium [D t =D0(1 - r)t] varied from 0.09 to 0.16. After t=20 generations of random mating, D t varied from 0.04 to 0.11.

Table 1 Genome-wide linkage disequilibrium before and after 10 or 20 generations of random mating with different numbers of QTLs and markers

Both the number of QTLs controlling a trait and the heritability of the trait had a prominent effect on the average power of in silico mapping (Table 2). The average power decreased as the number of QTLs increased. The average power across experiments (i.e., averaged across experiments at a specified level of a factor) decreased from 0.34 when 20 QTLs controlled the trait, to 0.17 when 80 QTLs controlled the trait. This decrease in the average power was more evident at more stringent significance levels. The FDR across experiments, on the other hand, decreased from 0.41 when 20 QTLs controlled the trait, to 0.23 when 80 QTLs controlled the trait. Moreover, the average power increased as the heritability increased. Increasing heritability from 0.40 to 0.70 led to an increase in the average power across experiments from 0.21 to 0.29, but it led to only a slightly increase for the FDR, from 0.30 to 0.34. Notably, the gain in average power (37%) outweighed the loss in FDR (15%).

Table 2 Average power, false discovery rate (FDR), mean and median (in parentheses) of the correlation between estimated and true effects, and χ2 test for the number of underestimates versus the number of overestimates from a ratio of 1:1, using in silico mapping via a mixed-model approach

Having more resources available for in silico mapping led to a higher average power. When the number of markers increased from 200 to 400, the average power across experiments increased from 0.21 to 0.29 (Table 2). The FDR across experiments, on the other hand, increased from 0.27 to 0.37. With a sample size of 600 tested hybrids, the average power across experiments was 0.19 and the FDR across experiments was 0.26. Increasing the sample size to 2,400 tested hybrids led to a higher average power across experiments (0.32), but also to a higher FDR across experiments (0.38). The gain in the average power across experiments (69%), however, still outweighed the loss in the FDR across experiments (45%).

We found a strong association between the average power and the FDR. An increase in the average power at different levels of each parameter generally led to an increase in the FDR, and vice versa. However, when a high marker density and a large sample size were used, a more stringent α level allowed a compromise between the average power and the FDR. With α=0.01, the average power across experiments was 0.33 and the FDR across experiments was 0.47 (Table 2). A more stringent significance level of α=0.0001 led to an average power across experiments of 0.17, half of that at α=0.01, and an FDR across experiments of 0.17, a 63% decrease from α=0.01 (Table 2). At both significance levels, the maximum average power (0.59 for α=0.01, and 0.47 for α=0.0001) was achieved when a trait was controlled by 20 QTLs and had a heritability of 0.70, linkage disequilibrium was high (i.e., ten generations of random mating), and in silico mapping was conducted with 400 markers and 2,400 hybrids. The minimum average power (0.11 for α=0.01, and 0.01 for α=0.0001) was achieved when a trait was controlled by 80 QTL and had a heritability of 0.40, linkage disequilibrium was low (i.e., 20 generations of random mating), and in silico mapping was conducted with 200 markers and 600 hybrids. The FDR closely followed this pattern for average power.

The mean of correlation coefficients between the true effects and the estimated effects was moderately high when 20 QTLs controlled a trait but was low when 80 QTLs controlled a trait (Table 2). The median of correlation coefficients followed the same pattern. Average power was low when a trait was controlled by 80 QTLs and had a heritability of 0.40, linkage disequilibrium was low (i.e., 20 generations of random mating), and in silico mapping was conducted with 200 markers and 600 hybrids at the α=0.0001 significance level (Table 2). Consequently, the mean and median of correlation coefficients were negative.

The correlations between the true effects of individual QTLs and the power to detect the QTLs ranged from 0.69 to 0.98 and were all highly significant (results not shown). In our model, the upper quartile QTLs (e.g., QTLs 1 to 20 for l=80) had larger effects than the lower quartile QTLs (e.g., QTLs 61 to 80 for l=80). With α=0.01, the average power across experiments for the upper quartile QTLs (0.53) was 3.5-fold that for the lower quartile QTLs (0.15). With α=0.0001, the average power across experiments for the upper quartile QTLs (0.37) was about tenfold that for the lower quartile QTLs (0.04).

Discussion

In silico mapping has four advantages over designed mapping experiments (Parisseaux and Bernardo 2004). First, in silico mapping exploits larger populations than designed mapping experiments. Second, the phenotypic data used in in silico mapping are obtained through more extensive testing under multiple, diverse environments. Third, the hybrids and inbreds tested typically represent wider genetic backgrounds. Fourth, the data used for in silico mapping are available without extra cost.

In plant breeding programs, the phenotypic data are highly unbalanced and the inbreds and hybrids have a pedigree structure. The original in silico mapping procedure proposed by Grupe et al. (2001) does not consider pedigree structures and becomes less powerful when data are unbalanced. In contrast, in silico mapping via a mixed-model approach accommodates unbalanced data, pedigree relationships, and different heterotic groups of parental inbreds by fitting relevant terms in the mixed model. Furthermore, the relative effects of the QTLs are measured by the regression coefficients of the significant markers, and the approximate positions of the QTL are indicated by the location of the significant markers. Interval mapping, within the method we described, can be conceivably used to estimate the location of a QTL within a marker interval. An interval mapping approach, however, might be computationally prohibitive.

The power of in silico mapping via a mixed-model approach in hybrid crops was affected by the heritability and genetic architecture of the trait (e.g., number and effects of QTLs), the resources available for mapping (e.g. number of markers and sample size), and the genetic structure of the particular breeding population (e.g., initial linkage disequilibrium among the founder inbreds). As with designed mapping experiments (Haley and Knott 1992; Beavis 1994), a large sample size, high marker density, high heritability, and small number of QTLs led to the highest power for in silico mapping via a mixed-model approach. A higher power to detect QTLs is expected for a trait with a high heritability and supposedly controlled by few QTLs (e.g., grain moisture and plant height in maize). A moderate power to detect QTLs is expected for a complex trait with a low to medium heritability and supposedly controlled by few QTLs (e.g., most disease resistance traits in maize). A low power to detect QTLs is expected for a complex trait with a low heritability and supposedly controlled by many QTLs (e.g., stalk lodging and root lodging in maize).

The strong association between the average power and FDR reflected a main challenge in dissecting complex traits. A compromise between the power to detect QTLs and the risk of having false positives can be reached by choosing an appropriate significance level for a given level of resources (e.g., sample size and marker density) and trait complexity (e.g., number of QTLs and heritability).

It would be useful to directly compare the power of a designed QTL mapping experiment (e.g., Haley and Knott 1992; Beavis 1994) versus the power of in silico mapping via a mixed-model approach. Previous simulations of designed QTL mapping experiments (Haley and Knott 1992; Beavis 1994), however, involved less realistic assumptions—i.e., fewer, unlinked QTL with equal effects underlying a complex trait. In contrast, we considered 80 randomly linked QTLs whose effects followed a geometric series. Assuming few, unlinked QTLs with equal effects optimizes the power to detect QTLs. A recent study by Bernardo (2004), however, considered a typical designed QTL mapping experiment with more realistic assumptions of genetic architecture of a complex trait, as in our study. Specifically, Bernardo (2004) considered a trait controlled by 30–100 QTLs, a heritability of 0.20 to 0.80, a sample size of 150 F2-derived families, and 100 markers. For a significance level of α=0.0001 to 0.01, the average power to detect QTLs ranged from 0.01 to 0.17 (versus 0.01 to 0.59 in the current study). These results suggest that the power of in silico mapping via a mixed-model approach compares favorably with the power of designed QTL mapping experiments. Although this comparison involves different population sizes used in the two studies, they do reflect the population sizes available for the two approaches.

Previous studies using designed mapping populations have found that if only a small proportion of underlying QTLs were detected, the total variation accounted for by these detected QTLs was overestimated (Beavis 1994; Utz and Melchinger 1994). Our results agreed with this general finding. While a lower power was achieved when 80 QTLs controlled a trait, the effects of significant markers were generally overestimated. Two factors presumably led to differences between the estimated effects and true effects of QTL in our study: (1) a QTL effect being confounded with the recombination distance between the QTL and a significant marker, and (2) a QTL effect being confounded with the effects of linked QTLs. The balance between these confounded factors determines the estimates of marker effects. As the number of QTLs increased, separating the effects of different linked QTLs became more difficult.

In gene discovery, one might want to identify primarily those QTLs with large effects. For traits controlled by few QTLs, the estimated effects of the significant markers were highly correlated with the effects of the underlying QTLs. This was not the case for traits controlled by many QTLs. The significant correlation between the true effects of QTLs and the power to detect QTLs, however, suggested that if an identified QTL is verified to be true, it is more likely to be one with large effect than with small effect, regardless of the number of QTLs underlying the trait.

Overall, our results indicated that gene discovery in hybrid crops can be initiated by in silico mapping via a mixed-model approach. It would be necessary, however, to find an acceptable compromise between the power to detect QTLs and the FDR. As with other QTL mapping methods, the results from in silico mapping should be followed by fine-mapping at the target regions, sequence analysis, and functional tests of gene effects (Glazier et al. 2002). In hybrid crops for which multiple heterotic groups exist, in silico mapping via a mixed-model approach can be applied to different heterotic patterns. Subsequently, the markers or the genomic regions that show a repeatable association with the trait of interest across different populations can be considered as the prime targets for further analysis (Parisseaux and Bernardo 2004). Cross validation by conducting in silico mapping in multiple heterotic patterns would result in a better control in overall FDR and provide increased confidence in conducting further investigation in putative QTL regions.