Introduction

Quantitative trait loci (QTL) mapping continues to be an important method for studying the genetic architecture of quantitative traits. QTLs are genomic regions that significantly affect complex traits. Recently, most studies on epistatic QTL mapping have used a multiple interval mapping (MIM) model based on a frequentist approach. These investigations have shown that epistasis regularly determines quantitative traits (Goto et al. 2019; Xu et al. 2021; Yang et al. 2020), as emphasized by Mackay (2014). However, most of the genotypic variance is additive (Hill et al. 2008). Epistasis is the interaction between non-allelic genes. In recent investigations, the epistatic QTL mapping was based on 80–200 recombinant inbred lines (RILs), using high to average single-nucleotide polymorphism (SNP) density (one SNP/0.7–4.7 cM). The theoretical background for a multiple QTL model with epistasis was developed in a series of studies that included theory and evaluation of the method efficacy based on field and/or simulated dataset (Boer et al. 2002; Carlborg et al. 2000; Jannink and Jansen 2001; Kao et al. 1999; Sen and Churchill 2001; Yi and Xu 2002; Yi et al. 2003).

The MIM model is a multiple QTL model and its likelihood function is a finite normal mixture (Kao et al. 1999). From the analysis of field data of a backcross design, Kao et al. (1999) observed that MIM identified more QTLs than composite interval mapping and interval mapping. In addition, the epistasis contributed 10–14% of the genotypic variance. Zeng et al. (1999) emphasized that MIM can improve the power of detection for minor QTL and the precision of estimating QTL positions. For mapping multiple interacting QTLs, Carlborg et al. (2000) proposed a weighted least-squares approach. On the basis of a simulated F2 dataset assuming none to 16 QTLs and a single pair of epistatic QTLs, they concluded that the genetic algorithm can be effective for the simultaneous mapping of multiple epistatic QTLs. The idea of a two-dimensional scan for searching epistatic QTLs was presented by Sen and Churchill (2001) using a Bayesian framework. From the analysis of field data from a backcross design and based mainly on the two-dimensional scan, the authors declared five QTLs explaining 3–13.5% of the phenotypic variance and two epistatic pairs explaining 6 and 6.5%.

The one-dimensional maximum likelihood approach proposed by Jannink and Jansen (2001) combines across- and within-population information using three doubled-haploid parents. The authors concluded that the method might double the power to detect first-order epistasis. Boer et al. (2002) proposed a penalized likelihood method for mapping epistatic QTLs with a one-dimensional genome search. The analysis of a simulated backcross dataset including three QTLs and all epistatic digenic interaction effects showed that the inclusion of the epistatic effects increased the QTL power of detection but under an effective dimension of 3 for epistatic interactions. Other Bayesian frameworks were proposed by Zuanetti and Milan (2022), Balestre and de Souza (2016), Yi et al. (2003), and Yi and Xu (2002).

Using simulated and field F2 data, Zuanetti and Milan (2022) proposed the method of data-driven reversible jump (DDRJ) for modeling epistasis and compared its efficacy with the usual reversible jump (RJ), QTLBim, MIM, and LASSO. DDRJ outperformed the other methods in regard to the power of detection and mapping precision without increasing the number of false positive QTLs. Balestre and de Souza (2016) proposed an RJ method for epistasis analysis. From the analyses of a simulated F2 and an empirical F2:3 datasets, the authors concluded that their method is effective to map epistatic QTLs. Using three simulated backcross data, Yi et al. (2003) observed no detection of false QTL under a no QTL model, a lower number of detected QTLs when ignoring epistasis, and a high power of detection for the epistatic QTLs. The genetic model adopted by Yi and Xu (2002) includes the additive × additive, additive × dominance, dominance × additive, and dominance × dominance epistatic effects. The analyses of three simulated backcross datasets, assuming two to three QTLs and one, two, and four epistatic effects, showed that fitting the epistatic model improved the power of detection for QTL with no main effect.

On the basis of the previously described methodological investigations on epistatic QTL mapping, the general conclusion is that this procedure is powerful, efficient to control the false positive rate (FPR), and precise to localize QTLs. That is, the power of detection for individual QTL and for epistatic pairs is close to 100%, there are no false positive QTL or there is an efficient control of the FPR, and the localization of the QTL is very precise. Compared with the almost-perfect results from most of the methodological studies, the investigations of Laurie et al. (2014) and Wei et al. (2010) provide some more realistic results as the influence of the trait heritability, the epistatic heritability, and the type of epistasis on the detection power of epistatic pairs. However, in both studies, there is also an almost-perfect control of the FPR, and the QTL mapping is very precise. It is surprising that an FPR close to zero (0.04) for both individual and epistatic QTLs is associated with a high power of detection, between 78 and 100%. It is also surprising that an increase in the power of QTL detection, from 8–16% to 90–99%, is associated with a decrease or an insignificant increase in the FPR.

Using theoretical and empirical results, we show that these almost-perfect results from simulated datasets are because of a combination of simplified specification of the coefficients of epistatic effects and no inclusion of minor genes. Thus, the objective of this simulation-based study on the efficacy of epistatic QTL mapping is to show that mapping epistatic QTLs is not an almost-perfect process. Differently from the previous studies based on simulation, we included 90 minor genes, defined the most known digenic epistasis types, and generated the additive, dominance, and four epistatic genetic values based on Kempthorne (1954).

Materials and methods

Simulation

The simulated dataset was generated using the software REALbreeding (available on request). The program has been developed using Xojo (https://www.xojo.com/). REALbreeding has been used in studies related to genomic selection (Viana et al. 2018), genome-wide association study (Pereira et al. 2018), QTL mapping (Viana et al. 2017), linkage disequilibrium (LD) (Andrade et al. 2019), population structure (Viana et al. 2013b), heterotic grouping/genetic diversity (Viana et al. 2020), and plant breeding (Viana et al. 2013a). It can also be used in research in human genetics, animal genetics and breeding, population genetics, and evolution. The program simulates individual genotypes for genes and molecular markers, and phenotypes in three stages, using inputs from the user. The first stage (genome simulation) is the specification of the number of chromosomes, molecular markers, and genes, as well as marker type and density. The second stage (population simulation) is the specification of the population(s) and sample size or progeny number and size. A population is characterized by the average frequency for the genes (biallelic) and markers (first allele). The final stage (trait simulation) is the specification of the minimum and maximum genotypic values for homozygotes, the minimum, and maximum phenotypic values (to avoid outliers), the direction and degree of dominance, and the broad sense heritability. Thus, the genetic effects are not sampled from a distribution. They are computed using the information provided by the user. For each gene (QTL and minor gene), the software computes the parametric (true) additive (A) and dominance (D) genetic values. On the basis of Kempthorne (1954, 1973), for each pair of epistatic genes (QTL-QTL, QTL-minor gene, and minor gene-minor gene), the software computes the additive × additive, additive × dominance, dominance × additive, and dominance × dominance genetic values. Finally, it sums the computed values to provide the genetic values for the trait.

The types of digenic epistasis are: complementary (G22 = G21 = G12 = G11 and G20 = G10 = G02 = G01 = G00; proportion of 9:7 in a F2, assuming independent assortment), duplicate (G22 = G21 = G20 = G12 = G11 = G10 = G02 = G01; proportion of 15:1 in a F2, assuming independent assortment), dominant (G22 = G21 = G20 = G12 = G11 = G10 and G02 = G01; proportion of 12:3:1 in a F2, assuming independent assortment), recessive (G22 = G21 = G12 = G11, G02 = G01, and G20 = G10 = G00; proportion of 9:3:4 in a F2, assuming independent assortment), dominant and recessive (G22 = G21 = G12 = G11 = G20 = G10 = G00 and G02 = G01; proportion of 13:3 in a F2, assuming independent assortment), duplicate genes with cumulative effects (G22 = G21 = G12 = G11, and G20 = G10 = G02 = G01; proportion of 9:6:1 in a F2, assuming independent assortment), and non-epistatic genic interaction (G22 = G21 = G12 = G11, G20 = G10, and G02 = G01; proportion of 9:3:3:1 in a F2, assuming independent assortment). Gij is the genotypic value for two epistatic genes, where i and j (i, j = 2, 1, or 0) are the number of copies of the gene that increases the trait expression.

The genotypic values for interacting genes are used to compute the additive, dominance, and epistatic genetic values, on the basis of Kempthorne (1954). Detailed description can be found in Viana and Garcia (2022). In summary, the parametric values of the 36 parameters for the nine genotypic values are obtained by solving the equations β = (XVX)–1 XVy, under the restrictions defined by Kempthorne (1954), where X is the incidence matrix, \(V={\rm{diagonal}}\{{f}_{ij}^{(n)}\}\) is the diagonal matrix of the genotype probabilities (n is the number of selfings), and y is the vector of the genotypic values (Gij) (i, j = 0, 1, and 2). The genotype probabilities for the non-inbred population (as F2) are presented by Viana (2004). The genotype probabilities for inbred populations (as RILs) are presented by Viana and Garcia (2022) (Supplementary information).

Dataset

REALbreeding crossed two contrasting inbred lines, assuming genes in association, and generated the populations F1, F2, and F3. The number of simulations of the segregant generations was 50. The sample size for the F2 and F3 were 400 and 4000 (400 progeny of size 10). The F2 plants were genotyped and phenotypes, and the F3 plants were phenotyped. We assumed genotyping for 975 SNPs distributed in 10 chromosomes of 100 cM. The number of SNPs per chromosome ranged from 93 to 100, and the average density was an SNP/cM. We simulated grain yield (g/plant) assuming 10 epistatic QTLs and 90 minor genes (7–10 per chromosome). We allocated one QTL in chromosome 1 (Q1), two QTLs in chromosomes 3 (Q2 and Q3), 5 (Q4 and Q5), and 7 (Q6 and Q7), and three QTLs in chromosome 9 (Q8–Q10). The minimum distance between linked QTLs was 20 cM. The epistatic pairs were Q1–Q5, Q2–Q4, Q3–Q8, Q6–Q7, and Q9–Q10. Thus, we defined two epistatic pairs in the same chromosome and three epistatic pairs in distinct chromosomes. We assumed seven scenarios of the same type of epistasis and an admixture of the distinct types. The trait broad sense heritability was 60%. The QTL heritabilities varied from approximately 1–13%, depending on the generation and epistasis type (7% on average). In general, one to four QTLs showed heritability between 1 and 5% (low), four to five QTLs had heritability between 6 and 9% (intermediate), and two to three QTLs showed heritability in the range 10–13% (high). The scenario of no epistasis included six QTLs with heritabilities in the range of 6–8%. Thus, for both scenarios, the average QTL heritability was 7%.

We also generated 50 simulations of a RIL population derived from the same parents, using a single seed descent process. The sample size was 400. Because we kept the trait heritability at 60%, the QTL heritabilities increased, ranging from approximately 2–22%, depending on the epistasis type (10% on average). Finally, we also processed the analysis of the RIL dataset, assuming a sample size of 600 and an admixture of epistasis types.

Efficiency of the epistatic QTL mapping

The dataset was processed using the R package qtl version 1.48.1 (Broman et al. 2003). We performed a two-dimension scan for the two-QTL model. The output for the epistatic QTLs from a two-dimension scan depends on five LOD thresholds. These LOD thresholds are for testing the full two-locus model versus the null (no QTL) model, the two-locus additive model versus the null model, interaction, the full two-locus model versus a one-locus model, and the two-locus additive model versus a one-locus model (Broman and Sen 2009). Because Broman and Sen (2009) declare that we are inclined to ignore the maximum LOD for interaction (Mi(j, k)), we processed the data using and ignoring Mi(j, k). Because a permutation test based on the expectation-maximization method is extremely time-consuming, we used 1000 permutations for the single-QTL model (one-dimension scan) but 100 permutations for the two-QTL model. For computing QTL detection power, FPR, and mapping precision (bias in the QTL positioning), we used a REALbreeding tool (eQTL summary). A declared QTL was assumed as a true QTL when the bias between its estimated and actual positions was lower than 20 cM. Because epistatic QTL mapping is more complex to interpret than non-epistatic QTL mapping, we first processed preliminary analyses using the single- and two-QTL models, the F2 and F2:3 designs, and two sample sizes (200 and 400), based on the first simulation data assuming no epistasis and an admixture of epistasis types. The main questions to be answered were: (1) do the two-QTL models provide a more effective analysis relative to the single-QTL model when there is epistasis? (2) does the F2:3 design provide a more effective analysis relative to the F2 design? and (3) does increasing the sample size significantly improve the epistatic QTL mapping?

Results

On the basis of the results from the preliminary analysis (Table 1), we realized that: (1) the two QTLs model provides a more effective analysis relative to the single-QTL model when there is epistasis; (2) the F2:3 design did not provide a more effective analysis relative to the F2 design; and (3) increasing the sample size significantly improves the epistatic QTL mapping. Under no epistasis and higher sample size, note that the power of detection for QTLs with heritability between 6 and 8% was 0.67. The higher FPR (0.20) is attributable to the threshold of 3.9. The FPR would be zero by using a threshold of 4.9 (26% higher). The bias was only 1 cM, on average. Assuming an admixture of epistasis types, note that fitting the two-QTL models to the F2 or F2:3 data of 400 genotyped plants provided the best results concerning power and FPR. Thus, for assessing the efficacy of the epistatic QTL mapping, we used the F2 data with 400 plants. Under this scenario, the thresholds assuming 100 and 250 permutations were essentially the same. An impressive negative result was a very high FPR under epistasis.

Table 1 Power of QTL detectiona, FPRa, and average bias (cM) in the positioning of the true QTLs from the analyses of the first simulation data, using the single- and two-QTL models, the F2 and F2:3 designs, and two sample sizes (200 and 400), assuming no epistasis (No) and an admixture of epistasis types (Ad) and defining thresholdsb at 5% from 1000 (single-QTL model) or 100 (two QTLs model) permutations.

The differences between the parents’ genotypic values and the magnitude of the F2 genotypic variance were lower assuming duplicate epistasis and higher under non-epistatic genic interaction (Table 2). Comparing with the scenario of no epistasis, where the ranges for the parents and the F2 genotypic variance were 120 g/plant and 367.7 (g/plant)2, respectively, note that epistasis significantly decreased the difference between parents and the F2 genotypic variance. This is impressive considering that it was assumed only 10% of epistatic genes.

Table 2 Genotypic values of the parents (P1 and P2) and F1 (g/plant), parametric F2 mean and genotypic variance (GV), thresholdsa at 5% from 100 permutations using data from simulation 1, average power of QTL detection for the low (L), intermediate (I), and high (H) heritability QTLs and for the epistatic pairs (E), FPRb, and bias (cM) in the positioning of the true QTLs, from the analyses of the F2 data assuming seven epistasis typesc and an admixture of epistasis types (Ad).

Regardless of the epistasis type, ignoring the maximum LOD for interaction (Mi(j, k)) maximizes the power of QTL detection irrespective of the QTL heritability (Table 2). The average power of QTL detection for low, intermediate, and high heritability QTLs were 56, 67, and 74%, respectively. However, even taking into account the Mi(j, k) obtained by permutations, the FPR was very high, in the range of 60–70%, approximately (65% on average). The high FPR is attributable mainly to the minor genes. The average FPR ignoring the false declarations in chromosomes with no QTL was 45% (range between 40 and 50%). Thus, we can estimate an average FPR of 23% (range of 15–30%), ignoring minor genes. But minor genes cannot be ignored. Note that taking into account or ignoring Mi(j, k), the power of detection for epistatic pairs is generally zero (average value of only 7%). Only under duplicate and duplicate genes with cumulative effects, ignoring Mi(j, k), we observed one or two epistatic pairs with power greater than or equal to 50%. Interestingly, for these three pairs, only one pair involved high heritability epistatic QTLs. Thus, we did not observe a positive association between the power of epistatic pairs and QTL heritability. As expected, due to the low power of detection for epistatic pairs, we observed a low FPR for epistatic pairs (maximum of 15%).

We analyzed the increase in the power of detection for the epistatic pairs by decreasing the Mi(j, k) obtained by permutations but filtering in the file of maximum LODs. But the results were disappointing (Table 2). The average power doubled (14%), but only one to three out of the five epistatic pairs showed approximate power of at least 10% (this means that only one in ten similar pairs would be declared), and the FPR was very high (85 to 97%, with one exception). Finally, we investigated the best balance between the power of QTL detection and FPR for both QTLs and epistatic pairs. That is, we tried to minimize the FPR by keeping a reasonable power of detection. Again, we filtered in the file of maximum LODs from the analysis ignoring Mi(j, k), by specifying a high threshold for the full two-locus model versus a one-locus model. The consequences of keeping an FPR of up to 30% for both QTLs and epistatic pairs were a significant decrease in the power of detection, proportional to the QTL heritability. The average power of detection for the low, intermediate, and high heritability QTLs were 17, 22, and 31%, respectively, and the average power of detection for epistatic pairs was 8%.

In regard to mapping precision, we observed an average bias in the positioning of the true declared QTLs in the range of 1.1, for duplicate genes with cumulative effects, and 6.0 cM, for duplicate epistasis (Table 2), for the best balance between the power of detection and FPR. The problems of very high FPR for individual QTLs and very low power of detection for epistatic pairs were also observed in a RIL population, even increasing the sample size to 600 (Table 3). Comparing with the results from F2, the analysis of the RILs provided, on average, a slightly lower QTL power of detection of approximately −7%, regardless of the maximum LOD for interaction. The average FPR was 67%, irrespective of the Mi(j, k) too. Using RILs did not increase the power of detection of epistatic pairs as well as the mapping precision. The average power of detection for epistatic pairs ranged from 0 to 18% and the bias in the positioning the epistatic QTLs increased 2 and 5%, ignoring and taking into account Mi(j, k). Again, regardless of the epistasis type, 60–100% of the epistatic pairs were not declared. Those pairs mapped showed low average power of detection, 19%. The value for F2 was 37%. No significant advantage was obtained by increasing the sample size to 600.

Table 3 Average genotypic value and genotypic variance (GV) in the RIL population, thresholdsa at 5% from 100 permutations using data from simulation 1, average power of QTL detection for the low (L), intermediate (I), and high (H) heritability QTLs and for the epistatic pairs (E), FPRb, and bias (cM) in the positioning of the true QTLs, from the analyses of the RIL data assuming seven epistasis typesc and an admixture of epistasis types (Ad).

Discussion

The results from this simulation-based study show that mapping epistatic QTLs is a challenge. Regardless of the epistasis type, the basic procedures of r/qtl maximized the QTL power of detection, by ignoring Mi(j, k), but associated with a very high FPR of approximately 60–70%. Furthermore, the lack of power to detect epistatic pairs is also disappointing. In F2, under complementary, recessive, dominant and recessive, and non-epistatic genic interaction, no epistatic pair was detected. For the other epistasis types and admixture of types, only one to two out of five epistatic pairs showed a power of at least 10%, approximately, with no clear association between power and the interacting QTL heritabilities (48.7% on average). Broman and Sen (2009) emphasized that they “are inclined to ignore Mi(j, k)” in the rule (8.3) and that they “place greater reliance on the numeric summaries from summary.scantwo”. This procedure alone, however, did not clearly provide the best balance between power and FPR for QTLs and epistatic pairs. We stress that this negative general result is not due to the use of r/qtl. We also analyzed the first simulation data using Windows QTL Cartographer (Wang et al. 2011), performing the sequential search proposed by Laurie et al. (2014), and QTL IciMapping (Meng et al. 2015), observing similar results to those provided by r/qtl. The problem could be a simplified specification of the coefficients of epistatic effects implemented in the software available.

Any software for QTL mapping process the phenotypic and molecular datasets from code specified under quantitative genetics theory. For F2 data, the genotypic value for two non-epistatic genes is G = ma + mb + θaaa + θbab + γada + γbdb = M + A + D, where, for each gene, m is the mean of the homozygotes, θ = 1 and γ = 0 if AA or BB, θ = 0 and γ = 1 if Aa or Bb, and θ = −1 and γ = 0 if aa or bb, a is the deviation between the genotypic value of the homozygote of greater expression and m, d is the dominance deviation (Falconer and Mackay 1996). The interval mapping is based on the conditional probabilities of the QTL genotypes in an interval flanked by two molecular markers. For example, if there is a QTL in a given interval, the average genotypic value for the individuals with SNP genotype mn (for the first SNP) op (for the second SNP) is Gmnop=M + [P(QQmnop) P(qqmnop)]a + P(Qqmnop)d, where P(QQmnop), P(qqmnop), and P(Qqmnop) are the conditional probabilities (Haley and Knott 1992). Because all software use correct modeling for the additive and dominance deviations, the QTL mapping of non-epistatic QTLs is really a powerful process that provides effective control of the FPR and precise positioning of the true declared QTLs, irrespective of the statistical approach (Viana et al. 2017).

However, correctly modeling epistasis in QTL mapping is a challenge to overcome. The genotypic value for two interacting genes is G = ma + mb + θaaa + θbab + γada + γbdb + I = M + A + D + I, where I is the epistatic effect (a specific value for each one of the nine genotypes). Partitioning the epistatic effect based on Kempthorne (1954), we have a much more complex situation: \({G}_{ijkl}=M+{\alpha }_{i}^{1}+{\alpha }_{j}^{1}+{\alpha }_{k}^{2}+{\alpha }_{l}^{2}+{\delta }_{ij}^{1}+{\delta }_{kl}^{2}+{({\alpha }^{1}{\alpha }^{2})}_{ik}+{({\alpha }^{1}{\alpha }^{2})}_{jk}+{({\alpha }^{1}{\alpha }^{2})}_{il}+{({\alpha }^{1}{\alpha }^{2})}_{jl}+{({\alpha }^{1}{\delta }^{2})}_{ikl}+{({\alpha }^{1}{\delta }^{2})}_{jkl}+{({\delta }^{1}{\alpha }^{2})}_{ijk}+{({\delta }^{1}{\alpha }^{2})}_{ijl}+{({\delta }^{1}{\delta }^{2})}_{ijkl}=M+A+D+AA+AD+DA+DD\), where α is the average effect of a gene, δ is the dominance value, and (αα), (αδ), (δα), and (δδ) are the additive × additive, additive × dominance, dominance × additive, and dominance × dominance effects, respectively. Assuming two epistatic QTLs in two intervals, the 81 parametric average genotypic values in F2 are complex to derive since the SNP genotype probabilities depend on the gamete probabilities regarding four SNPs and two QTLs. Only in this way could the correct coefficients for the epistatic effects be derived. In all the previous theoretical studies on epistatic QTL mapping there is a simplified specification of the coefficients of epistatic effects. Yi et al. (2003), Boer et al. (2002), Zeng et al. (1999), and Kao et al. (1999) assumed that the coefficient of the epistatic effect is the product of the coefficients of the effects, multiplied by an indicator variable, to define if the QTLs are epistatic or not. When modeling individual epistatic effects, the same general rule was assumed (Kao and Zeng 2002; Yi and Xu 2002). That is, the coefficient of the additive x additive effect, for example, was defined as the product between the coefficients for the two deviations (correctly attributed). However, as demonstrated in the Appendix, even assuming that the SNPs in the first interval have independent assortment (because independence or free recombination (1/2)) relative to the SNPs in the second interval, the coefficients for the epistatic effects do not follow this product rule. The same result was theoretically proved by Weir and Cockerham (1976), for the components of the genotypic variance in inbred populations, under LD. We were also surprised by the disappointing results from RILs. In this generation, the most significant component of the genotypic variance is the additive variance, higher than the additive × additive variance. The other epistatic components have a low magnitude, even with a high percentage of epistatic genes (Viana and Garcia 2022). Thus, even in RILs, there is a problem of theoretically modeling the epistatic effects.

Concerning the differences between this study and the previous investigations on the efficacy of mapping epistatic QTLs, including Laurie et al. (2014) and Wei et al. (2010), our results show that ignoring the effects of minor genes when computing the genotypic values explains the lower FPR previously observed. This study showed that approximately 2/3 of the FPR was due to minor genes. Furthermore, in the previous studies, there is generally a simplified specification of the epistatic effects when generating the simulated genotypic values. Most previous studies on the efficacy of epistatic QTL mapping ignored the types of epistasis and sampled epistatic effects from a Normal distribution, assuming IN(0, \({\sigma }_{I}^{2}\)). Only Wei et al. (2010) and Carlborg et al. (2000) considered five to six epistasis types. In regard to high-order epistasis, the epistasis type specification for three or more genes is complex, except for complementary and duplicate (for three epistatic genes, proportions of 27:37 and 63:1, respectively, in an F2 generation, assuming independent assortment). On the basis of Kempthorne (1954), the computation of the parametric genetic effects relative to three epistatic genes depends on the F1 gamete probabilities, where, for example, P(ABC) = (1–rab)(1–rbc)/2, assuming no interference. However, Maki-Tanila and Hill (2014) showed that the majority of the epistatic variance is due to two-locus interactions.

In the simulation-based studies on epistatic QTL mapping of Zuanetti and Milan (2022) and Balestre and de Souza (2016), the authors did not include minor genes and adopted the same product rule to specify the coefficients of the epistatic effects. However, their results also show that mapping epistatic QTLs is not an almost-perfect process. From a single simulated dataset, the DDRJ method proposed by Zuanetti and Milan (2022) provided a power of detection of 50% for the main QTLs and 40% for the epistatic pairs. The FPR for epistatic pairs achieved 78%. The authors also concluded that RJ, MIM, and LASSO showed a greater rate of false positives and/or a lower power of QTL detection, especially when estimating epistatic effects. Furthermore, because MIM was affected by the model selection criterion, the method may overestimate or underestimate the number of QTLs. The RJ method proposed by Balestre and de Souza (2016) provided a power of detection of 31% for the main QTLs and 14% for the epistatic pairs, associated with FPRs of 20 and 25%, respectively. The average bias was 6.1 cM.

The number of applied studies involving epistatic QTL mapping has increased. However, because most of them do not include any information on candidate genes (ontology and level of expression) and gene networks, the investigations do not prove that each declared QTL and epistatic pair are true declarations. Tadmor-Levi et al. (2019) and Li et al. (2018) observed QTLs with significant epistatic effects but with one or both interacting QTLs with no significant additive effects. Goto et al. (2019) found 15 pairs of epistatic QTLs from 27 main-effect QTLs, for chicken body weight. Regarding maize grain yield and five related traits, Yang et al. (2020) found 49 QTLs and 24 epistatic pairs. For rice grain yield and seven related traits, Chen et al. (2021) found 37 QTLs and 47 epistatic pairs. However, Xu et al. (2021) mapped only two QTLs with significant epistatic effects and Hu et al. (2021) did not find epistatic QTLs using MIM.

Concluding, on the basis of Kempthorne (1954) to simulate the components of the genotypic values assuming 10 epistatic QTLs and 90 minor genes, we show that mapping epistatic QTLs is a challenge. The basic procedures for the most important software available provided average power of detection for QTLs in the range of 40–65%, associated with low power for the epistatic pairs and a very high FPR for QTLs. Increasing the average power for epistatic pairs highly increased the related FPR. Adopting a procedure to find the best balance between power and FPR, there was a significant decrease in the power of QTL detection, associated with a low average power for epistatic pairs and an average FPR of 31% for QTLs and 16% for epistatic pairs. Under the best balance, the procedure is precise for positioning QTLs, showing an average bias of 3.5 cM. The main reasons for these negative results are a simplified specification of the coefficients of epistatic effects, as theoretically proved, and the effects of minor genes since 2/3 of the FPR for QTLs were due to them. However, on the basis of the great flexibility of the software available, we have confidence that breeders can achieve a superior balance between power and FPR when processing field data. Finally, we hope that this study, including the partial derivation of the coefficients of the epistatic effects, motivates investigations on how to increase the power of detection for QTLs and epistatic pairs, effectively controlling the FPR.