Introduction

The process of plant domestication and diversification was accompanied by the selection of features that increase adaptation to cultivation as well as features that are desirable for human consumption and use (Gepts 2004; Purugganan and Fuller 2009). These so-called domestication traits are associated with increase the ease of harvest and include seed retention (non-shattering), increase in fruit and seed size, changes in reproductive strategy and plant architecture, and loss of seed dormancy (Fuller 2007; Meyer and Purugganan 2013). Many studies have demonstrated that domestication traits are genetically controlled (Tanksley 2004; Doebley et al. 2006; Meyer and Purugganan 2013; Olsen and Wendel 2013; Zuo and Li 2014). The underlying loci have been mapped in segregating populations that were derived from crosses between wild and cultivated plants. For example, seed weight QTL were identified in various legume species including mungbean (Fatokun et al. 1992), cowpea (Fatokun et al. 1992), common bean (Koinange et al. 1996), pea (Timmerman-Vaughan et al. 1996) and soybean (Maughan et al. 1996). Increase in the size of fruit and vegetables appears most relevant in species with edible fruits such as tomato (Grandillo et al. 1999; Tanksley 2004), pepper (Zygier et al. 2005), cucumber (Yuan et al. 2008), melon (Monforte et al. 2014), watermelon (Sandlin et al. 2012) or cherry (Olmstead et al. 2007; Campoy et al. 2014).

The identification of the genes underlying quantitative characters such as fruit weight is an arduous task due to the quantitative nature of the trait. Many loci are thought to control weight, each with a varying effect on the character. Some loci play a major role, explaining up to 30 % of the variance (Frary et al. 2000) whereas others play a smaller role of only 7 % of the total variance (van der Knaap and Tanksley 2003; Huang and van der Knaap 2011). In addition, many loci of even smaller effects are typically not detected as a result of the lack of statistical power and the overshadowing magnitude of the effect exhibited by major loci that are segregating in the population.

The first cloned vegetable fruit weight QTL that resulted from the domestication and selection process is FW2.2 (Frary et al. 2000). This gene encodes a protein of the cell number regulator family (CNR) (Guo et al. 2010) and is thought to be localized at the plasma membrane (Cong and Tanksley 2006; Libault et al. 2010). We recently cloned the second fruit weight QTL, FW3.2 (Chakrabarti et al. 2013). The underlying gene encodes a cytochrome P450 and is an ortholog of KLUH that has been shown to regulate organ size in Arabidopsis (Anastasiou et al. 2007). Association mapping of the fw3.2 locus led to the identification of an SNP, M9, located in the promoter of SlKLUH that is highly, significantly correlated with the increase in fruit weight. Further genetic evaluations aimed at testing the functional relevance of the M9 SNP demonstrated that fruit weight was indeed segregating according to the allele of M9 (Chakrabarti et al. 2013). However, in certain populations, the segregation was not as significant or was only observed in backcross populations. These findings suggest that other loci with a larger and/or epistatic effect were segregating, negating the impact of SlKLUH on fruit weight. Two other loci also control fruit weight by regulating the number of locules in a fruit. The loci lc and fas most likely underlie the orthologs of WUSCHEL and YABBY2, respectively (Lippman and Tanksley 2001; Barrero et al. 2006; Cong et al. 2008; Muños et al. 2011).

With the advent of the reference genome sequence of tomato, we sought to test the utility of a next generation sequencing approach to expedite mapping of quantitative trait loci. In the past, populations derived from distantly related parents were used, featuring many nucleotide polymorphisms that facilitated molecular marker development and fine mapping. Populations derived from wide crosses also yielded many QTL segregating for the traits (Grandillo et al. 1999; van der Knaap and Tanksley 2003). Instead, populations derived from closer related parents should lead to a reduction in the number of segregating loci that are impacting the trait and expedite the fine mapping. Moreover, the number of nucleotide polymorphisms are not limiting when employing a whole genome resequencing approach. When few QTL are predicted to segregate, a bulk segregant analysis (BSA; Michelmore et al. 1991) followed by whole genome sequencing (QTL-seq) should be a good approach to expedite the initial mapping of loci (Takagi et al. 2013).

The main focus of this study was to utilize the QTL-seq approach to map fruit weight and locule number QTL in several segregating tomato populations. In addition, we sought to determine the genetic basis of the minor role of fw3.2 in certain populations and to evaluate the general feasibility of the QTL-seq method to map loci underlying quantitative traits in tomato and other crops.

Materials and methods

Plant material

CC4 corresponded to Moneymaker; CC7 corresponded to LA0147; CC9 corresponded to VIR1011; CC37 corresponded to VIR933 (No 2759 Enano); CC39 corresponded to VIR347 (No 347 Yablochnyi) and were obtained from Dr. Mathilde Causse (INRA, Avignon, France) (Ranc et al. 2008). LA1589, LA1655, LA2690 and LA2845 were obtained from the TGRC (http://tgrc.ucdavis.edu). LYC0440 and LYC1891 were obtained from IPK Gatersleben, Germany (http://www.ipk-gatersleben.de). Rio Grande, Howard German and Heinz 1439 were obtained from Tomato Growers Supply Company (http://www.tomatogrowers.com). The latter nine accessions were used in a previous study (Rodríguez et al. 2011). Six intraspecific F2 populations derived from independent crosses were developed and evaluated for fruit weight or locule number (Table 1). One F2 population (12S143) was a cross between Solanum lycopersicum var. cerasiforme (SLC) accessions, whereas all other F2 populations were crosses between Solanum lycopersicum var. lycopersicum (SLL) and SLC accessions. The three populations that were evaluated for fruit weight only (12S139, 12S141, and 12S143) were segregating for the M9 SNP at the fw3.2 locus (Chakrabarti et al. 2013) while fw2.2 was fixed. The three populations that were evaluated primarily for locule number (12S75, 12S76, and 12S97) were segregating for fas while lc was fixed. To prevent the impact of fas, which is known to control locule number and weight, F2 plants were selected to be heterozygous or homozygous mutant at fas. For most of the locule number populations, fruit weight varied extensively in the parents and fruit weight alleles were segregating in the F2 populations (Table 1). The F2, F1 and parents were grown in the experimental fields at the Ohio State University/OARDC Wooster campus in summer 2012. Previously, an F2 population from a cross between Rio Grande × LA1589 and Howard German × LA1589 let to the identification fw11.2 (Gonzalo and van der Knaap 2008). To confirm this QTL, the corresponding backcross populations (BC2 and BC1F4, respectively) were generated and evaluated in the present work.

Table 1 List of accessions used in this study

Phenotypic correlations between pairs of traits within an experiment and for trait between experiments were calculated using the Pearson correlation coefficient (rp).

DNA isolation and sequencing of pooled samples

Genomic DNA was extracted from young leaves using the DNeasy Plant Mini Kit (Qiagen, Valencia, CA). DNA quantity was measured with the Qubit® 2.0 fluorometer (Invitrogen-Molecular Probes, Eugene, OR). The instrument was calibrated with the Qubit™ dsDNA HS Standards according to the manufacturer’s specifications.

QTL-seq approach was applied to three populations; two for fruit weight (12S139 and 12S143) and one for locule number (12S75). For each population, two bulks comprising 10 plants each were generated. The bulks consisted of pooled DNA from plants featuring the most extreme values for the trait. Libraries of an approximate insert size of 300 bp were generated for each bulk and barcoded at the Genome Technology Access Center (GTAC) facility at Washington University (St Louis, MO). Two bulks were pooled per lane on the Illumina HiSeq 2000 and paired-end sequenced at 101 bp. FASTQ files were assessed using the FastQC program (version 0.10.1; Andrews 2010) and filtered for a quality value of at least 28. The summary statistics show genome coverage in each of the bulks (Supplementary Table S1).

Analyses of the short reads

The sequence data were trimmed and filtered prior to analysis. The short reads obtained from the two DNA-bulks were aligned against the tomato genome sequence (The Tomato Genome Consortium 2012) using Bowtie 2 (Langmead and Salzberg 2012). The preset option “--very-sensitive-local” of Bowtie 2 was used. By default, these settings attempt to find either an exact or a 1-mismatch end-to-end alignment for the read before trying the multiseed heuristic. For each read, only the best alignment would be allowed and no multiple alignments would be permitted when mapping to the reference genome. Files were converted to BAM files using SAM tools (Li et al. 2009), sorted and then compared to locate duplicate records using Picard software (http://picard.sourceforge.net). Re-alignment (BAQ) was done to avoid false SNP calls near indels. The resulting files were applied to GATK SNP-calling (McKenna et al. 2010; DePristo et al. 2011). The proportion of alleles corresponding to each of the two parental genomes was calculated by counting the number of reads harboring a SNP compared to the reference genome sequence (AD_ALT) and divided by the total number of reads (DP) (SNP-index = AD_ALT/DP). This so-called SNP-index was calculated for all positions in the genome. Since the parental genomes were likely to harbor SNP with the reference genome, we filtered out the positions where both bulks presented the same allele leading to a reduction of the number of false positive SNP. To identify regions of the genome associated with fruit weight and locule number, the SNP-indices were subtracted between the bulks (ΔSNP-index = SNP-index_‘Largest’ − SNP-index_‘Smallest’) and sliding window analysis was applied averaging the ΔSNP-index within a 1 Mb window size and 10 kb step increment using an in-house developed Python script. The average was plotted for all chromosomes to discern QTL from the regions that did not contribute to the variation for the traits of interest. We expected most of the genomic regions show ΔSNP-index = 0, but only regions with a high absolute ΔSNP-index values would suggest a major contribution to the trait variation.

Marker development and genotyping

PCR-based markers that distinguish wild and cultivated alleles of the known fruit weight and locule number loci, fw2.2, fw3.2, fas and lc (Supplementary Table S2), were evaluated in the mapping parents and F2 populations if the genes were known to be segregating. The new QTL identified by QTL-seq were also mapped in all six populations when we were successful in identifying polymorphic markers in those populations. Indels (≤25 bp) and SNPs that were in the vicinity of the QTL were converted into PCR-based markers (Supplementary Table S2). Kruskal–Wallis test or one-way analysis of variance (ANOVA) was used to test the significance of the alleles with fruit weight or locule number variation in the populations. The degree of dominance of the alleles at a specific locus was calculated as d/a ratio, where d = Aa − (AA + aa)/2 and a = (AA − aa)/2, where AA is the mean value for the homozygous parent 1, aa is the mean value for the homozygous parent 2 and Aa is the mean value for heterozygous. The percentage of phenotypic variation explained by each QTL (R 2) was estimated by multiple-regression analysis, using as explanatory variables the most significant markers for each QTL.

Data analyses were performed with the R open-source software (version 3.1.1; R Core Team 2014).

Heritability estimates

The broad-sense heritability (H 2) of each trait was calculated by variance components method according to Mather and Jinks (1982). The phenotypic variance for the parent lines (\(V_{{P_{1} }}\) and \(V_{{P_{2} }}\)) and F1 progeny (\(V_{{F_{1} }}\)) is due to environmental factors, whereas the phenotypic variance in the F2 (\(V_{{F_{2} }}\)) includes a sum of genetic (V G) and environmental (V E) variances. Broad-sense heritability can be estimated by:

$$H^{2} = \frac{{V_{\text{G}} }}{{V_{{F_{2} }} }} = \frac{{V_{{F_{2} }} - {\raise0.7ex\hbox{$1$} \!\mathord{\left/ {\vphantom {1 4}}\right.\kern-0pt} \!\lower0.7ex\hbox{$4$}}(V_{{P_{1} }} + V_{{P_{2} }} + 2V_{{F_{1} }} )}}{{V_{{F_{2} }} }}.$$

Epistatic interactions

Epistatic digenic interactions of fw3.2 with other loci were determined based on the following model of analysis of variance (ANOVA) for each pair of genes that is involved in fruit size variation.

$$Y_{ijl} = \mu + {\text{fw}}3.2_{i} + {\text{QTL}}_{j} + {\text{fw}}3.2\;{\text{QTL}}_{ij} + e_{ijl}$$

where µ represents the population mean value; fw3.2 i , the effect of the fw3.2 locus; QTL j , the effect of the QTL detected here; fw3.2QTL ij , the interaction between gene effects; eijl, are the experimental errors.

Confirming the QTL interval for fw11.2

To determine more precisely the interval of fw11.2 QTL, nine F2 plants carrying a recombination between markers 13EP232 and 13EP236 were selected from 12S139 population. Seeds from each recombinant F2 plant were sown and approximately 12 seedlings that were homozygous recombinant and 12 that were homozygous non-recombinant were identified with molecular markers. These plants were transplanted in the field for fruit weight evaluation in summer 2013. Average fruit weights were determined from twenty ripe fruits per plant. Student’s t test was used to detect significant differences between the homozygous recombinant and non-recombinant genotypes in each F3 family.

Results

Trait variation in the segregating populations

Three F2 populations were evaluated for fruit weight (FW) and another three were evaluated for both FW and locule number (LC) (Table 1). The frequency distributions showed continuous variation for both characters (Figs. 1, 2), indicating that FW and LC were quantitatively inherited traits. For FW, the mean values of the F1 and F2 plants were skewed towards the small fruited parent. Three of the six families were genotypically selected prior to transplanting in the field to be heterozygous (12S75 and 12S76) or homozygous mutant (12S97) for fas. Since fas has a large impact on fruit weight, the selection for the mutant allele was expected to result in an average larger fruit in the 12S97 F2 population. However, the 12S97 family was still skewed towards the small fruited parent (Fig. 2g), suggesting that the small fruit alleles of the segregating loci were dominant over the large fruit alleles. The 12S143 population exhibited transgressive segregation for FW as some plants carried smaller and others larger fruit than either parent (Fig. 1c). The CC37 parent of this population carried slightly smaller fruit compared to the CC39 parent while carrying the mutant alleles of lc and fw3.2, typically leading to increased FW (Table 1). This suggested that another FW QTL must be segregating in this population and that the effect of increased fruit weight originated with the CC39 parent.

Fig. 1
figure 1

Frequency distribution of fruit weight in three F2 tomato populations. a 12S139, b 12S141, c 12S143

Fig. 2
figure 2

Fruit weight and locule number distributions in three F2 populations. FW (a) and LC (b) distributions and FW to LC correlation (c) in 12S75 population. FW (d) and LC (e) distributions and FW to LC correlation (f) in 12S76 population. FW (g) and LC (h) distributions and FW to LC correlation (i) in 12S97 population

For LC, the segregation in the F2 populations showed similar trends as for FW. In general, the LC mean values of the F2 were skewed to the low locule number parent and the F1 or even smaller (Fig. 2). In contrast, the LC mean values in 12S97 population were higher than the F1 and both parents. In addition, even though all F2 plants carried the mutant allele of fas, LC varied from 2.25 to 5.40 in this population, suggesting the presence of another locus that enhanced the effect of fas on locule number.

Broad-sense heritability was calculated for the six populations (Supplementary Table S3). In the case of FW, heritability estimates showed a wide range (0.12–0.93), with the minimum and maximum H 2 values in the 12S139 and 12S97 populations, respectively. The low H 2 was likely due to high variation in fruit weight for some parents and the F1 (Supplementary Table S3). The H 2 for LC was higher and ranged between 0.40 and 0.93. Additionally, the correlation between FW and LC was relatively high, suggesting that fruit weight QTL positively increase locule number or vice versa (Fig. 2c, f, i).

Association of the known FW and LC QTL with the traits

The three populations that were primarily studied for variation in fruit weight (12S139, 12S141 and 12S143; Fig. 1) were used in a previous study aimed at validating the segregation of the M9 SNP with the fw3.2 QTL (Chakrabarti et al. 2013). The alleles of fw2.2 and fas were fixed in all populations whereas lc segregated in two of the three populations. To confirm the previous findings using a different population size, F2 seedlings from the same F1 were evaluated for fruit weight. As expected, only the 12S141 population showed strong association of the M9 SNP and FW (PVE of 0.35), whereas in the other two families, the segregation was barely or not significant (Table 2). This suggested that other fruit weight QTL were segregating in 12S139 and 12S143. Interestingly, lc was associated with fruit weight only in the 12S143 population and not in the 12S141 population. These results combined suggested that genetic background could have a large impact on the FW trait or, in the case of lc, that linkage to FW QTL might be the cause of marker-trait association.

Table 2 Significant markers associated with fruit weight and locule number in each F2 population

In the tomato germplasm, variation in locule number appears to be higher than what can be explained by alleles at the two loci, lc and fas (van der Knaap et al. 2014). Since LC and FW traits showed significant correlation, we wanted to find out whether the known FW loci fw2.2 or fw3.2 segregated for locule number. We genotyped the alleles in the three populations and found that the known FW QTL were not associated with locule number except for fw2.2 and only in the 12S97 population (Table 2). Taken together, these results suggested the presence of another LC QTL controlling locule number variation in these populations.

QTL-seq of three populations

We selected three populations for the identification of additional fruit morphology loci using the QTL-seq approach. Two populations were selected to identify FW QTL, 12S139 and 12S143, and one to identify LC QTL, 12S75. For the two fruit weight populations, in both we identified a single QTL, fw11.2 and fw1.1, respectively (Fig. 3). The results for the other chromosomes did not show the presence of any additional QTL in these populations (Fig.S1 and S2). Even though fw11.2 mapped close to fas and fw11.3, it was distinct from these two known fruit weight loci (Fig. 3a). The fw1.1 was associated with the pericentromeric region of the chromosome 1 (Fig. 3b). Pericentromeric regions are notoriously large in tomato and occupy 3/4 of the chromosome (Sim et al. 2012). The greatly suppressed recombination frequency of the pericentromeric regions was why fw1.1 spanned more than two-thirds of the entire chromosome. For LC we identified four QTL on chromosome 2, 3, 5 and 6 (Fig. 4, S3). The lcn2.4 locus on chromosome 2 was quite distant from lc and other previously mapped locule number QTL (Barrero et al. 2006). Therefore, these QTL might represent hitherto unknown loci controlling locule number.

Fig. 3
figure 3

Identification of fruit weight QTL in 12S139 and 12S143 populations. a Average values of ΔSNP-index calculated by sliding window analysis for chromosome 11 in 12S139 population. Known fruit weight and shape genes positions are indicated. b Average values of ΔSNP-index calculated by sliding window analysis for chromosome 1 in 12S143 population. Red lines indicate the approximate position of QTL (color figure online)

Fig. 4
figure 4

Identification of locule number and fruit weight QTL in 12S75 population. a Average values of ΔSNP-index calculated by sliding window analysis for chromosome 2. b Average values of ΔSNP-index calculated by sliding window analysis for chromosome 3. c Average values of ΔSNP-index calculated by sliding window analysis for chromosome 5. d Average values of ΔSNP-index calculated by sliding window analysis for chromosome 6. Red lines indicate the QTL approximate position (color figure online)

To confirm that the identified regions were associated with FW or LC, we used the genome sequence data to identify SNPs for molecular marker development. For FW, the two QTL were easily confirmed and showed that each explained 18 and 29 % in phenotypic variance, respectively (Table 2; Supplementary Table S4). For LC, three out of the four QTL were confirmed with markers mapped in the entire population. However, the effect and significance of each of the QTL on the trait was relatively low (Table 2). This result suggested that increased locule number was under the control of numerous loci of relatively small effect in this population. When we associated the LC QTL with FW in the 12S75 population, lcn3.1 was instead found to have a highly significant effect on FW. Therefore, this region was likely to carry a fruit weight locus that had a pleiotropic effect on locule number. Because of its larger effect on FW, we renamed the locule number QTL to fw3.3.

Single marker analysis of the newly identified QTL in all six populations

We wanted to know whether the newly identified QTL, namely fw1.1, fw3.3, fw11.2, lcn2.4, lcn5.1 and lcn6.1, explained the variation in FW and/or LC in the other populations that were not used in the QTL-seq experiment. The molecular markers that were developed for the confirmation of the loci in the QTL-seq populations were surveyed in the other populations (Supplementary Table S5). For certain QTL, all markers tested were monomorphic and therefore we could not determine with certainty that these regions were not associated with either FW or LC. For those markers that were segregating, fw3.3 was associated with FW in the 12S141 in addition to the 12S75 population (Table 2). Even though fw3.2 and fw3.3 were both segregating and linked in population 12S141 (15.3 cM and 27 recombinant plants out of 95), the distance between these QTL implied that they represented distinct loci controlling fruit weight. With respect to LC, the most striking QTL was identified in the 12S97 population on chromosome 6. Whereas this QTL was not significantly associated with LC in 12S75, lcn6.1 explained 14 % of the variance in the 12S97 population. In this population, the fas locus was fixed for the mutation and therefore, lcn6.1 appeared to be a modifier of the mutation in FAS by increasing locule number from an average of 3.0–3.5 (Table 2).

Interaction of fw3.2 with fw11.2 or fw1.1

The effect of fw3.2 was different in distinct genetic backgrounds. The lack of significant segregation at this locus in two of three populations was at least in part due to other segregating loci impacting the trait (Fig. 3). To determine whether fw11.2 and fw1.1 showed epistatic or additive interaction with fw3.2, we plotted the effect of the loci in the different genetic backgrounds (Fig. 5). These data showed that the effect of fw3.2 appeared negligible when fw11.2 carried the homozygous cultivated allele, suggesting that these two loci were epistatic to one another. However, two-way ANOVA performed to validate the interactions were not significant for either population (P value of 0.2250 and 0.1705, respectively). The lack of significance may be due to the high variation observed in fruit weight in the F 2 population or that the effects of the QTL on the trait were indeed additive.

Fig. 5
figure 5

Digenic interactions between fruit weight QTL using the markers significantly associated to them. a FW3.2 × FW11.2, b FW3.2 × FW1.1. Alleles at FW3.2 locus are named WT for the wild-type allele; DERIVED for derived allele and HET for heterozygous. At FW11.2 locus, CC7 homozygous for the CC7 allele, HET heterozygous, CC9 homozygous for the CC9 allele. At FW1.1 locus, CC37 homozygous for the CC37 allele, HET heterozygous, CC39 homozygous for the CC39 allele

Fine mapping of fw11.2

To further delineate the fw11.2 QTL, we identified recombinant F2 plants between the markers 13EP232 and 13EP236, a 1.62-Mb region on chromosome 11. Progeny testing of nine families showed that the QTL was located between 50.6 and 51.3 Mb (Table 3). This region comprised 66 genes encoding a range of functions such as DNA-RNA binding proteins including transcription factors (12 %), other binding proteins (18 %), enzymes (13 %), and many proteins of unknown function (33 %).These fine mapping results were consistent with previous results derived from crosses between cultivated tomato and a wild relative (Supplementary Table S6).

Table 3 Progeny test of fw11.2 recombinant plants

Discussion

Fruit weight and locule number are quantitatively inherited and many loci of both small and large effects control these traits. In the past, distantly related parents were used for the development of F2 or RIL populations because they offer a high number of segregating loci and nucleotide polymorphisms to associate molecular markers with traits (Grandillo et al. 1999; van der Knaap and Tanksley 2003; Lin et al. 2014). Delineation of each QTL in populations derived from these wide crosses is laborious due to the need to conduct repeated backcrossing to fix background markers. It would typically take more than 5 years after initial QTL identification to delineate a QTL to a defined interval on the chromosome (e.g., van der Knaap and Tanksley 2003; Zhang et al. 2012). With the public release of the tomato reference genome (The Tomato Genome Consortium 2012) and the knowledge of a few genes that control FW and LC (Frary et al. 2000; Cong et al. 2008; Muños et al. 2011; Chakrabarti et al. 2013), genetic studies of quantitatively inherited characters can now be expedited. This is because segregation of the known genes can be avoided or taken into account in the selection of parents for populations. Moreover, whole genome sequencing ensures that the number of nucleotide polymorphisms is not a limiting factor in marker development. Thus, closer related parents can be selected which means fewer SNPs and fewer segregating QTL for the traits. In this study, we used a BSA-whole genome sequencing approach to map loci controlling FW and LC in tomato. We mapped four highly significant QTL: three FW (fw1.1, fw3.3, fw11.2) and one LC (lcn6.1) QTL. Within months of the identification of fw11.2, we confined its location to a 0.7 Mb region and a reduced number of candidate genes. Pursuing of this strategy should allow us to further reduce this region to few candidate genes within another year. Thus, one of the crucial approaches to the successful implementation of the QTL-seq method in tomato was the selection of parents for this study.

A critical consideration for the successful implementation of QTL-seq and gene identification is also the size of the population. Whereas nearly 100 F2 individuals were sufficient to delineate a single QTL (12S139 and 12S143), this size was too small for the reliable identification of three LC QTL segregating in one population (12S75). Moreover, better delineation of the QTL using more recombinant F2 plants for progeny testing is desirable when the goal is to clone the underlying gene. Thus, even for the populations that segregate for a single QTL, a larger population would have resulted in a shorter interval of the location of the gene in the first generation of progeny testing (the F3), which would speed up the eventual cloning of the gene. In contrast, when the goal is to identify closely linked markers without further gene identification, a population size of 400 is desirable. In all, a researcher needs to consider available growing space, labor costs, the ease of trait evaluation and the number of individual populations to decide on the optimal F2 population size and reach the intended goal. For tomato FW and LC traits, a population size of approximately 200 individuals is preferred when undertaking a QTL-seq approach for the identification of the underlying genes.

It is broadly accepted that the power and precision of genetic mapping is significantly affected by the magnitude of trait heritability (Flint et al. 2005). QTL detection is easier for traits where the number of contributing loci is low and heritability is high (Moreau et al. 1998). Broad-sense heritability (H 2) has been used as an index of reliability of phenotypic selection for genetic characteristics (Holland et al. 2003). However, our data suggest that H 2 may not be a reliable parameter in predicting putative QTL. For instance, in 12S139 and 12S143 populations, broad-sense heritability for fruit weight was 0.11 and 0.83, respectively. Yet, fw11.2 QTL was detected on 12S139 population despite of its low H 2 value. Our relatively low heritability values were likely due to the high variation in the trait values for the parents. More replication of parents and F1 could have improved the accuracy of the heritability values in this study.

The feasibility to detect QTL by QTL-seq can also be affected by the abundance of sequencing reads at a specific locus. The minimum read depth must assure enough coverage across the genome, as this is required to detect variants, but also to adequately assess the contribution of each parent to the bulked DNAs. In tomato, the relatively small size of its genome (950 Mb; The Tomato Genome Consortium 2012) has allowed us to re-sequence DNA-bulks at 20-fold coverage, which was sufficient to detect variants (Dohm et al. 2008). Those results show that QTL-seq is a cost-efficient approach for de novo QTL identification in tomato. This conclusion is in line with conclusions from the first publication describing a QTL-seq approach in rice (Takagi et al. 2013), another crop species with a compact genome. It remains to be determined whether QTL-seq can be successfully applied in other crop species with larger genomes. But for genomes the size of tomato or smaller, the method is highly effective to identify markers linked to quantitative traits of agronomic interest.

Author contribution statement

EIB conducted the analyses, research and constructed the figures and tables; JVH extracted DNA from the six populations and developed the scripts for the analyses of the QTL-Seq data; ZH developed the locule number populations and identified the fw11.2 locus in one of the populations; EvdK initiated the project, supervised the research and analyses, and wrote the paper with EIB.