Introduction

Apple fruit shape index (FSI) is an important aspect of the exterior quality of a fruit. In East Asia, cultivars with large FSI have a greater potential for the fresh market. Like most of other important traits, the FSI of apples is not only genetically controlled but is also influenced by environmental factors. Though FSI can be affected by the application of plant growth regulators, selection for elite cultivars and breeding lines with larger FSI is a potential approach for improving apple fruit quality.

Very few studies have considered the inheritance of FSI in apples. Klein (1958) treated FSI as qualitative characteristic by the different value of FSI. Brown (1960) pointed out that apple fruit shape is a quantitative trait that is controlled by several polygenes. The average FSI of hybrid progenies is usually close to the mean of the parental FSI values. Higher levels of heterozygosity in the parents result in a higher degree of segregation in the offspring. However, fruit development is also greatly influenced by environmental factors, which further complicates the study of fruit shape inheritance.

A segregation ratio of flat, round and cone shaped apple of 1:2:1 was previously found in a hybrid population of ‘Ralls’ × (‘Starkrimson’ + ‘Richard Delicious’). Based on this finding, the inheritance of fruit shape was hypothesised to be controlled by two major genes (Guo et al. 1994). Round or oblate shape in pear (Pyrus communis L.) is dominant over cone or oval (Wang et al. 1991). Fruit shape in peaches (Prunus persica Batsch) is a qualitative characteristic that is controlled by a single gene; flat fruit is dominant over round ones (Lesley 1939).

Over the past two decades, a lot of molecular markers that are tightly linked to many important traits of apples have been identified, including fruit colour (Cheng et al. 1996), fruit quality (Costa et al. 2005), acid content (Yao et al. 2006), columnar growth (Hemmat et al. 1997) and fruit diameter (Kenis et al. 2008). However, molecular marker linked to FSI has not been identified in apple.

The objective of this study was to clarify the inheritance of apple FSI using a hybrid population of ‘Jonathan’ × ‘Golden Delicious’ (Malus domestica Borkh). To this end, we examined FSI frequency distributions for two consecutive years and screened this population for simple sequence repeat (SSR) and amplified fragment length polymorphism (AFLP) markers linked to FSI.

Materials and methods

Plant material

A hybrid population (‘Jonathan’ × ‘Golden Delicious’) totaling 1733 seedlings was used in the analysis of inheritance. Controlled hybridisations were performed in early May 2002. The young seedlings were planted at the Changli Institute of Pomology (Hebei, China) in May 2003 at a density of 0.5 m × 2 m. Conventional management and disease/pest control were applied to the seedlings.

Phenotyping

In the year of 2008 and 2009, 1162 and 971 seedlings respectively bloomed and set enough fruits for evaluation (400 were common to both years). The length and diameter of each fruit were measured with a vernier caliper. The FSI was calculated by the average of the length/diameter ratio for ten fully ripe apples from each seedling.

Estimation of experimental variation and required sample size for phenotyping

In order to estimate the minimum number of apples that has to be phenotyped for obtaining reliable FSI scores, the level of genetic and experimental variation was estimated by F test with the 30 seedlings randomly chosen from the 400 fruiting seedlings in both the year of 2008 and 2009. Because the hybrid seedlings were reared at both breeding and mapping purposes, and to reduce the cost of field management, seedlings were planted at a density of 10,000 seedlings per hectare. A seedling at this density actually produces not more than 30 apples every year in case of alternative bearing. According to our “Data collection standards for seedling evaluation in apple breeding”, ten apples were sampled and evaluated. For FSI in this paper, we shared the data of seedling evaluation for breeding purpose except the 30 seedlings fruiting both years for F test. Unfortunately, we found that the least number of available apples was 18 for those 30 seedlings. So that three subsets of six apples were used for F test. If six apples are enough for phenotyping, it would be no problem to share the FSI data (from ten apples) with breeding evaluation.

Analysis of inheritance

Frequency distribution diagrams of the FSI phenotypes of the 2008 (1162 seedlings), the 2009 (971 seedlings) and the combined data (1733 seedlings) were generated with MS Excel 2003. The segregation ratio between low and high FSI was examined by χ2 test. To determine detailed heredity parameters, the average values (Sm) of the internal variances of the two independent distributions were calculated. Theoretically, these average values should only include polygene effect variances (Si) and environmental effect variances (Se), i.e., Sm = Se + Si. Se can be measured by averaging the variances for the 400 seedlings fruiting both years. Si was then calculated as Si = Sm−Se. The major gene effect variance (Sa) was estimated by measuring the variance between the mean phenotype values of large and small FSI values. Because the interaction between environmental and genetic effects is impossible to be calculated in this experimental design, the population variance (S) could be formulated as S = Sa + Se + Si. Finally, the heritability of the major genes and polygene were expressed as Sa/S and Si/S, respectively.

Construction of segregating sub-populations and bulk segregants

The division between large and small values for FSI was determined based on the frequency distribution patterns. The crossing point of the two Gaussian distributions was considered as the cut-off criterion between large and small FSI (Fig. 1). In order to minimise the environmental error, the larger sub-population for identifying linked molecular markers was constructed from all 38 seedlings with large FSI phenotype while the small sub-population was from 100 randomly selected seedlings with small FSI phenotype in the year of 2008.

Fig. 1
figure 1

Frequency distributions of FSI of progenies from the hybrid cross ‘Jonathan’ × ‘Golden Delicious’. The three frequency distributions of FSI were generated with the data in the year 2008, 2009 and 2008 + 2009, respectively

According to the bulked segregant analysis (BSA) protocol (Michelmore et al. 1991), DNA from 30 seedlings with large FSI values and 30 seedlings with small FSI values were chosen randomly from the segregating sub-populations and pooled into the large or small bulk, respectively.

DNA-isolation

Genomic DNA from the leaves of each individual in the segregated population was extracted following the protocol of Maguire et al. (1994). After treatment with RNase, the concentration and purity of the DNA were measured using an ultraviolet spectrometer, and the final concentration of each DNA sample was adjusted to 100 ng/μl.

Screening for SSR and AFLP markers

A total of 180 published microsatellite primers were screened for polymorphisms between the large and small FSI DNA bulks (see Electronic Supplementary Material) (Guilford et al. 1997; Gianfranceschi et al. 1998; Hokanson et al. 1998; Liebhard et al. 2002; Yamamoto et al. 2004; Silfverberg-Dilworth et al. 2006). Sample amplification for the SSR analysis was conducted in a total of 20 μl (20 ng template DNA, 200 μM of each dNTP, 2 μl of 10× buffer, 0.4 μM of 3 and 5 primers and 0.4 U of DNA Taq polymerase). The amplification procedure followed that reported by Kijas et al. (1997) (94 °C/4 min, 32 cycles/94 °C/1 min, 55 °C/40 s, 72 °C/1 min, and a final extension at 72 °C/5 min). The PCR products were separated on 8% polyacrylamide gels and visualised by silver staining. The DNA ladder was ‘puc18/Msp I’ (RealTimer BioTech Co. Beijing). For validation of the identified markers, 3–5 markers close to the predicted site of the target genes were selected and tested according to the published maps (Liebhard et al. 2002; Silfverberg-Dilworth et al. 2006).

AFLP analysis was performed according to the method described by Vos et al. (1995). Two restriction enzymes, EcoRI and MseI, and the appropriate adaptors were ligated to their respective ends of the DNA fragments. PCR amplification was performed in a 20 μl volume containing 2.0 mM MgCl2, 160 μM of each dNTP, and 2.0 μl 10× reaction buffers, 0.16 μM of forward primer, 0.16 μM of reverse primer, 1.0 U of Taq DNA polymerase (Roadgen) and 50 ng of genomic DNA. PCR products were separated by 1.0% agarose gel electrophoresis, and the gels were photographed under UV light. The amplified products were then separated on 8% polyacrylamide gels and visualised by silver staining. The DNA ladder was ‘puc18/Msp I’ (RealTimer BioTech Co. Beijing) for both agarose and polyacrylamide gel electrophoresis.

Analysis of linkage and genotypes of the parents

Linkage between the molecular markers and target characteristics were analysed using the Mapmaker/Exp 3.0 (Lincoln et al. 1992) and MapDraw 2.0 (Liu and Meng 2003) software packages. The parental genotypes were estimated according to the segregation ratio of the markers in the segregated population and the appearance of them in samples of the parents.

Results

Sample size required for adequate phenotyping

Analysis of variance showed significant differences in the FSI values between the seedlings (F = 68.93, F 0.05 = 1.47, F 0.01 = 1.72) and between sampling years (F = 154.27, F 0.05 = 3.85, F 0.01 = 6.66). No significant difference in FSI was detected between replicates (F = 1.93, F 0.05 = 3.0). These results suggest that the FSI trait segregated significantly in the F 1 population and that the minimum sample size for adequate phenotyping was six apples.

Robustness over years

Based on the 400 seedlings common to both years, though significant variation was shown in F test, Pearson product-moment correlation coefficient between FSI values of the year 2008 and 2009 was highly significant (r = 0.714, r 0.05 = 0.098, r 0.01 = 0.128), which indicated a good robustness of FSI values over years.

Segregation ratios and a first genetic model

The FSI values of the parental cultivars, ‘Jonathan’ and ‘Golden Delicious’, were 0.83 and 0.87, respectively. The FSI values of the 1162 hybrid offspring sampled in 2008 ranged from 0.7 to 1.08 and the values for the 971 seedlings sampled in 2009 ranged from 0.71 to 1.07. The frequency distribution pattern of the FSI values is shown in Fig. 1 (a, b). Two mutually independent distributions can be clearly divided by the crossing point, 0.97 or 0.95 for the samples of 2008 and 2009, respectively. Hence, FSI is though a quantitative trait but major genes are involved in its genetic control.

In the 2008 sample, the cut-off criterion between large and small FSI values was 0.97, 1124 seedlings displayed small FSI values, and 38 individuals were considered with large FSI values. This 1124:38 ratio does not significantly deviate from the 31:1 ratio expected for five segregating loci, while it does for the 15:1 and 63:1 expected for four and six segregating loci respectively. Similarly, in the 2009 sample, 940 seedlings with FSI values less than 0.95 were regarded as having small FSI and the 31 individuals with FSI values larger than 0.95 were considered large FSI, and also this ratio matches the 31:1 ratio (Table 1). The frequency distribution of the combined (2008 and 2009) data displayed the same pattern as the two individual years (Fig. 1c). The differentiation cut-off value was 0.97. χ2 tests again showed that the ratio of small to large FSI (1687:46) was again not significantly different from 31:1 (Table 1).

Table 1 Criteria, segregation ratios and χ2 tests of small and large FSI in the progency of the ‘Jonathan’בGolden Delicious’

These segregation data imply that the variance in the values of FSI in this hybrid population is caused by the segregation of three to five independently segregating major genes (five in the case of different single parental loci, and three in case two loci segregate in both parents) and that large FSI values require the absence of all five alleles for low FSI.

Heritability estimates

The detailed heredity parameters are listed in Table 2, and the heritability of the major genes and polygene in this population was estimated at 75.00 and 22.50% for the year 2008, and was 75.51 and 22.45% in the 2009, respectively.

Table 2 Estimates of heredity parameters of FSI in offspring of the ‘Jonathan’בGolden Delicious’

Screen for microsatellite markers

The low-FSI bulk consists of randomly selected seedlings from the entire sub-population with low FSI scores. Many of these seedlings will thus have some of the +alleles for high FSI. In contrast, seedlings from the large FSI-bulks will have the +allele of all five loci and will thus lack the allelic marker for low FSI. By contrasting these bulks, markers are searched that are present in the low-FSI bulk and (almost) absent in the high-FSI bulk. Note that polymorphic markers may be present at low frequency in the large-FSI bulk due to recombination between the targeted gene and linked markers.

Thirty-eight primer pairs from a set of 180 published microsatellite primers showed polymorphisms between the large and small FSI DNA bulks. Of these 38 primer pairs, CH02d08, Hi16d02, CH01f12, GD147 and CH01d03 co-segregated with small FSI (Table 3). As expected, none of polymorphic marker alleles was amplified from the large FSI DNA bulk or hardly from individual seedlings from this bulk. In contrast, these differential alleles’ fragments (220, 220, 200, 200 and 160 bp for CH02d08, CH01f12, GD147, CH01d03 and Hi16d02, respectively) were amplified uniquely from the small FSI bulk and from several of its individual seedlings.

Table 3 Microsatellite and AFLP markers linked to small FSI in apple

CH02d08 and Hi16d02 are located in LG11 (Silfverberg-Dilworth et al. 2006). The polymorphic marker allele of the CH02d08 primer pair was 220 bp in size and amplified from four out of the 38 individuals with large FSI values, indicating a recombination rate of was 10.5%, and the map distance between CH02d08 and the target gene was found to be 10.7 cM. A specific 160 bp PCR product was amplified by the Hi16d02 primer from five individual seedlings when the DNA from individuals (38 in total) with large FSI values from the segregating population was used as a template. The recombination rate was 13.2%, and the map distance between Hi16d02 and the target gene was found to be 13.5 cM. Interestingly, the map distance between CH02d08 and Hi16d02 is 24.0 cM, according to the published linkage map (Silfverberg-Dilworth et al. 2006). These values are near the sum of the map distances between the target gene and the two markers (10.7 + 13.5 = 24.2 cM), and this suggests that CH02d08 and Hi16d02 are most possibly flanking markers linked to the target gene associated with small FSI.

CH01f12 and GD147 are located in LG10 and LG13, respectively (Silfverberg-Dilworth et al. 2006). These two markers displayed recombination in ten of the 38 individual seedlings with large FSI values (a recombination rate of 26.3%). The distance between the target gene and both CH01f12 and GD147 was 29.3 cM.

According to published linkage maps, the marker CH01d03 is located in both LG4 and LG12 (Silfverberg-Dilworth et al. 2006). In LG12, another marker CH05g07 also has polymorphism between the two bulks, though the recombination rate is much higher (more than 30%). In LG4, polymorphic product was amplified from the DNA bulks by none of the nine microsatellite primers tested (NZ05g08, CH04e02, Hi23g02, Hi04c10x et al.). Therefore, the marker CH01d03 is located in the LG 12. The allele CH01d03 displayed recombination in 11 of the 38 individual plants with large FSI values (a recombination rate of 28.9%). The distance between the target gene and the CH01d03 was 33.0 cM.

Validation of microsatellite markers

In order to validate and to narrow down the distance of the above identified microsatellite markers, markers near the site of the predicted major genes were tested using the same segregating population. CH02c11, located on LG10, was linked with FSI at a distance of 22.4 cM (a recombination rate of 21.1%). Two SSR markers on LG11, CH04a12 and Ch02d12, were found to be closely linked to the target gene, their map distances were 2.6 and 5.3 cM (2.6 and 5.3% recombination) respectively. On LG12, the recombination rate between CH03c02 and FSI was 10.5%, and the map distance was 10.7 cM.

AFLP marker

A total of 256 AFLP (Vos et al. 1995) markers were used to screen for loci linked to FSI. Only one primer pair (M-CTA/E-AGG) produced a polymorphic band (240 bp) that amplified from the small FSI but not from the large FSI bulk. Most individuals with small FSI values and the maternal parent cultivar, ‘Jonathan’, showed amplification when using this primer set, but only seven out of the 38 seedlings (18.4%) with large FSI values, and the map distance between the target gene and the M-CTA/E-AGG-0240 was 19.3 cM.

The relationship between SSR and AFLP markers

Linkage analysis was performed for the SSR markers and one AFLP marker using the Mapmaker/Exp 3.0 software. When the LOD value was three and the maximum distance was 37.2, CH02d08, CH04a12, Ch02d12, Hil6d02 and M-CTA/E-AGG-0240 were located on the same linkage group. The distance between the AFLP marker and CH02d08 was 34.7 cM, but that between M-CTA/E-AGG-0240 and the other three SSR markers was larger than 37.2 cM. Linkage was also found between CH01f12 and CH02c11, and between CH01d03 and CH03c02. GD 147 was left unlinked. The markers identified in this study were assigned and mapped to four linkage groups (Fig. 2). Thus, the variance in FSI values observed in this hybrid population is putatively controlled by the segregation of five major genes.

Fig. 2
figure 2

Mapping of the FSI loci. Numbers on the left of each linkage group (LG) indicate the map distance (in cM). Bold on the right of each LG indicates the position of the FSI locus and their markers. G Golden Delicious, J Jonathan; the following number means the linkage group

Genotypes of the parents

Chi-square analysis showed that the segregation ratio of all SSR markers except Hi16d02 in the hybrid population was not significantly different from 1:1 at a 5% confidence level. It also demonstrated that the ratio of Hil6d02 and M-CTA/E-AGG-0240 was not significantly different from 1:1 at a 1% confidence level. The SSR markers, CH01f12 and CH02c11, associated with small FSI values, were present only in the pollen parent, ‘Golden Delicious’, but not in the maternal parent, ‘Jonathan’. In contrast, the other SSR markers and the AFLP marker M-CTA/E-AGG-0240 were present only in the female parent, ‘Jonathan’ (Table 3). Based on these findings, both parents were expected to display small FSI values that are consistent with the actual phenotypes of the parents. The genotype of ‘Jonathan’ may be presumed to be f10f10F11-1f11-1F11-2f11-2F12f12F13f13 and the genotype of ‘Golden Delicious’ is F10f10 f11-1f11-1f11-2f11-2f12f12f13f13 (F: dominant locus for the small FSI, f: recessive locus for the large FSI, subscript number: linkage group).

Discussion

Inbred lines are rarely available for fruit crops and the genotypes of parental cultivars are usually unknown. Most traits segregate extensively in the F 1 population, and for many fruit species, it is often impossible to obtain an F 2 population due to self-incompatibility. Therefore, early genetic studies on fruit crops have mainly focused on qualitative traits, e.g., columnar tree habit in apples (Hemmat et al. 1997), cling/free stone and white/yellow flesh in peaches and nectarines (Warburton et al. 1996). More recently, joint segregation analysis (JSA) was developed for the analysis of inheritance of quantitative traits in plants (Gai et al. 2003). This method has been efficiently used in chickpea (Cicer arietinum L.) and many other plants (Anbessa et al. 2006). Most traits associated with important economic interest are quantitative traits, making analysis of frequency distribution patterns a potential approach for analysing the inheritance in horticultural crops.

By means of molecular markers or quantitative trait loci (QTLs), polygene can be treated as groups of Mendelian factors. But the small sizes of segregating populations or small number of individuals in segregant bulks used in many reports may easily result in false positive associations, especially in case of quantitative traits which are largely affected by environmental factors. For example, Wu et al. (2004) made up the DNA pool with only six individuals from a F 1 population (69 seedlings) to screen molecular markers linked to non-acid/acid fruit trait in peach, while Yao et al. (2006) used pools of just eight individuals when screening for markers linked to acid/low acid trait in apple. To minimise environmental error in this study, we examined 138 seedlings in the segregated population (selected from 1162 total individuals), as well as bulks containing DNA from 30 individuals with large FSI values and 30 individuals with small FSI values.

The most extreme phenotypes should be selected to generate segregation bulks when screening molecular markers for qualitative traits or even some quantitative traits (Michelmore et al. 1991; Chagué et al. 1997). The segregation within any of the two distributions shown in Fig. 1 is caused by the combination of the effects of polygenes, major genes and environmental variation. Thus, if the BSA-bulks are composed of a sufficient number of individual seedlings with extremely large or small FSI values, the molecular markers identified should be linked to both major genes and also to some of the polygenes. Instead, individuals from the two distributions with either large or small FSI values were randomly selected to reduce the interference from the polygene. In this study, we proposed that the FSI is controlled by five major genes. This hypothesis was confirmed by molecular markers, and the identified major gene loci were located on LG10, 11, 12, and 13. This finding agrees with previous results indicating that QTLs controlling fruit diameter are present on LG5, LG10 and LG17 (Kenis et al. 2008), whereby the QTL of LG10 co-localizes with our major gene locus. Kenis et al. (2008) reported this QTL to be located between the SSR-markers CH02c11 and COL, which region overlaps with the assigned position of our gene being 22.4 cM down marker CH02c11. And we also speculated the genotype of the parents basing on the χ2 analysis of segregation ratio of the markers in the hybrid population. In general, the nearer the genetic distance is, the better its accuracy will be. However, the map distances between the target genes and the markers are much larger. Additional markers with smaller map distances to the target genes will be required before such markers can be practically used in breeding programs.

The deviation of large and small-FSI subpopulations is caused by the segregation of none-additive major genes, otherwise it should be normally distributed. But the small-FSI part of the frequency distribution consists of 31 different FSI-genotypes. For a better understanding on the performance (mode of action) of the different genes, it would have been interesting to genotype these subsets for the identified SSR.