Introduction

Hybridization is an important mechanism in species formation either leading to new species or preventing the differentiation and evolution of different species (Barton and Hewitt 1985; Futuyma and Shapiro 1995; see also Seehausen 2004 for a review). The fate of hybrids and the temporal and spatial dynamics of hybrid zones depend on hybrid fitness in comparison to non-hybrids (Barton and Hewitt 1989). It is often assumed that hybrid fitness is lower than non-hybrid fitness because of genetic incompatibilities (Dobzhansky 1970) and that hybrid zones are maintained over time due to the continuing gene flow between the parental species (tension zone model; Barton and Hewitt 1985). However, hybrids can be more fit than either of the parental species in a specific habitat or they can outperform parents due to heterosis or evolutionary novelty (see Hewitt 1988 for a review).

The classic view of an ecological speciation states that reproductive isolation evolved in allopatry during adaptation to different environments (by-product mechanism) and is reinforced by reduced hybrid fitness (Butlin 1989). Hybrid zones are assumed to form via secondary contact after geographic divergence in allopatry. Notwithstanding, parapatric speciation may occur in particular conditions, such as ecological adaptation followed by in situ evolution of assortative mating (see Jiggins and Mallet 2000 for a review). Thus, information on crossing among races or varieties and the fitness of hybrids are critical for understanding ecological speciation and evolution of hybrid zones (see Schluter 2001 for a review).

Hancornia speciosa Gomes (Apocynaceae) is a Neotropical tree widely distributed in savannas and open vegetation from the Northeast towards Central-West Brazil, Paraguay, Bolivia, and Peru. The species is hermaphroditic and is pollinated mainly by moths. Controlled pollination experiments suggest that the species is self-incompatible (Darrault and Schlindwein 2005). H. speciosa was split into six varieties based on morphological differences in leaves and flowers (Monachino 1945), with parapatric or sympatric geographic distributions. H. speciosa var. speciosa (Gomes) occurs from the Northeast towards the Central-West and North Brazil; H. speciosa var. maximilliani (A. DC.) and H. speciosa var. lundii (A. DC.) occur in Southeast Brazil; H. speciosa var. cuyabensis (Malme) occurs in Mato Grosso, Central-West Brazil; H. speciosa var. gardneri (A. DC. Muell. Arg.) and H. speciosa var. pubescens (Nees and Martius) Muell. Arg. occur in central Brazil. The morphological differentiation suggests long-term restriction on gene flow and ecological adaptation or mutation of characters determined by low number of genes (Coelho and Valva 2001; Chaves 2006). In fact, high genetic differentiation has been observed among populations of H. speciosa var speciosa in Northeast Brazil using microsatellites (Amorim et al. 2015) and among populations of H. speciosa var gardneri and H. speciosa var pubescens in Central-West Brazil using RAPD (F ST = 0.197, Moura et al. 2011). However, some varieties are sympatric or parapatric with contact zones, and individuals with intermediate phenotype may be observed, suggesting no restriction on gene flow (Chaves 2006) and the maintenance of hybrid zones. Thus, a study of mating among varieties and progeny fitness is critical to understanding hybrid zone dynamics and the evolution of differentiation among varieties. If in situ parapatric differentiation is occurring in H. Speciosa, we predict no cross-pollination among varieties or lower hybrid fitness.

The edible fruit pulp of H. speciosa is used as raw material for candies, ice cream, and juice by small- and medium-sized enterprises in Central-West and Northeast Brazil, playing an important role in local economies. Despite its importance to the human population, all fruit markets are based on harvesting from populations in the wild, and no systematic breeding program or seed orchards have yet been developed (Lederman and Bezerra 2006). No information on pollen dispersal is available for H. speciosa nor whether the different varieties can cross-pollinate. In addition, one of the main constraints to a H. speciosa breeding program is the slow growth of seedlings in nursery, the long generation time compared to annual crops (c. 5 years of age at the first reproduction and c. 10 years to full production), and the strong flowering and fruiting seasonality, usually restricted to the spring (Lederman and Bezerra 2006). In addition, the potentially existing self-incompatible system may constrain the numbers of compatible mating within a germplasm collection. Therefore, the assessment of pollen flow dynamics and pollen contamination in a germplasm collection is highly important in order to define breeding methods and develop management practices that reduce pollen contamination (El-Kassaby and Askew 1998). The information on the genetic diversity conserved in a germplasm collection, the reproductive system, and pollen-mediated gene flow may provide fundamental information for breeding programs and genetic resource conservation (e.g., Buiteveld et al. 2001; Stoehr and Newton 2002; Moriguchi et al. 2004).

In this work, we studied cross-pollination among H. speciosa varieties in a germplasm collection, to better understand the evolution of varieties and generate useful information for a breeding program. We compared progeny fitness in a nursery to address whether hybrid fitness differed from non-hybrids and whether maternal and paternal taxa make different contributions to fitness. We also compared the genetic diversity and differentiation among varieties. For this, we analyzed open-pollinated progeny arrays from the germplasm collection. Progeny arrays were genotyped using seven microsatellite loci for paternity assignment. Progeny fitness was determined by measuring six quantitative traits in the nursery (days to shoot, growth rate, leaf width, leaf length, stem diameter, and plant height).

Materials and methods

The germplasm collection

The study was performed in the H. speciosa germplasm collection of “Escola de Agronomia, Universidade Federal de Goiás” (EA/UFG hereafter), a study plot planned as a basis for breeding programs. The collection was planted in December 2005 in an experimental area within the University Campus at Goiânia, GO, Brazil. For this, 109 open-pollinated progeny arrays were sampled in 34 wild populations comprising four varieties (Supporting Information Table S1). In each population, three to five progeny arrays were sampled (c. 20 seeds per mother tree) and grown in a nursery in a completely randomized block design. Four seedlings per progeny from 58 mother trees were planted in the field in a completely randomized block experimental design (four blocks) in 5 × 6 m spacing (Ganga et al. 2009). One seedling from another 84 mother trees was planted in two other blocks in 5 × 6 m spacing. The collection then had 274 adult individuals in an orchard of 220 × 60 m with four varieties: H. speciosa var. pubescens (65 individuals), H. speciosa var. gardneri (159), H. speciosa var. speciosa (21), and H. speciosa var. cuyabensis (29). Variety identification was checked again in adult plants in germplasm collection using leaf morphology following Monachino (1945).

Experimental design and sampling

We analyzed open-pollinated progeny arrays to evaluate natural crossing within and among varieties and offspring fitness. Plants started to flower at c. 5 years old (2010), and by 2012 (7 years old), all plants had already had at least one flowering season. Twenty-seven mother trees were randomly chosen for seed sampling in the fruiting season (August and September) of 2012. Fruit development in H. speciosa is very slow, so the fruits collected in 2012 originated from the flowering season of 2011. From the 27 mother trees, 14 were H. speciosa var. gardneri, 7 H. speciosa var. pubescens, and 6 H. speciosa var. cuyabensis. H. speciosa var. speciosa had no fruits in the fruiting season of 2012. Differences in the number of mother trees were mainly due to the number of flowering and fruiting trees for each variety. Seeds (560) were grown in the nursery in a completely randomized experimental design, and leaves from 320 seedlings were collected for DNA extraction and genotyping for parentage, mating system, and genetic diversity analyses (see below). The number of seedlings analyzed per mother tree differed due to variation in germination (Table 1). We also sampled leaves from all adults (274) to genotype for parentage, mating system, and genetic diversity analyses.

Table 1 Comparison of outcrossing rates and number of pollen donors among Hancornia speciosa mother trees in the germplasm collection

Genetic analysis

We genotyped all adults and progeny arrays using seven microsatellite loci (Rodrigues et al. 2015). DNA was extracted from leaves using the CTAB 2 % protocol (Doyle and Doyle 1987). Forward primers were marked with fluorescent dyes 6-FAM, HEX, and NED (Applied Biosystem, CA). PCR was performed in 15-μl reaction with 10-ng template DNA, 1× reaction buffer (10 mM Tris-HCl, pH 8.3, 50 mM KCl, 1.5 mM MgCl2), 250 μM of each dNTP, 250 μg BSA, 2.16 μM of each primer, and 1 U Taq DNA polymerase (Phoneutria, BR). Amplifications were performed in a GeneAmp PCR System 9700 (Applied Biosystems, CA) with one cycle at 95 °C for 5 min, 30 cycles at 95 °C for 1 min, annealing temperature for 1 min (specific for each primer pair, see Rodrigues et al. 2015), 72 °C for 1 min, a final step at 72 °C for 30 min to enforce 3′ Taq adenylation. PCR fragments were electrophoresed with GeneScan 500 internal lane standard (ROX, Applied Biosystems, CA) in an ABI Prism 3100 automated DNA sequencer (Applied Biosystems, CA) and automatically sized using the software GeneMaper® v4.1 software (Applied Biosystems, CA).

All individuals were genotyped twice and genotypes were visually reviewed and then Micro-Checker software (van Oosterhout et al. 2004) was used to detect errors due to stutter bands, allele drop-out and null alleles. We found no evidence of genotyping errors due to stutter bands or allele drop-out when the raw data were analyzed. We also found no evidence of null alleles.

Parentage analysis, gene flow among varieties, and mating system

To analyze crossing among varieties and offspring fitness, we first determined seed parentage using an assignment test implemented in the software CERVUS 3.0 (Kalinowski et al. 2007).

Due to the low DNA quality that resulted in unreliable genotypes, we successfully genotyped 258 adults from the 274. Thus, the 258 adult trees were considered candidate pollen donors for the 320 seeds analyzed. For parentage determination, we performed two simulations: one allowing selfing and another allowing only cross-pollination. For both simulations, we considered a genotyping error of 0.01 and 95 % of the candidates were sampled because some individuals had not been genotyped and also there was an arboretum close to the germplasm collection. We performed 10,000 parentage simulations at 95 % (strict) and 90 % (relaxed) confidence.

We also obtained pollen dispersal distances and the variance of dispersal distance ( p 2). To verify whether the pollen dispersal distance distribution followed the adult distribution in experimental area, we performed a Kolmogorov-Smirnov test. We estimated the number of pollen donors for each mother tree as the proportion of seeds sired by each pollen donor, N p  = 1/∑p i 2 (Burczyk et al. 1996), where p i is the number of seeds sired by each pollen donor (i)

We estimated the differentiation of allelic frequencies among the pollen pools sampled by the females in the population, ϕ FT (Smouse et al. 2001, Austerlitz and Smouse 2001), using the software TWOGENER (Austerlitz and Smouse 2001). To verify whether pollen pool differentiation was correlated with spatial distance, we performed a Mantel test with 10,000 permutations between pairwise ϕ FT and spatial distance matrices.

Finally, we analyzed H. speciosa mating system using the mixed-mating correlated model implemented in the software MLTRWin (Ritland 2002). Mating system parameters were obtained based on the assumption that there is a common gene pool for all the varieties due to results from gene flow among varieties (see below). We estimated multiloci outcrossing rates (t m), single locus outcrossing rates (t s), biparental inbreeding or outcrossing rates between related individuals (t m − t s), multilocus correlation of outcrossed paternity within progeny arrays (r p), and the correlation of self-fertilization between progeny (r s). Standard error for each estimate was obtained with 10,000 bootstrap resamples. We obtained pollen and ovule allele frequencies and tested for deviation from homogeneous distribution using contingency tables for each locus. The Fisher exact test was performed using a Markov Chain method, implemented in the software Genepop 4.2.2 (Rousset 2008).

Using the multilocus correlation of outcrossed paternity within progeny arrays (r p), we estimated the effective number of pollen donors (N ep = 1/r p) in the germplasm collection, i.e., the equivalent number of male parents in an idealized population where all males in the neighborhood have equal fertilities. We also estimated the proportion of half sibs [P HS = t m(1 − r p)] and full sibs (P FS = t m r p) in the progeny arrays.

Offspring fitness analysis

We measured the fitness component in offspring for six quantitative traits. Plant height (PH, cm) and stem diameter (SD, mm) were measured 133 days after seed germination. Growth rate (GR, mm/day) was obtained from the regression coefficient (b) of height (Y, mm) and variation in time (X, day). For this, we measured plant height seven times, from germination day to 133 days later. Leaf length (LL, mm) and width (LW, mm) were obtained from the mean value among three leaves per plant, also measured up to 133 days after germination. We also obtained the number of days to shoot (DS).

We estimated heterosis as the difference in mean value of each quantitative trait between varieties and hybrid offspring. To test for heterosis or exogamic depression, we performed a one-way ANOVA with planned contrasts corresponding to the comparison of crossing within varieties, hybrids, and the contrast between varieties × hybrids. A significant result for the contrast varieties × hybrids may evince heterosis or exogamic depression. For this analysis, we excluded H. speciosa var speciosa (S) because it had no fruits to analyze heterosis effects. Three variety crosses were considered: H. speciosa var. gardneri (GxG), H. speciosa var. cuyabensis (C×C), and H. speciosa var. pubescens (P×P). For hybrid crosses, we also considered three groups (without reciprocal cross effect): H. speciosa var. cuyabensis (C) × H. speciosa var. gardneri (G), H. speciosa var. cuyabensis (C) × H. speciosa var. pubescens (P), and H. speciosa var. gardneri (G) × H. speciosa var. pubescens (P).

To verify maternal effects on the fitness of offspring, we also used a one-way ANOVA with planned contrast between hybrids and reciprocals. Regarding the maternal effect, we expected to find a strong asymmetry between maternal and paternal effects in the ANOVA and differences among reciprocal crosses. For hybrid crosses and reciprocals, we excluded H. speciosa var speciosa (S) and considered the hybrids and reciprocals: C×G; G×C; C×P; P×C; G×P; P×G (the first variety corresponds to maternal plant).

We also compared the offspring fitness among varieties performing a one-way ANOVA including H. speciosa var speciosa hybrids without reciprocal cross effect. For this, we compared the six hybrids: H. speciosa var. cuyabensis × H. speciosa var. gardneri (C×G/G×C), H. speciosa var. cuyabensis × H. speciosa var. pubescens (C×P/P×C), H. speciosa var. cuyabensis × Hanchonia speciosa var. speciosa (C×S), H. speciosa var. gardneri × H. speciosa var. pubescens (G×P/P×G), H. speciosa var. gardneri × Hanchonia speciosa var. speciosa (G×S), and H. speciosa var. pubescens × Hanchonia speciosa var. speciosa (PxS) (the first variety corresponds to maternal plant).

Genetic diversity in adults, progeny arrays, and varieties

We characterized the genetic diversity conserved in the germplasm collection and compared the genetic diversity and polymorphism at the seven microsatellite loci for adults and progeny arrays. We estimated the number of alleles per locus (A) and observed (Ho) and expected heterozygosity (He) based on Hardy-Weinberg equilibrium (Nei 1978). Analyses were performed with the software FSTAT 2.9.3.2 (Goudet 2002) and comparisons between adults and progeny arrays and among varieties were performed using the permutation test (with 10,000 permutations). We also estimated for adults the probability of genetic identity for each locus (I i ) (Paetkau et al. 1995), which corresponds to the probability of two random individuals displaying the same genotype and paternity exclusion probability (Q i ) (Weir 1996), which corresponds to the power with which a locus excludes a random individual as the pollen donor of an offspring. The combined probability of paternity exclusion, QC = 1 − [Π (1 − Qi)], and the combined probability of genetic identity IC = Π Ii were also estimated for the seven loci.

To compare molecular genetic diversity among varieties, we estimated the mean number of alleles per locus (A), allelic richness based on rarefaction analysis (Mousadik and Petit 1996), observed heterozygosity (Ho), and expected heterozygosity (He) based on the Hardy-Weinberg equilibrium (Nei 1978). All analyses were performed using the software FSTAT 2.9.3.2 (Goudet 2002) and comparisons between adults and progeny arrays and among varieties were performed using the permutation test (with 10,000 permutations) also implemented in FSTATS 2.9.3.2. Genetic differentiation among varieties in the germplasm collection was verified using a hierarchical analysis of variance (Weir 1996). Total variance was partitioned into variance components based on varieties (among groups), populations within varieties (among populations within groups), progenies within populations, and within individuals.

Results

Parentage analysis, gene flow among varieties, and mating system

There were no self-fertilization events among the most likely parentages, and no difference was found in paternity assignment between the two simulations, one allowing selfing and another allowing only cross-pollination. Thus, we only showed the results considering the simulation allowing selfing, with critical ∆ = 3.88 at a 95 % confidence (89 % of the assignments) and ∆ = 2.32 at a 90 % confidence (11 % of the assignments).

From the 320 seeds analyzed, 296 we could assign paternity for 209 (65.3 %) and 186 (89 %) seeds at 95 % (strict) and 90 % (relaxed) confidence, respectively. We identified 121 different pollen donors, but the number of pollen donors per mother tree ranged from 1.0 to 12.4 (Table 1). We also detected mating among the four varieties of H. speciosa in the germplasm collection (see Supporting Information Table S2). Hanchornia speciosa var. gardneri sired the greatest number of seeds (113), followed by H. speciosa var. pubescens (57), H. speciosa var. cuyabensis (28), and H. speciosa var. speciosa (11). The distribution of the number of seeds sired by varieties followed the number of trees of each variety in the germplasm collection (contingency table, χ 2 = 0.002, p = 0.997). An individual of H. speciosa var gardneri (Fam5) sired more seeds (9), followed by another H. speciosa var gardneri (6) and an individual of H. speciosa var pubescens (Fam12, six seeds). However, most trees sired only one seed. All trees that sired more than three seeds were H. speciosa var gardneri or H. speciosa var pubescens varieties.

Although maximum pollen dispersal distance was greater than 160 m (165.6 m), mean dispersal distance was short (53.5 m, SD = 93.0 m) and most pollen dispersal (64 %) occurred at distances less than 60 m and only 11 % occurred at distances greater than 100 m. The frequency distribution of pollen dispersal distance was significantly different from adult distance distribution (Kolmogorov-Smirnov test = 0.187, p = 0.035). Analyses performed using TWOGENER showed low differentiation in the pollen pool received by mother trees (φft = 0.038). The differentiation in the pollen pool among mother trees was not related to pairwise spatial distance (r = 0.014, p = 0.253).

The multilocus outcrossing rate (t m) was equal to 1.0 in all progeny arrays analyzed (Table 1). However, single locus outcrossing rates (t s) ranged from 0.693 to 1.000 (Table 1), thus, biparental inbreeding presented a high variation among families, ranging from 0.000 to 0.307. The overall multilocus (t m = 0.990, SE = 0.007) and single locus (t s  = 0.899, SE = 0.016) outcrossing rates were high, and biparental inbreeding was low but significantly different from zero (t m− t s = 0.091, SE = 0.016) showing that 9.1 % of outcrossing was between closely related individuals (Table 1). We also found a high variation in the number of pollen donors per mother tree (Table 1), but the effective number of pollen donors over all mother trees was high (N ep = 13.2). The allele frequency distributions of pollen and ovule pools were heterogeneous for all loci (p < 0.01) indicating non-random sampling of pollen pool by maternal trees.

Correlation of selfing was close to 1.0 (r s  = 0.999, SE = 0.107) showing that no seeds were sired by self-pollination. Multilocus correlation of outcrossed paternity within progeny arrays was low but significantly different from zero (r p  = 0.076, SE = 0.014), indicating that for each mother tree, the probability that the same pollen donor sired two random sibs was 7.6 %. In fact, the proportion of full sibs was low (P FS = 0.075) and the proportion of half sibs was P HF = 0.915.

Offspring fitness

Fitness components were evaluated for 200 individuals (seedlings that had data for both quantitative traits and genotypes). ANOVAs with planned contrasts to test for heterosis or exogamic depression showed significant variation in stem diameter among offspring varieties and days to shoot and leaf width for hybrid offspring (Fig 1, see Supporting Information Tables S3-S8 for ANOVA results). Although hybrids of H. speciosa var pubescens tended to have longer leaf length (Table 2) and greater differences in leaf length compared to varieties offspring (see Supporting Information Table S9), the differences were not significant. No significant effect for the contrast hybrids vs variety was detected for any fitness component (Tables S3-S8), showing no evidence of heterosis or exogamic depression. We found significant differences in plant heights for reciprocal crosses (Table S8, Fig 2). The effect was due to greater plant heights for the C×G cross than for G×C (Fig 2, Table 3). The analysis including H. speciosa var speciosa showed significant differences among hybrids only for leaf widths (Tables S3-S8).

Fig. 1
figure 1

Comparison among fitness component means (bars are 95 % confidence intervals) for varieties and hybrid offspring. DS number of days to shoot, GR growth rate in millimeters per day, LW leaf width in millimeters, LL leaf length in millimeters, SD stem diameter in millimeters, PH plant height in centimeters

Table 2 Component of fitness means and their standard deviation (in parenthesis) for varieties and hybrids offspring
Fig. 2
figure 2

Comparison among fitness component means (bars are 95 % confidence intervals) for hybrids and their reciprocals offspring. DS number of days to shoot, GR growth rate in millimeters per day, LW leaf width in millimeters, LL leaf length in millimeters, SD stem diameter in millimeters, PH plant height in centimeters. The codes for variety crosses are C, H. speciosa var. cuyabensis; G, H. speciosa var. gardneri; P, H. speciosa var. pubescens. The left letter corresponds to the maternal parent

Table 3 Component of fitness means and their standard deviation (in parenthesis) for hybrids and their reciprocal offspring

Genetic diversity in adults, progeny arrays, and varieties

Most loci showed high polymorphism, with numbers of alleles ranging from 2 to 27 (Table 4). Expected heterozygosity was high for all loci, but Hs11 and Hs30 (Table 4). However, the combined probability of genetic identity (3.34 × 10−11) and of paternity exclusion (0.9999) showed that together, the seven loci were suitable for parentage analysis (Table 4).

Table 4 Characterization of the seven microsatellite loci of Hancornia speciosa and comparison of genetic diversity and polymorphism between adults (237 individuals) and progeny arrays (320 seedlings) from the germplasm collection

Both adults and progeny arrays presented high mean numbers of alleles (17.1 and 15.7, respectively, Table 4) that did not significantly differ (p > 0.05, Table 4). Although some alleles were lost in the progeny arrays (Hs01, Hs05, Hs08, H16), leading to lower genetic diversity for some loci, no significant reduction in overall observed and expected heterozygosity was observed (p > 0.05, Table 4).

All varieties had high polymorphism and genetic diversity (Table 5). The number of alleles per locus, allelic richness, and genetic diversity did not differ among varieties (p > 0.05). Genetic differentiations among varieties (F CT = 0.019, p < 0.001) and among populations within varieties (F SC = 0.053, p < 0.001) were significant but low (Table 6). Most variation was due to non-random mating within populations (F IS = 0.162, p < 0.001), leading to a total fixation index (F IT) equal to 0.222.

Table 5 Comparison of genetic diversity and polymorphism among Hancornia speciosa varieties from the germplasm collection
Table 6 Hierarchical analysis of variance for Hancornia speciosa in the germplasm collection

Discussion

Our results showed that the four varieties of H. speciosa can cross-pollinate in the germplam collection. Although the varieties sired different numbers of seeds, the differences were most likely due to the proportion of individuals of each variety grown in the germplasm collection and also to unequal flowering varieties (LJ Chaves, unpublished data). Although individuals of all varieties flowered at 5 years of age, most individuals of H. speciosa var speciosa flowered only at 7 years. The difference in flowering phenology among varieties may be related to climate and photoperiod differences between the germplasm collection locality (central Goias, Central-West Brazil) and the occurrence of each variety. For instance, H. speciosa var speciosa occurs mainly in the Central-North towards the Northeast of Brazil, a drier region with different photoperiod compared to central Goias where the germplasm collection was settled. The plants of this variety showed a lower growth rate in the experimental plot in comparison to the other varieties (Ganga et al. 2009). H. speciosa var. speciosa had the lowest number of sired seeds and the lowest number of individuals in the germplasm collection and just a few individuals fruiting during our study. However, despite the difference in the number of days to shoot (DS) among hybrids, the difference was mainly due to lower DS in hybrids of H. speciosa var pubescens × H. speciosa var gardneri. Hybrids of H. speciosa var speciosa showed similar fitness compared to the other hybrids.

The results from cross-pollination among the varieties suggest no reproductive isolation and that the morphological differences among varieties are most likely due to genetic drift or selection of loci controlling morphological characteristics (see Chaves 2006). In fact, hierarchical analysis of variance in allele frequency showed that genetic differentiation among varieties and among populations for microsatellite loci are very low, and the differentiation is due to non-random mating among varieties, populations, and individuals in parental populations. In fact, although varieties and hybrids differed in some fitness components, hybrid offspring showed no heterosis or exogamic depression. Because our analyses excluded environmental effect on fitness, since fitness components were analyzed under the same nursery conditions (environment independent model), we cannot test for parapatric differentiation due to ecological adaptation. However, all crosses were viable and resulted in no exogamic depression for any fitness component analyzed, evincing that morphological differentiation among varieties is most likely due to genetic drift and ecological adaptation. In addition, although varieties can potentially cross-pollinate, as shown in this work, the mating in parental populations is non-random, which may facilitate the fixation of alleles leading to morphological differentiation.

We found evidence for maternal effect in seedling height because hybrid offspring of H. speciosa var cuyabensis × H. speciosa var gardneri had greater fitness than offspring from reciprocal crosses. Maternal effect in seed performance, such as seed germination and growth, is an important component of plant fitness (see Donohue 2009 for a review). However, it is important to note that our experiment was carried out in a common garden, and seeds were grown in a nursery under the same environmental conditions (soil, irrigation, photoperiod). Environmental maternal effect may be better evaluated in field experiments to better understand the importance of this mechanism to the evolution of varieties.

The maximum pollen flow distance (c.165 m) was less than the maximum distance between two trees in the germplasm collection (c. 240 m). The shorter distance may be due to the high density of individuals in the collection, leading to little pollen carried over by pollinators, as observed for another high density Neotropical tree, Caryocar brasiliense (Collevatti et al. 2010). However, we could not assign the pollen donor for 111 seeds (35 % of the seeds analyzed). This result may be due to missing data, since 16 adults could not be genotyped and also due to pollen immigration to the germplasm collection. Indeed, individuals of H. speciosa from an arboretum 1.5 km away from the germplasm collection may be a potential source of pollen contamination. The lack of significant correlation of the pollen pool differentiation among mother trees and the spatial distance may also be due to the immigration of foreign pollen into the germplasm collection, supporting the hypothesis of pollen contamination.

Our results also showed that, despite some alleles not having been sampled in the progeny array, the overall genetic diversity in the germplasm collection at EA/UFG was conserved in the progeny array, as the number of alleles and expected heterozygosity did not differ statistically. This result is important for breeding programs and conservation of genetic resources of H. speciosa in the EA/UFG germplasm collection because breeding programs may begin with a sample with high genetic diversity. This result differs from those reported for a germplasm collection of another Neotropical fruit tree, Hymenaea courbaril (Fabaceae, Feres et al. 2009) that had greater genetic diversity in adults than in progeny arrays.

The high outcrossing rate in the germplasm collection was expected due to the high tree density (190.3 individuals/ha). High density stands usually present higher multilocus outcrossing rates due to the high proportion of pollinator movements among flowering plants (Murawski and Hamrick 1991). Besides the lack of self-pollination, our results also showed a high proportion of biparental inbreeding (9.1 %), taking into account the experimental design. Although sibs were grown in the germplasm collection, the seedlings were planted in a completely randomized block design with only one sib per block (four blocks) and two additional blocks without replications. Thus, sibs were in different blocks, which may decrease the probability of mating between closely related individuals. The low multilocus correlation of outcrossed paternity (0.076) may also be due to the high number of individuals flowering in the germplasm collection, which may lead to a greater movement of pollinators among trees. The high effective number of pollen donors in the H. speciosa germplasm collection may also be due to the experimental design (randomized blocks), which may decrease the probability of unsuccessful mating between closely related individuals, due to potentially existing self-incompatible systems (see Darrault and Schlindwein 2005) and also due to the high density of trees in the collection. In fact, a self-incompatible system was described for H. speciosa elsewhere (Darrault and Schlindwein 2005), and no self-pollination was observed in our study. In addition, the species presented a high outcrossing rate (c.1.0), also confirming the evidence for self-incompatibility. The effective number of pollen donors was greater (13.2) than for populations of Neotropical trees, such as Caryocar brasiliense (7.4, Collevatti et al. 2010), Solanum lycocarpum (10.2, Martins et al. 2006); Manilkara huberi (5.9, Azevedo et al. 2007); Myracrodruon urundeuva (6.8, Gaino et al. 2010) and also comparing to wind pollinated species in seed orchards (e.g., Picea abies, N ep  = 6.7, Burczyk et al. 2004; Pinus sylvestris, N ep  = 12.2, Torimaru et al. 2012). However, we observed a broad variation in the number of pollen donors among mother trees. Although sample size may explain a great part of the variation (r 2 = 0.41) in N ep among mother trees, trees with the same sample size showed high or low N ep (see Table 1) indicating that pollinator service may vary greatly among mother trees, most likely due to differences in flower abundance among them (Ghazoul 2005). Despite the variation in the number of pollen donors, the low differentiation in pollen pool among mother trees and the lack of correlation with spatial distance among trees indicate high movement of pollinators among mother trees in the germplasm collection. This is important for starting a breeding program using the Breeding Without Breeding (BWB, El-Kassaby and Lstiburek 2009) strategy, for example, because an efficient recombination among genotypes leads to maintenance of high variability and, by consequence, high genetic gains.

We found heterogeneity in the allele frequency distribution between the pollen and ovule pool, a violation to the mixed-mating system model (Ritland and Jain 1981). This result indicates a potential heterogeneity in reproductive success or pollen immigration, post pollination selection, or non-random mating due to self-incompatible systems and inbreeding. Also, the low number of progeny arrays analyzed or unequal sample size among mother trees may cause heterogeneity (Ritland and Jain 1981). Although pollen immigration may not be disregarded, we hypothesize that unequal flowering within and among varieties in the germplasm collection together with self-incompatible systems may have caused the observed heterogeneity. Some other tropical tree species also showed significant differences in allele frequency between the pollen and ovule pool, such as Caryocar brasiliense (Collevatti et al. 2001), Acacia aroma and Acacia macracantha (Casiva et al. 2004), and Solanum lycocarpum (Martins et al. 2006).

In conclusion, our study showed that the four varieties H. speciosa var. gardneri, H. speciosa var. pubescens, H. speciosa var. cuyabensis, and H. speciosa var. speciosa could cross-pollinate in the EA/UFG germplasm collection. We also found no heterosis or exogamic depression in hybrids of the four varieties, raising the hypothesis that morphological differentiation among varieties is most likely due to ecological adaptation. Maternal effect in plant height, at least for one hybrid crossing, deserves more in depth investigation to better understand its role in the evolution of varieties and hybrids zones. The four varieties have low but significant genetic differentiation for neutral microsatellite loci. Our results also showed high diversity in the germplasm collection, and the genetic diversity was conserved in the progeny arrays. As a consequence, for a breeding program, a sample of seeds from the collection represents a broad composite population with one generation of recombination. Moreover, our results confirmed the evidence that H. speciosa is self-incompatible with an outcrossing rate close to 1.0.