Introduction

Natural selection alters the distribution of allelic frequencies at target loci in finite populations in a different way compared to loci subject to other evolutionary forces like genetic drift, migration, and mutation (Hartl and Clark 2006). Typically, selection reduces variation within populations; nevertheless, the balancing selection class of models forms a general exception to this rule. Balancing selection results in stable non-neutral distributions of allelic frequencies, therefore in the maintenance of alleles in the population. One typical balancing selection model is negative frequency-dependent selection (NFDS; Maruyama and Nei 1981) and probably the most classical example of NFDS involves plant self-incompatibility (SI) systems (Wright 1939).

Outcrossing in plants is functionally favored by dedicated genetic systems (Barrett 2003) and almost half of plant species employ SI systems to limit inbreeding (Barrett 2002). RNase-based gametophytic self-incompatibility (GSI) is the most phylogenetically widespread system (∼60 % of SI species, Franklin-Tong and Franklin 2003). Generally, a pollen grain sharing one of the self-incompatibility alleles (S-alleles) of the style fails to achieve fertilization, since the pollen tube cannot develop (Franklin-Tong and Franklin 2003). Self-fertilization is therefore impossible and mating between relatives is reduced.

NFDS arises in the genomic region associated with SI (Castric and Vekemans 2004) as any new incoming S-allele has a selective advantage in mating, which generates a high number of alleles within populations. This outcome is the product of the above sexual selection process that results in higher fitness for rare and novel alleles (Wright 1939). S-alleles are tightly linked both due to close physical association on the chromosome and reduced recombination rate in the area (Kamau et al. 2007). The second consequence of NFDS at the population level lies in the assumption that all S-alleles are selectively equivalent and hence selective strength acting on an S-allele only depends on its frequency. Therefore, selective equivalence among the different allelic specificities should lead to equal allelic frequencies at equilibrium (isoplethy hypothesis, i.e., all S-alleles should settle to an identical frequency, equal to the inverse of the number of S-alleles; Steiner and Gregorius 1999; Stoeckel et al. 2011; Wright 1939). Subsequently, the expected genetic signature of NFDS among populations is a reduction in population differentiation (compared to neutral loci) because of an increased effective migration rate at the genomic area associated with the S-locus (Glémin et al. 2005; Stoeckel et al. 2008). NFDS can be considered as an evolutionary force modulated by migration (Schierup 1998) and opposing to genetic drift, since this selection pressure tends to increase directly the frequency of rare alleles and thus limits the loss of alleles caused by drift (Wright 1939). Furthermore, if a plant species is characterized both by self-incompatibility and local dispersal an antagonism can exist between their evolutionary effects. This antagonism will be manifested differently in markers linked to a fairly large genomic area associated with the S-locus (due to reduced recombination near the S-locus; Kamau et al. 2007; Kawabe et al. 2006) than in markers that are unlinked.

Molecular data are now available to explore the impacts of ribonuclease (S-RNase) based GSI systems at a fine level because the pistil and pollen components of such systems have now been identified and the corresponding genes have been sequenced in some species (Newbigin et al. 2008).

Prunus avium L. (wild cherry), is a self-incompatible, multipurpose, scattered forest tree species used for its high quality timber, fruits and aesthetic value, which presents a very good model to test NFDS and related hypotheses. Its GSI is determined by the multi-allelic S-locus that codes for S-RNases in the style (Bošković and Tobutt 1996) and for S-haplotype-specific F-box proteins in the pollen (Yamane et al. 2003). When the S-allele of the pollen grain matches one of the S-alleles of the style, these ribonucleases inhibit the growth of the pollen tube in the style. Various S-alleles have been cloned and shown to contain several highly conserved regions, while both allele-specific and consensus primers have been designed to amplify across each of the two introns (Sonneveld et al. 2001; Sonneveld et al. 2003). Both introns, especially the second one, present substantial length polymorphism, allowing most of the alleles to be distinguished according to the size of polymerase chain reaction (PCR) amplification products.

For GSI systems, Wright (1939) stated that “under a moderate mutation/immigration rate of 10−3, a population of 50 individuals should harbor about eight different S-alleles, whereas a population of 200 individuals would have about 15 S-alleles”. Empirical studies on S-allele diversity, employing either diallel crossing schemes or molecular-genetic approaches to assess S-genotypes, demonstrated that the examined S-allele distinction among populations is excessive (Busch and Schoen 2008; Castric and Vekemans 2004). S-allele genotyping in P. avium led to the rapid confirmation of S-alleles and incompatibility groups in many cultivars and genotypes, as well as in the identification of putative novel S-alleles through correlations with new PCR products (Choi et al. 2002; De Cuyper et al. 2005; Hauck et al. 2002; Sonneveld et al. 2003; Tao et al. 1999; Yamane et al. 2000; Zhou et al. 2002). S-allele typing has become a useful tool for biodiversity assessment in wild cherry populations, accessions (De Cuyper et al. 2005; Schueler et al. 2006), and germplasm collections (Gisbert et al. 2008; Marchese et al. 2007; Schuster et al. 2007). Up to 16 different S-alleles and 29 incompatibility groups have been reported thus far using different methods (Schuster et al. 2007).

Herein, we present results concerning allelic distribution and genetic structure at the gametophytic self-incompatibility locus in five wild cherry (P. avium L.) populations. We also used previously published data on eleven nuclear microsatellite (SSR) loci of the same populations (Ganopoulos et al. 2011) in a comparative analysis of microsatellite population genetic structure and differentiation at the S-locus and also data from the S-allele distribution in Greek (Ganopoulos et al. 2010) and other European (Boskovic and Tobutt 2001; De Cuyper et al. 2005) sweet cherry cultivars. The objectives of this study were: (1) to determine the allelic diversity of S-RNases by estimating numbers of alleles and allele frequencies at the S-locus for five natural wild cherry populations in Greece, (2) to compare S-locus genetic diversity of Greek natural populations to the genetic diversity of the Greek and other western European cultivars, and (3) to investigate the occurrence of NFDS and its consequences regarding (a) the number of S-alleles and genetic diversity at the S-locus, (b) isoplethy, (c) S-locus neutrality, (d) migration rate, and (e) population differentiation.

Materials and methods

Study site and sampling design

Five natural populations, covering most of the northern Greek range of P. avium, were sampled (Table 1). Levels of sample size per population were influenced by census population numbers, the non-continuous distribution of wild cherry in nature and the need to avoid sampling clonal or filial structures. We collected young leaves from individual trees spaced at least 150 m apart, in order to avoid sampling related individuals. No individual was found to be a vegetative propagule of another according to an earlier nSSR analysis (Ganopoulos et al. 2011). The sample size of each population ranged from 15 to 20 individuals (mean 18.6); 93 individuals were sampled in total (Table 1). Collected leaves of each individual were stored at −80 °C prior to DNA extraction. Total genomic DNA was isolated using the procedure described by Doyle and Doyle (1987). The DNA amount was quantified by a UV-spectrophotometer (Eppendorf Biophotometer, Hamburg, Germany). Samples were then diluted to a 20 ng/μL working concentration.

Table 1 Geographic locality of Prunus avium natural populations studied

Assessment of S-allele polymorphisms

Fluorescently labelled primers (IRD800 or IRD700) PaConsI-F and PaConsI-R2 (VBC-GENOMICS, Austria) were used in PCR to amplify regions encompassing the first of each S-RNase intron (Sonneveld et al. 2006). PCR products were electrophoresed and visualized on 6.5 % polyacrylamide gels using a LI-COR 4300 DNA analyzer. Results were analyzed using LI-COR SAGAGT software. Amplification products, including the second intron of each S-RNase were generated using the primers PaConsII-F and PaConsII-R and scored on agarose gels as described by Sonneveld et al. (2003). The 1 kb DNA ladder (Invitrogen) was loaded at the center of each gel. Twenty-one standard cultivars with known S-alleles were also included in the analysis as references (Ganopoulos et al. 2010).

S-allele frequency distribution and S-locus population genetic analysis

S-allele frequencies were calculated as a percentage of all alleles identified in 93 wild cherry genotypes. The software GenAlEx 6 (Peakall and Smouse 2006) was used. Corresponding calculations were performed on three large data-sets reported previously (Boskovic and Tobutt 2001; De Cuyper et al. 2005; Ganopoulos et al. 2010) for comparative purposes. To obtain population-wide estimates of S-allele diversity for populations in which not all individuals were sampled, the total number of S-alleles was estimated following Paxman (1963) as a = n (1 − (1 − 2/n)m), where “a” is the observed number of S-alleles, “n” is the effective number of S-alleles, and “m” is the number of plants sampled per population (Brennan et al. 2002; Holderegger et al. 2008). Paxman estimates were obtained from iteration and rounded to whole numbers. For comparative purposes, we have applied the Paxman (1963) procedure for all SSR loci as well.

To detect the presence of equal allelic frequencies in different populations (isoplethy hypothesis as expected in an infinite panmictic population), we tested the frequencies of S-alleles for equal distribution in each of the five populations studied using a modification by Davies of the chi-squared test by Mantel (1974), as described in Campbell and Lawrence (1981). As proposed by Mantel (1974), this statistic follows approximately a chi-squared distribution with n a − 1 degrees of freedom, “n a being the number of alleles (Stoeckel et al. 2008).

Standard population genetic estimates, i.e., the average number of alleles per locus (A), effective numbers of alleles (n e), gene diversity (expected heterozygosity, H E), observed heterozygosity (H O), and Shannon's diversity index (I), were calculated using GenAlEx 6 (Peakall and Smouse 2006). We used the hierarchical rarefaction method available in HP-RARE (Kalinowski 2005) to calculate in addition to the above parameters allelic richness (AR). Pairwise F ST estimates which characterize the level of population genetic structure for the S-locus was calculated using the Fstat vs. 2.9.3.2 software (Goudet 1995) based on the approach of Weir and Cockerham (1984). We have plotted H E vs. F ST to illustrate the performance of the S-locus compared to the nSSR loci. Historical gene flow (Nm) among populations was estimated indirectly based on the F ST statistics.

The hierarchical distribution of genetic variation among and within populations for both markers was also characterized by an S-genotype-based analysis of molecular variance (AMOVA; (Excoffier et al. 1992; Michalakis and Excoffier 1996). We conducted a hierarchical AMOVA using the GENALEX 6 software (Peakall and Smouse 2006) as well. The tests were implemented using estimates of Φ ST based on distances calculated from allelic data. Tests of significance were performed using 9,999 permutations within the total dataset.

Neutrality of a certain locus was tested by comparing the observed allele frequencies with the expected distribution of neutral alleles based on Ewens's sampling theory (Ewens 1972) using Watterson's homozygosity test (Watterson 1974). The Ewens–Watterson neutrality test was implemented in the software package Arlequin ver.3.0 (Excoffier et al. 2005) to test neutrality in S-RNase and microsatellite loci. The statistics for the test were calculated using the algorithm given by Manly (1985), using 1,000 simulated samples. Garrigan and Hedrick (2003) proposed the use of this test for neutral and the supposed selected loci in order to distinguish between demographic events and balancing selection.

Population genetics indices were compared to results from a neutral molecular codominant marker analysis (nSSR) of the same populations (Ganopoulos et al. 2011). A total of 11 nSSR loci have been employed by Ganopoulos et al. (2011) as part of a larger study. Nevertheless, one of these loci (EMPaS02) is closely linked to the Sc-locus (at 3.3 cM; Cachi and Wünsch 2011). As the S-locus genomic area is characterized by reduced recombination (Kamau et al. 2007), EMPaS02 was removed from the Ganopoulos et al. (2011) nSSR data set and population genetics indices were re-calculated to obtain baseline estimated from truly neutral markers. Therefore, the following data sets were employed in comparative analyses: (a) S-locus, (b) nSRR (11 loci; Ganopoulos et al. 2011), (c) nSSR except EMPaS02 (10 loci).

Results and discussion

The frequency distribution of the S-alleles in wild populations

We detected a total of 16 S-alleles in the five wild cherry populations studied. We have used Vaughan et al. (2008) as a background reference for defining and assigning S-alleles. The number of S-alleles per population was moderately high, ranging from 7 to 12 with a mean of 10.20 (s = 2.17) per population (Table 1, Fig. 1). The maximum number of 12 S-alleles was found in populations Katafito and Vorras. Even when a small population sample size was employed, (n = 15, population Vorras), a large number of S-alleles was revealed (12 S-alleles). The numbers of S-alleles per population changed only slightly when adjusted for incomplete sampling (Paxman 1963) presenting an average of 11.00 (s = 2.91; Table 1). A similar result was also observed when the Paxman (1963) procedure was applied to SSR loci (results not shown). According to NFDS predictions, a high number of S-alleles within populations are expected. Indeed, when the average number of S-alleles per population was compared to the corresponding value of neutral SSR markers (Ganopoulos et al. 2011), the number of alleles at the S-locus was higher and the observed difference was statistically significant according to a t test (t = 2.963, p = 0.031). This result was also extended to the effective number of alleles values (n e = 6.884 at the S-locus, while n e = 4.253 at 11 SSR loci). The number of SI alleles per locus independent of sample size AR (the allelic richness) ranged from 5.810 (Nimphaea) to 8.360 (Hayntou) with an average of AR = 7.558 compared to AR = 6.849 for SSR markers. Although in a strict sense, true allelic richness can only be detected at the DNA sequence level, by and large the above results support Wright’s (1939) prediction of high allelic richness at the S-locus. This outcome is in full congruence to most pertinent studies which report high allelic richness at the S-locus (Castric and Vekemans 2004; Lawrence 2000).

Fig. 1
figure 1

Relative occurrence (percent) of different S-alleles in 93 wild cherry (Prunus avium) genotypes (blue) compared to the frequencies found by Ganopoulos et al. (2010) in 21 sweet cherry cultivars (red), by Boskovic and Tobutt (2001) in 67 sweet cherry cultivars (green) and those reported by De Cuyper et al. (2005) in 65 wild cherry accessions (lilac)

Overall allelic frequency ranged from 2.15 % (S28) to 47.3 % (S30). Out of a total of 16 S-alleles, six (37.5 %) were common to all wild cherry populations. Thirteen S-alleles were found in at least in two of the five populations used in this study. The most frequent alleles (S7, S17, S29, S30, S31, and S32) were alleles found in all populations. Few S-alleles were private with low frequencies in the Cholomontas (two S-alleles) and the Katafito (one S-allele) populations, respectively (Fig. 2). The occurrence of S-alleles in natural Greek wild cherry genotypes was compared to those found in 21 sweet cherry cultivars of wide cultivation in Greece (Ganopoulos et al. 2010). A total of 10 S-alleles (S6, S7, S10, S14, S17, S19, S28, S29, S30, S32) were present only in the wild populations and two S-alleles (S5, S9) were present only in cultivars, while six S-alleles (S1, S2, S3, S4, S13, S30) were present in both (Fig. 1). Furthermore, results were compared to data from 67 cultivars examined by Boskovic and Tobutt (2001) and from 65 Belgian wild cherry accessions originating from natural populations (De Cuyper et al. 2005; Fig. 1). A total of seven S-alleles were common to the Boskovic and Tobutt (2001) study, while ten alleles were common to the De Cuyper et al. (2005) study. Alleles S1 occurred at high frequency in the Greek wild cherry genotypes, although they were rare in the De Cuyper et al. (2005) survey. Nevertheless, the frequencies of alleles S1 and S2 were similar to the Boskovic and Tobutt (2001) study (Fig. 1). Five S-alleles found in the Greek wild cherry populations (S28, S29, S30, S31, S32) have not been detected in European cultivars and natural accessions by Boskovic and Tobutt (2001) and De Cuyper et al. (2005). However, they were recently found in wild cherry genotypes of western Europe and have been characterized as novel alleles by Vaughan et al. (2008). On the contrary, alleles S5, S9, S12, S16, S18, S20, S21, and S22 were absent in the Greek populations studied; these belong to well-known international sweet cherry cultivars (Boskovic and Tobutt 2001), as well as to eastern sweet cherry genotypes (Ipek et al. 2011) and are present in the Belgian wild cherry population (De Cuyper et al. 2005) in addition to being present in the Greek sweet cherry cultivars (Ganopoulos et al. 2010).

Fig. 2
figure 2

S-allele frequencies in five Greek natural wild cherry (Prunus avium) populations

Comparative analyses of genetic diversity and gene flow in natural wild cherries and orchard cultivars in Northern Greece have indicated that these groups are very well differentitated and realized genetic introgression is practically absent (Ganopoulos et al. 2012). Cultivars do not appear to be crossbreeding with wild trees due to significant disjunction in flowering time and a considerable geographic distance between the two groups (Ganopoulos et al. 2012). The genetic base of the cultivar collection is restricted compared to natural populations probably as a result of prolonged and strong anthropogenic selection (Ganopoulos et al. 2012). Nevertheless, the presence of private S-alleles in cultivars compared to natural populations indicates that at least some cultivars have not been derived from local origin.

The tests for equal S-allele distribution within each population confirmed the presence of isoplethy in all populations, but one or in 80 % of the populations studied (Table 2). According to NFDS predictions, equal allelic frequencies at equilibrium are expected at the S-locus in natural populations (Wright 1939). The isoplethy hypothesis was supported in Pyrus pyraster (Holderegger et al. 2008); nevertheless, it was not confirmed in P. avium (Schueler et al. 2006; Stoeckel et al. 2008) and in Prunus lannesiana (Kato and Mukai 2004). Different behavior of populations within a species is not uncommon and has been observed before in P. avium (Stoeckel et al. 2008), while deviations from isoplethy have been explained on theoretical grounds (Campbell and Lawrence 1981; Muirhead 2001). In two reviews of empirical studies, Lawrence (2000) reported no departure from the identical allelic frequencies hypothesis in 16 out of 19 sampled populations from 12 species with GSI, whilst Castric and Vekemans (2004) reported that of 11 species with GSI, a highly significant rejection of the isoplethy hypothesis was obtained in two cases. Further results of mixed nature have been presented for perennial woody angiosperms in recent studies. Overall, it appears that, although allelic frequency tests have low statistical power because of generally low sample sizes, most species with GSI support the equal allelic frequencies hypothesis as this study reported. The prevalence of isoplethy provides further support regarding the occurrence of NFDS in the Greek wild cherry populations studied.

Table 2 Genetic diversity at the S-locus and 11 neutral SSR loci in Prunus avium

Genetic diversity and marker neutrality

Besides the large number of S-alleles, high levels of diversity at the S-locus (Table 2) were observed. Gene diversity was H E = 0.846 at the S-locus (Table 2), higher that the respective value at 11 SSR loci (H E = 0.740; Ganopoulos et al. 2011). When the EMPaS02 nuclear microsatellite locus which is closely linked to the S-locus was removed from the analysis, neutral marker diversity of the remaining ten nSSR loci was even lower (H E = 0.685, Table 2). These results are in agreement to the generally high S-allele diversity which is a distinct feature of the S-locus in self-incompatible plant species (Lawrence 2000). At the population level the average heterozygosity was highest in the Hayntou (H E = 0.878) and lowest in the Nimphaea (H E = 0.764) populations. High levels of genetic diversity at the S-locus were indicated especially in populations Katafito and Hayntou (Table 2). All pairwise F ST (0.0063–0.0342) values were highly significant (p < 0.001; Table 3). It can be therefore deducted that the levels of genetic diversity at the P. avium S-locus are in good agreement with theoretical models of S-locus population dynamics.

Table 3 Pairwise F ST values at the S-locus for the five Prunus avium natural populations

The Ewens–Watterson homozygosity test (Watterson 1974) revealed a strong deviation from neutrality for the S-locus (p < 0.001; Table 2).On the other hand, for nSSR loci, no departure from neutrality was detected (0.207 < F(p) < 0.374; Table 2). Similar results were found by Schueler et al. (2006) also in P. avium (in a comparison of the S-locus and nSSR loci), as well as by Garrigan and Hedrick (2003) in a comparison of MHC and SSR loci in sheep. Especially for selectively neutral loci that are in synteny with the S-locus in P. avium the above results may be regarded as supporting the finding of rapid linkage disequilibrium decay in this species (Arunyawat et al. 2012). The sharp difference between the results of the S-locus neutrality test and the nSSR loci neutrality tests in this analysis provides an additional clear evidence of the impact of NFDS in the wild cherry populations studied.

Gene flow and genetic structure

The elevated number of S-alleles in the wild cherry populations potentially indicates a high effective gene flow among populations for the genomic region associated with the GSI system. Indeed, gene flow based on data from the RNase-based gametophytic self-incompatibility alleles was N m = 6.754. This value is almost six times higher than the corresponding value estimated from neutral marker (nSSR) data for the same populations and individuals (N m = 1.190; Ganopoulos et al. 2011). Thus, substantial historical gene flow is clearly suggested at the S-locus. Theory predicts a greater rate of effective migration of S-alleles, which are under balancing selection, compared to alleles of neutral molecular markers (Muirhead 2001; Castric and Vekemans 2004). Consequently, our results are also in accordance to NFDS expectations regarding elevated gene flow as estimated by S-allele analysis.

Another finding which is consistent both with the above results and with theoretical predictions concerns population differentiation at the S-locus. The S-locus F ST value of 0.035 was strikingly low compared to the high degree of population differentiation observed at neutral loci in this species (F ST = 0.097 for 11 nSSRs; Ganopoulos et al. 2011). This result can be readily visualized in Fig. 3 where H E and F ST are plotted for all loci. When the closely linked to the S-locus EMPaS02 nSSR locus was removed the magnitude of the difference was not substantially altered (F ST = 0.092). Therefore, population differentiation was found to be approximately three times lower in the P. avium genomic region compared to the rest of the genome. This outcome is in accordance to Stoeckel et al. (2006, 2008) who found the F ST at the S-locus to be three times lower than F ST at seven microsatellite loci among three P. avium populations from France. Similarly, wild pear (P. pyraster) populations exhibited a three times lower F ST at the S-locus (F ST = 0.011) than at five microsatellite loci (F ST = 0.039; Holderegger et al. 2008). In addition, (Glemin et al. 2005) found a population differentiation at the S-locus accounting for half of that at 11 nSSR loci in self-incompatible Brassica insularis.

Fig. 3
figure 3

Plot of expected heterozygosity (H E) and fixation index F ST for the S-locus and the nSSR loci

Additional support for the above findings was obtained through the use of S-locus AMOVA. AMOVA showed that approximately 95 % of the total variation at the S-locus was partitioned within populations, while, among population differentiation is associated with approximately only 5 % of the total variation (Φ ST = 0.048; P = 0.001; Table 4). A lower genetic structure at the S-locus when compared with marker loci seems to be a general feature of plant SI systems (Glemin et al. 2005; Brennan et al. 2002; Kamau et al. 2007). In general, our results show clearly a lower genetic structure at the S-locus compared to the genetic structure of the same populations in neutral loci in congruence to the relevant literature (Kamau et al. 2007).

Table 4 S-genotype-based analysis of molecular variance (AMOVA) for the five Prunus avium populations

Conclusions

By analyzing S-locus allelic data derived from five natural populations of wild cherry (P. avium) and comparing population genetics parameters between S-locus data and neutral molecular marker data, we detected many of the signatures expected from a genomic region under negative frequency-dependent selection. These are: (a) a high number of alleles and allelic richness at the S-locus, (b) prevalence of isoplethy, (c) high S-allele genetic diversity, (d) a strong departure from neutrality at the S-locus, (e) a greater rate of effective migration of S-alleles compared to alleles of neutral molecular markers, (f) a lower population genetic structure at the S-locus. These results indicate that negative frequency-dependent selection is a major evolutionary selective force at the genomic area associated with the S-locus in accordance to the predictions of Wright (1939).

In practical terms, these results indicate the absence of restrictions concerning mate availability in the relatively small natural wild cherry populations studied. The lack of available data regarding the composition of S-alleles in Greek natural populations is now largely resolved. The determination of S-locus genotypes is paramount in designing crosses in breeding programs and our results are useful for the proper selection of plus trees. Furthermore, they can also contribute to the advance of population genetics and self-incompatibility research in wild cherry.