Introduction

Fig (Ficus carica L.) (2n = 26) belongs to the Moraceae family, known worldwide for its fruit and the presence of latex in all plant parts. Figs are eaten fresh or dried, are rich in phenolic antioxidants and nutrients with high fiber content (Vinson et al. 2005), being ideal for the human diet. According to Vavilov (1951), fig’s origin is Transcaucasian, whereas other authors place it in southern Arabia or the eastern Mediterranean basin (Stover 2007). From its center of origin, fig spread to the Mediterranean basin, and in the 16th century to the east (Southeast Asia, China, Japan) as well as the New World. Nowadays, the cultivation of fig is worldwide, including Australia and South Africa.

In Greece, fig is cultivated almost in the entire country, with three main production centers, namely Attica (Mesogaia region), Evia (Kimi and Taxiarches regions) and Messinia (Peloponnese prefecture). Fig cultivation declined in Greece in the last decades; in 1961 fig was cultivated on 250,000 acres, the production was 169,259 tons, with a yield of 700 kg/acre, whereas in 2011 only 38,000 acres were cultivated, with a total production of 9400 tons and a yield of 250 kg/acre (FAO stat 2012, www.fao.org).

Fig is threatened by genetic erosion, mainly due to urbanization, monoculture, and introduction of plant material from remote regions. Today, because of the increasing interest in fresh and dried figs, fig cultivation assumes new perspectives. Since fig is not subject to intensive breeding (Flaishman et al. 2008) the improvement of fig cultivars should be based on the rich genetic diversity present in established fig populations. The ambiguity in the description of fig cultivars, the lack of official cultivar names and the existence of variants within cultivars hinder proper fig cultivar identification (Perez-Jimenez et al. 2012). Condit (1955) listed more than 700 fig cultivars, with a great deal of confusion in cultivar identification and their relationships still existing.

Plant germplasm characterization, aiming at its conservation, is traditionally carried out using morphological or agronomical traits. These criteria are often variable across years and locations since these phenotypic traits are influenced by the genotype-environment interactions. In contrast, DNA-based data are stable, reliable, and detectable in all tissues regardless of developmental and differentiation stage and are not confounded by environmental, pleiotropic, and epistatic effects (Mondini et al. 2009). Molecular markers such as microsatellites (simple sequence repeats, SSRs), RAPDs, ISSRs, RFLPs and others have been used in fingerprinting and assessing genetic diversity in various fig collections (Papadopoulou et al. 2002; Chatti et al. 2010; Perez-Jimenez et al. 2012; Ganopoulos et al. 2015; Boudchicha et al. 2018; Rodolfi et al. 2018; Ergül et al. 2021). Microsatellites provide adequate resolution of germplasm differences due to their high polymorphism and codominance, are simple, quick, relatively inexpensive, and exhibit high reproducibility among laboratories, and as a result are still used for fingerprinting in plant species (Gupta and Varshney 2000; Mondini et al. 2009).

In the present study, the genetic diversity in fig germplasm was evaluated for an established population in an ex situ collection (60 fig accessions, from four Mediterranean countries and the USA) and an identification key for fig cultivars was developed using SSRs. The work was also aimed at assigning each accession to a group based on genetic diversity, thus evaluating the structure of the fig population.

Materials and methods

Plant material

In the present study the fig population comprised of 60 accessions categorized based on their geographical origin, with 41 accessions from Greece (central-eastern, central-western, northern Greece, as well as Crete, Peloponnesus, Lesvos, Syros) and 19 from other Mediterranean countries—Italy (12), Cyprus (6), Turkey (1), Spain (1); Suppl. Table S1). In particular, 33 of them were selected from different regions of Greece, based on their special agronomical, morphological, and fruit quality characteristics, based on information obtained from farmers and authors’ personal field inspection. The name of the accession usually represents a geographical qualifier or sometimes it derives from a morphological or fruit quality characteristic. Plants were vegetatively propagated by cuttings and planted in the experimental orchard of the Agricultural University of Athens (AUA). The remaining 27 fig accessions were acquired as trees and planted in the same experimental orchard and used as reference material. Eight of the 27 accessions were Greek fig cultivars analyzed previously using RAPD markers (Papadopoulou et al. 2002). The remaining 19 accessions were internationally well-known fig varieties from other countries (Suppl. Table S1).

DNA isolation

Plant DNA was isolated from fig leaves of all 60 accessions using the CTAB method (Murray and Thompson 1980). The DNA concentration was estimated spectrophotometrically and its integrity was evaluated by electrophoresis on 0.8% agarose gel followed by ethidium bromide staining. DNA suitability as PCR template was checked by PCR reaction using primers for the ITS (Internal Transcribed Spacer) locus, following the methodology described by Roy et al. (2010).

SSR analysis

Eight SSR markers, namely MFC1 to MFC8, developed by Khadari et al. (2001) for fig, were used in this study. DNA amplification reactions were carried out in a total volume of 25 μl containing 0.5 mM of each PCR primer, 200 mM of each deoxynucleotide triphosphate, 1.5 mM of MgCl2, 1 U of Taq DNA polymerase (New England Biolabs, USA) and 50 ng of template DNA (Khadari et al. 2001), using a PTC-200/A100 thermocycler (BioRad, USA). No DNA negative control reactions were performed.

PCR products were resolved using a 12% non-denaturing PAGE in a 20 × 20 gel (Biorad Protean II, USA) at 60 V for 20 min, followed by 180 V for 5 h. Gels were subsequently stained with ethidium bromide and photographed under UV, with photos digitized for further analysis. A DNA ladder (50 bp GeneRuler, Thomas Scientific, USA) was loaded in three wells in each gel, in asymmetric locations, to avoid gel orientation problems when scoring and assist in allele size determination (Hoffman and Amos 2005).

The amplified bands per SSR were scored for each fig accession, using GelAnalyzer (2010a) (http://www.GelAnalyzer.com). Only gels/lanes with unambiguous band patterns, after background subtraction using GelAnalyzer, were considered for allele assignment. Α band was accepted, when the corresponding fluorescence intensity value, from the digitized photos, was > 10 fluorescent units. Bands with a fluorescence intensity < 35% (i.e. stutter bands) of the main fluorescence intensity value were filtered out following previous recommendations (Ewen et al. 2000; UPOV/INF/17/1 2010). The microsatellite alleles were sized using a standard curve generated for each gel, employing the known molecular size DNAs of the DNA ladder. Since PCR products of a fig accession per microsatellite were electrophoresed two to four times (each time in a different gel), the size of an allele was estimated several times, with the mean value recorded as the allele size. The allele sizes, for each microsatellite and accession, were recorded in an Excel spreadsheet (Microsoft Inc., Redmond, USA) producing thus a data matrix for data storage and further processing. SSR allele size data were binned using Flexibin (Amos et al. 2007), following the methodology described by Ghosh et al. (1997). A final correction of the allele’s size was done by visual gel inspection as previously reported (Pompanon et al. 2005; Hoffman and Amos 2005).

Estimation of genotyping error

MicroChecker v.2.2.3 (Van Oosterhout et al. 2004) was used to statistically estimate the percentage of null (nonamplified alleles due to nucleotidic changes in flanking sequences of the SSR) alleles per SSR, which is the main non-technical contributor to the genotyping error.

In order to estimate the genotyping error, a subset of 20% randomly selected genotypes of the fig population was reanalyzed (Pompanon et al. 2005) following the same methodology, except that a different PCR thermocycler (Eppendorf Mastercycler Gradient 5341, USA), DNA polymerase (Phusion® High-Fidelity, NEB, USA), and a new fig DNA preparation were used. In addition, to strengthen the reliability of the obtained results: a) independent random PCRs, for each of the eight SSRs, were conducted again for approximately 20% of the fig accessions and were re-genotyped, b) for each PCR, PCR samples were re-electrophorized two to four times in different gels, and c) the allele sizes were scored twice by two different persons (Hoffman and Amos 2005).

Data and cluster analyses

Based on the SSR allele size data, genetic variability parameters (Allele per locus (Na), effective allele per locus (Ne), observed (Ho) and expected heterozygosity (He), Fixation index (F), χ2 test for deviation from Hardy-Weinberg equilibrium (HWE) (Table 1) per locus, private alleles summary (PAS) per country, and the number of genotypes for all SSR loci) were computed for all the 60 fig accessions originating from Greece and other Mediterranean countries using GenAlEx v.6.5 (Peakall and Smouse 2012).

Table 1 Genetic parameters per SSR marker for the fig accessions from Mediterranean basin Italy, Greece, Turkey, Cyprus, Spain

In order to depict the genetic relationships among accessions of the established fig population the DARwin v6 (Perrier and Jacquemoud-Collet 2006) was employed. Missing allelic data were handled choosing the pairwise allele deletion option, at a threshold of 70%. A DARwin file with extension “.DIS” stores the dissimilarity lower semi-matrix (without the diagonal) as computed by the software. Dissimilarity re-sampling done with 10,000 bootstrap and each semi-matrix is successively recorded at the end of the file. Dissimilarity based cluster analysis was performed and dendrogram was done following the Weighted Neighbor-Joining (WNJ) method with 10,000 bootstraps. In the dendrogram, the scale defined the edge length. In order to determine genetic relationship of fig accessions per country of origin MEGA11 (Tamura et al. 2021) was employed using Nei’s distance (Nei 1972). Analysis of molecular variance (AMOVA) was carried out by GenAlEx, with the analysis based on groups as revealed by the above CA analysis for all SSR loci.

Population structure

The population structure was investigated using non-Bayesian procedure, the Discriminant Analysis of Principal Components (DAPC) (Jombart et al. 2010) in the adegenet package for R software (R Development Core Team 2011), where variance in the sample is partitioned into a between-group and within-group component, without making assumptions on panmixia. The number of clusters was assessed using the find.clusters function, which runs successive K-means clustering with increasing number of clusters (k). The optimal number of clusters was selected using the Bayesian Information Criterion (BIC) for assessing the best supported model, and therefore the number and nature of clusters.

Establishment of an identification key for fig

The polymorphism information content (PIC) value per locus was estimated using PICCalc software (Nagy et al. 2012). To establish an identification (Id) key for fig, the methodology of Tessier et al. (1999) was followed. According to this method two parameters were estimated; the confusion probability Cj, and the discriminating power Dj \(\left( {Dj = 1 - Cj} \right)\). These parameters were computed by the frequencies of allelic pattern revealed per SSR locus in the established fig population.

To determine the sequence and the number of SSRs that are sufficient to establish an identification key for fig: (a) an alphabetic character was given to each SSR allele, with fig accessions exhibiting the same SSR allele pattern obtaining the same alphabetic character designation, (b) SSR loci were sorted by descending order of Dj, (c) fig accessions with the same alphabetic designation per SSR were pooled together in the same group, and (d) the total number of expected and the observed non-differentiated fig pairs of comparisons were determined after the completion of grouping.

Results

Genetic diversity among fig genotypes revealed by SSR markers

In the present study, 60 fig accessions were examined using eight microsatellites (SSRs), designated MFC1, MFC2, MFC3, MFC4, MFC5, MFC6, MFC7 and MFC8 (Khadari et al. 2001). For each fig accession showing a banding pattern, one allele (homozygous individual) or two alleles (heterozygous individual) were identified, and their size was estimated. All figs revealed PCR banding pattern for seven out of the eight SSRs (Suppl. Fig. S1). For MFC6, no banding pattern (no PCR products) was observed for 12 out of 60 fig accessions, namely seven from Greece (Acc. No 118, 124, 131, 139, 140, 142, 146), three from Italy (Acc. No 251, 252, 253), one from Cyprus (Acc. No 236) (Suppl. Fig. S1), and one from Turkey (Acc. No 230). PCR reactions for the 12 above-mentioned fig accessions were repeated, confirming the initial observation.

MicroCheker analysis did not reveal statistically significant indications of the stuttering, scoring, large allele dropout and null allele presence, except the case of ΜFC1 (only for the presence of null alleles). To validate the above analysis, the genotyping procedure was repeated for approximately, randomly selected, 20% of the fig accessions, and no new alleles appeared for any of the SSRs.

In total, the population of the 60 fig accessions under study resulted in 58 different genotypes for the eight SSR loci. The eight SSRs resulted in 30 alleles with a mean value of 3.75 alleles per SSR. The observed heterozygosity Ηο (0.557) was higher than Ηe (0.489). The Fixation Index (F) was negative − 0.152. In particular, six (ΜFC 2, MFC4, ΜFC5, MFC6, MFC7 and ΜFC8) out of the eight SSR have negative F value, while the remaining two (MFC1 and MFC3) have positive F value. Based on χ2-test, three SSRs (ΜFC5, MFC6, and ΜFC8) follow the HWE, while the remaining five deviated from the ΗWE, at a significance level α = 0.05 (Table 1).

We detected private alleles in Greek fig accessions as follows: for ΜFC1 the 192 bp (in genotype 109); for ΜFC2 the 158 bp (in genotype 145), and 186 bp (in genotypes 101,112,121,123,128,132,155,157, and 162); for ΜFC3 the 126 bp (in genotypes 118 and 131) and 154 bp (in genotypes 101,155); for ΜFC6 the 284 bp (in genotype 127) and 290 bp (in genotype 161). In Cyprus the private allele was a 170 bp ΜFC1 allele (in genotypes 220, 233, 236, and 249).

Lastly, a comparison was made for the genetic parameters found in the literature where the MFC SSRs were utilized (Suppl. Table S2). The values of the genetic parameters obtained in the present study agree with the previously published work.

Genetic relationship among fig genotypes—population structure of fig germplasm collection

From the dendrogram generated (Fig. 1) it appears that the fig population studied could be divided in three large groups, which were named Cluster I, II, and III, with each cluster subdivided into two subgroups, 1 and 2 (subgroup I-1, subgroup I-2 etc.), resulting in a total of 6 subgroups. Figs from different countries could be found in the same subgroup, except for subgroup III-2, which contains figs only from Greece. The AMOVA analysis for the 6 subgroups showed that 89% of the total variability within the population is due to genetic differences between the individual fig accessions of each subgroup and only 11% is accounted for by differences between the subgroups. Fst value of 0.132 indicates a small to negligible difference between the six subgroups. When the country of origin for the fig accessions was considered, based on the genetic distance of Nei, two major groups are created, one including the figs from Greece, Italy and Spain, and the second one figs from Cyprus and Turkey (Suppl. Fig. S2).

Fig. 1
figure 1

Dendrogram showing the genetic relationships of 60 Fig accessions based on SSR data. The dendrogram was constructed in DARwin v 6.0.15 using the WNJ method (confirmed by 10,000 bootstraps). The scale indicates the edge length

The genetic structure of the fig tree cultivars was investigated by non-Bayesian population assignment analysis. For DAPC analysis, 20 PCA axes and three discriminant functions were retained. The DAPC analysis indicates a partial sub-structuring of fig accession groups. The Cypriot and Italian accessions were grouped separately, while the remaining accessions were clustering together, most likely due to potential gene flows occurring between these populations (Fig. 2). The data were consistent when the analysis was conducted with clone correction (data not shown).

Fig. 2
figure 2

Scatterplot of individuals on the two principal components of DAPC. The graph represents the individuals as symbols and the groups as inertia ellipses. Eigenvalues of the analysis are displayed in the inset. The numbers refer to the country of origin: 1(blue) = Greece; 2 (yellow) = Italy; 3 (dark range) = Turkey; 4 (grey) = Cyprus; 5 (red) = Spain. (Color figure online)

Allele patterns per SSR locus analysis, in fig accessions, revealed different genotypes, ranging from two genotypes for MFC4, MFC5, and MFC8, to nine genotypes for MFC3, as shown in Table 1. Combining the data for all eight SSR markers indicated the presence of 58 different genotypes for the 60 fig accessions studied (Suppl. Table S2).

Identification key for figs

In this study, an identification (Id) key was generated based on the discriminating power (Dj) of each SSR. The SSR loci were hierarchically ordered according to their Dj values. MFC6 ranked first, as it is the most discriminative among the eight SSRs. As a result, the Id key produced was: MFC6-MFC3-MFC1-MFC2-MFC7-MFC8-MFC5-MFC4. MFC4 and MFC5, having the smallest Dj, were not included in the Id key as they lacked discriminative power (Table 2). The sixty fig accessions in this study produced 1770 pairs of comparisons. Using the above Id key, 58 out of the 60 fig accessions analysed were differentiated.

Table 2 Genetic Information indexes

Discussion

Genetic erosion threatens the local fig varieties and landraces and intensifies the need to protect them. For this reason, an ex situ fig collection was established in the experimental orchard of the Agricultural University of Athens. This comprised mainly of uncharacterized Greek fig accessions, along with few partly characterized Greek varieties (Papadopoulou et al. 2002) along with some well-known foreign cultivars. Ex situ collection where landraces are preserved without the accompanying information on their characteristics and registration of their region of origin would be of limited value. The information describing plant germplasm could be used for identification purposes, recognition of deficiencies of the collection and planning future efforts to strategically enrich it with new plant material. Such information includes morphological, agronomical features, biochemical and molecular data. In the present study, the genetic characterization of the established fig population was described using SSR markers. Such investigations with plant genetic resources are a prerequisite for breeding crops in order to face new challenges, including climate change. Data obtained were also used to propose an identification key scheme.

In a number of genetic studies using microsatellites, genotyping errors that are due to null alleles (nonamplified alleles), DNA degradation and low DNA concentrations are increasingly recognized as important factors that could render the conclusions doubtful (Hoffman and Amos 2005). Especially null alleles that usually result from changes in flanking region sequence of the SSR could alter the estimation of the genetic parameters of the population under study. In the present study, the frequency of null alleles was statistically estimated per SSR, and it ranged from 0 to 3.53%. This range is considered non-significant since frequencies of 5–8% introduce only a small bias in the genetic parameters investigated (Chapuis and Estoup 2007). Moreover, re-genotyping 20% of the members of the population studied reinforced the reliability of the results.

The dendrogram (Fig. 1) shows that in the subgroup I-1 two Italian varieties, namely Dottato (accession number 234) and San Pietro (accession number 213), are genetically close, something that is known in the literature (Papadopoulou et al. 2002). However, these two varieties differ phenotypically in leaf shape and cavity size within the fruit (Ntanos et al. 2015). In addition, in the same subgroup two figs from Greece, namely Maurosykia (accession number 108) and Zakynthos (accession number 160) appear genetically and morphologically close, even though they were collected in very distant regions of Greece. Minor differences between Maurosykia and Zakynthos need to be further investigated. Finally, in subgroup II-1 the genotypes Vasilika Mellisi (140) from Greece and Rosso Dendro (214) from Italy are similar. However, these two varieties differ phenotypically in the size and shape of the leaf and fruit stalk length is significantly longer in the second one (Ntanos et al. unpublished data).

In the fig population studied in the present work, the parameters that quantify the genetic variability were calculated using 8 microsatellites. Our results are consistent with published data for (Suppl. Table S1) (Khadari et al. 2003; Giraldo et al. 2008, 2005; Saddoud et al. 2007; Achtak et al. 2009; Aradhya et al. 2010; Caliskan et al. 2012; Perez-Jimenez et al. 2012; Ganopoulos et al. 2015; Boudchicha et al. 2018; Rodolfi et al. 2018; Ergül et al. 2021). The observed differences in genetic parameters observed in published works may be due to the selection of different SSRs, the genotypes analyzed, the differences in the methodology followed for the examination of the samples. Our finding that observed heterozygosity is higher than expected, resulting in a negative F value, agrees with previous works (Suppl. Table S2). Negative F values are the result of a negative assortative mating (Lachance 2016), due to fig’s entomophilous pollination, or to heterotic selection by man. Klekowski (1988) pointed out that perennial species tend to exhibit high heterozygosity as a mechanism to overcome the harmful effects of residual mutations. Despite the high observed heterozygosity in our fig population we observed a limited genetic grouping, as previously published (Aradhya et al. 2010; Caliskan et al. 2012; Perez-Jimenez et al. 2012).

From the present study it appears that the geographical origin could not be the main criterion for classification, as described also elsewhere (Giraldo et al. 2008; Aradhya et al. 2010; Boudchicha et al. 2018). Only Ikegami et al. (2009), using a population of eight Japanese and 11 foreign imports, suggested that figs from Japan are not genetically related to figs from other countries. Many authors attribute the absence of grouping of figs to the propagation of the fig tree, which favors the exchange of plant material between different geographical areas (Giraldo et al. 2008; Aradhya et al. 2010). The same partial lack of clustering was observed in the DAPC analysis. We opted to use this approach, instead of a Bayesian one, because the latter assumes that markers are not linked and that the population is panmictic. The DAPC analysis is a convenient approach for clonal or partially clonal populations (Jombart et al. 2010), where it aims at maximizing the discriminatory capacity of the between-groups variance.

The development of a reliable tool for the identification of the variety is necessary to ensure the identity of the plant material for the registration, the protection of the variety and the management of the propagating material. In the present study, we proposed a key identification scheme using six microsatellites that could distinguish 58 of the 60 fig genotypes in the ex situ collection (98% resolution). A study on the morphological characteristics showed that fig accessions that did not differ by molecular analysis could differ in the color of the fruit peal and of the fruit flesh (Ntanos et al. 2015). The latter suggests that a combined approach involving molecular and morphological analyses may be necessary to increase the resolution power.