Introduction

The olive (Olea europaea L. subsp. europaea), which originated in the Eastern Mediterranean area, has been cultivated throughout the Mediterranean basin since ancient times for its fruit and oil. It is believed to have been domesticated as far back as 3500–3700 BC (Zohary and Hopf 1994), and references to the use and trade of olive oil date back to 2000–3000 BC (Baldoni and Belaj 2010). Today, the olive is one of the most widely cultivated fruit crops in the world, and the Mediterranean region, in particular, produces 95 % of the world’s olives. Italy ranks second after Spain (FAO 2004) in terms of production.

The olive (2n = 2x = 46) is an allogamous, longevous and evergreen species, and most of the cultivars are not usually considered to be self-compatible (Mookerjee et al. 2005). The great longevity of this species, the lack of replacement of traditional and well-adapted cultivars combined with its allogamous nature gave rise to very rich germplasm, which has conserved most of its variability intact (Fiorino and Lombardo 2002; Baldoni and Belaj 2010). Olive cultivars are believed to be varieties of unknown origin presumably selected by growers over the centuries from wild genotypes, perhaps those producing the larger fruits, and are generally vegetatively propagated. Hence, the high genotypic diversity of olive varieties could be explained by human selection in response to local environmental and agronomic conditions (Besnard et al. 2001a; Angiolillo et al. 2006; Bracci et al. 2009). It is also likely that crosses between wild local olive genotypes and introduced cultivars have occurred in many areas, thereby leading to new cultivars (Besnard et al. 2001b). Bartolini et al. (1998) stated that at least 1,200 varieties are cultivated, 5,300 names are known, and in Italy alone, 538 cultivars are recognised.

In Italy, each region has its own local cultivars, and many seedling trees grow spontaneously. The larger olive-producing Italian regions are Apulia, Calabria, Campania and Sicily. The use of the species, both as table and oil cultivars, is well-documented in these regions, with many archaeological and written relics dating back to ancient times, attested also by the presence of monumental trees. This species was brought to Sicily with the Phoenicians in the sixth century BC and from this region moved throughout Southern Italy. Cultivation then began to move north with the Romans, who worked intensely on developing grafting technology (Zohary and Hopf 1994; Besnard et al. 2001b; Rugini et al. 2011). In Sicily, Bottari and Spina (1952) described 29 cultivars and landraces which mostly contributed to olive regional production. In the 1980s, work began in the “Dipartimento DEMETRA” in Palermo and since then it has collected and studied, at the morphological level, 37 Sicilian accessions (Caruso et al. 2007), 25 of which have been characterised at the molecular level by using molecular markers (La Mantia et al. 2005; Caruso et al. 2007; Marchese et al. 2008). On the Island, almost 92 % of olive production is used for olive oil extraction and numerous cultivars produce extremely high quality oil (Caruso et al. 2007). Table olive production is also significant (8 %), based on the Sicilian cultivars, ‘Nocellara Etnea’, ‘Nocellara del Belice’, ‘Ogliarola Messinese’ and ‘Moresca’, as they have large-sized fruits of high commercial value. In the Calabria region, olive cultivation was developed over many centuries and Calabrian germplasm is characterised by a remarkable variety of cultivars. The first morphological characterisation of cultivars was reported by Caruso (1883) followed by many authors (Grippo 1923; Zito 1931; Catanea 1934; Pavirani 1959; Chimenti 1963; D’Amore et al. 1977; Parlati et al. 1995, 1999; Mafrica et al. 1996; Motisi et al. 2001; Lombardo et al. 2004). These studies reported that Calabrian olive cultivars, almost exclusively used for oil production, usually have vigorous growth habits, small-sized fruits and show great variability in agronomical behaviour and adaptability to environmental conditions. Generally, in these papers, the number of cultivars described is small; morphological descriptions are very schematic and conducted using different methodologies, making any comparison between genotypes difficult, especially amongst those exhibiting a high degree of similarity. Molecular screening of Calabrian cultivars, together with those of numerous Southern Italian regions, was performed by Carriero et al. (2002) and Muzzalupo et al. (2009) with SSR markers, most of which were recently discarded on the basis of the many drawbacks associated, as reported in literature (reviewed in Baldoni et al. 2009). In Campania, thanks to its orographic and climatic heterogeneity, there is a rich olive genetic heritage. Most of the traditional cultivars are exclusively used for oil production; only few cultivars, such as ‘Ortice’, ‘Ortolana’ and ‘Caiazzana’ are suitable for use both as table olives and oil production. A number of local cultivars are thought to be resistant to cold (ex. ‘Rotondella’) or drought (ex. ‘Carpellese’ and ‘Pisciottana’) conditions, and a few cultivars are putatively considered to be resistant to black mould, peacock spot and knot diseases (ex. ‘Pisciottana’) (Pugliano 2000). Campanian olive oils vary considerably and possess typical and distinctive sensory characteristics (Sacchi et al. 1999; Di Vaio et al. 2013). Pugliano (2000) morphologically described 66 olive accessions, whereas the most representative cultivars were characterised at the molecular level by means of AFLP and/or SSR markers by Rao et al. (2009), Muzzalupo et al. (2009) and Corrado et al. (2009). Considering the historical link and the geographical proximity between these three regions, it is highly likely that the local olive germplasm for each of the regions share genetic relationships.

It is known that olive cultivar differentiation based on morphological descriptions is not particularly reliable, as it can be influenced by environmental conditions and requires skilled staff (Belaj et al. 2001). Furthermore, the presence of both native and foreign olive cultivars with ambiguous naming together with the interchange of plant material over the centuries, make it difficult to ensure cultivar identification and to fully understand the pattern of geographic distribution of olive cultivars (Sarri et al. 2006). Therefore, a wide range of molecular markers has been employed in order to study genetic diversity in olive cultivars from the whole of the Mediterranean area (reviewed in Bracci et al. 2011). Of those molecular markers available, microsatellites or SSRs (simple sequence repeats), developed in olives (Rallo et al. 2000; Sefc et al. 2000; Carriero et al. 2002; Cipriani et al. 2002; De la Rosa et al. 2002; Marrazzo et al. 2002; Sabino Gil et al. 2006) enable parentage analysis and diversity studies (reviewed in Bracci et al. 2011). In this study, the diversity of major and extensively cultivated olive cultivars from three Southern Italy regions was investigated morphologically, measuring reliable and stable traits, and, at the molecular level, by using most of the currently recommended SSR markers. Our work aimed to investigate relationships in order to ascertain, for the first time, the existence of putative parents and/or siblings by parentage simulation analysis, test the putative existence of genetic structure and, finally, to clarify how genetic diversity is partitioned at the micro-scale level for regional germplasm.

Material and methods

Plant material

In this investigation, 68 accessions of O. europaea, representing the diversity of olive germplasm from three Southern Italian regions—Calabria, Campania and Sicily—were collected in Spring from the olive cultivar collections of the ‘Azienda Sperimentale Regionale Improsta’—Eboli (Salerno; Campania), the ‘Azienda Carboj E.S.A.’—Menfi (Agrigento; Sicily) or, as in the case of the Calabrian accessions, from local farms (Fig. 1).

Fig. 1
figure 1

Map of Calabria, Campania and Sicily and main areas of cultivation of the 68 olive cultivars studied

Morphological characterisation and clustering analysis of the data

The description of 13 quantitative and 17 qualitative morphological traits was recorded (Table 1) for 2 years (2009–2010), from either the collection fields or farms, following indications provided in literature (Bottari and Spina 1952; Barranco et al. 2000; Bartolini et al. 2005; Caruso et al. 2007). Quantitative traits were transformed as ordinal characters using the classes reported by Caruso et al. (2007) to minimise environmental effects. Five variables (leaf dimensions, flesh to pit ratio, and fruit and pit weight) were excluded from the analysis as they were considered to be more influenced by environmental conditions.

Table 1 List of morphological traits analysed in the 68 cultivars from Calabria, Campania and Sicily

Classical cluster analysis was carried out on the remaining 25 morphological variables (Table 1). The type of metric distance used between objects was “percent” that is the percentage of comparisons between values resulting in disagreements in two profiles. Average linkage was performed using a joining algorithm.

Canonical discriminant analysis (CDA) was also implemented in order to distinguish between groups of cultivars with the same origin. One of the major assumptions of CDA is that each predictor variable is normally distributed. Therefore, only the 15 explanatory variables that were normally distributed were selected (Table 1). A graph was built for the first two canonical functions (Can1 and Can2), illustrating the 80 % confidence ellipses of the mean vectors for each cultivar, in order to visualise multivariate trends for all treatments jointly.

All the statistical analyses based on morphological traits were performed using the Systat statistical program (SYSTAT Software Inc., Chicago, IL).

SSR analysis

Genomic DNA was extracted from fresh leaves according to the method of Doyle and Doyle (1987) with modifications. Its quantity was determined by comparing the fluorescent yield of the samples with λ DNA standards [Gibco BRL, Paisley, Scotland, UK] on 0.8 % (w/v) agarose gel in TAE (40 mM Tris–acetate, 10 mM EDTA, pH 8). The DNA was further quantified by spectrophotometer at 260 and 280 nm and diluted in order to perform SSR analysis.

A total of 12 SSR loci, chosen on the basis of their polymorphism and reproducibility, were used—OeUA-DCA: 03, 04, 07, 09, 14, 16, 17 and 18 (Sefc et al. 2000), UDO43 (Cipriani et al. 2002; Marrazzo et al. 2002), GAPU: 101 and 103 (Carriero et al. 2002), and EMO90 (De la Rosa et al. 2002) (Table 2). Nine of these microsatellites had been proposed previously as being the most suitable for olive fingerprinting studies (Baldoni et al. 2009). Polymerase chain reactions (PCRs) were performed in a final volume of 8 μL, containing 1X PCR buffer (Buffer 10x, Roche Diagnostic Indianapolis, Indianapolis, USA), 0.2 mM of each dNTP (Roche Diagnostic Indianapolis, IN, USA), 0.312 mM of each primer (Roche Diagnostic Indianapolis, IN, USA), 1.5 mM MgCl2 (2.5 nM for the primer pair UDO-043), 0.312 mM of each primer (Roche Diagnostic Indianapolis, IN, USA), 20 ng genomic DNA and 1 U/ Taq Polymerase enzyme (Roche Diagnostic Indianapolis, IN, USA), using 7300 System Thermal Cycler (Applied Biosystems, USA). Reactions were performed using the PCR cycles as described by La Mantia et al. (2005). The forward primer was labelled with one of two fluorescent dyes, namely 6-FAM or HEX (Invitrogen), and sized by capillary electrophoresis through an ABI-PRISM 3130 Genetic Analyser (Applied Biosystems). Data were collected and analysed using the GeneMapper® software v3.7 (Applied Biosystems). Peaks were considered to correspond to alleles; genotypes showing a single peak for a particular locus were considered to be homozygous.

Table 2 Summary statistics for 12 microsatellite markers in 66 olive cultivars

Genetic analysis and paternity inference analysis

The number of alleles per locus (Na), the observed (Ho) and expected (He) heterozygosity, the total number of null alleles (Fnull), the polymorphic information content (PIC) and the deviation from the Hardy–Weinberg equilibrium (HWE), which was inferred by sequential Bonferroni correction, were calculated using CERVUS 3.0 software (Marshall et al. 1998; Kalinowski et al. 2007). Paternity inference analysis was performed using CERVUS 3.0, based on the “likelihood” approach of Thomson (1975; 1976) and Meagher (1986), by selecting settings for both “parents being unknown” and ”incomplete parental sampling”. Internal simulations were run (set to 10,000) to determine the significance of LOD scores (the logarithm of the likelihood ratio). Relaxed and strict confidence levels were set to 95 and to 99 % respectively, and the proportion of loci mistyped was set to 0.005.

Cluster and structure analysis based on microsatellites

Microsatellite alleles were scored as present (1) or absent (0), and the genetic distance between the cultivars was analysed with the NTSYS-pc, version 2.02k (Rohlf 1993). The similarity matrix was obtained using the simple matching coefficient (Sokal and Michener 1958) based on SIMQUAL (Similarity of Qualitative Data), available in the NTSys-Pc software, and was used to construct a UPGMA dendrogram. Starting from allele frequency data (distance method “shared allele”; tree method UPGMA), dendrogram robustness was also assessed by performing bootstrap analysis, running 1,000 iterations with the program PowerMarker (Liu and Muse 2005). Consensus tree was calculated by the “Consense” module of PHYLogeny Inference Package (Phylip, version 3.69). Mantel tests (Mantel 1967) were also performed with PowerMarker (Liu and Muse 2005) to investigate relationships and fit between genetic/morphological, morphological/geographic and genetic/geographic distances of the cultivars, using simple matching dissimilarity matrices. The p values were also calculated.

The software package Structure 2.3.1 (Pritchard et al. 2000; Falush et al. 2003, 2007; Evanno et al. 2005; Hubisz et al. 2009) was employed, using the SSR data, in order to infer relationships between the three collections of olive germplasm, obtain the most consistent grouping of the 68 cultivars studied, and identify putative admixed or exchanged cultivars. The ‘admixture’ model, specifying one to seven populations (K), a burn-in length of 10,000, followed by 90,000 runs at each K, with ten replicates for every K, were used. To select the right number of populations (K), the log likelihood for each K (L(K)) was adopted (Rosenberg et al. 2002). Cultivars with membership probabilities equal to or above 0.80 were considered to belong to the same group.

Results

Cluster and canonical discriminant analysis based on morphological traits

A good correlation among the 2 years of morphological data was observed. The dendrogram, which was constructed using 25 morphological traits (listed in Table 1) and based on the distance “percent” and the ”average linkage method”, showed two main clusters; cluster I and II (Fig. 2). Cluster I included all of the Sicilian cultivars and five Campanian cultivars: ‘Oliva Bianca’, ‘Racioppella’, ‘Cornia’, ‘Ravece’ and ‘Ortice’; cluster II grouped Calabrian and Campanian cultivars. Cluster II contained two quite well-defined sub-clusters, namely sub-cluster A, which included the Campanian cultivars, and sub-cluster B, which incorporated the Calabrian cultivars. Only two Campanian cultivars—‘Ogliarola Campana’ and ‘Pisciottana’—cultivated in areas bordering on the Calabria region, were also included in sub-cluster B (Fig. 2).

Fig. 2
figure 2

Cluster analysis, based on the statistical analysis of 25 morphological traits of 68 olive cultivars studied, based on the distance ‘percent’ and the ‘average linkage method’. (1) Sicilian, (2) Calabrian (3) Campanian

Results of canonical discriminant analysis based on the region of origin are shown in the Fig. 3. The coordinates on the two axes, which describe the complete discriminant space, correspond to the three groups and to the individual values of the 68 retained profiles. The first axis (Can 1) separated the profiles of the Sicilian genotypes sharply from those of the other two regions, which showed a complete overlap. The cultivar profiles from the Calabria region, however, were separated from Campanian cultivars by the second axis, in spite of a slight overlap. Despite this, few cultivars were misclassified. The Sicilian cultivars ‘Bottone di Gallo’ and ‘Nocellara Etnea’ were located amongst the Campanian cultivars, and the Calabrian ‘Cassanese’ and the Campanian ‘Ravece’ were classified as Sicilian cultivars. Two Campanian cultivars—‘Racioppella’ and ‘Ogliarola Campana’—were grouped with the Calabrian cultivars, whilst the classification of the Calabrian cultivars ‘Minuta Maierato’ and ‘Ottobratica Rotondella’ was uncertain.

Fig. 3
figure 3

Canonical Discriminant Analysis (CDA) of the 68 olive cultivars studied, according to the region of origin, based on morphological traits. Empty triangle = Sicily; dark circle = Calabria; empty square = Campania

SSR diversity

The 12 SSR primer pairs belonging to the series OeUA-DCA (Sefc et al. 2000), GAPU (Carriero et al. 2002) EMO, (De la Rosa et al. 2002) and UDO (Cipriani et al. 2002; Marrazzo et al. 2002) successfully amplified polymorphic and reproducible alleles in all 68 cultivars, allowing a unique profile to be obtained for most of them (97 %) and, in two cases, for synonymy to be revealed. A total of 157 alleles were found; the number of alleles per locus varied from five for SSR loci EMO90 to 19 for SSR locus DCA04, with an average number of 13.08 (Table 2). For 66 cultivars, excluding from the calculation the two cultivars which could represent cases of synonymy, the mean expected heterozygosity (He) was 0.84 (ranging from 0.73 for EMO-90 to 0.91 for DCA09), the mean observed heterozygosity (Ho) was 0.83 (varying from 0.53 for DCA07 to 0.985 for DCA09 and DCA18), and the mean polymorphic information content (PIC) was 0.81 (ranging from 0.678 for EMO-90 to 0.895 for DCA09) (Table 2). Only SSR locus DCA07 showed a significantly high estimated null allele frequency value +0.2284. In six cases (DCA: 03, 09, 14, 16 and 18 GAPU101) Ho was higher than He, indicating high genetic variability amongst the cultivars analysed (Table 2).

The probability of identity (PI) was estimated as being between 3.3 × 10−2 for the SSR locus DCA09 and 2.13 × 10−1 for EMO-90. The value of the total probability of identity for the 12 SSR analysed, which indicates the probability that two unrelated genotypes chosen at random from all surveyed genotypes have the same profile, was very low 7.23 × 10−14 (Table 2).

Allele frequencies varied from a minimum of 0.007 to a maximum of 0.41 for allele 186 bp at the EMO-90 locus. In many cases, the allelic frequency was very low, particularly for loci having a high number of alleles. ‘Rare alleles’ were also found: allele 216 bp of the locus DCA03 was present only in ‘Minuta Maierato’ and ‘Minuta Zungri’, allele 162 bp of DCA04 was present only in ‘Verdello’, allele 153 bp of the locus DCA07 in ‘Cassanese’, 199 bp DCA09 in ‘Carolea’, allele 150 bp of the DCA14 locus in ‘Cornia’, allele 188 bp of the locus DCA16 in ‘Salella’, 171 bp of DCA17 in ‘Ciciarello’, and allele 166 bp of the locus GAPU103 was found only in ‘Aitana’. In addition, allele 166 of the DCA07 locus was mostly restricted to the Sicilian germplasm, apart from its presence in two Calabrian cultivars (‘Cassanese’ and ‘Oliva d’ogghiu’), and allele 196 bp of the EMO90 locus was present mostly in the Calabrian germplasm, with the exception of its presence in the Sicilian cultivar ‘Santagatese’ and two Campanian cultivars ‘Cornia’ and ‘Pisciottana’ (Supplementary Table 3, available from http://dx.doi.org/10.5061/dryad.3dv00).

The non-exclusion probability between two unrelated individuals (NE-I) and two hypothetical full siblings (NE-SI), calculated with CERVUS, ranged from 0.017 (DCA09) to 0.122 (EMO90), and from 0.303 (DCA09) to 0.420 (EMO90), respectively (Table 2). These values depict the probability that the genotypes at a single locus do not differ between two randomly chosen genotypes. This probability is usually calculated in two ways by CERVUS, either by assuming that two individuals are unrelated, or by assuming that the two individuals are full sibs. Simulation of parentage analysis was used to assess the power of this set of SSR markers to assign parentage and to find putative parentage assignments. Critical LOD values were 6.22 and 3.7 for the single parent analysis, and 16.27 and 12.33 for parent pair (sexes unknown) for strict (99 %) and relaxed (95 %) confidence levels. Noteworthy, single parent simulation showed that the following ten pairs or groups of cultivars can be reciprocally parents or siblings based on LOD scores and confidence level values: ‘Minuta Zungri’/‘Minuta Maierato’ (7.24), ‘Chianota’ (identical to ‘Olivo di Mandanici’)/‘Olivo a Rappu’ (7.72), ‘Caiazzana’/‘Femminella’/‘Racioppella’ (7.88), ‘Cerasuola’/ ‘Nocellara del Belice’ (9.08), ‘Giarraffa’/‘Ravece’ (9.73), ‘Carolea’/‘Tombarella’ (10.7), ‘Minuta’/‘Vaddarica’ (11.2), ‘Cavalieri’/‘Crastu’/‘Ogliarola Messinese’ (11.2). ‘Tonda Campana’/‘Ottobratica Rotondella’/‘Rotondella’ (15) and ‘Ghiastrina’/‘Ottobratica Perciasacchi V.J.’/‘Ottobratica Perciasacchi V.T.’/‘Ottobratica std’ (17).

Cluster analysis (Fig. 4) showed the grouping of cultivars in two main clusters, one of which (cluster I) containing mainly Calabrian and Campanian cultivars, the other (cluster II) containing mostly Sicilian cultivars, confirming the results of CDA (Fig. 3). Only five Sicilian cultivars, ‘Minuta’, ‘Nasitana’, ‘Olivo di Mandanici’, ‘Santagatese’ and ‘Vaddarica’, grown in the north-eastern tip of the island, showed a closer relationship to Calabrian and Campanian cultivars, sharing a level of similarity with them of approximately 0.8. Five Calabrian cultivars, ‘Carolea’, ‘Cassanese’, ‘Oliva d’ogghiu’, ‘Imperiale’ and Tombarella’, and one Campanian cultivar, ‘Ortice’, were located inside a sub-group of cluster II, showing strict relationships with the Sicilian cultivars. Two cultivars, ‘Miseo’ and ‘Salella’, clustered together and appeared as the most isolated of cluster I.

Fig. 4
figure 4

Consensus UPGMA dendrogram showing relationships among the olive germoplasm from three Southern Italian regions, (1) Sicily, (2) Calabria and (3) Campania, based on 12 SSR loci. Significant bootstrapping values are reported

Two cases of identity were found; ‘Olivo di Mandanici’ (Sicily)/’Chianota’ (Calabria) and the Campanian ‘Biancolilla Campana’/’Carpellese’ were undistinguishable at the molecular level with the markers used. ‘Biancolilla Campana’/’Carpellese’, widespread in Southern Campania, were grouped with the Calabrian cultivar ‘Ottobratica V.J.’ (common on the South Ionian sea side of the Calabrian region) with a coefficient of genetic similarity equal to approximately 0.99. The cultivar ‘Ottobratica V.J.’ (Calabria) differed from ‘Biancolilla Campana’/‘Carpellese’ by only one allele at the UDO43 locus.

The Calabrian ‘Ottobratica Rotondella’ and the two Campanian cultivars ‘Tonda Campana’ and ‘Rotondella’ grouped together, showing a coefficient of genetic similarity of approximately 0.982. ‘Tonda Campana’ and ‘Ottobratica Rotondella’ differed by only one allele at the DCA7 locus (149 and 147 bp, respectively) (Supplementary Table 3, available from http://dx.doi.org/10.5061/dryad.3dv00). For this locus, ‘Rotondella’ was found to be homozygous for allele 128 bp and differed from the other two cultivars regarding SSR alleles at locus UDO43. The genotypes ‘Ghiastrina’, ‘Ottobratica Perciasacchi V.J.’, ‘Ottobratica Perciasacchi V.T.’ and ‘Ottobratica std’, grown in Central and Southern Calabria, were found to group together. ‘Ghiastrina’, ‘Ottobratica Perciasacchi V.J.’ and ‘Ottobratica Perciasacchi V.T.’ differed by only one allele at the locus DCA4 (194, 186 and 190, respectively) (Supplementary Table 3, available from http://dx.doi.org/10.5061/dryad.3dv00). The cultivar ‘Ottobratica V.J.’ from Calabria showed all SSR profiles as being distinct from the remaining cultivars of the ‘Ottobratica’ group.

Interestingly, the UPGMA clustering based on SSRs reflected all ten putative parent/sibling groups as indicated by the parentage simulation performed by Cervus. Generally, these groups were confirmed by their bootstrap values (Fig. 4).

Clustering based on SSR diversity differed from clustering built on morphological trait variability, with the following exceptions: the four Calabrian accessions (‘Ghiastrina’, ‘Ottobratica Perciasacchi V.J.’, ‘Ottobratica Perciasacchi V.T.’ and ‘Ottobratica std’); ‘Biancolilla Campana’ and ‘Carpellese’ which showed a close morphological distance of about 0.37, and two pairs (‘Cavalieri’/‘Crastu’ and ‘Cerasuola’/‘Nocellara del Belice’) of the ten groups of putative parent/sibling cultivars (Fig. 2). The Sicilian cultivar ‘Olivo di Mandanici’ and the Calabrian cultivar ‘Chianota’, sharing the same SSR profiles, were shown to differ at the morphological level and clustered in two different groups (Figs. 2, 3 and 4).

A Mantel test (Mantel 1967) was conducted to determine the correlation between SSR profiles, morphological data and geographic origins. We found low, but significant correlations, between morphological data vs SSR allelic frequency (r = 0.24; p < 0.001), between origin vs SSR allelic frequency (0.22; p < 0.001), and between origin and morphological data (0.50; p < 0.001).

To assess the genetic structure of the three olive germplasm collections, we used the software Structure 2.3.1. It is well-known that when K is approaching a true value, L (K) reaches a plateau and has high variance between runs (Rosenberg et al. 2002); in our analysis the plateau was reached at K = 3. In Fig. 5, the three groups were marked as group 1 (black), group 2 (light grey) and group 3 (dark grey). Nineteen of the Sicilian cultivars were found in the black group (which appeared strongly structured); in this group, only two Sicilian cultivars, ‘Bottone di Gallo’ and ‘Santagatese’, had membership value lower than 0.8. The Sicilian cultivar ‘Olivo di Mandanici’, identical at the SSR level to ‘Chianota’, belonged to the light grey group, which comprised thirteen Calabrian and twelve Campanian cultivars. Five Sicilian cultivars (‘Brandofino’, ‘Minuta’, ‘Nasitana’, ‘Vaddarica’ and ‘Verdello’) belonged to the dark grey group, which also included ten Calabrian cultivars (‘Cassanese’, ‘Carolea’, ‘Imperiale’, ‘Oliva d’ogghiu’, ‘Tombarella’, ‘Sinopolese’, ‘Ciciarello’, ‘Minuta Maierato’, ‘Minuta Zungri’ and ‘Miseo’) and three Campanian cultivars (‘Cornia’, ‘Ortice’ and ‘Salella’). Grouping from the structure analysis reflected the UPGMA clustering exactly. The Calabrian and the Campanian germplasm shared most of their allelic makeup, contrary to the Sicilian cultivars, which tended to separate from them. It is likely that cultivars with membership values lower than 0.8 may be crossbred. Overall, the genetic diversity was consistent with the geographic area of origin.

Fig. 5
figure 5

Genetic structure of 68 olive accessions, considering K = 3. Each vertical bar represents the olive cultivar, reported under the respective bar. Colours (black, pale grey and dark grey) represent the three groups, defined by the K value. Olive cultivars showing more than one colour may have an intermixed genetic makeup, resulted from crossing. The vertical axis indicates the membership value

A further canonical discriminant analysis on molecular data and fruit size, as grouping variables, was also performed (data not shown). The analysis was able to discriminate genotypes having large (>6.1 g) or small (<2.0 g) fruit sizes. Moreover, a correlation was found between the presence of some alleles of the loci DCA03, DCA04 and Gapu101, and the fruit size. In particular, alleles 240 and 244 bp of DCA03, allele 152 bp of DCA04 and alleles 185 and 191 bp of Gapu101 occurred in cultivars which had a small fruit size, whilst alleles 250, 252 and 254 bp of DCA03, allele 166 bp of DCA04 and alleles 199, 201 and 207 bp of Gapu101 were present in cultivars which had a very large fruit size.

Discussion

The 68 olive cultivars were analysed morphologically for the 2 years and phenotypically they were considered extremely representative of the olive germplasm variability present in these three Southern Italian regions.

Fifteen leaf, fruit and endocarp traits can be considered the most important to discriminate cultivars at the phenotypic level. Cluster analysis based on morphological traits showed that the grouping of the cultivars reflected their geographical origin. In the light of this finding, it seems that phenotypic traits are moulded by the area of origin and cultivation. Of these three regions, the Sicilian cultivar group was found to be the most compact. Therefore, it is likely that olive cultivars from Sicily were subjected to limited exchange or diffusion from the areas of origin, most probable due to geographical isolation, although, historical links exist between these regions. Geographical closeness between Calabria and Campania may have favoured the exchange of plant material within the area and/or inter-crosses. Canonical discriminant analysis confirmed the cluster analysis indicating an even stronger grouping, which reflected the area of origin and cultivation. In particular, two endocarp characters, pit base shape and pit shape, were found to be the most important on the second canonical function, whereas longitudinal curvature of the blade; leaf apex angle and fruit apex shape displayed more discriminant power on the first canonical function. Most of the Sicilian genotypes, which were divided by the first canonical function, presented epinastic leaves, acute leaf apex angle and rounded fruit apex shape.

The selected 12 SSR loci, nine of which were reported to be the most highly resolving SSRs for olive (Baldoni et al. 2009), were shown to be highly polymorphic in this investigation and gave reproducible amplification patterns for all 68 olive cultivars analysed. The average number of alleles per locus (Na), reported in this study was 13, higher than that obtained by La Mantia et al. (2005), 7.8 (Na) using 12SSR and a set of 50 olive accessions by Lopes et al. (2004), 9.6 (Na) with 14 SSR screened in 130 accessions by Belaj et al. (2012), 11.35 (Na) analysing 23 SSR in 361 cultivars from 19 different countries. The average expected heterozygosity (He) 0.84 was higher in comparison to other published works, 0.76 (La Mantia et al. 2005), 0.68 (Lopes et al. 2004), 0.62 (Belaj et al. 2012), indicating remarkably large genetic variation amongst the germplasm studied from these three Southern Italian regions. For all 12 utilised SSR markers, PIC values of at least 0.67 were recorded, indicating that all loci were highly informative and suitable for individual identification. The value of the total probability of identity (PI) was very low (7.23 × 10−14), demonstrating that the 12 SSR markers used in our study were extremely powerful at genotyping olive cultivars. The presence of “rare alleles” also revealed great genetic variability within the olive germplasm of Southern Italian regions.

The groups identified by morphological analysis did not correspond to the clustering based on molecular analysis, except for the cluster of four Calabrian cultivars (‘Ghiastrina’, ‘Ottobratica Perciasacchi V.J.’, ‘Ottobratica Perciasacchi V.T.’ and ‘Ottobratica std’), which shared from 78 to 84 % of morphological traits. They also showed very few differences at the DCA04 locus and only one at the locus DCA09 occurring in ‘Ottobratica std’, and the pair of cultivars ‘Rotondella’ and ‘Tonda Campana’ which differed by only one allele at the locus DCA07. These differences seemed of somatic origin and can be considered insufficient to have originated by sexual reproduction in a species such as olive, predominantly allogamous and with a high degree of heterozygosity at the genomic level (Zohary and Spiegel-Roy 1975). However, these putative cases of synonymy deserve further investigation with additional molecular markers. A close proximity has also been observed in the morphological analysis between ‘Biancolilla Campana’/‘Carpellese’, that were undistinguishable at the SSR level, but it was not retained among ‘Olivo di Mandanici’ (Sicily) and ‘Chianota’ (Calabria) also identical at the SSR level. Twelve of the 25 morphological characteristics analysed differed for ‘Olivo di Mandanici’ and ‘Chianota’, whilst ‘Biancolilla Campana’ and ‘Carpellese’ differed in eight of them. Concerning ‘Biancolilla Campana’ and ‘Carpellese’, the differences were minutiae regarding the pit shape and pit bundles. Regarding ‘Olivo di Mandanici’ and ‘Chianota’, with the exception of the parameter fruit veraison and fruit shape (length/width), other differences concerned four fruit traits and five pit traits. However, as these cultivars were from extremely close areas of cultivation it is likely that they share a strict relationship, further analysis is needed to clarify this point.

Noteworthy is the fact that the Mantel tests showed significant, albeit weak, correlations between genetic/morphological, morphological/geographic and genetic/geographic data. As already stated by several authors and confirmed by our results, it is likely that multi-local selection and breeding of olive cultivars occurred in each area of origin (Besnard et al. 2001a; Owen et al. 2005; Dominguez-Garcia et al. 2012).

Parentage simulation performed with CERVUS allowed identification, for the first time, of ten statistically significant cases of putative parent/sibling. In most of the cases, these cultivars originated within the same region from very close areas of cultivation. Therefore, it is evident that olive diversification arose not simply through the occurrence of somatic mutations but also through sexual reproduction. Concerning the Sicilian cultivar ‘Giarraffa’ and the Campanian ‘Ravece’, which were found to be related, as well as ‘Olivo di Mandanici’ (Sicily), ‘Chianota’ and ‘Olivo a Rappu’ from Calabria, the Campanian ‘Tonda Campana’ and the Calabrian ‘Ottobratica Rotondella’, our analysis confirmed the occurrence of hybridization between cultivars or exchanges of plant material. These results were partially confirmed by the canonical discriminant analysis performed on morphological variables. Regarding ‘Minuta Zungri' and ‘Minuta Maierato’, we can assume there is a certain degree of relatedness, as they are from the same area of cultivation—the Southern Tyrrhenian coast of Calabria. Local growers have also commented on their morphological similarity, together with the cultivar ‘Ciciarello’. ‘Tombarella’ and ‘Carolea’, both having large fruit and originate in the Ionian side of Calabria, are likely to be strictly related and possibly derive from landraces or genotypes introduced by the Greeks. For similar reasons the Campanian cultivars ‘Caiazzana’/‘Femminella’/‘Racioppella’, and the three Sicilian cultivar pairs ‘Cerasuola’/‘Nocellara del Belice’, ‘Minuta’/‘Vaddarica’ and ‘Crastu’/‘Ogliarola Messinese’ may also be related. As regard, ‘Ghiastrina’/‘Ottobratica Perciasacchi V.J.’/‘Ottobratica Perciasacchi V.T.’/‘Ottobratica std’ and ‘Ottobratica Rotondella’/‘Rotondella’/‘Tonda Campana’ (also indicated as putative siblings by CERVUS), we are more inclined to presume that these are bud sport mutations, originating from somatic mutation events, and not a result of sexual reproduction. The UPGMA dendrogram showed relationships between putative parent/sibling groups, which were supported by bootstrap values. Although we are aware that variants of the same length may be homoplasic, we are quite confident that these putative parent/sibling cases may be genuine, especially because none of the SSRs used in our study presented the drawback of allelic homoplasy, as demonstrated by Baldoni et al. (2009) through SSR sequencing.

In Italy, olive cultivars with large fruits are used as table olives, whilst small-fruited cultivars are used for olive oil production. In some of previously published molecular analyses, groupings of olive cultivars based on their usage and fruit size were found (Fabbri et al. 1995; Besnard et al. 2001b; Belaj et al. 2001). In our study, we found an association between the presence of some alleles of the loci DCA03, DCA04 and GAPU101. However, so far no association between olive fruit size and SSRs or other molecular markers has been reported in any of the published linkage maps (De la Rosa et al. 2003; Wu et al. 2004; Khadari et al. 2010; Dominguez-Garcia et al. 2012).

The structure analysis, with a K value equal to 3, gave us an understanding of genetic relationships at the micro-scale level between and within the three olive germplasm collections. Sicilian cultivars had the highest membership value and a more homogenous genetic makeup, probably due to geographical isolation. The Calabrian and the Campanian cultivars seemed to have a less distinct genetic structure, more intermixed, possibly because they shared putative ancestors. Six Campanian cultivars presented genetic affinity with the Sicilian germplasm, and six Sicilian cultivars had a membership value lower than 0.1, confirming plant material exchanges between regions. This may explain the non-grouping of ‘Brandofino’, ‘Bottone di Gallo’, ‘Caiazzana’, ‘Ravece’ and ‘Ortice’ with cultivars of the alleged germplasm sources in the cluster analysis based on morphological traits. However, on the other hand, the presumable different genetic background of ‘Minuta’, ‘Nasitana’, ‘Olivo di Mandanici’, ‘Santagatese’ and ‘Verdello’ was not evident from the cluster analysis and the CDA based on morphological traits.

The lack of complete correspondence between morphological and molecular data indicates that although morphological characterisation is useful when describing cultivars, it is not sufficient to reflect olive genetic diversity. It is possible that cultivars with different genetic backgrounds tend to assume similar forms under the pressure exerted by human selection and agronomic conditions. DNA marker analysis is, therefore, fundamental in order to depict the level of genetic structure and to discriminate between olive varieties showing similar phenotypes, given that morphological characterisation provides different information, often leading to overestimate or underestimate the real level of genetic diversity.

Our results will be used to construct an olive database based on morphological, phenological and molecular data of most Southern Italian regions, which will allow comparison and identification of cultivars and the exchange of reliable genetic material among institutes for future research.