Introduction

The olive tree (Olea europaea L.) is currently the most widely cultivated temperate fruit crop in the world and is characterised by an extensive legacy of clonally propagated traditional cultivars (Bartolini and Petrucelli 2002; FAO 2008; Rallo 2005). This great genetic diversity is the result of empirical and local selection of exceptional trees since the olive was domesticated about 6,000 years ago in the Middle East (Zohary and Spiegel-Roy 1975). Commercial shipping and human migrations spread the olive westward across the Mediterranean Basin, leading to complex genetic relationships among cultivars (Besnard et al. 2013).

In the last 20 years, important socioeconomic changes in many Mediterranean countries have driven significant technological improvements in olive cultivation. These changes are increasing the risk of genetic erosion of olive germplasm because local traditional cultivars are being replaced by a few cultivars that are suitable for the new mechanically harvested plantations. Therefore, the identification and conservation of traditional olive cultivars are currently high-priority tasks that are needed to ensure the sustainable use of those cultivars in the future (Rallo et al. 2013).

Germplasm banks are facilities that are designed to achieve this goal by providing characterisation and long-term “ex situ” conservation of genetic resources. Clonally propagated fruit crops such as olive are typically conserved in “live collections”, which are suitable selected field plantations where the crop can fulfil its normal biological cycle (van Hintum et al. 2000). Almost 100 collections of olive genetic resources have been established, both in the Mediterranean Basin and in new olive-growing regions of the world, such as North and South America, Australia, China and South Africa (Bartolini et al. 2005). The World Olive Germplasm Bank of Cordoba (WOGBC) was established in 1970 and it is currently one of the largest olive germplasm banks; WOGBC contained 499 accessions from 21 countries at the time that this study was undertaken (Caballero et al. 2006). In 2003, a second world olive germplasm bank was established at the experimental orchard of Tessaout, in Marrakech, Morocco. This bank contains olive cultivars from other collections, such as the WOGBC, as well as local genetic resources (Haouane et al. 2011).

Initially, olive cultivars were named for their outstanding morphological traits or utility of production. Denominations are also frequently based on the locality of origin of the propagating material (Rallo 2005). Consequently, synonymies (different names for the same cultivar) and homonyms (the same name for different cultivars) are extremely frequent among and within olive-growing countries (Barranco et al. 2000a). Additionally, the occurrence of clonal mutations, which may or may not have some phenotypic expression, make the characterisation of olive cultivars a challenging process that requires experience in both morphological and molecular identification (Cantini et al. 2008; Corrado et al. 2009; Díez et al. 2012).

To ensure an efficient conservation of olive genetic resources, the accessions of the WOGBC have been characterised using both, morphological and molecular markers. Initially, morphological markers were used to characterise the accessions by applying the morphological scheme proposed by Barranco and Rallo (1984). Later, a simplified scheme proposed by Barranco et al. (2000a) was adopted as reference by the International Union for the Protection of New Varieties of Plants (1991). This morphological scheme allowed for the identification of 272 different cultivars from Spain (Barranco et al. 2005). Additionally, various molecular markers including isozymes (Trujillo et al. 1995). RAPD (Belaj et al. 2002, 2003a, b) and SSR markers, have been applied to complete the morphological descriptions. SSR markers became the marker of choice for the identification of the entire collection after study their performance in the collection (Belaj et al. 2003c) and following successful experiences in other fruit crops, such as grape, sweet cherry or pear (Bowers et al. 1996; Kimura et al. 2002; Wünsch and Hormaza 2002).

The WOGBC has paid special attention to the authentication of olive cultivars. The concept of cultivar authentication has primary been used inthe context of modern food technology to guarantee that the commercial, edible product matches the cultivar specified on the label (Downey and Boussion 1996; Melchiade et al. 2007; Mouly et al. 1997). Similarly, we considered an accession of the WOGBC to be authentic if it matched samples coming from the area of origin of the putative cultivar to which it belongs. Therefore, authentication guarantees that a cultivar distributed worldwide corresponds to the original cultivar growing in its area of origin. The conservation of a reference collection of endocarps has proved to be a useful tool for this purpose. Unfortunately, authentication is a pending task in most olive collections. Nevertheless, authentication should be an essential pre-requisite for exchanging plant material among collections, researchers and the nursery industry to avoid the extended confusion among denominations and true-to-type cultivar names reported in most olive cultivar collections around the world (Bartolini et al. 2005).

This is the first study describing the complete process of characterisation, identification and authentication of a world olive germplasm collection. We used 33 select SSRs complemented by the use of morphological markers to achieve three main goals: (a) the characterisation, identification and authentication of the olive cultivars in the germplasm collection; (b) the establishment of a consistent, easy to manage and affordable protocol for the management of olive germplasm banks; and (c) the detection of common errors arising from any misnaming and mislabelling that occurs during the processes of sending, receiving or propagating plants.

In summary, this study attempts to ease and encourage the labour of other olive germplasm banks by describing a management pipeline to efficiently preserve olive genetic resources. To support this overarching goal, we also provide the SSR profiles and endocarp morphological descriptions of the cultivars analysed herein.

Materials and methods

Several common terms related to the management of a germplasm bank could result in confusion for readers unfamiliar with this field. For this reason, we have included a glossary with critical terms used in this study (Online resource 1).

Plant material

We studied 824 trees belonging to 499 accessions of cultivated olive (O. europaea L.) from 21 countries of origin (Table 1). These accessions are conserved in live collections in the WOGBC located at the Instituto de Investigacion y Formacion Agraria y Pesquera de Andalucia centre “Alameda del Obispo”, Cordoba (Southern Spain). Each accession was given a unique identifier composed of the letters COR plus six digits corresponding to the chronological arrival of the sample to the WOGBC (i.e. COR000789).

Table 1 The number of olive accessions and trees analysed per country and the number of genotypes and cultivars that were identified and authenticated after the identification process

For 260 accessions, only one tree was available for analysis, while each of the remaining 239 accessions from two to eight trees were analysed to confirm their identity (Online resource 2). All the accessions were planted on their own roots except four accessions that were grafted onto the cv. Oblonga.

DNA extraction and amplification

Total genomic DNA was extracted from fresh leaves using the CTAB method described by de la Rosa et al. (2002). DNA quality and quantification were assessed by electrophoresis on 0.8 % (w/v) agarose gels.

A set of 33 SSR markers were selected based on their polymorphism level, PCR amplification reproducibility and easy interpretation among 77 SSR markers developed for the olive (Carriero et al. 2002; Cipriani et al. 2002; de la Rosa et al. 2002; Gil et al. 2006; Sefc et al. 2000). The selection of the most suitable set of SSR markers was carried out by testing their performance (clear amplification, easy interpretation and polymorphism) on a set of 48 different cultivars from the WOGBC. These cultivars were selected according to their variability and representativeness of the original collection (Díez et al. 2012).

Differences of 1 bp between alleles were checked by re-amplification to establish whether a coding error had occurred. Whenever present, replicates of the same accessions were compared. In cases of mismatch, both replicates were analysed again using newly collected plant material. These rare differences were considered correct SSR profiles if the differences were confirmed in the second amplification.

The SSR amplification was performed in a total volume of 20 μl, containing 2 ng of genomic DNA, 1× supplied PCR buffer (Biotools, Spain), 200 μM of each dNTP (Roche), 0.25 units of Taq DNA polymerase (Biotools, Spain) and 0.2 μM of forward (fluorescently labelled) and reverse primers. The PCR reactions were carried out on a thermal cycler (Perkin-Elmer-9600) using the following program: denaturation at 94 °C for 5 min, 35 cycles of 94 °C for 20 s, 50 °C for 30 s and 72 °C for 30 s and a final extension at 72 °C for 7 min. Detection of amplification products was carried out with an automated sequencer ABI 3130 Genetic Analyser (Applied 181 Biosystems/HITACHI) using the internal standard GeneScan 400 HD-Rox. Two cultivars, Arbequina and Frantoio, were used as controls in all runs.

Data analysis

Genetic characterisation by SSR markers

The allele profiles were sized (basepairs) and characterised using Genescan 3.7 (Applied Biosystems). The following parameters were calculated for each SSR locus using the Power Marker V3.23 (Liu and Muse 2005) software package: average number of alleles; number of alleles making up each genotype, number of unique alleles; observed heterozygosity (Ho); expected heterozygosity (He) and polymorphism information content (PIC) (Botstein et al. 1980). Null allele frequency per locus was tested using the program Cervus v 2.0 (Marshall et al. 1998).

The genotypes were discriminated by the pair-wise comparison of their SSR profiles using “Excel Microsatellite Toolkit” (Park 2001). To evaluate the genetic relationships among the different genotypes, a matrix containing only the different SSR profiles was built, with amplified alleles scored as present (1) or absent (0). This matrix was used to perform a cluster analysis based on the unweighted pair group method with arithmetic mean (UPGMA) algorithm using Dice’s similarity index (Dice 1945) implemented in the statistical software NTSYS-PC v2.02 (Rohlf 1998).

The characterisation of the accessions by SSR markers (Online resource 2) was complemented by the use of morphological markers to (a) authenticate the putative cultivars; (b) check whether small allelic differences between pairs of genotypes led to differential phenotypic expression in the selected endocarp traits; (c) confirm cases of synonymy; and (d) detect possible errors during the propagation and establishment of the collection.

Morphological characterisation

The morphological characterisation was independently carried out by three trained observers, using a representative sample of 40–50 endocarps per tree and during at least 2 years. We evaluated a minimum of 11 characteristics of the endocarp included in the pomological scheme developed and described by Barranco et al. (2000a, 2005): weight, shape in position A, symmetry in positions A and B, position of maximum transverse diameter in position B, shape of apex in position A, shape of base in position A, roughness of surface, number of grooves on basal end, distribution of the grooves on basal end and presence of mucro (Online resource 3). The morphological profile of each sample was the combination of its level of expression for each one of the 11 endocarp traits that were evaluated. We confronted the morphological profiles of the samples conducting pair-wise comparisons between them. The traits of endocarps are the most discriminating and stables ones, while other characteristics, such as those of the fruit are more influenced by environmental conditions. Moreover, endocarps may also be conserved for a long time and they are easily exchanged among collections. For these reasons, the description of the endocarp has been frequently used to catalogue olive cultivars (Barranco et al. 2000a, 2005; Fendri et al. 2010; D’Imperio et al. 2011)

Authentication and denomination of the cultivars

The identified cultivars were authenticated by their comparisons with control samples of endocarps coming from the corresponding countries and areas of origin. Recently, a comparison with control DNA samples has been added to the authentication process to complement the morphological control. These control samples are part of the reference collection, which progressively is being established in the Department of Agronomy at the University of Cordoba, Spain. The accessions involved in cases of synonymy and homonymy as well as those whose SSR profiles did not match any cultivar present in the WOGBC, were re-named following previously described criteria (Barranco and Rallo 1984; Barranco et al. 2000a, 2005; Caballero et al. 2006) (Online resource 2).

Results

Overall genetic diversity and nested sets of SSRs for identification purposes

A total of 466 alleles were amplified from the entire collection, of which 67 were unique alleles (present in only a single genotype) (Table 2). The number of alleles per SSR ranged from 5 (GAPU82) to 36 (ssrOeUA-DCA10), with an average of 14.12 alleles per locus. Allele frequencies varied between 0.001 and 0.93; it is noteworthy that 254 alleles (53.3 %) showed a frequency below or equal to 0.01. The allelic differences between pairs of different genotypes ranged from 1 to 59 alleles, with an average of 40.74 alleles (Online resource 4).

Table 2 Diversity parameters of the 33 SSR markers used in this study characterising the entire collection of olive germplasm: size range (base pairs), number of alleles (Na), number of unique alleles (Nu), observed (Ho) and expected (He) heterozygosity, null allele frequency (An), polymorphic information content (PIC) and whether or not the SSRs was involved in the description of molecular variants (MV)

The Ho varied between 0.973 (GAPU-101) and 0.168 (GAPU-11e17), with an average of 0.65 (Table 2). The He ranged from 0.875 (UDO99-043) to 0.324 (GAPU82), with a mean of 0.69. Twenty-six SSR markers had a PIC value higher than 0.5 (Table 2).

Despite the valuable information given by the set of 33 SSRs in terms of genetic variability, to use it for routine identifications could be quite time-consuming and expensive. For this reason, we also defined three nested sets of SSRs for identification purposes that were selected according to their discrimination capacity in the WOGBC (Fig. 1). Five SSRs (UDO99-043, ssrOeUA-DCA9, ssrOeUA-DCA16, ssrOeUA-DCA3 and GAPU101) could be used to distinguish between 79 % of the accessions. Ten SSRs (UDO99-043, ssrOeUA-DCA9, ssrOeUA-DCA16, ssrOeUA-DCA3, GAPU101, ssrOeUA-DCA11, ssrOeUA-DCA4, UDO99-005, sseOeIGP07 and GAPU89) could be applied to discriminate between 93 % of the accessions. A larger set of 17 SSRs (UDO99-043, ssrOeUA-DCA9, ssrOeUA-DCA16, ssrOeUA-DCA3, GAPU101, ssrOeUA-DCA11, ssrOeUA-DCA4, GAPU103, UDO99-005, ssrOeIGP7, GAPU89, ssrOeUA-DCA18, ssrOeUA-DCA8, ssrOeUA-DCA10, GAPU82, UDO99-042 and ssrOeUA-DCA15) was needed to discriminate between 100 % of the accessions from the WOGB.

Fig. 1
figure 1

Percentages of genotypes discriminated using an increasing number of SSR markers (one to seventeen SSRs). The red dashed lines indicate the five and ten SSRs sets to appreciate their discrimination capacity

Identification of the WOGBC by SSR markers

The set of 33 SSR markers amplified 411 different genotypes among the 824 trees belonging to 499 accessions analysed herein (Table 1). The process of identifying each accession is compiled in the Online resource 2; each different genotype is coded with an ordinal number (Online resources 2 and 5).

To discriminate between the 411 different genotypes, we first compared the SSR profiles of the trees within accessions and those accessions represented by only one tree. The results for the 499 accessions (Online resource 2) were distributed between three groups as follows: (a) 297 accessions gave rise to unique SSR profiles (not duplicated in any other part of the entire collection); (b) 166 accessions had SSR profiles in common with other accessions resulting in the identification of 80 different SSR profiles. For instance, we detected the same genotype among accessions coming from different countries, including Picholine Marocaine (COR00101 and COR001479) from Morocco, Mission de San Vicente (COR001133) from Mexico and Mission Nieland (COR000716) from the USA. We considered these last two accessions to be duplicated and a case of synonymy because their denominations were accepted and well known in their respective areas of origin; (c) 36 accessions accounted for more than one genotype, being some of them duplicated within this group. Therefore, this third group gave rise to 34 different SSR profiles after pairwise comparisons (Online resource 2). We found, for example, that the accession Manzanilla Picua (COR000377) from Spain showed two different SSR profiles and that Zaity (COR000788) from Syria had three different profiles. These cases could be due to the accidental mixture of plants and mislabelling during the propagation phase.

Complementing SSRs with morphological markers to identify olive cultivars

The analysis of the entire WOGBC by 33 SSR markers was complemented with the evaluation of 11 endocarp traits, allowing the identification of 332 different olive cultivars (Online resources 2 and 3).

The 11 endocarp traits were polymorphic, discriminating 246 different morphological profiles that were coded with an ordinal number (Online resources 1, 2 and 3). As the morphological traits were approximately half as powerful as the use of molecular markers (246 morphological vs. 411 SSRs profiles), we used the evaluation of the endocarp as a complement for the identification of olive cultivars.

We found 61different genotypes sharing the same endocarp profile than other cultivars in the collection. In those cases, the information given by the SSRs markers was taken as main criteria to consider the genotypes to be different cultivars. We finally applied the same criteria to 25 cultivars without data for endocarp profiles until their endocarps are available in the collection.

We paid special attention to the pairs of different genotypes, which presented small allelic differences between them sharing high similarity indexes (0.9). We checked the consistency and reproducibility of these small allelic variations by re-amplification using new plant material. Up to 25 SSRs were involved in the amplification of these allelic differences, independently of their length and repetitive motif (Table 2). If the SSR variability was confirmed, we proceeded to compare the endocarps of the accessions with their closest cultivars. If no morphological differences were observed, we considered the accessions to be molecular variants, representing cases of intracultivar variability of their closest cultivar. Alternatively, if the endocarp morphology was different, we consider them to be different cultivars. In these latter cases, previous information about the morphological characterisation of other organs as well as about their agronomic performance, if available, was also taken into account to take the decision. Therefore, we found 130 genotypes that could be considered as molecular variants of 48 different cultivars because any morphological difference in the endocarp was observed between them (Table 3; Figs. 2 and 3a). By contrast, only four pairs of cultivars (Chemlali-744-Chetoui, Azulejo-Manzanilla Cacereña, Cordovil Castello Branco-Verdial de Badajoz and Zarza-Lechín de Sevilla) showed identical or nearly identical SSR profiles but presented morphological differences (Fig. 3b). We would like to make clear that the cv. Chemlali mentioned above, is not the cv. Chemlali de Sfax, one of the most important cultivars in Tunisia. The accession CO000744 arrived at the WOGBC with the generic denomination of Chemlali, a common homonymy in Tunisia; however after the identification process, it was determined to be almost identical to cv. Chetoui, also from Tunisia. To avoid confusion between this accession and the well-known cv. Chemlali de Sfax, we added a code number to the original accession name, being from now on Chemlali-744.

Table 3 Information on cultivars with molecular variants including their area of cultivation, the number of molecular variants (no. MV) and Dice similarity index (SI) range
Fig. 2
figure 2

UPGMA dendrogram based on the Dice similarity index of 48 different cultivars with molecular variants. Numbers in red indicate groups of molecular variants of each cultivar whose name is specified in the rightmost box. The morphological characterisations are provided in Online resource 2

Fig. 3
figure 3

Contrasting patterns of molecular and morphological differences among accessions. a Endocarps belonging to three synonymous accessions (Mission Nieland, Sigoise and Menara) that exhibit subtle genetic differences (similarity index, 1–0.983) but no morphological differences from their closest cultivar “Picholine Marocaine.” b Endocarps belonging to cultivars “Zarza” and “Lechín de Sevilla,” which exhibited subtle genetic differences (SI = 0.991) and clear morphological differences

Authentication process, synonyms and homonyms

We performed the authentication of 200 cultivars, 172 by comparisons of their endocarps with those of the same cultivar coming from its originating country and 28 by both endocarp and SSR profiles. For example, we authenticated the accessions COR000231 and COR001477, named Arbequina, by comparing their SSR profiles and endocarps with the authentic control sample of the cv. Arbequina coming from Catalonia, Spain, its natural area of origin. By contrast, 28 cultivars did not match their respective authentic control samples (Table 1; Online resource 2). The remaining 104 cultivars (31 %) could not be authenticated because their corresponding authentic reference samples of endocarps from the countries of origin were not available in our collection.

The authentic control samples were also very helpful to determine 37 new and 15 previously described, cases of synonyms among cultivars (different names for the same cultivar used in different growing areas). For instance, a new synonymous group was identified that included four names of the cv. Picholine Marocaine, originally from Morocco; this group was formed by the accessions Alameño de Marchena (COR000254) and Cañivano Blanco (COR000052) from Spain, Mission Nieland (COR000716) from the USA, Haouzia (COR000835) and Menara (COR000836) from Morocco and Sigoise (COR000119) from Algeria (Table 4).

Table 4 Cases of synonyms found in the identification process at the WOGBC

When a synonym was described, the cultivar was given the denomination that it holds in its wider and original area of cultivation. For instance, in the previously mentioned case, the name of the cultivar was Picholine Marocaine. Similarly, we previously found that Frantoio and Oblonga were synonymous denominations used for the same cultivar in Italy and the USA, respectively. Nevertheless, Frantoio was chosen, as the reference name because Italy is the putative area of origin for this cultivars and it is cultivated in a significantly greater area in Italy compared with in the USA.

Several possible synonymous cases are pending confirmation because of the lack of authentic control samples from their countries of origin, including the pairs: Abbadi-AbbadiShalal, AbadiAbou Gabra-1033-Bent Al kali, Adramitini-Ayvalik, Ayrouni-Verdial de Huévar, Kokerrmadh Berati-Frantoio, Kusha-Mixani, Pecoso-Pico Limón, Torcio de Huelma-Nevado Rizado and Sevillana-Sevillana de Abla, which could be synonymous of the cultivars Bent al Kali, Ayvalik, Verdial de Huevar, Frantoio, Mixani, Pico Limón, Sevillana de Abla, respectively (Online resource 2).

Additionally, seven new cases of homonyms (the same name used for different cultivars) were discovered. For example, the denomination Toffahi, which included two different cultivars, cv. Toffahi from Egypt and cv. Toffahi-1,000 from Syria. Similarly, the denomination Trylia included two cultivars, cv. Gemlik from Turkey and cv. Trylia-992 from Syria (Table 5). The accessions that did not match any identified cultivar from the WOGBC kept their original name followed by the significant digits of their accession code at the bank. For example, the accession COR000361, named Chorruo de Castro del Rio, was renamed as Chorruo de Castro del Rio-361 after ensuring that it did not match the cv. Chorruo de Castro del Rio (Online resource 2).

Table 5 Cases of homonyms found in the identification process at the WOGBC

Propagation errors and mislabelling

Possible errors (within and among accessions), which might occur at any step during the establishment of the plants in the collection, were identified. For example, the accession Picual (COR000303) did not match the control cv. Picual at either morphological or molecular markers; however, it matched cv. Pico Limon. Similarly, the accession Hamed (COR000722) from Egypt was reliably identified as cv. Manzanilla de Sevilla from Spain and three accessions labelled as Desconocida (Unknown; COR000954. COR001481 and COR001464) were identified as cvs. Verdial de Huevar, Caballo and Uovo di Piccione (Online resource 2).

Discussion

Overall genetic diversity and nested sets of SSRs for identification purposes

The main purpose of this paper was the identification of the WOGBC and to propose a consistent pipeline to fulfil this goal based on our experience. However, the task of a germplasm bank is not only the identification and conservation of their accessions but also the characterisation of their variability (van Hintum et al. 2000). For this reason, we increased the number of SSRs commonly used for identification purposes (Baldoni et al. 2009; Haouane et al. 2011) not only to improve this task, but also to provide optimised markers to further characterise the WOGBC genetic variability. To do so, we selected a set of 33 SSRs from among 77 polymorphic markers based on their high variability, reproducible patterns, easy interpretation, and polymorphism.

The overall levels of polymorphism, heterozygosity and PIC obtained for the entire collection were consistent with those observed in previous studies, thus corroborating the high variability and heterozygosity values of olive germplasm (Belaj et al. 2003c; Bracci et al. 2009; Cantini et al. 2008; Erre et al. 2010; Koehmstedt et al. 2010; Sarri et al. 2006). The large number of alleles amplified in the entire collection, many of which appeared at low frequencies, also demonstrated this high variability. The international origin of the cultivars, as well as the occurrence and accumulation of punctual mutations throughout the long history of clonal reproduction of these cultivars might have strongly contributed to this variability (Charafi et al. 2008; Díez et al. 2011; Soleri et al. 2010).

The heterozygote deficiency detected at some SSRs loci, such as GAPU82, GAPU-11e17, ssrOeUA-DCA10, ssrOeUA-DCA15 and UDO99-042, could be related to the presence of null alleles at these markers (Table 2). These markers were still selected for identification purposes because of their polymorphism level, clarity of amplification and ease of interpretation. Nevertheless, these markers should be used with caution in parentage and population structure analyses because of their null allele frequency (Dakin and Avise 2004).

The application of regular identification protocols in large collections is costly and labour intensive. Therefore, one of the goals of this study was propose a gradual use three nested sets of SSRs according to their discriminant capacity for an efficient and progressive identification of the olive accessions in a germplasm bank (Fig. 1). The application of five and ten SSRs was enough to discriminate between 79 and 93 % of the genotypes, respectively and 17 SSRs were able to discriminate between all of them. Thus, when a set of samples coming from the survey of a region is received in the WOGB, the first task would be to eliminate duplicated samples. This step could be quickly, easily and most importantly, economically done applying from five to ten SSRs, depending on the level of similarity of the samples. Afterwards, the different genotypes should be better characterised applying larger sets of SSRs with the aim of accurately evaluate their genetic diversity.

Most of the SSRs have already been systematically used at the WOGBC to conduct preliminary identification tasks and have been successfully employed in previous studies focused on the characterisation of olive germplasm (Belaj et al. 2007; Bracci et al. 2009; Díez et al. 2011; Erre et al. 2010; Noormohammadi et al. 2007). It is worth of mentioning that our 5-SSR set was included in the set of 11 SSRs proposed by Baldoni et al. (2009) for olive identification purposes. However, only five and seven markers of our 10-SSR and 17-SSR sets, respectively, were represented in the set of Baldoni et al. (2009). Reciprocally, the 11 SSRs proposed by Baldoni et al. (2009) were included in our large set of 33 SSR markers, with the exception of ssrOeUA-DCA14. Despite the congruency of the minimum 5-SSR between both studies, the different selection of markers in the subsequent sets could be due to two main reasons. First, we screened SSRs that were not tested by Baldoni et al. (2009), for example those reported by Gil et al. (2006). Second, the SSRs proposed by Baldoni et al. (2009) were selected using a set of 21 cultivars and tested in a sample of 77 accessions. By contrast, our SSR sets were selected using a group of 48 cultivars and tested on the 499 accessions contained in the WOGBC.

Identification of olive cultivars

The SSRs profiles were complemented with the evaluation of the endocarp traits to identify 332 olive cultivars in the WOGBC; similar methods have been employed in other fruit species, such as chestnut, sweet cherry, strawberry or grape (Garcia et al. 2002; Ganopoulos et al. 2011; Martín et al. 2009; Zulini et al. 2005).

The information given by the SSRs markers was taken as main criteria to considered the different genotypes to be different cultivars having into account that: (a) the discrimination capacity of the SSRs markers was the double than the capacity exhibited by the endocarp traits (411 vs. 246 profiles); (b) the phenotypic changes derived from genetic differences might be expressed in other not evaluated organs or features, such us leaves, tree architecture or fatty acid compounds. This last reason remarks the relevance of the phenotypic characterisation of the cultivars in a germplasm bank.

Coupling between molecular and phenotypic differences is a classic controversial topic in the identification of cultivars (Staub and Meglic 1993). Qualitative morphological traits have been successfully used for cataloguing the olive cultivars of Spain (Barranco et al. 2005). Indeed, endocarp traits were successfully used to overcome the initial confusion originated by cases of synonymy and homonymy for two main reasons: first, they were the most discriminating and stables morphological traits (Barranco et al. 2000a, 2005); and second, endocarps could be indefinitely preserved and exchanged among collections. Despite endocarp traits present these positive features; SSR markers have greatly overcome their discrimination capacity discarding environmental variation effects. Nevertheless morphological descriptors are still necessary to complement the UPOV descriptions and they have been very helpful in the identification process of the WOGBC. We described four pairs of cultivars (Chemlali-744-Chetoui; Azulejo-Manzanilla Cacereña; Cordovil CasteloBranco-Verdial de Badajoz and Zarza-Lechin de Sevilla) (Fig. 3b) with the same or highly similar SSR profiles that showed morphological and even agronomical differences (Barranco et al. 2005). We hypothesised that these pairs of cultivars might have originated from the same cultivar through punctual somatic mutations, which might trigger major morphological changes without affecting the amplified SSR regions. However, this is only a hypothesis, and further research would be necessary to disentangle the responsible genetic mechanisms underlying this variation. Nonetheless, we detected cases such as Lechin de Sevilla and Zarza (Fig. 3b), where the morphological differences were so evident that the effect of environmental variation might be just a secondary force. These cases highlighted the usefulness for complementing molecular markers with morphological descriptors to characterise and identify olive cultivars.

Different cultivars and molecular variants

We described 130 genotypes that could be considered as molecular variants of 48 different cultivars because there were no morphological differences in the endocarp and only small genotypic differences (Table 3; Fig. 2). Despite the possibility of genotyping errors, which must always be taken into account, the small genotypic differences could be due to somatic mutations and consider intracultivar variation. Then, having into account this phenomenon and in terms of cultivars, the application of the set of five SSRs previously described, discriminate between 93 % of the cultivars; 10 SSRs were sufficient to distinguish between 100 % of the cultivars, albeit without revealing intracultivar variation; and 17 SSRs were able to discriminate between all cultivars the collection including intracultivar variation. This singularity has been frequently observed in olive using different molecular markers, including RAPD, AFLP and SSR markers (Banilas et al. 2003; Charafi et al. 2008; Cipriani et al. 2002; Díez et al. 2011; Garcia-Diaz et al. 2003; Khadari et al. 2008). Several cases of somatic mutations giving rise to variation within cultivars has often been described in grape, a fruit crop that shows a remarkable resemblance with olive in regard to its diffusion history, propagation system and diversity of cultivars (Riaz et al. 2002; This et al. 2006). The profuse detection of somatic mutations in these crops might be due to two facts: (a) traditional cultivars have been continuously clonally propagated and could have accumulated somatic mutations without accompanying phenotypic consequences in crop morphology and agronomic performance (Díez et al. 2011; Riaz et al. 2002; and (b) mutations are more likely to occur in highly variable and neutrally evolving genomic regions such as SSRs. Additionally, highly polymorphic di-nucleotide SSRs are the most widely used for identification purposes in olive and other clonal fruit crops (Baldoni et al. 2009; Haouane et al. 2011; Irish et al. 2010; Laucou et al. 2011; Motilal et al. 2011; Zhang et al. 2009). For these reasons, new SSRs with core repeats from three to six nucleotides long have recently been developed in grape (Cipriani et al. 2008) as well as in olive (González-Plaza et al. 2011). The polymorphism, stability and transferability of these markers are currently being assessed in accessions from the WOGBC.

Authentication process, synonyms and homonyms

The authentication process guarantees that a cultivar, which is being distributed worldwide, corresponds to the original cultivar growing in its putative area of origin. Therefore, authentication should be an essential requisite before a germplasm bank is able to distribute plant material to other collections, researchers and nursery plant certification agencies.

In this study, 200 accessions were authenticated, 172 based only on their identity with control samples of the endocarp from their original growing area and 28 by the comparison of both endocarp and DNA control samples. DNA profiles are a valuable complement to the authentication of cultivars with subtle morphological differences or when the technicians are not well trained in the morphological characterisation. The authentication of the entire collection was not possible due to the lack ofcontrol samples in the WOGBC (Table 1; Online resource 2). Several factors hinder the possibility of developing a complete collection of authentic control samples of all endocarps such as: (a) the absence of characterisation studies of olive cultivars in many olive growing countries; (b) the fact that several old local cultivars are disappearing from their autochthonous areas of origin and c) the incomplete or erroneous passport data associated with many accessions received in germplasm banks that make their comparison with control samples impossible (Trujillo et al. 2006).

Authentic control samples were helpful resources to compare and describe cases of synonyms and homonyms. We found 43 new cases of synonyms among the cultivars (Table 4). Olive growing has been traditionally linked to human migrations, which may have blurred the fingerprints of independent domestication events and led to complex relationships among cultivars (Baldoni et al. 2006; Bracci et al. 2009; Díez et al. 2011; Koehmstedt et al. 2010; Sarri et al. 2006; Soleri et al. 2010). Lately, the globalisation process has reinforced this trend by increasing the movement of olive cultivars within and among countries. The synonymy case revealed by this study involving the accessions named Alameño de Marchena (COR000254; Spain), Cañivano Blanco (COR000052; Spain), Haouzia (COR000835; Morocco), Menara (COR000836; Morocco), Mission Nieland (COR000716; USA) and Sigoise (COR000119; Algeria), which all corresponded to the main Moroccan cultivar cv. Picholine Marocaine, is a clear example of this phenomenon. This case is especially remarkable since the cv. Mission was thought to have originated from California (USA) and to be part of the olive production identity of this area (Soleri et al. 2010). According to our results, cv. Mission Nieland and cv. Picholine Marocaine are synonymous; this result is in accord with those obtained in the USA by Koehmstedt et al. (2010) between Zitoun, a well-known synonymous name of Picholine Marocaine (Barranco et al. 2005; Khadari et al. 2008), and Mission Nieland.

The correct identity of each tree is essential to avoid the propagation of mislabelled cultivars that could have harmful consequences for the nursery industry, research and breeding programs. Despite the actions taken to avoid errors, we found that trees of 36 (7.2 %) accessions corresponding to 34 different genotypes (8.2 %), did not share the same SSR profile. Similarly, errors due to mislabeling have been reported for French olive cultivars (Khadari et al. 2003), apple (Evans et al. 2011), Cicer (Shan et al. 2005), persimmon (Badenes et al. 2003) and cacao (Motilal and Butler 2003).

WOGBs and international networks

International initiatives are currently being established to identify and authenticate the accessions included in the WOGBs of Cordoba and Marrakech using a common protocol and for gathering a complete worldwide collection of authentic control samples.

The molecular characterisation efforts at the WOGB of Cordoba and Marrakech used 11 SSR markers in common (Haouane et al. 2011). Nine of these markers were slightly more polymorphic and had greater allelic richness in the Marrakech collection than in Cordoba; however, these values were not significantly different (Kruskal–Wallis test; P > 0.05; Online resource 6). This similarity in genetic diversity is remarkable given the distinctive composition and geographical areas represented in each collection. The collections only share the names of 153 accessions (Haouane et al. 2011). The WOGBC includes a large proportion of samples from the northern shore of the Mediterranean Basin along with a large collection of Spanish cultivars, but North of Africa is poorly represented. By contrast, the Marrakech collection, with 505 olive genotypes from 24 countries, has a more balanced composition of the Mediterranean olive growing areas, with countries such as Morocco, Algeria and Egypt well represented (Haouane et al. 2011). The distinctive and rich composition of both WOGBs highlights the need for cooperative international projects to characterise the collections and to guarantee the efficient conservation of their genetic resources.

Despite the international efforts, we will never be able to authenticate the cultivars that have already disappeared from the areas where they were once sampled. This real case of genetic erosion should be taken into account when promoting the characterisation and conservation of olive cultivars in those countries where this task has not even begun. The combined application of a minimum of morphological traits and SSRs can be an efficient and affordable tool for the identification of olive cultivars in those countries where germplasm characterisation is still in the early stages. Recently, high-throughput markers have been applied to the characterisation of olive cultivars (Belaj et al. 2012) and other fruit crops, such as grape (Myles et al. 2011). These methods provide powerful tools for both deep genetic characterisation and association mapping to be applied in the near future.

Conclusions

The identification of the accessions of any olive germplasm bank should be compulsory before distribution of any plant material from that bank. Only diffusion of true to type cultivars will avoid the worldwide confusion between denominations and cultivars existing in almost any world germplasm collection (Bartolini et al. 2005). This work illustrates how the use of 17 SSR and 11 endocarp traits has allowed for the identification of different sources of error (propagation, mislabeling, inconsistent denominations). In addition, the method led to the establishment of synonymies, homonymies, intracultivar molecular variability and partial authentication of cultivars housed in one of the world’s largest olive germplasm banks. Our protocol could be useful for managing olive germplasm banks and identifying the true to type cultivars to be preserved and exchanged.