Introduction

The olive (Olea europaea L. subsp. europaea) is one of the oldest and economically most important fruit species in the Mediterranean Basin countries. Olive became highly valued in these countries due to its many different uses, including both the wood and the fruit. Wood may be used for furniture and firewood, and fruit as table olive and to extract oil, used for cooking, lighting and in practical medicine and cosmetics (Owen et al. 2005). Furthermore, in the last years, olive cultivation has been steadily expanding to more geographical zones, in response to increased consumption demand for olive oil owing to its high nutritional value and recognised health benefits. The commercial cultivation of olive has been introduced and successfully managed in subtropical and warm temperate climates, such as Australia, South Africa, USA, China, Argentina and Brazil (do Val et al. 2012; Spennemann and Allen 2000).

Most olive accessions are self-incompatible or partially self-compatible, needing to be fertilised by compatible pollinisers to ensure a commercial yield (Lavee 1986). Ease of propagation by seedlings along with cultivar intercompatibility resulted in a high degree of diversity with a large number of native accessions in areas where the crop had been established. For this reason, identification of genetic diversity and differences in olive accessions is essential for conservation of genetic resources and breeding (Owen et al. 2005).

Germplasm characterization and knowledge of genetic diversity is a key first step for the proper management of genetic resources and in starting the pre-breeding process. Although more than 2,600 distinct olive accessions have been described worldwide (Belaj et al. 2004; Cipriani et al. 2002; Fendri et al. 2014; Muzzalupo et al. 2014; Owen et al. 2005; Rallo et al. 2005; Sarri et al. 2006; Trujillo et al. 2014), analyses at the micro-scale (regional) level are still lacking (Baldoni et al. 2006). Thus, there is an increased interest in studying and collecting old autochthonous olive accessions that could provide a better knowledge of the history of the crop and be used in the breeding programmes to select genotypes that could be better adapted to certain environmental conditions.

Spain is the first producer of olive oil in the world. The available data for 2010 indicate that olive groves account for close to 5 million hectares in the EU, concentrated in Spain (50 %), Italy (26 %) and Greece (22 %). These three countries account for about 97 % of EU olive oil production, with Spain producing approximately 62 % of this amount. In terms of oil quality, in 2010, Spain produced 35 % extra virgin oil and 32 % virgin oil (FAOSTAT 2013). Olive oil consumption has increased about 70 % over the past 20 years, primarily due to the verification of their good nutritive properties and their positive healthy impact. Additionally, the geographical distribution of olive oil consumption remains focussed on its producing area, being Italy (30 %), Spain (22 %) and Greece (10 %) the countries with the highest consumption of olive oil. In countries such as USA, France, UK, Germany, Australia, Japan and Canada, consumption increase has exceeded 100 % in the last 10 years.

In Spain, each province has its own local cultivars, with many olive trees growing spontaneously. Although the largest Spanish producer is Andalusia (Southern Spain) with over 70 % of the whole production, other regions such as Catalonia (Northeastern Spain), Aragon (Northern Spain) or Extremadura (South-western Spain) are also important growing areas. The utilisation of both table and oil accessions is well documented in these regions. Several prospections have also been carried out during the last decades in Valencia, Extremadura, Catalonia and Andalusia (Barranco et al. 2005; Tous and Romero 1993).

Despite the high diversity that is supposed to exist in Aragon, the local cultivars only represent a very small area in olive groves, which might cause plant genetic erosion in the future, resulting in a limited gene pool and, consequently, inbreeding depression in cross-pollination, yield reduction and population bottlenecks in the ensemble of the olive population. Different accessions under the denominations of ‘Empeltre’ and ‘Arbequina’ represent more than 80 % (50 and 32 %, respectively) of the Aragon orchards. Other important accessions, such as ‘Bolvina de Belchite’, ‘Caspolina’, ‘Manzanilla’, ‘Negral de Sabiñán’, ‘Picual’, ‘Royal’ and ‘Verdeña’, represent less than 15 %. Despite this diversity and the risk of its loss, so far only one prospection by a private nursery in 2006 has been undertaken in Aragon. Some of the material collected was previously morphologically characterised (Espada and Gracia 2009; Viñuales 2007). However, olive cultivar differentiation based on morphological descriptions is not particularly reliable, as it can be influenced by environmental conditions and requires skilled staff (Belaj et al. 2001), needing to be complemented by molecular analysis.

Genetic diversity studies in olive have applied different molecular marker systems, which have been shown as valuable tools for identifying and characterising olive genotypes, as well as for examining structure, differentiating among olive accessions, characterising germplasm collections, and exploring cultivar identity (Besnard et al. 2001; Lopes et al. 2004; Sanz-Cortés et al. 2001; Trujillo et al. 1995). Among all markers developed in olive, microsatellites or Simple sequence repeats (SSRs) are, to date, the most used for parentage analysis, DNA fingerprinting of accessions, as well as diversity and even taxonomic studies (Brito et al. 2008; Carriero et al. 2002; Cipriani et al. 2002; De la Rosa et al. 2002; Ercisli et al. 2011; Rallo et al. 2000; Roubos et al. 2010; Sabino Gil et al. 2006; Zaher et al. 2011).

Many important crops have complex population structures that arose from a long domestication and breeding history (Flint-Garcia et al. 2003). During the last decades, molecular data have been greatly applied in studies on genetic diversity, population structure and phylogeography of plant species to understand population dynamics and evolutionary processes (Newton et al. 1999). These methods can also be used for studying genetic structure in germplasm collections. This fact is an important aspect of future association studies (Shriner et al. 2007). General agreement exists among researchers that the incorporation of population structure into statistical models used in association mapping is necessary to avoid false positives (Flint-Garcia et al. 2003). To date, several studies in pear (Ferreira dos Santos et al. 2011), plum (Horvath et al. 2011), sweet cherry (Mariette et al. 2010), peach (Font i Forcada et al. 2013), olive (Breton et al. 2008; Erre et al. 2010), and chestnut (Pereira-Lorenzo et al. 2010) have been carried out. The most popular model-based clustering technique is STRUCTURE software (Falush et al. 2003; Pritchard et al. 2000), assuming that within populations loci are in linkage equilibrium and Hardy–Weinberg equilibrium.

Therefore, this study was undertaken as the first molecular approach for characterising the Aragon olive accessions. SSR markers were utilised to identify these accessions and to determine the genetic diversity and relatedness among them. Additionally, the analysis of the genetic structure of olive was included to assess the impacts of selection and bottlenecks in the olive germplasm.

Materials and methods

Plant material

A total of 163 ungrafted accessions were collected from three different areas of Aragon, NE Spain (Fig. 1; Table 1). Most of them (~90 %) were prospected and characterised in abandoned sites, considered as remnant trees in these regions. Four of them, under the name of ‘Acebuche’ were considered as wild accessions of O. europaea subsp. sylvestris. Most of this plant material was prospected from singular trees that, at the moment of their collection, were actively growing. An important group of accessions was taken from a previous survey of local and old cultivars, comprising 36 accessions previously selected because of their historical value and outstanding size and maintained as monumental trees in an orchard in Morata de Jalón. In addition, ten reference accessions were included in the study, since the names of some accessions were common with those of other Spanish regions.

Fig. 1
figure 1

Map of the region of Aragón (Spain) indicating the location of the Olea europaea samples: filled squares Province of Huesca, filled circles Province of Teruel, and filled triangles Province of Zaragoza

Table 1 Olive accessions classified by their area of location and spread in Aragon, Spain

Amplification of SSR loci and fragment separation

For DNA extraction, leaf samples were collected from young shoots from the upper part of each tree, frozen immediately in liquid nitrogen, and stored at −20 °C. Genomic DNA was isolated following the CTAB extraction method based on Doyle and Doyle (1987) and diluted to 10 ng μL−1 to carry out PCR amplifications.

Eleven microsatellite markers, DCA9, DCA11, DCA13, DCA15, DCA18, DCA71b, GAPU59, UDO99-11, UDO99-19, UDO99-24, and UDO99-43, previously developed from O. europaea subsp. europaea (Carriero et al. 2002; Cipriani et al. 2002; Sefc et al. 2000) were screened in the diverse set of accessions collected in this study.

Polymerase chain reactions were performed in a 10 μL volume containing 1 PCR buffer, 1.5 mM MgCl2, 0.2 mM dNTPs, 0.2 µM of each primer, one unit of Taq DNA Polymerase (Invitrogen, Madrid, Spain) and 20 ng of genomic DNA. The cycling profile included a denaturation for 1 min at 94 °C, followed by 35 cycles of 15 s at 94 °C, 15 s at 50 °C, and 1 min at 72 °C, and a final extension of 2 min at 72 °C.

The PCR reactions were carried out in a 96-well block Thermal cycler (Applied Biosystems, Madrid, Spain). PCR products were detected using an ABI PRISM 3130 Genetic Analyser and GeneMapper analysis software (Applied Biosystems). For capillary electrophoresis detection, forward SSR primers were labelled with 5′-fluorescence dyes PET, NED, VIC and 6-FAM and the size standard used in the sequencer was Gene Scan™ 500 Liz® (Applied Biosystems).

Data analysis and genetic diversity

Simple sequence repeat data were scored on the basis of the presence or absence of marker alleles, generating a binary matrix. These data were used to estimate genetic similarity between individuals based on Nei and Li (1979) in the NTSYSpc-2.11 software package (Exeter Software Setauket, NY, USA). A dendrogram was generated using the unweighted pair group method with arithmetic mean (UPGMA). To evaluate the consistency of the dendrogram obtained by UPGMA, the cophenetic correlation coefficient (CCC) was calculated (Sokal and Rohlf 1962) with 1,000 simulations and analysed by the cophenetic correlation value (r). After grouping, the intra and inter distances were computed, considering that a correlation value of r between 0.85 and 0.9 showed a good adjustment (Mantel 1967).

Principal coordinate analysis (PCoA) was conducted using Nei’s unbiased genetic distance pairwise population matrix to determine whether observed patterns in molecular data support the partitioning of the olive tree samples into specific groupings.

Genetic parameters such as the number of alleles per locus (N), the effective number of alleles detected per locus (N e), the observed heterozygosity (H o = number of heterozygous individuals/number of individuals scored), the expected heterozygosity (H e = 1 − ∑ρ 2 i , where ρ i is the frequency of the ith allele), and the Wright’s fixation index (F = 1 − H o/H e) for comparing both heterozygozities (Wright 1951) were estimated using PopGene 1.31 software (Yeh et al. 1997).

Population structure analysis

To assign individuals to populations based on the SSR genotypes and to investigate the population structure, the Bayesian clustering method was used, as implemented in the software package Structure (version 2.3.2) (Pritchard et al. 2000). The aim of this software is the identification of population structure by clustering individuals into genetically distinguishable groups on the basis of allele frequencies.

The ad hoc statistic ΔK (Evanno et al. 2005) was used to set the number of populations (K). Individual and admixture analysis were performed using Structure software assuming an admixture model where the allelic frequencies were correlated. This method uses a Markov Chain Monte Carlo (MCMC) algorithm to cluster individuals into populations on the basis of multilocus genotype data (Pritchard et al. 2000). A burn-in of 20,000 and 250,000 MCMC replications seemed to be the best fit for our data at K = 2. The analysis was run for K values ranging from two to five inferred clusters with ten independent runs each. The results were displayed graphically in a bar graph/chart.

Results

SSR loci and allelic frequency

The 11 SSR primer pairs belonging to the series DCA, UDO and GAPU amplified successfully the 163 olive accessions studied (Table 2). A total of 176 alleles were produced with those primers, with sizes ranging from 120 (UDO99-11) to 284 bp (DCA15). DCA11 detected the highest number of alleles (24), followed by UDO99-43 with 23 and DC9 with 22. DCA13 detected the lowest number of alleles, only five, much lower than for the second less polymorphic primer (DCA15 with eight alleles). Observed heterozygosity ranged between 0.43 (DCA13 and UDO99-24) and 0.87 (DCA71), with an average of 0.62 across the 11 SSRs (Table 2).

Table 2 Summary statistics for 11 SSR markers in the 163 Spanish olive accessions analysed with their size range, number of alleles (Na), observed heterozygosity (Ho), expected heterozygosity (He), polymorphic information content (PIC) and Wright’s fixation index (F)

The high values obtained for the number of alleles per locus and for heterozygosity confirmed the wide genetic diversity shown by the olive populations studied. These high values may also be due to the higher resolution of the capillary electrophoresis for efficient separation of close alleles in comparison to the non-automated techniques. Expected and observed heterozygosity values were compared with the fixation index (F) which was on the average 0.23, ranging between 0.026 (DCA71) and 0.43 (UDO99-11). High F values in combination with individuals in homozygosis (or showing only one peak) for these primers suggest the presence of a null allele (Brookfield 1996). It was positive in 11 primers, thus indicating a high level of heterozygosity in the genotypes analysed. The mean polymorphic information content (PIC) was 0.80, ranging from 0.62 for DCA13 to 0.89 for DCA71. Based on the PIC values, the primer DCA13 appears to be the least informative followed by UDO99-24 (0.71), whereas SSR DCA71 appears to be the most informative. The cophenetic correlation coefficient between the cophenetic matrix and the original SSR data was r = 0.88, indicating that the original matrix data were well represented in the dendrogram.

Genetic relationships among accessions and olive accession discrimination

Genetic structure was defined by two different approaches, the dendrogram derived from the UPGMA cluster analysis (Fig. 2a), and the PCoA performed on Nei’s unbiased genetic distance matrix (Fig. 3). Additionally, supplementary material where the dendrogram is split in three parts is shown as SM1, SM2 and SM3.

Fig. 2
figure 2

Dendrogram based on the diversity analysis of the 163 Spanish olive accessions based on UPGMA analysis using the similarity matrix generated by the Nei and Li coefficients after amplification with 11 SSRs (Huesca cvs. green colour, Zaragoza cvs. red colour and Teruel cvs. black colour) (a). The structure bar graph results obtained on the whole set of data from K = 2 to K = 5 (b) (colour figure online)

Fig. 3
figure 3

Principal coordinates analysis (PCoA) plot of the olive accessions based on the first two principal coordinates (coord. 1 = 35.4 % and coord. 2 = 21.7 %). This PCoA plot was based on SSR data illustrating the genetic relationship based on binary genetic distance among the 163 Spanish olive accessions

The examination of the UPGMA cluster allowed classifying the genotypes into two main groups of very different sizes. The first group (1), indicated by green colour, consisted only of accessions collected in the province of Huesca. The second group (2) included also accessions from Huesca in green colour, but also all accessions from Zaragoza in red colour and, to a lesser extent, the few accessions from Teruel in black colour (Fig. 2a). Consequently, most accessions from Huesca may be differentiated from the others.

The group 1, including Huesca accessions, showed two main subgroups, which may be further subdivided according to the schematic dendrogram shown in Fig. 2a and in the supplementary material (SM1, SM2 and SM3). The first subgroup (1.1) included a group of accessions from the Somontano area, such as ‘Minutera de Labata’, ‘Minutera de Boltaña’, ‘Minutera de Ibieca’, ‘Minutera de Castillonroy’ and ‘Minutera de Alquézar’, all of them with the same popular denomination and showing a close genetic distance among them, although they may be differentiated at the molecular level. Other accessions included in this same subgroup are ‘Cerruda de Abizanda’, ‘Cerruda de Artasona’, ‘Royeta de Baells’ and ‘Royeta de Secastilla 1’, with the same pattern of popular denominations and genetic closeness. In the second subgroup (1.2) such accessions as ‘Cerruda de Secastilla’, ‘Cerruda de Boltaña’, ‘Verdeña de Baells’ and ‘Verdeña de Lascuarres’, closely grouped, may be highlighted. Therefore, not all accessions with the same popular denomination are grouped together according to their genetic similarity, as the different accessions under the ‘Cerruda’ denomination show.

The group 2 of the cluster may be also divided into two subgroups. The first subgroup (group 2.1) comprised accessions from the three provinces and may be further divided into two smaller sections. The first section (2.1.1) included mostly accessions from Huesca, such as ‘Manzanilla’, ‘Manzanilla de Binéfar’, ‘Acebuche-2’, ‘Lanau’, ‘Royeta de Asque’ and ‘Nación’ with the exception of one accessions from Zaragoza that could be identified as ‘Picual’. The second section (2.1.2) included accessions from the three provinces and may be a group of accessions linking together the different provinces, although showing very close relationships among some of the accessions. Thus, ‘Caspolina’ (old tree), ‘Caspolina’ and ‘Sevillano’ (Caspolina) clustered together and showed a very close genetic distance among them since they probably belong to the same population. The same pattern happened with three accessions from ‘Morata de Jalón’ (5, 8 and 15) and ‘Boltaña-2’, clustering together and showing a very close genetic distance. This also happens with two accessions from ‘Morata de Jalón’ (7 and 11) and ‘Olivón de Secastilla-3’. Two cases of identity were found in this section of the cluster. One is of ‘Massanac’-1 and 2, undistinguishable at the molecular level and clustering additionally very close to ‘Massanac’-3. The other case of identity was found between three accessions from ‘Morata de Jalón’ (21, 27 and 29) and ‘Manzaneta’, showing a coefficient of genetic similarity of approximately 0.99.

The second subgroup of group 2 (2.2) included mostly accessions from the province of Zaragoza and may be divided into two other sections. The first section (2.2.1) contained accessions from two provinces, showing a very close similarity among the different ‘Arbequina’ clones, but not with ‘Bolvina de Caspe’. A second group (2.2.2) showed wider genetic distances, with similarity indices lower than 0.5. A single case of identity was observed in this section with three accessions from ‘Morata de Jalón’ (18, 19 and 20), grouping together with ‘Empeltre de Belchite’, with a coefficient of genetic similarity of approximately 0.97. The third section (2.2.3) comprised mainly accessions from Zaragoza, including 15 accessions from ‘Morata de Jalón’.

Finally, the PCoA analysis showed that the 163 olive accessions were separated into two main groups (Fig. 3), as supported by the UPGMA cluster previously explained. In the PCoA analysis, the first two principal axes explained a total of 57.1 % of unbiased genetic distance, with 35.4 % and 21.7 % for coordinates 1 and 2, respectively.

Genetic population structure

The olive accessions were evaluated for population stratification using Structure software. Bar graphs/charts were obtained with different values of K, from two to five. The uppermost hierarchical level of population structure suggested a clear maximum in the log probability of the data for K = 2. This partitioning level corresponds to a very strong differentiation in three major groups. The first two groups clustered 109 Huesca accessions (bar graphs/charts corresponding to the first group of green colour and the red colour for K = 2, Fig. 2b), with a significant admixture between them suggesting allele sharing. Few accessions from the province of Zaragoza (‘Morata de Jalón’ 1 and 30, and both Caspolinas) clustered together with the Huesca accessions.

Interestingly, the third group (the last green group for K = 2, Fig. 2) also included accessions from the three provinces, as already observed in the different sections of the dendrogram. Thus, nine accessions from Teruel and 46 from Zaragoza (both with green colour for K = 2, Fig. 2b) were included, as also some accessions from Huesca, such as ‘Empeltre’ (collected in Huesca near the places of other ‘Empeltre’ accessions of Zaragoza), as well as ‘Albás’, ‘Acebuche’, ‘Royeta de Mipanas’, ‘Royeta de Secastilla-4’, ‘Arbequina’, ‘Verdeña’ or ‘Dulce’.

Admixture was also observed among clusters, with several specimens showing Bayesian assignment probabilities ranging between 0.1 and 0.9. To investigate the sub-structure of our sample, we also calculated the percentages of membership from K = 3 to K = 5 (Fig. 2). According to K = 2, these results showed a differentiation in three major groups but a more complex and admixture structure. The proportion of genotypes assigned to each population was not symmetric, and many accessions were strongly assigned to one population or another, indicating that a real population structure exists (Pritchard et al. 2000).

Discussion

The 11 SSRs utilised in this study showed their usefulness in characterising the genetic diversity of the olive accessions from Aragon. Although the SSRs were not all the same than in other studies, the mean number of alleles per marker, 16, was similar to 16.5 obtained by Belaj et al. (2010), but slightly higher than 13.2 obtained by Sarri et al. (2006) and 13 reported by Marra et al. (2013), and much higher than 3.93 obtained by Harbi et al. (2012) and 3.6 reported by Ercisli et al. (2012). The average heterozygosity value of 0.62 was lower than the value of 0.83 obtained by Marra et al. (2013) in the Italian accessions from Calabria and Sicily, but higher than 0.60 obtained by Muzzalupo et al. 2014 using also Italian accessions, and than the value of 0.74 found by Belaj et al. (2010) in wild and domesticated Spanish accessions. Our result was also similar to 0.65 obtained by Sarri et al. (2006) for some western accessions and higher than 0.47 obtained by Ercisli et al. (2012) in Croatian and Turkish genotypes. The general high heterozygosity found per locus within the Spanish local accessions is typical of naturally outcrossing, perennial, and clonally propagated species that are highly selected for greater adaptability, vigour, and productivity under cultivation (Belaj et al. 2010; Díez et al. 2011; Sanz-Cortés et al. 2001). Although most accessions studied in this work had not been previously analysed for molecular characterization, these results allowed confirming that the geographical diversity of the different accessions may be related to their genetic diversity.

Concerning the NTSYS analysis, the results demonstrated the presence of synonyms among these different accessions, comparable to those reported in the literature. Taking into account that the accessions studied were mostly collected from different orchards all over the region, with the exception of the collection of Morata de Jalón, which in its turn came from old orchards, the level of identity was very low. Only a few cases of synonymy were found, mostly with accessions coming from the same area, as shown by their same geographical denomination with different identification numbers. This could be due to the fact that the different growing conditions could affect their appearance, thus being originally identified as different accessions.

A high genetic diversity was found in subgroup 1.1, which included only accessions from the province of Huesca, with all the different denominations of ‘Minutera’, ‘Cerruda’, ‘Royeta’ and ‘Verdeña’, This diversity corroborates the high olive variability in the Somontano area. No synonymies were found in this cluster, confirming the uniqueness of each accession, even those belonging to the same general denomination, showing that a denomination may apply to the same type of tree but with small differences according to their different geographical origins. As most accessions are probably very ancient cultivars, the different denominations may be due to mutations from an original genotype giving better adaptation to the different areas where they spread and have been collected. Subgroup 1.2 includes another group of accessions from the Somontano area, but also two close accessions, ‘Manzanilla de Binéfar’ and ‘Manzanilla de Secastilla’, supposed to be introduced cultivars since they are grouped with the reference accession ‘Manzanilla’ (Kochmstedt et al. 2012). This reference cultivar originated in southern Spain and could have been introduced in the past in Aragon. However, their genetic dissimilarity with the two accessions with the same name may rule out their relationship. A possible explanation to this improper synonymy could be their phenotypic similarity resulting in naming the two autochthonous and probably related accessions after the name of a widespread and well-known cultivar. Some of the other introduced cultivars could have crossed with local accessions giving spontaneous rise to new accessions receiving the same denomination because of their similarity. This could have happened with the three ‘Sevillano’ (syn. ‘Caspolina’) accessions included in subgroup 2.1, since all of them come from Caspe and group together, as already established (Belaj et al. 2001, 2003), even if they are not identical.

Although the different accessions from Morata de Jalón were recovered without knowing their real origin, the results of the dendrogram allow establishing their relationship with the reference accessions. Although accessions 1, 5, 7, 8, 11 and 15 cluster together with ‘Boltaña 2’, their origin cannot be established because ‘Boltaña 2’ is not a cultivar name but a reference name taken from the near-by town. Therefore, these accessions cannot be considered as clones from the same population. Accessions 35 and 36 could also be identified as ‘Arbequina’ types because they cluster with the two accessions, ‘Arbequina’ and ‘Old Arbequina’, although not being identical. The reference accession ‘Royal de Alloza’ clusters with two accessions identified as ‘Royeta’ (from Secastilla and from Mipanas), thus suggesting that ‘Morata de Jalón’-2 could also be considered as belonging to the ‘Royeta’ group, as well as ‘Rebuxenca’ from Teruel.

The rest of accessions from Morata de Jalón clustered within subgroup 2.2. This subgroup includes the reference accession ‘Empeltre’, the most common in Aragon, since it extends to more than 50 % of the olive area. ‘Empeltre’ has been considered as showing very high variability since this name was given to different individuals grafted in old wild olive bushes (Belaj et al. 2004; De la Rosa et al. 2002). Consequently, accessions 10, 14, 16, 18, 19 and 20 from Morata de Jalón can be considered as belonging to the ‘Empeltre’ group, as well as ‘Moixoneca’ from Teruel, although maintaining their identity. Accessions 9, 13 and 31 can be related to ‘Bolvina de Belchite’. Finally, Accessions 3, 4, 6, 12, 17, 22, 23, 24, 25, 26, 28, 32 33, and 34 belong to the same group than the reference accession ‘Negral de Saviñán’. Additionally, although ‘Alquecerana’ and ‘Siso’ were collected in the province of Huesca and received their local geographical names, they are probably clones of ‘Negral de Saviñán’ introduced in the past and later considered as an own local accession.

Clusters obtained by Structure software for population stratification were compared with the UPGMA dendrogram. There is an agreement between clusters representing genetic diversity and population structure (Fig. 2a) according to their geographical origin. Three clear groups were observed at K = 2 in the genetic structure analysis (Fig. 2b), as also previously shown by the UPGMA method. The first two groups (the first group of green colour and the red colour at K = 2) include the accessions from Huesca and the third (the second green group at K = 2) includes the accessions from Teruel and Zaragoza, as well as some from Huesca. Adaptation to local environmental conditions could explain the subgroups generated for the olive accessions into the different genetic clusters according to the structure analysis at K = 3, 4 and 5. It is noteworthy that ‘Empeltre’ and ‘Arbequina’, the most widespread accessions throughout Aragon, clustered together in the third group. The propagation system in the past included grafting the same genotype onto the same rootstock, as is the case of ‘Empeltre’ grafted on ‘Royal’. The solution to horticultural problems such as poor rooting, tolerance to environmental stresses, or more recently the adaptation to high-density planting and possibility of mechanical harvesting, as is the case of Arbequina, could explain the development of grafting techniques in Aragon, which are not common in other olive growing regions. Similar results on the genetic structure were obtained by other authors (Baldoni et al. 2006; Díez et al. 2011; Erre et al. 2010). Contrasting patterns of genetic diversity between 40 accessions from the Mediterranean basin have also been found by Breton et al. (2008) and population structure for 22 and 68 Italian olive accessions was shown by Albertini et al. (2011) and Marra et al. (2013), respectively.

In many fruit species, domestication occurred relatively late, so the bottleneck was relatively recent and its duration short, causing diversity loss and genetic drift. The study of the population structure of olive in Aragon (Spain) has provided valuable information about bottlenecks and evolutionary processes of this species, although demographic bottlenecks probably did not directly affect the Mediterranean accessions. However, the continuing genetic erosion caused by the continuous gene flow between wild and domesticated species, and the land degradation, especially in rural areas, must be taken into account (Belaj et al. 2007).

Conclusions

The identification of local germplasm is the first key step in the development of typical olive oil production. Moreover, the study of local cultivars is very important in preserving the biodiversity from the risk of genetic erosion due to the introduction of few superior cultivars in modern orchards and could expand the supply of different olive cultivars in Aragon. Research on the Spanish olive germplasm in the different growing areas has been also performed with the aim of determining the identity of minor local varieties so as to use them as an interesting resource for breeding programmes.

Our results represent a first molecular approach to olive genotypes from most Northeastern Spanish areas, allowing the comparison and identification of accessions as well as the exchange of reliable genetic material among institutes for future research.

Author contribution

AFiM, CFiF and MJRC designed the study, collected leaf samples and performed the SSR analysis. CFiF carried out the statistical and the structure analysis, while RSiC supervised the research and guided data interpretation. All authors participated in the writing of the manuscript and approved the final version.