Introduction

The olive (Olea europaea L.) is one of the most ancient tree crop species spread in the Mediterranean area, with an excellent ability to survive and reproduce in harsh climates consisting of poor soil and low water availability.

Identification and conservation of olive genotypes growing under semi-natural conditions in extreme environments, showing medium to large fruit sizes, good oil quality and considerable tree sizes, are important priorities for the knowledge and promotion of local olive genetic resources (Bagnoli et al. 2009; Hakim et al. 2010; D’Imperio et al. 2011). Germplasm reassessment is particularly important in olive production, as olives remain one of the most economically significant and widespread fruit crops. Unlike most other fruit species, olives have a very large genetic patrimony represented by over 1,200 cultivars and an abundance of wild trees; in addition, a considerable number of ancient cultivated forms are still waiting to be identified and characterized. Wild plants and semi-natural ecotypes are of invaluable agronomical interest and their loss could lead to dramatic consequences for olive diversity (Beghe et al. 2011). It was seen that the distribution of olive species throughout Iran, which is outside of the traditional Mediterranean spreading area, follows different patterns, such as the colonization of pre-desert areas with very limited water availability and sub-saline lands with extreme temperature variations (Noormohammadi et al. 2009).

A few documents have shown that in the tenth and eleventh centuries, olives were cultivated in the following regions: Nisapur, Gorgan, Deylam, Ramhormuz, Arrajan, and Fars. This distribution likely reflects the situation as it existed in pre-Islamic Persia (Floor 2005). Historical analysis has shown that olives were one of the main agricultural products in the ancient city of Gorgan (also known as Jorjan). Abvdlf Khzrjy 1,100 before present (BP), who visited Jorjan once said, “The dates, bergamot and olives and pomegranates, walnuts and sugar cane growth, is very high” (Rechinger 1968).

Today, Iran has an olive crop production area larger than 122,000 hectares amounting to 6,000 tons olive oil per year (International Olive Council 2000) and is starting to become recognized as a prominent olive producer. Olive cultivation occurs throughout many provinces, but more than 80 % is limited to the northern province of Gilan. Harsh climate conditions throughout the country are not suitable for olive growth; however, some olive trees may have survived in a few remote places, holy areas and archaeological sites due to easy resprouting from stumps, or maintained because they are growing on sacred sites such as necropolis.

In the Golestan province, temperatures can vary from lows of −5.9 °C to highs of 36.6 °C, with an annual rainfall of 564 mm in the west and 127.6 mm in the east (www.IRIMO.ir). The area suitable for olive cultivation in Golestan is represented by the northern valley of the Alborz mountains, between the Caspian sea and the northern arid area. The olive cultivation area is increasing in the last years with the introduction of foreign Mediterranean cultivars showing limited production probably due to their poor adaptation to the local agro-ecological conditions.

Analysis of genetic diversity through SSR markers can detect variants and discriminate among different genetic resources (Belaj et al. 2012; Diez et al. 2011; DImperio et al. 2011; Doveri et al. 2008; Hannachi et al. 2008; Muzzalupo et al. 2010; Tatjana et al. 2013). To date, microsatellite markers are recognized as the most effective tool for cultivar discrimination in olive trees, due to high rates of polymorphism and reproducibility and previous studies have used microsatellite markers on a large number of olive genotypes (Baldoni et al. 2009; Haouane et al. 2011; Noormohammadi et al. 2009; Rehman et al. 2012; Sarri et al. 2006).

This work presents the results of molecular analyses of unidentified old olives surviving as isolated or groups of trees in some areas of Golestan province. These olives represent plants whose status of cultivation or level of wildness is unknown and will be referred to as ecotypes in this article. The analysis of their genetic structure, assessing the level of diversity and their relationships to reference cultivars should allow to gain valuable knowledge on the origin of an unknown olive germplasm and to highlight their potential agro-ecological value.

Materials and methods

Plant material and geo-climatic data

A total of 23 olive ecotypes were collected from eight areas in the Golestan province along a 255-km area, from the western to the eastern side of the province (Fig. S1). These trees were selected from local small populations or isolated trees, sometimes represented by ancient plants, based on trunk and stump size, as in the case of Uzineh3 and Nasrabad 1 and 2 trees, reaching 2.5 m trunk diameter. Samples were labeled based on collecting sites. These ecotypes were compared to seven main Iranian cultivars and 11 representative Mediterranean varieties (Table 1) in order to evaluate their relationships with the most cultivated olive trees.

Table 1 Sample name, cultivation status and region of diffusion

To evaluate the geographical distribution of these trees and establish their climatic adaptation, the following data were registered: latitude, longitude, altitude (m asl) and climatic data. Including average, minimum and maximum temperature based on the average values of coldest and hottest month, and annual rainfall (mm/year) calculated over the last 10 years (Table 2). Sampling site altitudes ranged from 26 to 750 m above sea level (asl). Rainfall ranges from 127 mm for the Ghazanghayeh area to 564 mm for the Lemesk, Nasrabad and Ayeshkalaleh collecting sites. The lowest average temperature was −5.9 for Ghazanghayeh and 1.6 for Lemesk, Nasrabad and Ayeshkalaleh sites. The highest average temperatures were recorded in Uzineh and Azadshahr (36.6 °C) and also in Ghazanghayeh (30.6 °C).

Table 2 Geo-climatic data for 23 olive accessions in Golestan province

DNA extraction and molecular analysis

DNA was extracted from leaves using the Plant GenElute Extraction Kit (SIGMA-ALDRICH®). DNA samples were analyzed by 11 SSR markers widely used for olive cultivar fingerprinting (Baldoni et al. 2009; Haouane et al. 2011) including: DCA3, DCA5, DCA9, DCA14, DCA16, DCA18, EMO90, GAPU71B, GAPU101, GAPU103A and UDO-043 (Sefc et al. 2000; de la Rosa et al. 2002; Carriero et al. 2002; Cipriani et al. 2002). Forward primers carried VIC, FAM, PET or NED labels at their 5′-end.

PCR amplifications were performed in reaction volumes of 25 μL containing 25 ng of template DNA, 10X PCR buffer, 200 μM of each dNTP, 10 pmol of each primer and 2 U of Perfect Taq DNA Polymerase (5 PRIME, Eppendorf). Amplifications were performed with the PCR System 9600 Thermal Cycler (Applied Biosystems, Foster City, CA), using the following cycling conditions: initial denaturation step at 95 °C for 5 min followed by 35 cycles of 95 °C for 30 s, with the annealing temperatures suggested by the authors, for 30 s and 72 °C for 25 s, then followed by a final elongation step at 72 °C for 30 min. PCR products were loaded on the ABI 3130 Genetic Analyzer (Applied Biosystems-Hitachi) using the internal GeneScanTM-500 LIZ Size Standard (Applied Biosystems). Output data were analyzed by GeneMapper 3.7 (Applied Biosystems-Hitachi).

Fruit morphological analysis, oil content and composition

Only for a few samples it has been possible to obtain fruits for preliminary morphological and oil analyses. In particular, four ecotypes (Uzineh3 and Ghazanghayeh1, 2, and 8) were selected and analyzed for fruit, stone and leaf morphology based on IOC (International Olive Council) parameters.

Oil content and fatty acid composition were also recorded for the same samples. For oil composition, six main fatty acids (palmitic, palmitoleic, stearic, oleic, linoleic and linolenic acids) were analyzed.

Data analysis

The GenAlEx 6.5 (Peakall and Smouse 2012) was used to detect the total number of alleles at each locus and for each ecotype sample (using the Iranian and Mediterranean cultivars as reference olive genotypes). The following measurements were made: number of alleles (Na), number of effective alleles (Ne), number of private alleles (Np), Shannon’s information index (I), observed (Ho) and expected heterozygosity (He), fixation index (F), mean fixation index (Fis) and pairwise population Fixation index (Fst). Micro-Checker version 2.2.3 (Van Oosterhout et al. 2004) was used to estimate the presence of null alleles at the eleven loci, responsible for heterozygote deficiency, by using four different methods (Oosterhout, Chakraborty, Brookfield 1 and Brookfield 2).

Phylogenetic and molecular evolutionary analyses were completed with DARwin v.5 software (Perrier and Jacquemoud-Collet, 2006) using the neighbor joining (NJ) method for 10,000 bootstrap replications.

To establish parent-offspring relationships among reference cultivars and ecotypes, a parentage analysis was carried out using CERVUS software version 3.0.3 (Marshall et al. 1998). In cases where the parental candidates identified from the likelihood procedure had more than two loci mismatches, the offspring were not assigned to any parental candidate.

To explore the partitioning of the genetic variance into different geographical and genetically distinguishable groups, we analyzed populations using the Mantel test on GenAlex software, based on correlation coefficients among triangular matrices of genetic and geographic distances.

To detect changes in diversity measurements within the assessed material, Pearson’s correlation coefficients were calculated for Nei’s gene diversity, as well as Ho, Na, I, uHe and F against several geographical and climatic parameters. Positive or negative correlations were assessed using SPSS software version 17.0 (IBM Corporation, NY, USA).

All SSR data were also analyzed using STRUCTURE 2.3.4 software (Pritchard et al. 2000) to identify the possible breakdown of samples into different populations and to identify hidden population structures. This was completed by clustering individuals into genetically distinguishable groups on the basis of allele frequency. An admixture model for independent alleles with 10,000 iterations was applied to analyze from K = 1 to K = 10 for all samples or from K = 1 to K = 6 when considering only the Golestan ecotypes. To identify the true population number, ΔK values were calculated as mean of absolute values of difference between successive likelihood values of K divided by the standard deviation of L (K).

Results

Microsatellite polymorphisms

The 11 loci used in this study displayed a high degree of polymorphism among Golestan ecotypes, Iranian and Mediterranean reference cultivars and revealed 77 alleles for Golestan samples, 61 for Iranian cultivars and 77 for Mediterranean varieties for a total of 215 alleles. Allele numbers for all samples ranged from a minimum of three at the DCA5 locus to a maximum of 14 alleles at DCA16. The number of effective alleles ranged from 3.46 for the Golestan ecotypes, to 3.88 for Iranian reference cultivars and 4.96 for Mediterranean cultivars, with a mean value across groups of 4.1 and an information index of 1.53. The Shannon index ranged from 1.42 in ecotypes to 1.69 in the Mediterranean cultivars. Expected heterozygosis (He) varied from 0.63 in olive ecotypes, to 0.71 in Iranian cultivars and 0.76 in Mediterranean cultivars, with an average level of 0.72 and, in all cases, these values were lower than Ho, even if some ecotype samples showed high level of homozygosity, as Uzineh3 ecotype.

The presence of null alleles at all microsatellite loci, analysed by Microchecker software, revealed that ten SSRs were not affected, but in the case of UDO-043, it was highly probable (Table 3), as suggested by the excess of homozygotes at this locus. In order to avoid any bias due to the excess of null alleles, the data from the UDO-043 locus were then eliminated from further analyses to assess genetic variability and differentiation among analyzed samples.

Table 3 Null alleles analysis by Microchecker software

The values of He and Ho didn’t vary significantly after excluding UDO-043 and fixation index (F) decreased only for the ecotypes from −0.07 to −0.14, with standard error values also significantly decreasing from 0.07 to 0.03.

The pairwise Fixation index (Fst) values among the three groups showed considerable differences in all cases, with the lowest scores detected between the ecotypes and Mediterranean cultivars (0.08) and between the Iranian and Mediterranean cultivars (0.09). Surprisingly, the highest value (0.11) was recorded between Golestan ecotypes and Iranian cultivars (Table 4).

Table 4 Pairwise population Fixation index values (Fst)

When compared to main Iranian cultivars, Golestan ecotypes have been shown to contain a large number of private alleles (Table S1), with loci DCA16, DCA3, DCA9 and GAPU103A showing the maximum values (16 for DCA16 and five for all the others), whereas only DCA5 locus showed no degree of exclusivity.

Cluster analysis

The Neighbor-Joining dendrogram obtained by DARwin v. five was based on simple matching dissimilarity matrix. It revealed the presence of three main clusters: one including most of Golestan ecotypes, a second composed by all the Mediterranean, six out of seven Iranian cultivars and the Livan ecotype from Golestan, and a third group only including the Ganaveh ecotype and the Zard Iranian variety (Fig. 1). Within the first cluster most of the Ghazanghayeh samples grouped together, two samples resulted identical (Ghazanghayeh6 and 9), but no further geographical aggregations were observed.

Fig. 1
figure 1

Neighbour-joining dendrogram based on simple matching dissimilarity matrix obtained by DARwin v. five. Bootstrap values ≥50 indicated along the branches represents the percent of times out of 10,000 that two accessions grouped together during bootstrap analysis. Branch length is proportional to the distance between nodes

Parentage

Kinship analysis was performed by CERVUS on all SSR data to determine the parentage of Golestan ecotypes among genotypes with different genetic backgrounds. The results indicated that, among cases with the highest LOD scores, the Ghazanghayeh1, 3, 6, 8 and 9, Azadshahr1, 3 and 4 and Ayeshkalaleh1 samples were the most likely parental candidates of other plants present in the same sites(Table 5).

Table 5 Parentage analysis of CERVUS with lowest mismatch in 41 olive studied samples

Inferred population structure

Population structure analysis showed the stabilization of ∆K values at K = 3 (Fig. 2). The three main groups separated Golestan ecotypes from Iranian cultivars and from Mediterranean varieties. A degree of admixture was only observed for the Livan ecotype (<20 %) with both, Iranian and Mediterranean cultivars, and, to a lesser extent, for the Iranian variety Zard.

Fig. 2
figure 2

Genetic structure of the Golestan olive ecotypes compared with Iranian and Mediterranean olive cultivars at K = 3. Each vertical bar represents single accessions, and colors distinguish the three groups defined by the K values. Olive samples with more than one color indicate the admixture of genetic composition

The population substructure analysis within ecotypes performed by hierarchical population structure, couldn’t divide Golestan ecotypes in different subgroups, revealed a high percentage of admixture among these samples.

Geographical and climatic data

The Golestan province represents one of the most climatically variable areas in Iran. The Pearson’s correlation coefficient, calculated to determine whether possible correlations exist between geo-climatic and molecular data, showed no possible correlations (Table 6).

Table 6 Pearson’s correlation for geoclimatic and molecular data based on SPSS software

The Mantel test, performed to assess the correlation between genetic and geographic distances based on two triangular matrices using GenAlEx software, showed a positive but not significant correlation (R2 = 0.0485; Fig. 3).

Fig. 3
figure 3

Mantel test diagram, performed to assess the correlation between genetic (GD) and geographic (GGD) distance matrices

Morphological and agronomical characters

In order to give an idea of the potential interest of these trees for cultivation and to exclude their confusion with wild olives naturally occurring in the Mediterranean area, we performed a raw survey on a few plants (four) where fruit material was available (Fig. S2, Table S2).

Despite the strong unfavorable climatic conditions where they grow, fruits of the Ghazanghayeh trees were bigger (7.06–12.83 g) than most of internationally known table olive cultivars, according to the international standards (IOC 2000). To this high fruit size corresponded a high stone size, leading to low flesh/pit ratios (0.97–1.50). On the contrary, Uzineh sample showed medium–low fruit weight (1.67 g) but with a very low stone weight with elliptic shape, leading a high flesh/pit ratio (4.68).

Also the fruit oil content on total fresh fruit weight resulted quite high, reaching 25.6 % for Ghazanghayeh1 and 23.7 % for Uzineh3, whereas for the other two samples the oil percentage didn’t overcome 7.3 % (Table S3). Oleic acid, the main fatty acid on olive triacylglycerols, reached percentages up to 73.9 % (Uzineh3, palmitic and linoleic acids showed percentages within normal ranges as well.

Discussion

Golestan is a province located in the northeast region of Iran, at the border of Turkmenistan, facing the Caspian Sea and limited at south by the Alborz mountains. The valley along these mountains show fertile soils, water availability and a mild climate, while at north the region is largely characterized by a drought desertic area. Along the southern valley and up to the northeastern pre-desert area, it is possible to find isolated or small groups of olive trees growing naturally into the wildness, without any sign of cultivation and under difficult or extreme eco-climatic conditions, sometimes close to sacred areas or archaeological sites, including the Livan, Lemesk and Nasrabad sites. In some cases, trees show considerable trunk or root plate sizes, indicating an old age, likely corresponding to more than a hundred years, and their fruits may reach a big size, corresponding to the best table olive varieties typical of the Mediterranean area. No information neither documents are available to explain the origin of these plants, however, they appear highly respected and protected by local people.

A representative sample of these olive ecotypes was collected from all places and, where available, also fruits were harvested, for a preliminary evaluation of their agro-technological value.

Molecular analysis of these ecotypes indicated they were quite diverse compared to other olive varieties, and pairwise population fixation index values confirmed the clear separation obtained by cluster and structural analysis among Golestan and Mediterranean cultivars, also demonstrating a higher differentiation with the main Iranian cultivars. Only Zard, one Iranian variety currently cultivated in the Golestan province, carried some similarity to some of these ecotypes.

They also exhibited a large number of private alleles, highlighting a deep difference with varieties, allowing to hypothesize that these ecotypes have poorly contributed to the development of the cultivated varieties and that they may have evolved independently. On the contrary, their population substructure highlighted a high percentage of admixture among different sites and parentage analysis revealed high kinship relationships among all ecotypes even if located a considerable distances. In general, the geographical distance among these olive ecotypes was unrelated to their genetic diversity.

Based on these discrepancies but taking into account the present distribution of variation among the few still-alive olives, it can be assumed that the olive plants surveyed in this work most likely represent the survivors of ancient olive cultivations established thousands of years ago in the region and then abandoned for geo-climatic disorders or socio-economic changes. A few original plants may have survived for a while and may have disseminated seeds able to germinate and establish new trees. In fact, the significant number of shared alleles within Ghazanghayeh samples may support a common origin as seeds naturally dispersed and underwent successive inbreeding among Ghazanghayeh ecotypes. The two identical samples from Ghazanghayeh most likely derived from clonal propagation, strengthening the hypothesis that their origin comes from cultivated plants.

The distance detected with the present cultivars, either Iranian and Mediterraenan, can be attributed to the fact that they have developed in other areas, drawing from different gene pools, while paleo-varieties of Golestan, despite their high agricultural value, have remained isolated for a long time and have not contributed to the formation of the varieties currently grown.

Finally, morphological and biochemical analyses performed on a restricted but important subset of samples revealed they may carry desirable fruit size, high oil content and balanced fatty acid composition, in line with the best Mediterranean commercial varieties.

It is important to underline that in all the area under evaluation, naturally growing olive trees have never been reported (Akhani et al. 2010; Heshmati 2013), neither evaluated from an ethnobotanical point of view (Khoshbakht and Hammer 2006), unless for the new olive plantations established in most favourable conditions along the Alborz Valley (Ayoubi et al. 2011; Ghavami et al. 2007), but realized by the use of foreign cultivars or with a few national varieties. Probably the occurrence of these trees is too scarse and patchy to arouse the interest of botanists and agronomists. On the contrary, the presence of olives is reported in the Gorgan area since 1,000 years BP (Floor 2005; Rechinger 1968).

These preliminary data identify the potential value of these ecotypes, but for a complete understanding of their agronomical proficiency, more comprehensive studies should be undertaken.

Conclusions

The present work demonstrates that in the Golestan province, which lies a great distance from the core region of Mediterranean olive cultivation, an unexpected level of olive variation is preserved. These olives, genetically far from most common olive cultivars, likely originated as survivors of ancient cultivated olives.

The most interesting samples, which certainly include the Ghazanghayeh group, are represented by plants with a good adaptation to strong adverse soil and climate conditions and bearing large fresh fruit sizes with considerable weights. The quality and quantity of table olives and olive oil should also be deeply evaluated to consider their use in new olive plantations and to improve olive products. Confirmation of the genetic differences found in ecotypes studied in the present work through a larger sample set could open a new window to determine the origin and spread of olive tree cultivation out of the boundaries of the Mediterranean Basin.

The discovery of such source of variability could help the cultivation of this important plant for human consumption in unsuitable areas. Additional information needs to be collected and plans to safeguard and exploit this material should be undertaken, meanwhile a propagation program has been established, as well as an ex situ collection for their preservation.

The emerging trend to build new olive orchards using non-native ecotypes or through a few Iranian cultivars should take under consideration this interesting source of variation, showing strong resilience to the environmental limitations. As a consequence of this preliminary work, it will be possible to perform wider agronomical evaluations and incorporate ecotypes to develop “new” olive genotypes suitable for cultivation under arid or cold conditions and for production as table olives and olive oil.