Introduction

Corylus avellana L., the European hazelnut, is diploid (2n = 2x = 22), monoecious, dichogamous, and wind-pollinated and has sporophytic incompatibility that enforces cross-pollination. Its geographical distribution extends from Europe and North Africa to the Caucasus region and Asia Minor. It is the source of important cultivars in Europe and Turkey, which were selected over many centuries from local wild populations (Thompson et al. 1996). A few superior cultivars were spread beyond their area of origin by trade and human migration. In spite of the long cultivation history, little remains known about the origin and domestication. European hazelnut is one of the most important tree nut crops in terms of worldwide production. The Black Sea countries account for the majority of world production: Turkey (610,264 tons, average of 2009–2011), Azerbaijan (28,564 tons), and Georgia (20,567 tons). Other important producers are Italy (114,991 tons), the USA (35,079 tons), and Spain (16,988 tons) followed by Iran, China, France, and Greece (FAOstat 2011), with significant new plantings in Chile. About 90 % of the world crop is shelled and sold as kernels, while the remaining 10 % is sold in-shell for fresh consumption. The primary user of kernels, the food industry, requires cultivars that produce nuts with few defects and has precise requirements for morphological, chemical, and physical characteristics of the kernels.

In recent years, efforts to improve the efficiency and effectiveness of agro-biodiversity conservation have been conducted for most crop species as required by the Convention on Biological Diversity (CBD 1992), the International Treaty on Plant Genetic Resources for Food and Agriculture (FAO 2001), and the Global Plant Conservation Strategy (CBD 2002). Agro-biodiversity includes plant genetic resources (PGR) for food and agriculture: (a) modern cultivars, breeding lines, and genetic stocks that are widely and actively conserved by plant breeders and gene banks and (b) obsolete cultivars, landraces (e.g., farmer populations of crop plants), ecotypes (e.g., natural plant populations), and wild and weedy relatives that still need to be actively conserved (Polegri and Negri 2010). PGR can be used as raw material required for genetic improvement, allowing a crop to adapt to unpredictable environmental changes and resistant to diseases, guaranteeing food security for future generations (FAO 2001). In recent decades, PGR have usually been conserved by ex situ methods. More recently, in situ conservation, sometimes referred to as “on-farm conservation,” has been proposed as a better conservation strategy in order to protecting diversity of cultivated and wild plant species in its natural habitat (Jarvis et al. 2000).

In hazelnut, about 400 clonal cultivars have been described and are maintained in different ex situ germplasm repositories (Thompson et al. 1996). A total of 510 accessions are conserved in 13 European hazelnut collection fields: four in Italy, three in Portugal, two in Spain, and one each in Slovenia, France, and Greece (Rovira et al. 2011). More than 700 Corylus accessions were preserved in the major world hazelnut collection located in Oregon (USA), while a collection containing 20 registered cultivars and more than 400 accessions collected from the Black Sea coast is in Turkey (Gürcan et al. 2010a). In contrast, the in situ conservation strategies proposed by Jarvis et al. (2000) have not yet been applied to hazelnut PGR, although a first on-farm exploration was conducted in northern Spain (Asturias) by Ferreira et al. (2010). Over three consecutive years (2008–2010), the EU AGRI GEN RES project SAFENUT (“Safeguard of almond and hazelnut genetic resources: from traditional uses to modern agro-industrial opportunities”) aimed to increase knowledge of genetic diversity in the European hazelnut. Objectives included description of cultivars from different ex situ European collections as well as the on-farm exploration, description, and in situ conservation of local endangered PGR. This characterization was carried out using different set of descriptors: morphological, biochemical, molecular, as well as ecological and cultural aspects (Bacchetta et al. 2011).

Identification of accessions and analysis of genetic diversity in collections (ex situ and in situ) are important in the management and utilization of PGR. Traditional methods to characterize and identify hazelnut accessions or cultivars are based on morphological and phenological descriptors (Biodiversity International 2008). In recent years, DNA markers have proven to be useful for accurately identifying cultivars due to their high discriminating power at a relatively low cost. In C. avellana, microsatellite or simple sequence repeat (SSR) markers have been developed (Bassil et al. 2005a, b; Boccacci et al. 2005; Gürcan and Mehlenbacher 2010a, b; Gürcan et al. 2010b) and mapped (Mehlenbacher et al. 2006; Sathuvalli et al. 2011). Loci have been used to fingerprint accessions in collections, identify synonyms, determine parentage, and assess genetic relationships among cultivars (Boccacci et al. 2006, 2008; Gökirmak et al. 2009; Gürcan et al. 2010a). SSR markers have been also used to investigate the genetic diversity and structure of different populations (Boccacci and Botta 2010; Gökirmak et al. 2009; Gürcan et al. 2010a) or compare local cultivars and wild hazelnuts (Campa et al. 2011).

The present work reports the results of a hazelnut germplasm exploration conducted on-farm within the SAFENUT project in five southern European countries (Portugal, Spain, Italy, Slovenia, and Greece). The main aims were to characterize hazelnut landraces using morphological descriptors and SSR markers and to investigate their genetic relationships with wild forms and well-known cultivars. The information will be useful to identify landraces for in situ preservation, further evaluation in ex situ collections, and use in breeding programs. The origin and diffusion of the cultivated germplasm in southern Europe, particularly in the Italian Peninsula, will also be discussed

Materials and methods

Plant material

A total of 153 hazelnut accessions were analyzed in this study: (a) 77 landraces mostly surveyed on-farm during the SAFENUT project (2008–2010) (Table 1), (b) 57 reference cultivars from different European and Turkish collections (Supplementary Table 1), and (c) 19 wild hazelnuts sampled in the sites of Vejano (Latium, central Italy) and Benevento (Campania, South Italy), where wild populations are still present. The landraces were surveyed in the traditional areas of hazelnut cultivation in five southern European countries (Table 1). Among them, 5 were collected in northern Portugal, 10 in northern Spain (Asturias), 52 in six Italian regions [6 in Piedmont (northwestern Italy), 10 in Liguria (northwestern), 1 in Marche (central Italy), 12 in Latium (central Italy), 3 in Calabria (southern Italy), and 20 in Sicily], 5 from Slovenia, and 5 from northern Greece. Farmers were contacted explaining the reasons for the project and interviewed about the presence of old endangered cultivars on their farms. Information on agronomic and qualitative traits, as well as use, local names, tradition, and social context, were also collected. The set of reference cultivars included those well-known and grown in the above mentioned countries, together with the most important ones from Black Sea region. In consideration of the high number of landraces and cultivars from Italy, some wild forms were included to investigate the origin and circulation of hazelnut in the Italian Peninsula.

Table 1 Locations of 77 landraces characterized in five southern European countries

Morphological observations

A total of 20–50 nuts were collected in situ from each surveyed landrace. Husks or involucres, nuts, and kernels were characterized using 14 qualitative standard descriptors (Table 2), following Thompson et al. (1978), the UPOV (1979), and Bioversity International (2008) guidelines.

Table 2 Proportion of phenotypic classes of morphological descriptors of hazelnut fruits collected from landraces shown to have unique SSR genotype

DNA extraction and SSR analysis

Total genomic DNA was extracted from 0.25 g of young leaves or immature catkins using the modified procedure of Thomas et al. (1993). A total of ten SSR loci, selected by Boccacci and Botta (2010) for the SAFENUT project, were analyzed: CaT-B107, CaT-B501, CaT-B502, CaT-B503, CaT-B504, CaT-B505, CaT-B507, CaT-B508 (Boccacci et al. 2005), CaC-B020, and CaC-B028 (Bassil et al. 2005a). PCR amplifications were performed in a volume of 15 μl containing 40 ng DNA, 0.5 U Taq-DNA polymerase (AmpliTaq Gold, Applied Biosystems, Foster City, CA, USA), 1.5 μl 10× PCR buffer (100 mM Tris–HCl, pH 8.3, 500 mM KCl), 2 mM MgCl2, 200 μM dNTPs, and 0.5 μM of each primer. The PCR conditions were: a first denaturation step at 95 °C for 9 min, followed by 26 cycles of denaturation (30 s at 95 °C), annealing (45 s at 55 °C and 50 °C for CaT-B502), and extension (90 s at 72 °C). The final elongation step was carried out at 72 °C for 30 min. Amplification products were analyzed using an ABI-PRISM 3130 Genetic Analyzer capillary electrophoresis instrument (Applied Biosystems, Foster City, CA, USA). Results were processed with GeneMapper software (Applied Biosystems), and alleles were designated by their size in base pairs using a GeneScan-500 LIZ standard (Applied Biosystems).

Data analysis

Microsatellite data obtained at ten SSR loci for 153 hazelnut accessions were processed using the software Identity 4.0 (Wagner and Sefc 1999) to identify accessions with identical SSR profiles. When two or more accessions had identical SSR profiles, only one was retained for further analysis.

The genetic relationships among the different genotypes were investigated using two types of analysis. An unweighted pair-group method using arithmetic average (UPGMA) was used to construct and draw a dendrogram from the genetic similarity matrix using mega v. 5.05 (Tamura et al. 2011). Genetic distances (1,000 bootstraps) were computed as: D = [1 − (proportion of shared alleles)], using Microsat software (Minch 1997). A principal coordinate analysis (PCoA) was computed by GenAlEx 6.2 (Peakall and Smouse 2006).

The program structure v. 2.3.3 (Pritchard et al. 2000), a model-based Bayesian clustering method, was used to infer population structure and assign individuals to sub-populations. Structure was run five independent times for each K value ranging from 1 to 10. The admixture model was applied and allele frequencies were assumed to be correlated. A burn-in period of 100,000 generations and 200,000 Markov chain Monte Carlo replications were used. All individuals were treated as having known origin and assigned to one of eight geographical groups. Among them, accessions from Italy were assigned to different geographical groups in order to investigate the origin and spread of cultivated germplasm in the Italian Peninsula. Landraces and cultivars were assigned to one of six groups: the Iberian Peninsula (Spain and Portugal, 18 accessions), Sicily (22), southern Italy (7), central Italy (12), northwestern Italy (13), and Balkans–Black Sea (Slovenia, Greece, and Turkey, 27). Wild individuals were from Latium (9) and Campania (10). The statistic ΔK (Evanno et al. 2005) was calculated by structure harvester software (Earl and vonHoldt 2012) and used to select the optimal K value.

Genetic diversity and differentiation among the eight geographical populations was investigated. Popgene software (Yeh et al. 1997) was used to calculate observed (N a) and effective (N e) number of alleles, observed (H o) and expected heterozygosity (H e), Nei’s (1978) coefficient of genetic identity (G i) and genetic distance (G d), and gene flow (N m) (Slatkin and Barton 1989). The fixation index (F st) was estimated according to Weir and Cockerham (1984) using the program F-stat (Goudet et al. 1995). The significance levels of the F st values were determined after 560 permutations.

The Shannon–Weaver index was calculated for phenotypic diversity in each trait in the 42 landraces. The diversity index was calculated as H = − Σp i  ln p i , where p i is the frequency of the phenotypic class i for each trait, as reported in Table 2 (Shannon and Weaver 1949). A principal component analysis (PCA) using the 14 morphological descriptors for 42 landraces and 11 standard cultivars (‘Casina,’ ‘Barcelona,’ and ‘Negret’ from Spain; ‘Nocchione,’ ‘Tonda Gentile Romana,’ ‘Tonda di Giffoni,’ and ‘Tonda Gentile delle Langhe’ from Italy; ‘Tombul’ from Turkey; ‘Cosford’ from England; ‘Istrska dolgoplodna’ and ‘Istrska okrogloplodna’ from Slovenia) was performed using PAST v. 2.12 software (Hammer et al. 2001).

Results

Sets of duplicates

Microsatellite analysis identified 42 unique genotypes and ten sets of duplicate accessions among the 77 investigated landraces. Accessions listed as duplicates were similar for nut and husk morphology. Thirty-five landraces showed SSR profiles identical to a reference cultivar (Supplementary Table 1).

Among the landraces surveyed in the Iberian Peninsula, three sets of duplicates were detected. The first set consisted of three accessions from Portugal (‘Cartuxeria/Tubulosa,’ ‘Dawton,’ and ‘Purpurea’) whose SSR profiles were identical to ‘Fructo rubro’ (syn. ‘Pellicule rouge’). All had small, long, thin-shelled nuts and long tubular husks. The second set was the pair ‘Raul’ and the Turkish cultivar ‘Karidaty’ (syns. ‘Imperiale de Trebizonde,’ ‘Kargalak’). The third set consisted of six accessions from northern Spain which were identical to ‘Casina,’ the most common cultivar in this area.

Three sets of accessions with the same SSR profile were found among the accessions collected in Italy. The first set included six accessions of ‘Tonda di Biglini’ (Piedmont) and ‘Tonda Gentile delle Langhe’ (‘TGL’). The latter cultivar represents over 90 % of the orchards in the Piedmont region and is well-known for kernel quality. However, there were phenological and carpological differences between ‘Tonda di Biglini’ and ‘TGL.’ ‘Tonda di Biglini’ nuts matured 10–15 days earlier, had thicker shells, a lower percentage of kernel by weight, and a higher frequency of double kernels (data not showed). The second set was ‘Meloni’ and ‘Nocciola della Madonnella’ from Latium, which were identical to ‘Tonda Gentile Romana’ (‘TGR’), which represents ∼85 % of the local nut production. Nut maturity in ‘Meloni’ was about 15 days earlier than ‘TGR.’ The third set consisted of ‘Nocchia rosa’ (Latium), three accessions of ‘Tonda Calabrese’ (Calabria), and six accessions of ‘Caraffara’ (Sicily), all of which were identical to ‘Nocchione.’ All had round-oblate nuts of medium size in short husks. ‘Nocchione’ is the main pollinizer of ‘TGR’ in the Latium region. It is also the most widely grown cultivar in Sicily, where is known under different names: ‘Nostrale,’ ‘Comune,’ or ‘Mansa’ (Catania and Messina provinces); ‘Racinante’ (Enna province); and ‘Santa Maria del Gesù’ (Palermo province). Alberghina (1982) attributed the small morphological differences observed among the above-mentioned cultivars to environmental factors and renamed the group ‘Siciliana.’ Recently, Boccacci et al. (2006) and Gökirmak et al. (2009) confirmed their identical profiles at 24 and 21 SSR loci, respectively.

Accessions ‘CV/1’ and ‘CV/2’ from Slovenia were identical to ‘Barcelona’ (syn. ‘Castanyera’). ‘Barcelona’ is commercially important in the USA and in France where it is known as ‘Fertile de Coutard.’ It is of minor importance in Spain where it is and known as ‘Castanyera’ in the northeast (Tarragona province) and ‘Grande’ in the north. In Portugal, it is called ‘Grada de Viseu’ (Mehlenbacher and Miller 1989).

Among the Black Sea cultivars, three sets of synonyms were noted. The first set was the pair ‘Patem small’ from Greece and ‘Fructo rubro.’ The second set was ‘Argiroupoli’ and ‘Patem large’ from Greece and the cultivar ‘Yassi Badem’ from Turkey. Kernels of ‘Yassi Badem’ resemble almonds in size and shape and are consumed fresh, but are not suitable for processing. Finally, the third set was the pair ‘Polykarpos’ and ‘Tombul Ghiaghli’ from Greece; the latter is commonly cultivated there.

Morphological characterization

Morphological observations revealed high phenotypic diversity among the 42 unique landrace genotypes (Table 2). The H index calculated for each of the 14 morphological descriptors averaged 1.1, ranging from 0.26 (‘presence of double kernels’) to 1.57 (‘kernel shape’); the highest values were for ‘nut shape’ (1.50) and ‘kernel shape’(1.57). The predominant number of nuts per cluster was 2–3 (46.2 %) and 1–2 (30.8 %). The majority of the landraces had an involucre longer than the nut (47.4 %). Of the nut characters, the small (40.5 %) and medium (37.5 %) sizes were most common. Nut shape was highly variable, but the globular (33.3 %) and long cylindrical (26.2 %) shapes were the most common. The majority of nuts had a light brown shell color (64.3 %), few (33.3 %) or medium stripes (47.6 %), and small (44.1 %) or medium (44.1 %) size of pistil scar. Among kernel descriptors, almost all accessions (92.9 %) had no double kernels. The majority had medium (45.2 %) or small (40.5 %) size kernels, and the most common shapes were ovoid, long cylindrical, and globular (28.6, 28.6, and 23.8 %, respectively). The appearance of the skin (pellicle) was slightly corky (57.1 %), and the size of the internal cavity was small (53.1 %). Concerning the ‘percentage of kernel by weight,’ 31.7 % of landraces showed medium values (45.1–50.0 %), while 58.6 % had values less than 45.0 %. Finally, 43.9 % of them had a high ‘percentage of kernel caliber >12 mm’ (75.1–100 %). These descriptors can be used to identify accessions suitable for the kernel or in-shell markets. The main morphological and technological traits are reported for the 42 landraces in Table 3.

Table 3 Morphological and technological parameters of 42 landraces with unique SSR profiles and standard cultivars ‘Tonda Gentile delle Langhe’ and ‘Negret’

In the PCA based on 14 morphological descriptors for 42 landraces and 11 reference cultivars, the first two components (PC1 and PC2) explained 38.7 % of the total variation. PC1 accounted for 25.1 % and was positively correlated with nut and kernel size. PC2 accounted for an additional 13.6 % and was mostly associated with nut and kernel shape. The PCA scatter-plot split the samples into three main groups (Fig. 1). Among Italian landraces, the northwestern accessions (Liguria) were separated from those from central (Latium) and southern (Sicily) Italy. Ligurian landraces, with the exception of ‘Noscello,’ were grouped on the right side of the scatter-plot with ‘Casina,’ ‘Istrska dolgoplodna,’ ‘Negret,’ and ‘Tombul.’ The accessions collected in Sicily and Latium, except ‘Selvaggiola lunga,’ clustered in two adjacent groups. The group in the upper left contained: (a) ‘Barrettona,’ ‘Cappello del Prete,’ ‘Itavex,’ and ‘Madonnella’ from Latium; (b) ‘Selvaggiola riccia,’ ‘Selvaggiola SIC6,’ SIC13, and SIC16, ‘Selvaggiola agostara,’ ‘Selvaggiola tardiva SIC8,’ and ‘Trichette’ from Sicily; and (c) the standard cultivars ‘Barcelona,’ ‘Nocchione,’ ‘TGR,’ ‘Tonda di Giffoni,’ and ‘TGL.’ The group in the lower part of the graph included: (a) ‘Allungata,’ ‘Nocciola Ada,’ ‘Nocciola Benedetta,’ ’Nocciola centenaria,’ and ‘Nocciola lunga’ from Latium and ‘San Vicino Vittori’ from Marche; (b) ‘Minnulara,’ ‘Selvaggiola tardiva SIC12,’ and ‘Selvaggiola’ SIC4, SIC7, and SIC17 from Sicily; (c) ‘Cosford.’ The PCA did not separate into distinct groups the landraces from the Iberian Peninsula, Slovenia, and Greece.

Fig. 1
figure 1

PCA two-dimensional scatter plot based on the first two principal components (PC1 and PC2) generated for 42 landraces and 11 standard cultivars based on 14 morphological traits

Genetic relationships and population structure analysis

A dendrogram was constructed depicting the genetic relationships among the 42 unique landrace genotypes, 57 cultivars, and 19 wild individuals. Accessions were grouped into eight clusters (Fig. 2). Group A included cultivars from the Black Sea region (Turkey and Greece) as well as three accessions surveyed in Italy (‘San Vicino Vittori’ from Marche, ‘Lunghera’ and ‘Seigretta’ from Liguria) and one surveyed in Greece (‘Philio’). Group B was a mixture of cultivars and landraces from different geographical areas. ‘T/10’ surveyed in Slovenia appeared adjacent to the cultivars ‘TGL’ (Italy), ‘Trenet,’ and ‘Morell’ (Spain), while the landraces ‘Ciasetta’ from Liguria and ‘Nocciola Benedetta,’ ‘Nocciola lunga,’ and ‘Allungata’ from Latium were adjacent to the Turkish cultivars ‘Yassi Badem’ and ‘Yuvarlak Badem.’ Group C contained cultivars from the Iberian Peninsula and landraces surveyed in Asturias (‘Allande 3,’ ‘Robriguedo 2,’ ‘Las Cuevas 1,’ and ‘Priero 1’) and Liguria (‘Noscello,’ ‘Menoia,’ and ‘Bardina’). The landraces from northern Spain constituted a sub-group with ‘Casina’ and ‘Noscello.’ Most Italian cultivars and landraces were placed in the main cluster D, which includes germplasm from central Italy (Latium), south Italy (Campania), and Sicily. In this group were also placed the accessions ‘T/0’ and ‘T/16’ from Slovenia and ‘Quinta Vila Nova Do Rego’ from Portugal. Group E was composed of only four landraces surveyed in Liguria (northwestern Italy). Finally, the last three clusters (F, G, and H) were comprised of wild genotypes and a few cultivars. Clusters F and G included both wild and landraces from Latium, while almost all wild individuals from Latium and Campania were included in the larger group H with ‘Tonda rossa’ and ‘Tonda bianca.’ These two cultivars are grown only in Avellino province (Campania, South Italy), are distinct from the other cultivars in Campania, and are morphologically similar.

Fig. 2
figure 2

UPGMA dendrogram based on SSR analysis of 42 unique landrace genotypes (LR), 57 cultivars (CV), and 19 wild individuals (W)

In the PCoA, the first two PCs explained 48.7 % of the total variation. The first coordinate explained 26.1 % of the variation and the second coordinate an additional 22.6 %. The projection of 118 hazelnut accessions on a two-dimensional plane defined by the first two PCs (Fig. 3) showed a tendency to separate the cultivated accessions from the wild genotypes. Considering the geographical origin of the cultivars and landraces analyzed, the scatter-plot showed a tendency of the central–southern Italian accessions to cluster together in the lower half of the graph. Accessions from the Black Sea were preferentially placed in the upper left and those from the Iberian Peninsula in the upper right. Among the northern Italian accessions, three (‘Lunghera,’ ‘Seigretta,’ and ‘Trietta’) clustered with those from the Black Sea, seven (‘Bardina,’ ‘Del Rosso,’ ‘Dell’Orto,’ ‘Gianchetta,’ ‘Menoia,’ ‘Noscello,’ and ‘Tapparona’) clustered with the Iberian accessions, and three (‘Catainetto,’ ‘Ciasetta,’ and ‘TGL’) were placed in an intermediate position along the X-axis of the graph.

Fig. 3
figure 3

Two-dimensional plot obtained from PCoA for 118 hazelnut genotypes classified in eight geographical groups and analyzed at ten SSR loci

The 118 hazelnut genotypes were further evaluated for population stratification using the structure software. SSR data were analyzed increasing the number of subpopulations (K) from 1 to 10. The estimation of ΔK revealed the highest value for K = 3 (ΔK = 48.1), indicating the existence of three groups, composed mainly of Turkish, wild, and central–southern Italian accessions, respectively (Fig. 4). Several genotypes were not clearly placed in separate groups, such as those from Spain or Liguria that clustered both with the Turkish and wild accessions. At K = 4 (ΔK = 21.2) and K = 5 (ΔK = 12.0), these three initially identified groups remained almost constant, whereas several Spanish accessions showed the tendency to constitute a separate group, Ligurian accessions were placed both with the Turkish and Spanish accessions, and some cultivated forms collected in Latium showed introgression with the local wild germplasm (Fig. 4). Comparing these results with the UPGMA dendrogram and the PCA scatter-plot, there was general agreement about the population subdivisions and the genetic relationships among genotypes.

Fig. 4
figure 4

Hierarchical organization of genetic relatedness of 118 unique hazelnut genotypes based on ten SSR markers and analyzed by the STRUCTURE program, with three, four, and five populations (K = 3, K = 4, and K = 5). Legend geographical groups: 1 Iberian Peninsula, 2 northwestern Italy, 3 central Italy, 4 southern Italy, 5 Sicily, 6 Balkans–Black Sea, 7 wild Latium, 8 wild Campania

Differentiation among geographical gene pools

On the basis of their geographical area of origin, the 118 unique genotypes were divided in eight gene pools (Table 4 and 5). The observed (N a) and effective (N e) number of alleles and the observed (H o) and expected heterozygosity (H e) were calculated to evaluate the level of genetic diversity within each gene pool (Table 4). N a and N e ranged from 4.2 to 7.8 (average 6.3) and from 3.1 to 4.7 (average 3.8), respectively. H o (average 0.81) was generally higher than H e (average 0.71) in each group, with the exception of the wild individuals from Latium. The level of genetic diversity observed was high, a consequence of the self-incompatibility mating system of C. avellana and wind pollination, and similar to that found by other authors (Boccacci and Botta 2010; Boccacci et al. 2006, 2008; Gökirmak et al. 2009; Gürcan et al. 2010a; Campa et al. 2011).

Table 4 Genetic diversity for hazelnut accessions classified in eight geographical groups
Table 5 Genetic identity (G i), genetic distances (G d), gene flow (N m), and genetic differentiation (F st) among and between hazelnut gene pools analyzed with SSR markers

Genetic identity (G i), genetic distance (G d), fixation index (F st), and gene flow (N m) were calculated to investigate genetic differentiation among gene pools (Table 5). G i was the highest among gene pools comprised of cultivated accessions, ranging from 0.667 to 0.918. On the contrary, G i values were lower between wild and cultivated groups, ranging from 0.400 to 0.694. Correspondingly, a higher G d was found between cultivated and wild groups, ranging from 0.086 (Iberian Peninsula vs. northwestern Italy) to 0.916 (wild Campania vs. Balkans–Black Sea). All pairwise comparisons yielded significant differentiation values, ranging from 0.015 (Iberian Peninsula vs. northwestern Italy) to 0.194 (wild Campania vs. Sicily) with P (Fst not >0) < 0.05 for each F st value, and an equal distribution of N m values between gene pools was observed.

Discussion

Mislabeling and the existence of synonyms and homonyms are important challenges for germplasm conservation. In the past decade, SSR markers have become very valuable tools in the management of ex situ hazelnut collections. In their study, Boccacci et al. (2006) reported six sets of synonyms among 78 accessions from European collections, Gökirmak et al. (2009) found 72 duplicates among 270 accessions from USDA-ARS-NCGR and OSU germplasm repositories, and 6 Turkish accessions conserved in the US collection fields were found to be synonyms of cultivars from the HRI collection by Gürcan et al. (2010a).

Among the 77 landraces surveyed in five southern European countries (Portugal, Spain, Italy, Slovenia, and Greece), SSR profiles indicated that 42 were unique genotypes, while 35 accessions appeared to be synonyms. A total of ten sets of duplicates were found between landraces and some reference cultivars. Among them, landraces from Portugal, Slovenia, and Greece had profiles identical to those of six foreign cultivars: ‘Barcelona’ (syn. ‘Castanyera’) from Spain, ‘Fructo rubro’ from the Balkans, and ‘Karidaty’ (syns. ‘Imperiale de Trebizonde,’ 'Kargalak'), ‘Yassi Badem,’ ‘Palaz,’ and ‘Tombul Ghiaghli’ from Turkey. Hazelnut is cultivated on only a few hectares in Portugal and Slovenia. Beginning in the 1980s, several introduced cultivars have been evaluated for growth and nut production in both countries alongside local cultivars (Solar and Štampar 2011). In Greece, hazelnut cultivation began when Greek immigrants came from the Pontus region in northern Turkey. With them they brought Turkish cultivars that are still cultivated today, such as ‘Extra Ghiaghli’ (a clone of ‘Tombul’), ‘Tombul Ghiaghli,’ and ‘Sivri Ghiaghli.’ Landraces surveyed in Spain and Italy showed identical SSR profiles and similar morphological traits with important local cultivars: ‘Casina’ in northern Spain, ‘TGL’ in Piedmont, ‘TGR’ in Latium, and ‘Nocchione’ (syn. ‘Siciliana’) in central and southern Italy. Hazelnut growing has a strong tradition in Spain and Italy, where orchards have a limited number of cultivars. In Spain, the northeastern province of Tarragona (Catalonia) accounts for 88 % of the total area planted to hazelnut, and ‘Negret’ is the most widely planted cultivar. Minor hazelnut-growing areas include Asturias and adjacent regions in northern Spain, where cultivated forms are found in small orchards and gardens. In the past, hazelnut was an important crop in this region, and ‘Casina’ was the most commonly grown cultivar (Ferreira et al. 2010; Campa et al. 2011). In Italy, 98 % of the producing surface is located in four regions: Piedmont, Latium, Campania, and Sicily. Production is limited in Liguria, Sardinia, Emilia, Veneto, and Calabria. A wide and varied hazelnut germplasm base exists in the Italian Peninsula, but it has been little studied in Sicily and minor growing regions. In Campania, seven main varieties are cultivated (‘Camponica,’ ‘Mortarella,’ ‘Riccia di Talanico,’ ‘San Giovanni,’ ‘Tonda di Giffoni,’ ‘Tonda bianca,’ and Tonda rossa’). In Piedmont (‘TGL’) and Latium (‘TGR’), production is based on a single cultivar, while ‘Nocchione’ (syn. ‘Siciliana’) is the main pollinizer of ‘TGR’ in Latium.

In their studies, Boccacci et al. (2006) and Gökirmak et al. (2009) reported that ‘Nocchione’ and a group of Sicilian cultivars, renamed ‘Siciliana’ by Alberghina (1982), had identical profiles at 24 and 21 SSR loci, respectively. It was an unexpected result, since these cultivars are grown in two distant Italian regions: Latium and Sicily (Boccacci et al. 2006). Our results confirm this synonymy and clarify the origin of ‘Nocchione.’ Structure analyses revealed that ‘Nocchione’ grouped (98 %) with the accessions from southern Italy and Sicily rather than with those from Latium (central Italy). The Bayesian clustering and admixture analysis can be considered a standard method to identify the ancestral populations from which cultivars originated and quantify genetic relationships with probabilities and proportions (Breton et al. 2008). Thus, our data indicate that ‘Nocchione’ originated in southern Italy, most likely in Sicily. Moreover, the molecular analysis of most Sicilian accessions surveyed in situ confirmed the existence of a single dominant cultivar in local orchards (Alberghina 1982). In fact, six ‘Caraffara’ accessions, known as ‘Nostrale’ in Enna province, showed the same SSR profile as ‘Siciliana,’ indicating that Sicily is very likely the origin of ‘Nocchione’ from which it spread to central and southern Italy. It is probable that ‘Nocchione’ was also introduced into Calabria during the second half of XIX century (Piccirillo et al. 2007), as indicated by the genetic identity between ‘Nocchione’ and ‘Tonda Calabrese.’

Morphological characterization revealed a wide diversity among the 42 unique landraces. The H index was high (average of 1.1), and most of the phenotypic classes were present for the evaluated descriptors (Table 2). These accessions should be considered original and valuable PGR and additional local genetic diversity which needs to be conserved in situ. In addition, some landraces showed morphological and technological traits appreciated by the market (Table 3). Accessions ‘Robriguedo-2’ (Asturias), ‘Noscello’ (Liguria), ‘Barrettona,’ ‘Itavex,’ ‘Cappello del Prete,’ ‘Madonnella’ (Latium), and ‘Selvaggiola Tardiva SIC12’ (Sicily) were interesting for the food industry. Nuts with globular or ovoid shape, kernels with medium size, and a caliber ≥ 12 mm are the ideal traits for the industry processing (Garrone and Vacchetti 1994). On the contrary, ‘Selvaggiola SIC3,’ ‘Trichette’ (Sicily), ‘San Vicino Vittori’ (Latium), and ‘T/16’ (Slovenia) showed the large nut and kernel size desired by the in-shell market.

The study of the genetic relationships and population structure among wild forms, landraces, and cultivars in a geographic area can supply information about the putative domestication events, the evolutionary relationships, or the gene flow between them. The UPGMA tree (Fig. 2), the PCoA scatter plot (Fig. 3), and the structure analyses (Fig. 4) revealed a high level of differentiation between wild and cultivated forms. The wild genotypes from Latium and Campania were closely related but were separated from cultivars and landraces. Nevertheless, an introgression and admixture of genotypes between wild accessions and some landraces from Campania (‘Tonda bianca’ and ‘Tonda rossa’) or from Latium (‘Nocciola centenaria,’ ‘Cappello del prete,’ and ‘Barrettona’) was identified. Similar results were also obtained by Campa et al. (2011) between 40 wild hazelnuts collected in northern Spain and 62 locally cultivated accessions, investigated at 13 SSR markers. The SSR data are in agreement with the general idea that most currently important hazelnut cultivars were selected over centuries from local wild populations and a few were spread outside their areas of origin by trade and human migration (Thompson et al. 1996).

The cultivated forms showed the tendency to constitute two main groups located in the Mediterranean basin in the West (Spain–Italy) and Black Sea basin in the East (Turkey). They are two of the four major geographical gene pools described in the European hazelnut: English, central European, Spanish–Italian, and Black Sea (Gökirmak et al. 2009). A high level of genetic similarity between cultivars grown in the Iberian and Italian Peninsula has also been reported by other authors (Boccacci and Botta 2010; Boccacci et al. 2006; Gürcan et al. 2010a). In our study, almost all accessions from the Iberian Peninsula were separated from Italian ones. The cultivars from northeastern Spain (Tarragona) were more closely related to accessions surveyed in northern Spain (Asturias) than to cultivars in central and southern Italy; the landraces surveyed in Asturias showed a tendency to cluster into a separate ‘Casina’ group. These results agrees with those of Campa et al. (2011) and encourage their hypothesis that local hazelnuts belong to the northeastern Spanish germplasm, but constitute a separate group. A significant genetic differentiation between the Spanish and Italian gene pools was also observed by Boccacci and Botta (2010). Then, it is probable that Spain and Italy are two hazelnut diversification areas, where a shared gene flow occurred between the western and central Mediterranean basin as a consequence of human migration and trade during and after the Roman civilization (Boccacci and Botta 2009, 2010). Among the accessions from Italy, most from central and southern areas comprised the largest gene pool, while some landraces surveyed in Liguria (‘Gianchetta,’ ‘Dell’Orto,’ ‘Tapparona,’ and ‘Del Rosso’) were arranged in a differentiate group. A congruent topology was reported in the PCA scatter plot obtained from morphological data (Fig. 1). These genetic and morphological data indicate less gene flow between northern and southern Italy, whereas exchange of plant material very likely occurred in southern Italy between Campania and Sicily. The existence of a main gene pool in southern Italy supported the hypothesis that it was an important centre of origin and diffusion of hazelnut cultivars, as suggested by Boccacci and Botta (2009) who analyzed cultivars from Spain, Italy, Turkey, and Iran at 13 chloroplast SSR (cpSSR) loci. The grouping of several Italian landraces did not fit their geographic origin. Some accessions from Latium (‘San Vicino Vittori,’ ‘Nocciola Benedetta,’ ‘Nocciola lunga,’ and ‘Allungata’) and Liguria (‘Trietta,’ ‘Lunghera,’ and ‘Seigretta’) were genetically closer to Turkish cultivars, while others from Liguria (‘Noscello,’ ‘Menoia,’ ‘Catainetto,’ and ‘Bardina’) showed a genetic similarity to Spanish accessions. It is likely that gene flow occurred from the western Mediterranean to northern Italy and from the Black Sea to northern and central Italy, most likely as a consequence of commercial exchanges. In fact, during the eleventh century, hazelnuts produced in Turkey were traded in Liguria on the Genoa market (Rosengarten 1984). These results supported the hypothesis that hazelnut cultivation and cultivars were not introduced from the eastern Mediterranean/Black Sea basin into Campania and Sicily by Greeks or by Arabs, as reported by Boccacci and Botta (2009).

The genetic diversity calculated between each geographical gene pool pair (Table 5) also supported the above mentioned considerations: (a) high genetic differentiation between northern and southern Italian groups, (b) low genetic diversity among central and southern Italian gene pools, (c) higher genetic similarity between Iberian and north–west Italy groups and between Balkans–Black Sea and northwestern and central Italian groups, and (d) low gene flow between southern Italy and Black Sea. Finally, these results also indicated that northeastern Spain, southern Italy, and Black Sea could have been three important hazelnut domestication areas.

Archeological findings, historical documents, pollen data, and cpSSR analysis support the hypothesis that Campania (southern Italy) was an important centre of origin and diffusion of hazelnut cultivars (Boccacci and Botta 2009). It seems likely that this germplasm originated from the post-glacial refuge in southern Italy (Palmé and Vendramin 2002) and spread around the Mediterranean Sea beginning with the Roman civilization. Our data support this hypothesis, indicating that a varietal sharing occurred among the regions Latium, Campania, and Sicily. Moreover, genetic relationships also showed that the Sicilian cultivars ‘Napoletana’ and ‘Napoletanedda’ were very close to those from Campania, indicating their introduction to Sicily from Campania. Thus, it can be hypothesized that a first gene flow occurred from Campania southward to Sicily and northward to Latium, while a second gene flow carried cultivars from Sicily to Latium and Calabria. These results confirm that hazelnut cultivation was not introduced to Sicily by Arabs but from Campania by the Romans. The Arabs dominated Sicily beginning in the second half of the ninth century, but hazelnuts were already being cultivated in Roman times (Boccacci and Botta 2009).

In summary, the molecular and morphological characterizations of surviving on-farm landraces were useful for identifying duplications and mistakes as well as the most interesting accessions and provided justification for their in situ preservation. These materials have been grafted and propagated on their own roots for planting in two hazelnut collections: IRTA in Reus (Spain) and in the country where the material originated, for evaluation ex situ and use in breeding. Findings about genetic relationships and population structure also raise an interesting question about the origin and diffusion of the hazelnut germplasm cultivated in southern Europe. According to several authors (Boccacci and Botta 2009, 2010; Gökirmak et al. 2009; Gürcan et al. 2010a), C. avellana seems to have been domesticated independently in six different areas: British Islands, central Europe, Spain, Italy, Black Sea, and Iran. Our results are in agreement with these conclusions, indicating the existence of three main germplasm groups in the Mediterranean basin which could correspond with three domestication areas: northwestern Spain and southern Italy in the West and the Black Sea region in the East. Moreover, the data indicate the existence of secondary gene pools in the Iberian (Asturias) and Italian (Liguria and Latium) Peninsulas, where local varieties have been domesticated in subsequent times from wild forms and/or from the introduction of ancient domesticate varieties, followed by a relatively local evolution that could include crosses among them and with local hazelnuts. The introduction of plant material from other areas influenced the local gene pool, but it is more likely that this was due to introgression of genes from foreign germplasm into local accessions followed by selection rather than to the direct adoption of introduced cultivars.