Introduction

Persian walnut (Juglans regia L.), a monoecious tree with a long history of cultivation in the Middle East and Europe, is a major nut crop in Iran (McGranahan et al. 1998; Vahdati 2000). The high levels of genetic diversity amongst walnuts in Iran is credited to the country being one of the origins of the species (Ebrahimi et al. 2011; Karimi et al. 2015; Khadivi-Khub et al. 2015b; Vahdati et al. 2015). A heterodichogamous pollination system combined with thousands of years of sexual propagation in Iran have allowed the species to retain a high level of phenotypic and genetic diversity as well as a variety of vegetative growth and nut characteristics (Khadivi-Khub et al. 2015b). However, this diversity has been problematic when selecting stock for establishment of modern orchards capable of producing a superior product for export (Ebrahimi et al. 2011; Khadivi-Khub and Ebrahimi 2015). Azadshahr, located in northern and north-eastern Iran, is one of the main sites of walnut production. The Azadshahr region is one of the primary areas for walnut orchard expansion in Iran. The majority of walnut cultivation occurs in the southern part of Azadshahr within the Rudbar, Kashidar and Vamenan villages. Unique to this region is the dramatic climate variation near the Hyrcanian forest. The Hyrcanian forest is well-suited for natural walnut distribution (Jafari Sayadi et al. 2012) as the subtropical microclimates within the range are useful for selection of low chill walnut genotypes. While morphological studies of nut traits are necessary for selecting superior trees in perennial nut species (Fatahi et al. 2010; Karimi et al. 2015; Khadivi-Khub 2014), genetic studies have also been undertaken in Iran (Ebrahimi et al. 2011; Karimi et al. 2015; Vahdati et al. 2015) and elsewhere. Morphological markers are fairly simple to use for evaluation of genotypes but can be influenced by plant growth stages and various environmental factors (Nadeem et al. 2018). Several types of molecular markers have been employed for assessment of genetic diversity and relationships in walnut, such as RAPD (Fatahi et al. 2010; Woeste et al. 1996), ISSR (Christopoulos et al. 2010), AFLP (Bayazit et al. 2007) and SSR (Ebrahimi et al. 2011; Foroni et al. 2006; Vahdati et al. 2015). While all of these techniques provide useful information, microsatellites are ideal for characterizing relationships among individuals because of their co-dominant inheritance, hyper-variability, and reproducibility of genotyping results in a laboratory setting (Foroni et al. 2006; Weising et al. 2005).

The objective of our research was to blend morphologic and genetic methodologies to discover a correlation between superior nut quality traits and genetic markers. Likewise, we wanted to assess the overall diversity of Persian walnuts within the Azadshahr walnut genotypes using morphological and nut traits along with selected superior genotypes to determine genetic structure and gene flow among them by microsatellite DNA markers.

Materials and methods

Plant materials and evaluation

Research was conducted in 2015 in the Azadshahr region of Golestan province in north-eastern Iran (Fig. 1). Tree density and accessibility varied within each site; therefore, we grouped trees into four geographic populations (Rudbar, Saidabad, Kashidar and Vamenan) (Table 1, Fig. 1). Trees within 15 km of each other were considered a part of the same population (Malvolti et al. 1993). Populations consisted of mature walnut trees, all originating from open pollinated seedlings from orchards or on home sites. Between 15 and 40 individual trees were selected for study from each population (N = 102 total individuals sampled) depending on availability. We used a modified IPGRI walnut descriptor (International Plant Genetic Resources I 1994) to assess 28 morphological characteristics related to fruit, leaf, and tree growth. Finally, 39 superior phenotypes, based on the three most valuable fruit characteristics, were selected for further genetic analysis (Table 2).

Fig. 1
figure 1

Geographic location of collection sites for the studied walnut genotypes from Golestan Province, Azadshahr area in Northeastern Iran (for explanation of region codes, see Table 1)

Table 1 Collection locations: basic geographical information for each study site in Azadshahr, Iran
Table 2 Pomological traits: mean values for pomological traits in 39 superior walnut genotypes selected for SSR analysis

DNA extraction

Six young leaves were collected from each individual tree in 2014 (summer) for use in DNA extractions. Genomic DNA was extracted using a modified salting out method (Miller et al. 1988). We ground 1 g of fresh leaves in liquid nitrogen, mixed with 2 × volume of extraction buffer (10 mM tris–HCl, 1% Triton 100 ×, 5 mM MgCl2 and 0.32 M sucrose, pH 8), centrifuged the sample (7000 rpm for 5 min), and discarded the supernatant. We then added 600 ml of lysis buffer (10 mM tris–HCl, 2 mM EDTA, 400 mM NaCl, pH 8.2) to the pellet, vortexed for 15 s, added 40 μl of SDS 10% and 70 μl of proteinase K and then incubated the sample overnight in a water bath (37 °C). After digestion was completed, we added 200 ml of saturated NaCl (6 M), shook gently, centrifuged the sample (2500 rpm for 15 min), then extracted the supernatant. The supernatant was then mixed in equal volume with chloroform, and centrifuged (3000 rpm for 10 min). We then added 0.1 v/v of sodium acetate (3 M) and precipitated the DNA using ethanol. DNA was quantified spectrophotometrically (260 and 280 nm) and visualized on 0.8% agarose gels stained with ethidium bromide.

SSR primers and PCR amplification

Ten primer pairs developed for black walnut were selected based on polymorphism information content (PIC), heterozygosity rate, and number of observed alleles (Dangl et al. 2005; Wang et al. 2008; Woeste et al. 2002) (Table 3). Primers were synthesized by Tacapo Zist Co. (Iran). Polymerase chain reaction (PCR) amplifications were performed in 12 μl reactions: 6 μl of PCR master mix 1 × (CinnaColon, Iran), 2 μl primer pair (10 pM) and 1 μl (100 ng) template DNA. Touchdown PCR reactions 5 min at 95 °C, then 35 cycles of 1 min at 95 °C, 1 min at annealing temperature (see Table 3), and 2 min at 72 °C, followed by a final extension of 72 °C for 10 min were performed. PCR products were stored at 4 °C until needed. All PCR products were visualised on 2% agarose gels in 1X TAE buffer then on a 1 mm thick, 6% non-denaturing polyacrylamide gel. Gels were pre-run at 90 W for 2 h. DNA bands were visualized by silver staining (Bassam et al. 1991). Alleles were sized and scored visually using a transcription ruler alongside a reference DNA ladder (Jena Bioscience GmbH).

Table 3 Microsatellite loci: properties of microsatellite loci used to characterize 39 walnut genotypes from Azadshahr, Iran

Data analysis

Mean parameter values for a given individuals were used to perform statistical analysis of morphological traits. Morphological pair-wise distances of the genotypes were measured using Ward’s method (Anderberg 1988). A dendrogram based on morphological data was conducted in genotypes using SPSS 16 software. PCR products from 39 superior phenotypes were scored visually and the following information calculated per locus: number of alleles (Na), number of effective alleles (Ne), Shannon’s information index (I), observed heterozygosity (Ho), expected heterozygosity (He) calculated in each locus by GenAlex 6.3 software (Peakall and Smouse 2006). A phenogram for the 39 individuals was constructed using UPGMA (unweighted pair-group method using arithmetic average) based on the dissimilarity matrix of allelic data by simple matching using DARwin6 software (Perrier and Jacquemoud-Collet 2006). Bootstrap analyses were performed with 1000 replications. GenAlex 6.3 software was used to generate the principal co-ordinate analysis (PCoA) based on the dissimilarity matrix (Peakall and Smouse 2006). Population structure analysis and values of K (hypothesized number of populations) from 1 to 8 were tested by running 20 replicate simulations per K with 750,000 Markov chain Monte Carlo repetitions and 50,000 burn-in iterations by Structure Software (Pritchard et al. 2003). The most likely K selected with delta K as proposed by Evanno et al. 2005.

Results

Morphological diversity

Variability was observed among the individuals for all quantitatively measured traits (Table 4). Individuals Ka17 and Va31 had the highest average of nut weight (19.79 g) while Va31 had the highest kernel weight (9.4 g) (Table 2). Among the individual walnuts studied, nut weight ranged from 6.1 to 19.79 g (Table 4). Kernel weight and kernel fill percentage ranged from 2.9 to 9.4 g and 26.51 to 60.34%, respectively (Table 4). The maximum of variance was observed in Kernel fill percentage and leaf length (Table 4). 39 superior walnuts phenotypes, mainly based on the three most valuable fruit characteristics, kernel and nut weight and kernel fill percent, were selected for further genetic analysis (Table 2).

Table 4 Physical descriptions: descriptive analysis of morphological traits in the studied walnuts

Cluster analysis of morphological data

Cluster analysis of morphological data showed that individuals clustered based on geographical region. Individuals sorted to four main groups and correlated closely with geographic location (Fig. 2). Most individuals of Vamenan and Kashidar regions and two Rudbar individuals (Rud4 and Rud11) sorted into the second and third clusters although a few (Ka20, Va22, Va24, Va33, and Va42) sorted into the first cluster. The Vamenan and Kashidar groups were closer to each other than any other group and were similar in elevation (Table 1). Nearly all individuals collected from Saidabad were assigned to the third cluster. Saidabad individuals were geographically separate from the other selections and also grew at much lower elevations (Table 1). The fourth cluster was composed primarily of Rudbar individuals with a few Vamenan and Kashidar individuals and a single Saidabad sample included. The Rudbar region is between Kashidar and Vamenan in elevation and Saidabad and Vamenan geographically (Fig. 1).

Fig. 2
figure 2

Dendrogram. Clustering analysis for 102 walnut individuals collected from the Kashidar (Ka), Rudbar (Rud), Saidabad (Sai), and Vamenan (Va), regions of Azadshahr based on 28 morphological traits

Allelic variation at microsatellite loci

The selected primers were polymorphic and ranged from 104 to 268 bp over all 79 alleles (Table 3). The number of allele per locus with average of 7.9 ranged from 3 alleles in WGA005 to 12 alleles in WGA054 (Table 5). The Ne value ranged from 2 in WGA005 to 9.55 in WGA054 (Table 5). The average of Shannon’s information index (I) was 1.34 and ranged from 0.85 in WGA005 to 2.35 in WGA054. The most informative locus was WGA054, with an average of 2.35, whereas the least information index was 0.85 in WGA005. The He value ranged from 0.5 in WGA005 to 0.9 in WGA054, with an average 0.67 for all loci (Table 5). The Ho value ranged from 0.54 in WGA202 to 0.93 in WGA001 and WGA276, with an average 0.70 for all loci (Table 5). The Ho for WG001, WGA005, and WGA276 loci was more than He.

Table 5 SSR diversity: SSR diversity parameters in the studied Azadshahr, Iran populations

Population structure analysis, clustering and principle coordinate analysis (PCoA) of SSR genotypes

Our PCoA similarity coefficient spanned a wide range (0.04–0.80) and emphasized the highly diverse genotypes found in the Azadshahr region of Iran. The greatest genetic similarity was observed between Va2, Ka11, and Sai5 with the Rud9, Va4, and Ka20 genotypes. The SSR genotyping results were used to perform population structure analysis for the 39 walnut genotypes under an admixed model using the STRUCTURE program (Pritchard et al. 2003). Figure 3 shows that the greatest estimation of delta K (Evanno et al. 2005) was obtained when K = 4, suggesting that the population used in this study can be divided into four clusters. Clustering analysis, using 10 SSR loci, divided the observed genotypes into three main groups (Fig. 4). The cophenetic correlation coefficient between the dendrogram and the original similarity matrix was significant and relatively high (r = 0.87). PCoA documented 23.35% of variance on the first axis and 16.5% variance on the second axis (Fig. 5). Observation of the resultant matrix of PCoA showed genotypes clearly sorted into one of four distinctive groups which aligned with the cluster analysis (Fig. 5). All genotypes collected from Saidabad sorted into the first clustered group (Cluster 1) (Fig. 4). Application of PCoA to these samples resulted in two subgroups, G1a and G1b. Group 1a included Sai3, Sai5, Sai11, Sai15, Sai16 from Saidabad and Va33 from Vamenan. Group G1b was composed of Sai1, Sai2, Sai6, Sai12, Sai13, Sai14, Va22, Va24, Va42, Ka20, Rud2, and Rud8 (Fig. 5). The members of this subgroup, except Rud8, were all placed in the second group (G1b). The second cluster included five Vamenan, two Rudbar, and seven Kashidar genotypes are matched well with group 3 (G3) of the PCoA (Figs. 4, 5). Cluster 3 was made up of the fourth group of genotyped samples; Ka13, Ka15, Ka21, Va4, and Va9. Although placed in G3 by the PCoA, the dendrogram however, sorted Va30 and Rud13 into group 4 (G4) (Fig. 5).

Fig. 3
figure 3

a Estimated delta K of walnut population structure from Bayesian Structure analysis. b Triangle plot of Structure analysis when K = 4 and the colors correspond to prior population labels. For a given point, each of the three components is given by the distance to one edge of the triangle. Individuals who are in one of the corners are therefore assigned completely to one population or another. c STRUCTURE analysis for 39 walnut genotypes when K = 4

Fig. 4
figure 4

Cluster analysis of 39 walnut genotypes collected from the Vamenan (Va), Kashidar (Ka), Rudbar (Rud) and Saidabad (Sai) areas of Azadshahr, Iran based on 10 simple sequence repeats (SSRs). G1 (Group 1), G2 (Group 2), G3 (Group 3), G4 (Group 4) represented PCoA results. The numbers on the branches represents Bootstraps analysis

Fig. 5
figure 5

Principal coordinate analysis (PCoA) via distance matrix of 39 walnut genotypes from four populations within Azadshahr, Iran based on 10 simple sequence repeats (SSRs). Solid lines represent cluster analysis (cluster) results

Population analysis

The pairwise population dissimilarity matrix of Nei’s genetic distance among walnut populations was determined and presented in Table 6. Genetic distances ranged from 0.218 between Vamenan and Rudbar to 0.569 between Kashidar and Saidabad. There were no significant differences between the four studied J. regia populations regarding Ho, He, Ne, I (Table 7). The Vamenan population demonstrated the highest Ne (4.26), I (1.57), Ho (0.74) and He (0.73) values observed in our four populations (Table 7) which may be related to high genetic diversity in this area. AMOVA revealed that the greatest diversity was located between individuals (92.58%) rather than among subpopulations.

Table 6 Genetic distance: pairwise population dissimilarity matrix of Nei’s genetic distance among walnut population in Azadshahr, Iran
Table 7 Genetic diversity of the studied walnut populations in Azadshahr area, Iran

Discussion

The morphological characteristics analysis showed that individuals from the Azadshahr province display high levels of diversity in assorted pomological traits such as kernel fill percent, nut size, nut weight, and ease of kernel separation. The greatest nut weight (19.79 g) in our study was less than the nut weight for walnuts in Fars province, Iran (20.28 g) (Khadivi-Khub et al. 2015a), Adilcevaz, Turkey (23.81 g) (Sen and Tekintas 1990) and for walnuts in Kamal-Abad, Iran (20 g) (Atefi 2001). Our nut weights were greater than those reported in Himachal Pradesh, India (18.60 g) (Sharma and Sharma 1998), east Anatolia, Turkey (17.04 g) (Yarilgac et al. 1999), North-eastern Anatolia, Turkey (16.01 g) (AslAntAŞ 2006), and the Oltenia region of Romania (18.40 g) (Cosmulescu and Botu 2012). Ideal kernel weights range 6–10 g and have at least 50% kernel fill percentage (Khadivi-Khub et al. 2015a). Also, the most desirable walnut types should have a shell thickness between 0.70 and 1.50 mm (Akca and Ozongun 2004). Twelve of our identified genotypes neared the preferred weight and kernel fill range. All of our genotypes had shell thickness measurements greater than 1.17 mm. Khadivi-Khub et al. (2015a, b) reported that shell thickness in 142 of 232 walnut genotypes collected from the Fars province in Iran was roughly 0.92 mm. This disparity may result from genetic differences between local populations or could potentially be linked to climate effects. The Azadshahr region has a shorter growing season, greater humidity, and a warmer winter than the Fars province. Both nut and kernel quality are strongly affected by genotype, environment, and the interaction of these characteristics (McGranahan and Leslie 2012). Kernels in 62 of the 102 genotypes in this study were very lightly pigmented and 30 genotypes had kernels that were easily extracted from the shell. In general, the most remarkable genotypes have lightly colored kernels that are simple to extract from the shell. These two traits were key in previous selections of superior walnut genotypes (Arzani et al. 2008; Cosmulescu and Botu 2012; Yarilgac et al. 1999; Yarilgac et al. 2001).

The average number of alleles per locus was 7.9 (Table 5). Karimi et al. (2015), studied the diversity of Persian walnut populations in the Northern and Western regions of Iran using 11 SSR loci and obtained 5.7 alleles per locus. Fingerprinting of Persian walnut genotypes in Iran using 9 SSR (Ebrahimi et al. 2011) yielded 5.1 alleles per locus. Similarly, 5.1 alleles per locus were reported by Ruiz-Garcia et al. (2011) in characterizing 57 walnut cultivars originating from Spain and the United States. The average allelic number in our study was greater than some of the afore-mentioned studies because we utilized more geographically and genotypically diverse walnut populations. Our mean allele number was lower than Doğan et al. 2014; Wang et al. 2008, however, those studies used different Juglans species and hybrids. The observed heterozygosity of 0.70 for our selected SSRs markers was similar to that reported previously by Ebrahimi et al. (2011) (0.72) and Foroni et al. (2006) (0.68), who also showed high levels of heterozygosity among different walnut genotypes. Five of the ten loci used in this study (WGA001, WGA69, WGA 276, WGA32 and WGA54) amplified the same number of alleles previously reported by Dangl et al. (2005) and Woeste et al. (2002) whereas the remaining five loci generated a higher numbers of alleles. These differences likely result from the high levels of genetic variation within the walnut germplasms studied. We found a higher percentage of private alleles 20.25% (16/79) than the 14% reported for Persian walnut populations in southeast of Iran by Vahdati et al. (2015) (Table 3). The quantity of private alleles in our study may be related to (1) low sample sizes or (2) the non-random selection of superior walnut phenotypes for genotyping.

Cluster analysis of morphological data showed that individuals clustered based on geographical region. Saidabad populations had the highest dissimilarity distance when compared to other populations. The clustering analysis and PCoA of SSR data clearly indicated that Saidabad populations clustered away from the other groups despite observation of moderate levels of polymorphism within the SSRs. A few individuals from Vamenan (Va33, Va22, Va24 and Va42), Kashidar (Ka20) and Rudbar (Rud2, and Rud8) were also clustered in the first group (Cluster 1) alongside some Saidabad genotypes. Saidabad village is farther north than the other studied regions and its elevation is significantly different (Fig. 1). The population analysis by dissimilarity matrix of Nei’s genetic distance also confirmed this result. Cluster and principal coordinate analyses based on SSR data produced similar results with a few differences. PCoA clustered the genotypes in four main groups and confirmed K value of structure analysis, while the dendrogram showed three main groups. Principal co-ordinate analysis is known to be less sensitive to distances between close neighbours, but represents more accurate distances between clusters. Meanwhile, cluster analysis generally produces distances between the close neighbours faithfully, but shows distortion amongst members of different clusters (Sneath and Sokal 1973). Differences in genetic diversity parameters between the four studied walnut populations were few while greater diversity existed among individuals. These observations were supported using AMOVA where greater than 92% of diversity could be attributed to individuals. The Vamenan population displayed the highest genetic diversity in the area.

Conclusion

Persian walnut populations in north-eastern Iran maintain high levels of morphologic and genetic variation. Fruit morphology traits (nut size, kernel weight, etc.) have long been used to select individuals for improvement of tree orchards in Iran. Unfortunately, selection using these traits alone has not proven to be a reliable method for estimation of future production. The walnut genotypes in our study have many diverse nut traits, however, shell thickness which is advantageous for nut improvement through breeding and selection is sub-par. Evaluating superior trees based on shell thickness and kernel quality can improve selection of promising genotypes for breeding. Diversity among Azadshahr region walnut genotypes was confirmed with morphological and DNA markers. These results illustrated the effectiveness of SSR markers for genotype differentiation. Cluster analysis using morphological and SSR markers clearly separated Saidabad genotypes from other populations and indicated that geographical distribution may have led to genotypic isolation. This seclusion may have resulted from hills or mountains acting as geographical barriers or other environmental factors (e.g. rainfall, soil type). Collection of additional genotypes from other Iranian locations for comparison to those of the Azadshahr region may further solidify our results.