Introduction

Hares are widely distributed in different types of habitats in the Iranian Plateau including steppes, woodlands, semidesert areas, foothills, and fields (Blanford 1876; Lay 1967; Etemad 1978; Flux and Angermann 1990; Firouz 2008; Ziaie 2008; Smith et al. 2018); however, their taxonomic status has long been under debate. In fact, the taxonomic history of the lagomorphs of Iran is a striking example of how different the conclusions of different authors may be, even based on the same data. Additionally, interconnection of the North African hares (L. capensis sensu lato) and hares from Europe (L. europaeus) with Asian L. tolai in Iran has added more confusion (Table S1; Angermann 1983). However, no authors have considered more than two or three species to be present in Iran (Ellerman and Morrison-Scott 1951; Etemad 1978; Hoffmann and Smith 2005; Firouz 2008; Karami et al. 2008; Ziaie 2008; Smith et al. 2018), and the reason for the apparent proliferation of species is largely due to differing views on species limits and other taxonomic issues.

In particular, the L. capensis complex continues to be controversial, and the status of arabicus, tibetanus, and tolai as subspecies under L. capensis or valid species in their own right, has long been, and continues to be, a challenge to taxonomists (Ellerman and Morrison-Scott 1951; Petter 1961; Sludskii et al. 1980; Angermann 1983; Pierpaoli et al. 1999; Ben Slimen et al. 2005, 2008; Kasapidis et al. 2005). Lepus tibetanus seems to be mainly allo- or parapatric with L. tolai, and has sometimes been synonymized with L. tolai (Angermann 1983), but is usually considered a separate species (Ognev 1940; Sokolov and Orlov 1980; Smith et al. 2018). Additionally, L. craspedotis has variously been treated as a subspecies of L. tolai (Blanford 1876), L. tibetanus (Ognev and Heptner 1929; Ognev 1940; Hoffmann and Smith 2005) or L. arabicus (Ellerman and Morrison-Scott 1951).

The main reasons for such taxonomic uncertainties in hares seem to be a lack of taxonomic agreement and comprehensive identification keys (Ognev 1940; Ellerman and Morrison-Scott 1951; Flux and Angermann 1990; Hoffmann and Smith 2005; Ziaie 2008; Smith et al. 2018), complicated and exacerbated by intraspecific variation and interspecific overlaps in morphological characters, phenotypic plasticity, and high levels of hybridization (Lado 2015). Introgression is very common in hares especially when two or more species are sympatric. Thulin et al. (1997) have reported introgression of L. timidus mtDNA into L. europaeus in Sweden, while introgression of L. europaeus mtDNA into L. timidus has also been reported (Thulin et al. 2006; Zachos et al. 2010). Additionally, introgression of mtDNA of L. timidus into other species of hares has also been documented (Alves et al. 2003; Melo-Ferreira et al. 2005, 2007; Fredsted et al. 2006). Liu et al. (2011) reported extensive introgression between several lineages of Chinese hares, and concluded that eight species of hares exist in China, L. capensis, L. comus, L. hainanus, L. mandshuricus, L. oiostolus, L. sinensis, L. timidus, and L. yarkandensis. They found no evidence that L. capensis should be divided into L. tibetanus and L. tolai as suggested by Hoffmann and Smith (2005). The extensive gene flow that seems to occur almost any time two species occur in sympatry would be expected to diminish or muddle morphological differences between taxa, and make it problematic to clarify the taxonomic status and evolutionary history of populations and taxa in lagomorphs, based on morphological data alone (Anderson 1948; Levin 1979; Wu et al. 2011).

The most recent view is that the most widespread hares of Iran are L. europaeus and L. tolai (Hoffmann and Smith 2005; Karami et al. 2008; Ziaie 2008), with reports of Lepus [capensis] arabicus and Lepus [capensis] tibetanus still needing confirmation. In this study, we re-evaluate the taxonomic status of the hares of the Iranian Plateau and adjacent areas, based on a statistical analysis of variations in morphometric and morphological characters as well as phylogenetic analyses based on mitochondrial and nuclear markers. We also use ecological niche modeling to test ecogeographic rules relating to variation in body and appendage size, and present a revised view of hare distribution and habitat preferences.

Materials and Methods

Sampling

Overall, we studied 108 specimens from Iran belonging to the genus Lepus (Tables 1 and S2; Fig. 1). In total, 83 specimens were collected from steppe, farmland, woodland, scrubland, and semidesert areas of the Iranian Plateau and adjacent areas from January 2014 to October 2016. These specimens were deposited at the Zoology Museum of Ferdowsi University of Mashhad (ZMFUM). Additionally, 25 specimens housed in the collection of the Iranian Department of Environment in Tehran were investigated (Table S2). Also, one Oryctolagus cuniculus from Sweden and four tissue samples of L. tolai from an area adjacent to the Selenga River, southeast of Lake Baikal, Russia (the type locality of L. tolai) were obtained from the collections of the Zoological Museum of Moscow University (ZMMU, Moscow).

Table 1 Specimens used in morphometric and molecular analysis, sample localities, numbers of the localities on the map, and sample size
Fig. 1
figure 1

Geographical distributions and sampling localities of genus Lepus in Iran. Circles show sampling localities (light green circles = L. europaeus; red circles = unclassified group; blue circles = the Golestan population; and olive green circle = L. tolai from the type locality). For localities correlated to numbers on the map see Table 1

External and Cranial Measurements

Specimens were weighed (grams) and four external measurements were taken using a ruler with an accuracy of 1 mm. Thirty-two craniodental variables from each specimen were measured applying a Vernier caliper with an accuracy of 0.01 mm (Riga et al. 2001; Pintur et al. 2014; Fig. S1). The abbreviations of variables are as following: Head and body length (HBL); Hind foot length (HFL); Ear length (EL); Tail length (TL); Weight (W); Total skull length (TSL); Condylobasal length (CB); Basal length (BL); Occipitonasal length (ONL); Dental length (DL); Greatest length of nasals (GLN); Parietal length (PL); Frontal length (FL); Viscerocranium length (VCL); Length of tooth row (UTR); Length of upper diastema (LD); Palatal length (PAL); Greatest width of occipital condyles (GWC); Greatest width across openings of external acoustic meatus (GWO); Greatest neurocranium width (GNW); Width between facial tubercles (WFT); Aboral zygomatic width (AZW); Greatest width of nasals (GWN); Palatal width (PALW); Posterior zygomatic width (PZW); Anterior nasal width (ANW); External nasal length (ENL); Smallest frontal width (SFW); Post palatal width (PPW); Tympanic bulla width (TBW); Tympanic bulla length (TBL); Foramen incisivum length (FIL); Rostral width (RW); Mandible length (ML); Mandible height (MH); Length of lower cheek-teeth (LCTRL); Length of lower diastema (M4).

Identification and Morphometric Analyses

Taxonomic names and ranking followed Hoffmann and Smith (2005) and Smith et al. (2018). Sex of the specimens was determined by inspection of gonads. For morphometric analysis, the specimens were divided into three age classes based on morphological features (1) obvious juvenile, (2) subadult, and (3) adult. The morphological characters used for the age classification includes: ossification of sutures and bones of the skulls; non-porous supraorbital bones; non-porous premaxillaries and corpus mandibulae. To exclude age bias, log transformed data along the vector of age variation an orthogonal projection were used (Burnaby 1966). A modified Factor analysis was applied as an ordination method: first, the vector of age variation was calculated as the first eigenvector of the between-group covariance matrix computed with nested two-factor MANOVA (F = 1.03, P = 0.45). In two-factor MANOVA analysis, the variable containing age classes 1 and 3 (we excluded age class 2 in order to minimize errors arising from the inaccuracy in the determination of age), as well as the identifier of the one-species geographical sample were used as the grouping variables. The age factor was nested in the geographical sample. Therefore, the dataset with reduced age was obtained calculating the eigenvectors of the within-group covariance matrix (with geographic samples as groups) of the dataset. Second, the nested two-factor MANOVA was performed to calculate the influence of sex and the sex factor was nested in the geographical sample. Third, the initial data matrix was multiplied with the matrix of the eigenvectors. Indeed, the original data was scale-unified and processed into the space of intergroup variation without alteration of the initial space (Obolenskaya et al. 2009). Principal Component Analysis (PCA) was computed on the basis of covariance matrix (Bookstein et al. 1985) and age-corrected data. Normality was tested using Shapiro-Wilk, while homogeneity of variances was checked applying Levene’s test (P value ≥0.05).

Extraction, Amplification and Sequencing

DNA was extracted from liver, fresh skin, kidney, and muscles using QIA Quick DNEasy Kit (Qiagen, Inc), according to the manufacturer’s instructions. Primers applied for amplification and sequencing of the mitochondrial cytochrome b (cyt b) and the nuclear transferrin (TF) are given in Table S3. The polymerase chain reaction (PCR) parameters consisted of 40 cycles: 40 s at 95 °C, 60 s at 45 °C, and 2 min at 72 °C for cyt b and 35 cycles: 30 s at 95 °C, 30 s at 62 °C, and 1 min at 72 °C for TF. PCR products were purified and cleaned of primers with 0.5 μl ExoTAP (Exonuclease I and FastAP Thermosensitive Alkaline Phosphatase) (Werle et al. 1994). The fresh samples were sequenced using three internal primers for cyt b (Table S3), and were deposited in GenBank (Table S2). In order to amplify and sequence, the degraded DNA of old museum materials extraction, PCR-amplification, and sequencing procedures followed the procedures described in Mohammadi et al. (2018). Additional PCR primer pairs targeting 14 short (70–300 bp) overlapping fragments of the cyt b gene were designed to sequence the degraded DNA (Table S3 and Fig. S2).

Molecular Phylogeny

We selected one mitochondrial protein-coding gene, cytochrome b (cyt b), and one nuclear locus, transferrin (TF), that consistently yielded a single amplification product and exhibited genetic variation. A total of 1138 base pairs (bp) were sequenced of complete cyt b gene for 83 individuals (samples from Iran including specimens from the type locality of L. craspedotis) from which the tissues were available, and also four Lepus from the region of the type locality of L. tolai, southeast of the Lake Baikal, Russia and one O. cuniculus from Sweden. Moreover, 464 base pairs (bp) of transferrin for 32 individuals (selected at least seven from each main clade) of hares from Iran and also one Lepus from the region of the type locality of L. tolai, and one O. cuniculus were sequenced. Additionally, 35 Transferrin and 72 cyt b sequences including L. tibetanus [capensis] pamirensis (LC073697) near the type locality of L. tibetanus were retrived from GenBank. The sequences were aligned and trimmed using MegAlign 4.03 in the DNAstar package (DNAstar Inc.). The choice of substitution model was determined based on the Bayesian Information Criterion (Schwarz 1978) calculated in jModeltest (Darriba et al. 2012). We reconstructed phylogenies based on single locus analysis using Bayesian inference (BI) implemented in MrBayes v. 3.2 (Ronquist et al. 2012) and Maximum Likelihood (ML) in PAUP 4.0b10 (Swofford 2003). The GTR + G + I model was implemented for cyt b and the GTR + I model for TF as suggested by jModeltest. Default priors and MCMC proposal distributions were implemented in MrBayes. The MCMC were run for 50 million generations and sampled tree topology every 5000 generations, discarding the first 25% of trees as burn-in. Posterior probability (PP) values of 0.95 and higher were considered strong clade support. ML analyses were carried out under heuristic tree search with ten random addition sequence replicates, and tree bisection reconnection (TBR) branch swapping. To assess support for internal nodes, we ran nonparametric bootstrapping (500 replicates) (Felsenstein 1985) under ML, with a single random addition sequence replicate per bootstrap replicate. Maximum-likelihood bootstrap (BP) ≥75% were considered strong support. A median-joining haplotype network was generated using the program Network ver. 5.0.0.1 (Fluxus Technology, Suffolk, Great Britain).

Species Data for Ecological Analyses

In total, 164 occurrence records (presence-only) were collected using a combination of the personal field expedition data and literature from Turkey, Iran, Turkmenistan, and the archive of mammals recorded in Zoological Museum of Golestan University (ZMGU) (Table S4). The dataset was manually georeferenced and structured using BioVeL services (Hardisty et al. 2016). All new data were submitted to the Global Biodiversity Information Facility (http://www.gbif.org/).

Ecological niche analysis and models were executed on the BioVeL portal (http://portal.biovel.eu/), using workflows for taxonomic refinement (Mathew et al. 2014) and Ecological Niche Modeling (ENM; Leidenberger et al. 2015). In total, 20 biogeoclimatic variables (Table S4) were obtained from http://www.worldclim.org through BioVeL’s services (generating 1 km2 resolution of climatic data layers) based on average monthly climate data from weather stations on a 10 arc-second resolution grid (Hardisty et al. 2016).

Statistical Analysis of Ecological Data

Bioclimatic variables were analyzed to compare different Operational Taxonomic Units (OTU) of hares in their environmental preferences. To this end we transformed all variables to logarithms using RStudio v. 1.0.143 (RStudio Team 2017). In case of pairs of variables with Pearson correlation coefficient (r) > 0.75 (Rissler et al. 2006), only one in each pair was retained, leaving a total of eight variables for further analysis and ecological niche modeling (Table S5). All bioclimatic variables were checked for homogeneity of the variances and pairwise differences in tolerance using Welch’s t-test and Games-Howell post hoc test, respectively. Both statistical tests were performed using SPSS version 22 (Armonk, NY: IBM Corp. 2013). To define the differences in the distribution of populations over the range of environmental variables, we generated box-and-whisker diagrams and ran a principal component analysis (PCA) on the environmental envelopes using RStudio (RStudio Team 2017).

Testing Ecogeographic Rules with Ecological Niche Modelling

We generated ecological niche models to describe and compare the geographical space of suitable habitat as occupied by each OTU. Additionally, ENM was applied to test whether the unclassified group diffused in their distributional range neutrally or displaced their pure parental range. In this respect, the fundamental niche of each OTU was obtained applying ENM and then compared to realized niche plotted based on sampling localities. Models were built, tested, and projected using the openModeller webservice suite (Muñoz et al. 2011) and applying three algorithms, including the Environmental Distance (ED) algorithm v. 0.5 (Farber and Kadmon 2003), Maximum Entropy (MAXENT) v. 1.0 (Phillips et al. 2006), and Support Vector Machine (SVM) v. 0.5 (Schölkopf and Smola 2001). Analyses were executed using version 20 of the Ecological Niche Modelling (ENM) workflow (http://purl.ox.ac.uk/workflow/myexp-13007) in batch mode (called data sweep) to estimate potential distribution maps with favourable biotic, environmental, and geographical conditions.

Models were created with the following specifications. MAXENT models were set to run with 10,000 background points (including input points) drawn from the mask. Feature selection was automated, allowing the algorithm to combine feature types when fitting a model, and perform 500 replicates. Tolerance for detecting model convergence was set to 0.00001, while sample threshold was set to 80 (product), 10 (quadratic), and 15 (hinge). SVM models were set to execute the C-SVC algorithm with radial basis kernels, gamma values 1/k (where k is the number of layers), and a cost value of 1. ED models were set to run with Mahalonobis distance (Mahalanobis 1936).

Models were tested applying ten-fold cross-validation to calculate the area under curve value (AUC) as well as omission error using the lowest presence threshold (LPT). Overall, we executed nine niche models (i.e., three algorithms for three OTUs; Table 4). All niche modelling algorithms returned raster output maps with probabilistic value for favorable biotic, environmental, and geographical conditions ranging between 0 (no suitable habitat) - 254 (maximum suitable habitat). We converted all raster maps into binary outputs using a 50% threshold to indicate suitable habitat for all OTUs and models. Finally, the maps of the three algorithms were overlaid for each OTU to show the consensus among the model projections. All maps were processed using the qGIS software package v. 2.6 (Quantum GIS Development Team 2014). We also quantified relationships between body size and latitude and also annual precipitation and annual temperature to test ecogeographic rules (Bergmann 1847; Allen 1877). For all analyses, normality of data was estimated through the Kolmogorov-Smirnov test and body size and ecological variables were log10 transformed.

All data generated or analyzed during this study are included in this published article (and its additional files). Sequence data have been deposited in GenBank with the accession numbers MN098872 to MN098993. All other data are available from the authors upon reasonable request.

Results

Taxonomy

The population of hares from the west of Iran (the northwest and the southwest) exhibited morphological and morphometric characters of typical L. europaeus. The total length of the skull was more than 87 mm (Table 2). The black color on the ear is an extended broad smirch; the black mid-dorsal stripe of the tail just covered the midline. White hairs remarkably covered the lateral upper part of the tail.

Table 2 External and morphometrical characters for studied individuals of L. europaeus, Golestan population, and unclassified group from Iran

Hares from the lowlands between the Alborz Mountains and the southeastern corner of the Caspian Sea (Golestan Province; henceforth referred to as the Golestan population) showed specific morphological and morphometric features compared to hares from other parts of Iran. The dorsal pelage was sandy brown with grey tint. The black edge on the ears was narrow and hairs created a ear-tuft, the mid-dorsal stripe of the tail was broad and black and covered most of the width of the upperpart of the tail, and the white color of the upperpart of the tail was restricted to a narrow line along the edge. The head and body length are less than 480 mm (407.57 ± 17.45 mm). The total length of the skull was less than 87 mm (mean value = 84.76 ± 0.77 mm). The auditory bullae of the skull were inflated, the nasals were short (GLN = 34.88 ± 0.89 mm), the zygomatic arch was broad (AZW = 38.53 ± 1.32) compared to that of L. europaeus. The average weight is 1720.53 ± 150.74 g (Table 2). This population was identified as most similar to L. tolai/tibetanus based on external morphology.

Hares from the central and eastern Iran (except for the Golestan population), and some individuals from southwestern Iran, were generally smaller than both L. europaeus and the Golestan population, including a total length of the skull which was often more than 78 mm and less than 87 mm (Table 2). These specimens could not be identified based on current identification keys, as they showed a mixture of morphological characters between L. europaeus and the Golestan population. For instance, in some individuals, the black spot on the ear was like that in the Golestan population and in some others like that in L. europaeus. Additionally, three specimens with total body and skull lengths characteristic of L. europaeus, but with other ambiguous morphological characters were found. This entire population thus showed sign of possible introgression or hybridization and did not conform to any known taxon, and will be henceforth referred to as the unclassified group.

Morphometric Analyses

In total, 82 specimens were included in a MANOVA test performed on morphometric characters associated to locality and sex for each OTU. The influence of sex in the morphometric analysis was not significant (P ≥ 0.45); thus, we analyzed females and males together. Descriptive statistics related to the three groups of hares from the Iranian Plateau are shown in Table 2.

Pairwise comparisons between the three groups of hares, using Tukey’s test, revealed that 89.18% of characters were significantly different (P ≤ 0.05) between the Golestan population and L. europaeus from western Iran (Table 2). The only measurements that were not significantly different were palatal length, tympanic bulla width, tympanic bulla length, and tail length. Additionally, the Golestan population differed significantly in six out of 37 external and cranial characters (16.21% of characters; P ≤ 0.05), including head and body length, dental length, anterior nasal width, external nasal length, foramen incisivum length, and ear length compared to the unclassified group. As a rule, the mean value of head and body length, dental length, anterior nasal width, external nasal length, and foramen incisivum length were significantly higher in the Golestan population relative to that of the unclassified group. In contrast, the Golestan population has shorter ears than those of the unclassified group. Lepus europaeus has the highest mean value in all morphometric characters compared to the other two groups except for tympanic bulla width and tympanic bulla length, which are lower than that of the unclassified group. Moreover, the unclassified group represents the least mean value of all measurements except for greatest width across the openings of the external acoustic meatus, greatest neurocranium width, palatal width, post palatal width, tympanic bulla width, tympanic bulla length, basioccipital length, and ear length (Table 2). There were also significant differences for 91.89% of characters between L. europaeus and the unclassified group, but no significant differences in ear length, palatal length, and tympanic bulla length. Based on Factor Analysis (FA) of the cranial characters, the first two factors explain 90.5% of variance (89.8% + 0.7%) (Fig. 2). Lepus europaeus and the Golestan population beside the unclassified group disperse along the first two axes. Lepus europaeus occupy a distinct multivariate space but the Golestan population and the unclassified group overlap broadly.

Fig. 2
figure 2

Scatter plots of FA1 and FA2 scores based on the cranial characters for hares from Iran (green = L. europaeus, red = unclassified group, and blue = Golestan population)

Based on PCA, the first two principal components explain 80.9% of total variance. The first principal component of the skull measurements explains 51.1% of overall variation. The contribution of each particular skull measurement into this component varies from 0 to 96% (TSL). The second PC explains 29.8% of overall variation and is most correlated with TBW. In concordance with the FA on the cranial caracters, PCA results also indicated overlap between cranial measurments of the Golestan population and the unclassified group in eastern Iran (Supplementary Fig. S3).

Testing Ecogeographic Rules

Overall, a pattern of decreasing general body size in lower latitudes was found in hares from the Iranian Plateau. The Golestan population occurs in an area too restricted to vary geographically and was thus not analyzed for geographical variation. In L. europaeus all measurements (35 out of 41 characters including four proportional characters EL/HBL, TBW/TSL, TBL/TSL, ENL/GNW) except GNW, PALW, SWF, TBL, EL, and TL were significantly correlated to latitude (Fig. 3a), while body size variation analysis in relation to annual precipitation showed no significant correlation. For the unclassified group, latitude per se has less explanatory power and just seven out of 41 characters demonstrated positive correlation to latitude (Fig. 3b). However, 32.4% of characters (12 out of 37) in the unclassified group (Fig. 3b) showed a statistically significant positive correlation to geographic body size variation in relation to annual precipitation. In L. europaeus, significant negative correlation relative to annual mean temperature in the Iranian Plateau was observed for 62.16% of characters (23 out of 37) (Fig. 3a). Notably, ear length and tympanic bulla width were positively correlated to annual mean temperature. In the unclassified group only 13.51% of characters (five out of 37) were significantly negatively correlated to increasing annual mean temperature (Fig. 3b).

Fig. 3
figure 3

Relationship of body size regressed on latitude, temperature, and precipitation. The figure shows a pattern of decreasing general body size in lower latitudes in hares from Iran (a) L. europaeus, and (b) Unclassified group. The results revealed that body size is under selection in relation to environment dependence to latitude. The abbreviations are as following: External nasal length (ENL); Occipitonasal length (ONL); Greatest length of nasals (GLN); Aboral zygomatic width (AZW); Total skull length (TSL); Condylobasal length (CBL); Dental length (DL); Rostral width (RW); Parietal length (PL); Length of upper diastema (LD); Posterior zygomatic width (PZW); Ear length (EL): Head and body length (HBL); Tympanic bulla width (TBW): Total skull length (TSL); Tympanic bulla length (TBL): Total skull length (TSL); External nasal length (ENL): Greatest neurocranium width (GNW); Greatest width of nasals (GWN); Greatest width across openings of external acoustic meatus (GWO); Mandible length (ML); Frontal length (FL)

Molecular Analysis

Mitochondrial Cyt b Results

The cyt b data consisted of 102 haplotypes among 159 investigated individuals of genus Lepus from Asia, Africa, and Europe (Fig. 4a). Network results of cyt b show that the unclassifed group and L. europaeus constitute a haplogroup. Similar to the phylogenetic results (Fig. 4b), three L. tolai from the type locality is found in the same haplogroup as L. timidus and one in the same haplogroup as L. tolai.

Fig. 4
figure 4

a The median-joining network of hare haplotypes obtained with 72 mtDNA sequences from GenBank and 87 mtDNA sequences provided by this study. The size of each circle is proportional to the haplotype frequency. Numbers on the lines indicate the number of mutations (no number indicates single mutation). b Phylogenetic tree based on 159 mtDNA cyt b sequences of hares from Asia, Africa, and Europe in MrBayes. Posterior probability and bootstrap supports are provided for the divergence of major lineages, respectively. Oryctolagus cuniculus was not shown on the tree. Colors represent species names; red: unclassified group; light green: L. europaeus; olive: L. tolai; violet: L. timidus; blue: L. cf. L. tibetanus; yellow: L. sinensis; light grey: L. capensis s. l. and light blue: L. capensis s.s., respectively (for more details see Table S2)

We obtained similar topological estimates from Maximum likelihood (ML) (not shown) and Bayesian analyses (Fig. 4b). The L. europaeus samples from Iran (A2, Fig. 4b) as well as 35 specimens of the unclassifed group (A1, Fig. 4b), including topotypes of craspedotis, from eastern Iran are all part of one large, well-supported clade that also includes samples from many parts of Europe (A3, Fig. 4b). There is a tendency of a geographic structuring in the clade, between western (A3, Fig. 4b) and eastern (A1-A2, Fig. 4b) samples, with all samples from Iran falling in the eastern clade together with samples from Russia, Anatolia, and Cyprus (A1-A2, Fig. 4b), slightly diverged from a clade of L. europaeus consisting of European samples from Germany, Greece, Italy, Poland, and Sweden (A3, Fig. 4b).

The seven individuals morphologically identified as L. tolai/tibetanus from Golestan (D, Fig. 4b) clustered with one sequence of L. tibetanus [capensis] pamirensis (LC073697) and five sequences of L. capensis-2 sensu Liu et al. (2011) from northwest China. This clade (D, Fig. 4b) is sister to a clade including two sister clades (B and C, Fig. 4b), the first dominated by L. capensis sensu Liu et al. (2011), but also including sequences identified as L. tolai by other studies (B, Fig. 4b), and one sequence from an individual morphologically identified as L. tolai from near the type locality of this species. The second sister clade (C, Fig. 4b) is primarily made up of L. timidus, but also contains three sequences from individuals morphologically identified as L. tolai from the type locality of this species (Fig. 4b).

Transferrin Sequence Results

The transferrin sequence data consisted of 18 haplotypes among 67 investigated individuals of genus Lepus from Asia, Africa, and Europe (Fig. 5a). The transferrin network results, unlike cyt b, show that the unclassifed group and samples from Golestan, together with one topotypical L. tolai sample, constitute a haplogroup separate from L. europaeus and L. timidus haplogroups (Fig. 5a, b).

Fig. 5
figure 5

a Median-joining network obtained for the transferrin dataset. Circle sizes are proportional to the number of the same haplotypes observed in the dataset. Branch lengths are proportional to the number of mutations between haplotypes. Numbers on the lines indicate the number of mutations (no number indicates single mutation; see Table S2). b Phylogenetic tree depicting relationships of hares from Iran based on the analysis of the transferrin dataset and reconstructed following Bayesian method. Posterior probability and bootstrap supports are provided for the divergence of major lineages, respectively. The unclassified group is highlighted in light grey and L. tolai from the type locality in dark grey. Colors indicate species according to bibliographical data: red: unclassified group; light green: L. europaeus; olive: L. tolai; violet: L. timidus; blue: L. cf. tibetanus; light orange: L. oiostolus; light grey: L. capensis s. l.

In the phylogenetic reconstruction based on the nuclear marker transferrin, all specimens morphologically identified as L. europaeus from Iran are part of a clade also containing L. europaeus from throughout southern Europe. This clade is sister to a clade containing L. timidus from western Europe, Russia, and China, although support for this topology is weak and it is not justified to consider any of the structure mentioned above as supported.

All specimens morphologically identified as L. tolai/tibetanus from the Golestan population; all individuals of the unclassified group, and one specimens from the type locality of L. tolai (Selenga River, southeast of the Lake Baikal, Russia) constitute a well-supported clade, sister to a clade of GenBank sequences from North Africa referred to as L. capensis s.l. (Fig. 5b).

Ecological Preferences

In the PCA of the ecological variables, the first three principal components explain 89.3% of variance. The first factor (PC1) accounted for 45.7% of the overall variance (Eigen value 0.013), whereas the second factor (PC2) accounted for only 38.3% (Eigen value 0.002). The third factor accounted for only 5.3% (Fig. 6). In spite of relative overlap in ecological space between L. europaeus and the unclassified group in optimal ecological tolerance, precipitation of warmest quarter, altitude, mean temperature of wettest quarter, and precipitation of coldest quarter related to PC1 contributes toward separating three groups (Fig. 6). In addition, PC2 correlated to altitude, precipitation of warmest quarter, mean temperature of wettest quarter, mean temperature of driest quarter and PC3 related to mean temperature of wettest quarter, altitude, precipitation of coldest quarter, precipitation of warmest quarter respectively distinguished three groups. Box plots show discrimination in preferred altitude and mean diurnal range of the suitable habitat between the Golestan population and the other two groups (Fig. 7). Moreover, the unclassified group is relatively differentiated in isothermality and precipitation of coldest quarter of the preferred habitat comparing to those of the Golestan population and L. europaeus.

Fig. 6
figure 6

Niche overlap analysis of three groups of hares from Iran showing the principal component analysis plots of environmental variables for all three OTUs of hares from Iran for eight predictor variables. Ellipses represent 57% of hypervolume for each species; dots represent presence of each group at environmentally unique locations (light green circles = L. europaeus, red circles = unclassified group, and blue circles = Golestan population)

Fig. 7
figure 7

Box-and-whisker diagrams showing the variation of eight key environmental variables between and within three OTUs of hares (light green boxes = L. europaeus, red boxes = unclassified group, blue boxes = Golestan population). Variable and unit are shown on the y-axis. Color boxes indicate 50/75% of the sample points and are limited by the 1st (bottom) and 3rd quartile (top). The black line within the box displays the median

Tests of equality of means (Welch’s test) demonstrated violation from homogeneity of variances for ecological variables related to OTUs (Welch’s ANOVA; P < 0.009) (Table S6). The Games-Howell post hoc test revealed significant (P ≤ 0.008) differences between geoclimatic tolerance of the Golestan population and the unclassified group in precipitation of warmest quarter, precipitation of coldest quarter, mean diurnal range, isothermality, and altitude. Between the Golestan population and L. europaeus, environmental variables differ significantly (P ≤ 0.004) for suitable habitat including mean diurnal range, mean temperature of wettest quarter, mean temperature of driest quarter, and altitude. Additionally, between L. europaeus and the unclassified group, precipitation of the warmest quarter, precipitation of the coldest quarter, isothermality, temperature seasonality, mean temperature of the wettest quarter, and mean temperature of the driest quarter of the suitable habitat were different (P ≤ 0.007) (Table 3).

Table 3 Multiple comparisons of difference between means of environmental variables for three groups of hares (G.p – Golestan population, u.g. – unclassified group, L.e – L. europaeus) using Games-Howell Post Hoc test

As a whole, the lowest mean temperature in wettest and driest quarter, the highest altitude, the highest precipitation in coldest quarter, and the highest temperature seasonality contribute most to the preferable habitat for L. europaeus in the Iranian Plateau, compared to the Golestan population and the unclassified group (Table 3 and Fig. 8a). On the other hand, optimal habitat for the unclassified group in comparison to L. europaeus and the Golestan population is characterized by the highest mean diurnal range, the highest isothermality, the highest mean temperature of the wettest and driest quarter in the Iranian Plateau and adjacent areas (Table 3 and Fig. 8b). The most important ecological variables describing the range limits of the Golestan population compared to the other two groups are the highest precipitation in the warmest quarter, the lowest mean diurnal range, the lowest altitude, and the lowest isothermality (Table 3 and Figs. 7 and 8c).

Fig. 8
figure 8

Maps showing the potential distribution for (a) L. europaeus, (b) unclassified group and (c) Golestan population in Iran. The color scale in the potential distribution (PD) maps indicates habitat suitability, ranging from 0 (unsuitable, in white) to 254 (maximum suitability, in dark red)

Ecological Niche Models and Distribution of Suitable Habitat

As it is obvious from the high AUC values (all values >0.9) (Table 4) both SVM and MAXENT models generated very good predictions for suitable habitats. The SVM model performed marginally better than the ED and MAXENT models for all three groups. Omission errors were acceptable (based on acceptable OM described in Peterson et al. 2008) for all three models (<7%), except for the Golestan population and ED prediction for the unclassified group (12%). Restricted habitat distribution and few records of the Golestan population in Iran may have caused the lower power of prediction for models. The SVM is the algorithm that makes the largest geographical predictions of suitable habitat (Fig. 8; light red), while MAXENT is the algorithm with the narrowest geographical predictions (Fig. 8; dark red).

Table 4 Results of model tests for all models of current distribution (based on BioClimate variables)

Based on the niche modeling results, the southern part of the Caspian Sea, south Turkmenistan, and some parts of Iraq provide suitable habitat for the Golestan population. The southern Alborz and the northern coast of the Persian Gulf were also predicted to be suitable for the Golestan population, but we have no evidence of it occurring there. For L. europaeus, the high altitude ranges of the Alborz, Zagros, and Kopet Dagh mountains were suggested to provide the most suitable habitat, but almost the entire Iranian Plateau and lower mountains of western Afghanistan (altitudes less than 2000 m a.s.l.), with the exception of the central deserts and the eastern coastline of the Persian Gulf and Oman Sea, were also predicted to be suitable. It is also predicted that geoclimatic habitats suitable for the unclassified group include the whole range of the Iranian Plateau (with the exception of the central deserts and the coastline of the Oman Sea), Anatolia, most parts of Afghanistan except plains and deserts in the southwest and the high mountain areas (Fig. 8b). The consensus models, however, suggested the western part of the Iranian Plateau provide the most suitable habitat for L. europaeus while the eastern part being the most fitted for the unclassified group.

Discussion

Morphological and Genetic Taxonomic Assessment of the Hares of Iran

Previous studies have demonstrated that latitudinal and geographical adaptation as well as frequent cases of hybridization have made it difficult to determine the taxonomic status of hares. Introgressants generally show transitional phenotypic and morphometric characters of their pure parental forms (Anderson 1948; Wu et al. 2011). Confusion in diagnostic characters in a hybrid zone was also described in Barton and Hewitt (1985) and Barton (2001) and reported in hares (Thulin et al. 2006; Thulin and Tegelström 2002). Although Smith et al. (2018) mentioned that conspecificity of L. europaeus, L. capensis, and hares of North Africa (incertae sedis) is plausible, our nuclear data provide evidence of relatively high divergence between at least North African hares (incertae sedis) and L. europaeus in the nuclear transferrin gene.

In spite of the comprehensive sampling from different localities, we have identified only three groups of hares in the Iranian Plateau and adjacent areas based on morphological traits (Fig. 2), despite different species having been listed from the Plateau in previous publications (Table S1).

Taxonomic Identity of the Hares of Western Iran

Lepus europaeus in western Iran is uncontroversial. It is well differentiated from the other two groups in craniometric measurements (Table 2 and Fig. 2), and our samples fit this species also in mitochondrial and nuclear characters. The morphological characteristics of samples identified as L. europaeus are in agreement with diagnostic characters provided by Ellerman and Morrison-Scott (1951), and Palacios (1996).

Taxonomic Identity of the Hares of Golestan, Northeastern Iran

The morphological characteristics of samples from Golestan, northeastern Iran are in agreement with diagnostic characters for both L. tolai and L. tibetanus provided by Smith et al. (2018) and Ognev (1940). Angermann (1983) also referred to samples from Golestan (Gorgan and Pahlavi Desh) as L. capensis tibetanus. The main problem with the identity of these samples is overlap in diagnostic characters between L. tolai and L. tibetanus and lack of comparative material, particularly concerning DNA sequences from the type locality of L. tibetanus.

The majority of samples in the two clades in Fig. 4 that contains sequences of L. tolai (clades B and C, Fig. 4) have largely been assigned other taxonomic identities than L. tolai in the original studies. Liu et al. (2011) considered these clades to be L. capensis and L. timidus, respectively. We agree that the latter most probably represent L. timidus, but disagree regarding the former. The name L. capensis is a name in very broad use, often including L. europaeus and other taxa. Our study shows that both sequences of L. europaeus from Europe (clade A, Fig. 4) and sequences of L. capensis from Africa form distinct clades (clades F and G, Fig. 4), and no sequences phylogenetically close to any of these clades are represented in the material of Liu et al. (2011). Thus, using the name L. capensis for any of the samples of Liu et al. (2011) seems erroneous from a phylogenetic point of view. The identification process of Liu et al. (2011) seems to have relied on single characters judged to be diagnostic for each taxon, and it is not possible for us to assess their identifications. They found two clades representing their L. capensis, but judged that there was no ground for dividing it into L. tolai and L. tibetanus. However, in the light of the additional information provided by our data, a clade containing their L. capensis also contains three sequences of L. tolai from other studies, including one sequence from this study of a presumed L. tolai sample from the type locality. We thus suggest that there is a possibility that this clade represents L. tolai, but acknowledge that our material is insufficient for certain conclusions and that further research is needed.

The diagnostic morphological characters separating L. tolai and L. tibetanus are qualitative rather than quantative, and many authors have considered the differences too slight to elevate these two to separate taxonomic ranks (Ellerman and Morrison-Scott 1951; Angermann 1983). In fact, the L. capensis group is taxonomically very controversial. Some authors gave species rank to L. tolai (Gromov and Baranova 1981) but merged L. tibetanus with either L. capensis (Corbet 1978) or L. tolai (Gureev 1964), while others retained both L. tolai and L. tibetanus as separate species (Ognev 1940; Hoffmann and Smith 2005; Smith et al. 2018). Angermann (personal communication) stated that “In my longterm studies I was unable to find any morphological characters separating these forms from the widespread Lepus capensis s.l.”. However, the Golestan samples form a clade with L. capensis-2 sensu Liu et al. (2011) (clade D, Fig. 4). We cannot judge the taxonomic identity of the samples of Liu et al. (2011), but also note that a single sample of L. capensis pamirensis is part of this clade. This sample came from an area geographically not far from the type locality of L. tibetanus, albeit from the other side of a major mountain range. According to Smith et al. (2018) L. capensis pamirensis is a synonym of L. tibetanus and it is thus possible that the correct name of this clade could be L. tibetanus. However, the taxonomic ranking of tibetanus as a species is still under debate. The transferrin data is conflicting, as one sequence, identified as L. tolai from the type locality, is part of the Golestan clade. Unfortunately, Liu et al. (2011) did not use transferrin, so we cannot compare with their data. Our assessment is that both morphological and genetic data suggest that the Golestan hares are best regarded as L. tibetanus on currently available evidence, but further research is needed, primarily comparison with material from the type localities and core ranges of L. tibetanus and L. tolai.

Taxonomic Identity and Evolutionary Origin of the Unclassified Group, from Eastern Iran

The unclassified group consisted of individuals from the eastern part of the Iranian Plateau, sharing a combination of morphological characters typical of both the Golestan population and of L. europaeus, but with morphometric traits generally smaller than both (Table 2). However, the Golestan population and the unclassified group were not highly differentiated in external and craniodental characters (Figs. 2 and S3). Genetically this population was characterized by having mitochondrial DNA of L. europaeus and nuclear transferrin of the Golestan population (Figs. 4 and 5). There were no samples that had mitochondrial DNA from the presumed L. tolai/tibetanus population in Golestan and transferrin of L. europaeus. This population thus consists of individuals that appear to be of mixed ancestry.

The Nature of the Gene Flow in the Unclassified Group, from Eastern Iran

The discordance between mtDNA and nuclear phylogenies here can conceivably be interpreted as either introgression or incomplete lineage sorting. However, the Golestan population and L. europaeus from outside the hybrid zone demonstrated complete lineage sorting in both nuclear transferrin and the mitochondrial markers, making the latter a less plausible alternative. In addition, shared haplotypes between the unclassified group and L. europaeus were not randomly distributed (specimens with mitochondrial DNA of L. europaeus type and transferrin of L. tibetanus type are exclusively distributed in eastern Iran), strongly contradicting a hypothesis of retained shared ancestral polymorphisms. Our results thus indicate that the Golestan population and L. europaeus are able to hybridize, and that the hares in eastern Iran are descendants of ancestral populations of these two.

We believe the most likely evolutionary background to this to be secondary contact, one species dispersing into the range of the other one. However, as a result of two species starting to hybridize, maternal mtDNA could have introgressed in both directions. In this case, almost all individuals show mitochondrial introgression from a parental population with L. europaeus mtDNA into a population with the Golestan specimen’s nuclear genome. The fact that we do not see the reverse case, i.e., maternal penetration of mtDNA from the Golestan population into L. europaeus, calls for attention.

According to Haldane’s rule, the heterogametic sex is reproductively inferior (Haldane 1922); therefore, hybrids have to backcross with the most common pure parental form. Hybrid females would be expected to increase in frequency faster than male hybrids, and backcrosses between hybrid females and pure males would be more common than the other way around according to the “mother’s curse effect” (Gemmell et al. 2004; Smith et al. 2010). Because of maternal inheritance of mitochondrial DNA in mammals, hybrids commonly represent the mitochondrial genome from one species and the nuclear genome from the other parental form.

Under selective neutrality, the local population is in equilibrium but an invader is far from the original gene pool and the genome of the subpopulation of frontrunners would be expected to become gradually diluted by the genome of the local population. Therefore, there is a net flow of genes from the local population to the invaders (Currat et al. 2008; Excoffier et al. 2008). Drovetski et al. (2015) suggested that when one population expands into the range of another one, introgression of neutral nuclear loci will be asymmetric from the local to the invading population. Their data also showed an introgressive sweep of mtDNA that had the opposite direction of the nuDNA introgression. A scenario like that could explain the origin of the unclassified population, if an ancestral population of L. europaeus invaded a part of the range of the same taxon as the Golestan population, in eastern Iran. Following this scenario, the nuclear genome of the Golestan population could have swept the genome of the invaders, whereas the mitochondrial genome of the invaders could have swept the local population.

In many cases hybridization would result in loss of ecological adaptations (Johnson 2000; Land and Lacy 2000), due to outbreeding depression in hybrid individuals (Lynch 1991). Hybrid forms usually have lower fitness compared to pure parental forms, in which case the hybrid zone is often restricted to a narrow suture zone (Fickel et al. 2008; Shurtliff 2013). However, in this case the presumed hybrid population inhabits a region that appears to differ ecologically from both of the presumed parental populations (Fig. 8), suggesting a higher degree of adaptation to the conditions of the inland arid areas of the Iranian Plateau (an area of approximately 1,000,000 km2) by the unclassified population, than that of either of the parental populations.

Evidence of Gene Flow between L. tolai and L. timidus

Lepus tolai overlap extensively with L. timidus in the type locality region (Hoffmann and Smith 2005), and of our own samples from the southeast Baikal, identified as L. tolai, three show L. timidus cyt b haplotypes (clade C, Fig. 4).

Taxonomy and Geographical Distribution

As inferred from our results, the Golestan population tentatively identified as L. tolai/tibetanus according to morphology and mtDNA is distributed outside of the high Iranian Plateau, through the lowlands and semidesert areas along the southeast Caspian Sea, where fluctuations in humidity serve to moderate both monthly and annual temperature (Khalili 1973). Lepus europaeus occupies habitats in the western part of the Iranian Plateau from the northwest Alborz Mountains south through the foothills of the Zagros Mountains. This is considered the southeastern-most limit of the range of L. europaeus (Angermann 1983; Hoffmann and Smith 2005).

The unclassified group, with mitochondrial DNA of L. europaeus and the nuclear transferrin of the Golestan population, occupies most of the eastern and central parts of the Iranian Plateau. Different subpopulations of this unclassified population have previously been described as L. capensis habibi (northeast of Iran), L. c. cheybani (central parts of Iran), and L. c. petteri (southeast Iran). Although morphological and morphometric variations within the unclassified population seems to be related to ecogeographic conditions that may not contradict such phenotypic grouping, the apparent hybrid origin of the population speaks against giving taxonomic rank to its parts until more is known about the current situation regarding gene flow and possible assortative mating.

This study also sheds light on the taxonomy of L. craspedotis from Baluchistan, Iran. In spite of attribution of craspedotis to various species (Ognev and Heptner 1929; Ognev 1940; Hoffmann and Smith 2005), our results demonstrate that the taxon craspedotis most likely represents a subpopulation of the unclassified population characterized by the same combination of mitochondrial DNA from L. europaeus and transferrin from the Golestan population as the rest of the unclassified population. It seems that variation in morphological and morphometric characters within both L. europaeus and the unclassified introgressed population are adaptations to local climatic, geographical, and altitudinal factors. The morphological differences resulting from such local adaptations seem to be responsible for much of the taxonomic confusion previously surrounding the hares of Iran.

Morphological Adaptions to Local Conditions

Populations of L. europaeus from southwest Iran tend to have proportionately longer ears, compared to head and body length, than populations from farther north (Fig. 3a), which is probably an example of convergent evolution based on Allen’s rule in warm habitats (Allen 1877). Allen’s rule postulates that mammals living in a cold climate prevent heat loss by decreasing length of appendages such as a tail and ears. In contrast, in warm climate, appendages tend to be longer in order to decrease body temperature. Furthermore, a clinal variation from north to south in general body size was observed within L. europaeus (Fig. 3a), suggesting that body size is under selection in relation to environmental conditions related to latitude. Decreasing mean values of these measurements can be an adaption to warmer and dryer climate in the south of Iran in concordance to Bergman’s rule (Bergmann 1847). However, we find annual mean temperature as a climatic variable related to latitude to show more correlation to the body size in L. europaeus (Fig. 3a), while the body size of the Golestan population appears to be more correlated to annual precipitation. In both cases, this can be interpreted as a response to the factors limiting productivity in the regions. Plant productivity determines availability of food for hares, and in high altitudes inhabited by L. europaeus is correlated to temperature, while in the semi desert areas occupied by the unclassified group the main limiting factor is humidity supplied by precipitation. Differences in the size of the body and appendages thus seems to be a thermoregulatory response to latitudinal trends in temperature variations.

Conclusions

In our data, only L. europaeus in western Iran are taxonomically uncontroversial. Samples from Golestan fit L. tolai/tibetanus in morphological and mtDNA characters. These two groups are inferred to be the parental populations to a population of hybrid origin inhabiting much of eastern Iran, seemingly occupying a slightly different niche than either of the parental species.

This study provides the first evidence of introgression of mitochondrial DNA from L. europaeus into what is inferred to be L. tolai/tibetanus during ancient secondary contact, and also from L. timidus into L. tolai in the type locality of the latter. In general, the unclassified population is here interpreted to be descendant from a L. tolai/tibetanus population with introgressed mtDNA from L. europeaus. The amount of admixed nuclear DNA remains to be determined. The unclassified population is morphologically most similar to the Golestan population, but shows some differences that may be related to adaptions to different altitude, precipitation, and seasonality in southeastern Iran. The maintenance of the mixed genome and morphological differences over a wide area providing a different niche than the parental populations suggests that the unclassified population may possibly be viewed as a currently independently evolving population, i.e., an incipient species.