Introduction

Biodiversity is being lost at an astonishing rate due to forces associated with human population growth, habitat alteration and global climate change (McNeely et al. 1990; Chapin III et al. 2000). The situation is particularly troublesome for freshwater ecosystems, as freshwater biodiversity has declined at a faster rate compared to terrestrial and marine systems (Ricciardi and Rasmussen 1999; Jenkins 2003). Freshwater ecosystems contain 0.01% of the world’s water by volume; yet these systems are inhabited by 6% of all described species (Dudgeon et al. 2006). Fishes are the most-studied indicators of biodiversity decline in freshwater ecosystems (Moyle et al. 2011) but our knowledge and understanding of freshwater fish diversity, patterns of endemism, and genetic variation is limited, with considerable amounts of diversity yet to be described (Myers et al. 2000; Abell et al. 2008).

With limited resources for conservation, it is vital that ecologists are able to identify the appropriate level of conservation units for protection. Species have long been considered the foundation of biodiversity (McNeely et al. 1990), and a standard metric to monitor environmental conditions (Noss 1990). However, with the expansion of genetic data in conservation, and the uncertainty in species designation, conservation managers have begun to incorporate diversity below the taxonomic species level (i.e. subspecies, evolutionary significant units (ESUs), distinct population segments). Delimitation of fauna into definable conservation units is especially difficult for organisms that are morphologically indistinguishable or cryptic (Bernardo 2011). Cryptic diversity is commonly misclassified by traditional approaches, where two or more units are mistakenly classified as one. With the increase in DNA-based studies, cryptic diversity is rapidly being discovered throughout taxa and bioregions (Bickford et al. 2007; Pfenninger and Schwenk 2007). Once cryptic diversity has been identified by molecular analysis, how or if it is delimited into a conservation unit, such as a “species” or an “ESU”, is controversial (Niemiller et al. 2013; Murphy et al. 2015). Though the identification and recognition of cryptic diversity further complicates taxonomy and conservation management, it provides a more realistic picture of biodiversity and extinction risks. An accurate accounting of cryptic diversity is essential for identification of biodiversity hotspots, species distributions, and reserve design (Bickford et al. 2007; Bernardo 2011; Piggott et al. 2011).

Speckled dace (Rhinichthys osculus), a small (typically less than 100 mm fork length) freshwater cyprinid that occurs throughout much of the western United States, displays considerable amounts of cryptic diversity that have yet to be categorized (e.g., Hoekzema and Sidlauskas 2014). Speckled dace inhabit a wide assortment of habitats (Moyle 2002) but only subtle morphological differences have been recorded between speckled dace populations exhibiting species level genetic divergence (Hoekzema and Sidlauskas 2014). Presently, speckled dace are considered to be a single, wide ranging species comprised of a number of subspecies. There are four subspecies listed under the Endangered Species Act and four subspecies considered to be of conservation concern in California (Moyle et al. 2015).

Several phylogenetic studies have undertaken the task of untangling the evolutionary history of speckled dace (Oakey et al. 2004; Pfrender et al. 2004; Smith and Dowling 2008; Ardren et al. 2010; Billman et al. 2010; Hoekzema and Sidlauskas 2014). Molecular assessments have shown that speckled dace represent a polyphyletic taxon containing a number of undescribed taxa (Ardren et al. 2010; Hoekzema and Sidlauskas 2014). In general, these studies have shown that deep genetic divergence occurs among speckled dace isolated in separate river basins. Diversification of speckled dace, like many freshwater fishes of the American west, can be attributed to complex geologic and climatic processes that caused extended periods of isolation between basins, interspersed with episodes of dispersal associated with drainage rearrangements (Minckley et al. 1986).

Understanding of speckled dace taxonomy and evolutionary history is complicated by the potential for hybridization. Genetic studies investigating hybridization in speckled dace are limited, but evidence of hybridization between introduced speckled dace and relict dace (Relictus solitarius) in the Great Basin has been reported (Houston et al. 2012). Also, Smith (1973) suggested hybridization between speckled dace and two other species (longnose dace Rhinichthys cataractae and redside shiners Richardsonius balteatus) based upon morphological analyses.

Our research investigates speckled dace in the Klamath–Trinity Basin in south central Oregon and northwestern California. Due to genetic distinctiveness and geographic isolation, speckled dace occurring in the Klamath–Trinity Basin are recognized as an endemic subspecies, Klamath speckled dace (R. osculus klamathensis) (Oakey et al. 2004; Pfrender et al. 2004). Klamath speckled dace are nearly continuously distributed throughout the entire Klamath–Trinity Basin and occur at high abundance in many areas in the basin (Moyle 2002). While speckled dace are common and native to the Klamath–Trinity Basin, they are absent from all adjacent coastal basins, with the exception of introduced populations (Moyle 2002).

Two studies have resolved evidence for multiple speckled dace lineages within the Klamath–Trinity Basin. First, Pfrender et al. (2004) uncovered evidence of two highly divergent mitochondrial DNA haplotypes in the Upper Klamath River and hypothesized that there might be two reproductively isolated forms of speckled dace co-occurring in the region. Second, Kinziger et al. (2011) found deep divergence in nuclear microsatellites and mitochondrial DNA between speckled dace occurring in the Trinity River system versus those from the Klamath River system. While these studies suggest the presence of multiple distinct genetic groups, the numbers and distributions of the previous sampling locations were insufficient to provide a clear understanding of speckled dace genetic structuring within the Klamath–Trinity Basin.

In this study, we conducted a higher resolution assessment of speckled dace genetic structure in the Klamath–Trinity Basin using both nuclear microsatellites and mitochondrial DNA markers over a broader geographic range of sampling sites. Our objective was to identify cryptic genetic groups, their geographic boundaries, contact zones and levels of hybridization. Overall our analysis resolved highly complex patterns of genetic structuring, including (1) resolution of deep genetic divergence between speckled dace in the Trinity River system and the Klamath River system (hereafter referred to as the “Klamath” and “Trinity” groups; Fig. 1), (2) evidence for hybridization between the Klamath and Trinity groups near the confluence of the Klamath and Trinity rivers, specifically in Tish Tang Creek (TT) and the Salmon River (NFS and SFS, hereafter these three sites are referred to as Klamath X Trinity hybrids), (3) identification of a genetically distinctive group of speckled dace isolated by a waterfall, located in Jenny Creek (JEN), a tributary to the Klamath River, and (4) an extremely rare mitochondrial haplotype (SEV01) restricted to the upper reaches of the Klamath River basin, that is highly divergent from all other speckled dace examined herein.

Fig. 1
figure 1

Location of the 25 Klamath–Trinity Basin speckled dace collections in California and Oregon, USA. Collections from the Klamath River system (red), Trinity River system (green), Jenny Creek (purple), and Klamath X Trinity hybrids (blue). Circles represent collections for this study or from Kinziger et al. (2011) and the triangle represents the location (tributary to Little Butte Creek, TLB) of sequence data provided by Thomas Dowling. Site abbreviations as in Table 1. (Color figure online)

Table 1 River system, collection site, site abbreviation (ID), latitude, longitude, collection date, and the Humboldt State University Fish Collection numbers (HSU ID) from Klamath–Trinity basin speckled dace

Materials and methods

Field collections

Speckled dace were collected at 25 sites throughout the Klamath–Trinity Basin (Fig. 1; Table 1). Collections of the Klamath group (N = 13) spanned from the mouth of the Klamath River to above Klamath Lake. Collections of the Trinity group (N = 9) included sample sites throughout the Trinity River system, including those above Trinity Lake. Collections of the Klamath X Trinity hybrids consisted of three sites near the confluence of the Klamath and Trinity rivers, including Tish Tang Creek (TT) and the Salmon River (NFS and SFS). Jenny Creek was represented by a single collection. Specimens were collected using seine nets, or a backpack electrofisher and were supplemented by specimens from Kinziger et al. (2011) which were archived at the Humboldt State University (HSU) Fish Collection. Specimens were euthanized using an overdose of tricaine methanesulfonate and vouchered whole, or were anesthetized and then a caudal fin clip was collected before releasing the fish. Whole specimens/tissue were preserved in 95% ethanol and deposited into the HSU Fish Collection.

Microsatellite genotyping methods

Speckled dace were genotyped at nine microsatellite loci (Baerwald and May 2004; Turner et al. 2004; Girard and Angers 2006, Table S1). Whole genomic DNA was extracted from fin tissue using the Chelex DNA extraction method (Walsh et al. 1991) and microsatellite loci were amplified via polymerase chain reaction. Amplifications were performed as either 10 or 12.7-µl reactions using GoTaq Colorless Master Mix (Promega, Madison, WI) in a MJ Research (Waltham, MA) PTC-100 or an Applied Biosystems (Grand Island, NY) 2720 thermal cycler. The forward primer of each primer pair was labeled with a WellRED fluorescent dye (Sigma-Aldrich, St Louis, MO) for identification. Products were visualized and allele sizes determined with a Beckman-Coulter CEQ 8000 Genetic Analysis System (Brea, CA). Allele sizes were scored twice and any discrepancies were either resolved or the genotype was removed. Individuals missing more than two loci from their multi-locus genotype were removed from the dataset. Tests for conformance to Hardy–Weinberg equilibrium (HWE) and linkage disequilibrium were conducted using GENEPOP V4.2 (Raymond and Rousset 1995). Loci were checked for null alleles, stutter peaks, and large allele dropout using MICRO-CHECKER v 2.2.3 (Van Oosterhout et al. 2004).

Microsatellite analysis

Nuclear diversity

Observed heterozygosity (H o), Hardy–Weinberg expected heterozygosity (H e), and allelic richness (A) were calculated in ARLQUIN 3.11 (Excoffier et al. 2005). Rarified allelic richness (A R) and private allelic richness (A p), both standardized to a sample size of 48 genes, were calculated using HP-Rare (Kalinowski 2005). Permutation tests (2000 replicates) for significant differences in A R and H e between the Klamath and Trinity groups were conducted using the software FSTAT v2.9.3 (Goudet 1995).

Population structure

Pairwise estimates of genetic differentiation (F ST) between sites and tests of their significance were conducted using FSTAT v2.9.3 (Goudet 1995). Pairwise estimates of standardized F ST, (FST) which is standardized to the largest possible value obtainable, were calculated in GENODIVE (Meirmans and Van Tienderen 2004). A graphical depiction of genetic divergence between sites was generated by constructing a neighbor-joining tree of Cavalli-Sforza and Edwards chord distances using the software PHYLIP v3.68 (Felsenstein 2005). Branch support was evaluated by a bootstrap analysis with 1000 replicates.

Two different genetic clustering approaches were employed to analyze the microsatellite dataset. The first approach utilized a Bayesian clustering algorithm that is implemented in the software STRUCTURE v 2.3.4 (Pritchard et al. 2000) to estimate the number of discrete genetic clusters (K) of individuals. An individual’s assignment to each cluster was also calculated, called the admixture proportion (q), and can be used to estimate hybridization levels. After discarding the first 100,000 steps of the MCMC simulations as burn-in, 100,000 additional steps were performed. A total of 20 iterations were conducted at each level of assumed K = 1 … 12. To estimate the number of clusters in the data the ad hoc method of ∆K (Evanno et al. 2005) was calculated using STRUCTURE HARVESTER (Earl and vonHoldt 2012), where the largest change in K infers the number of clusters. The results from STRUCTURE were visualized using the software DISTRUCT V 1.1 (Rosenberg 2004).

The second genetic clustering method, Discriminant Analysis of Principal Components (DAPC) (Jombart et al. 2010), uses a multivariate approach to visualize population differentiation with no assumptions of population genetic models. The analysis attempts to maximize between group variation among predefined groups. The DAPC analysis was conducted in the program R (R core team 2016) using the package ADEGENET (Jombart 2008).

Isolation-by-distance

Tests for conformance to an isolation-by-distance (IBD) gene flow model were conducted by evaluating the relationship between river distances and genetic distances by conducting a Mantel test (10,000 randomizations) in the software IBDWS v.3.23 (Jensen et al. 2005). Pairwise river distances (KM) between collection sites were calculated in GIS ArcMap 10.1 and genetic distances consisted of pairwise FST.

Mitochondrial DNA

A subset of 236 speckled dace from all 25 sites (average of nine individuals per site, range 2–23 individuals) was sequenced for a 530-bp fragment of the mitochondrial cytochrome b gene (cyt b). Amplification of DNA was conducted with primers LA and HA (Dowling and Naylor 1997) using the following thermal cycling routine: 35 cycles of 94 °C for 60 s, 48 °C for 60 s, and 72 °C for 120 s. The primer LA and three primers designed for this study were used for sequencing (Table S2). PCR products were purified and sequenced at High-Throughput Genomics Center (University of Washington, Department of Genome Sciences) using an Applied Biosystem 3730 xl sequencer. Chromatograms were visually inspected and manual corrections were made to the sequences. Sequences were aligned in MEGA6 (Tamura et al. 2013) using MUSCLE (Edgar 2004).

Mitochondrial DNA diversity and population structure

DNASP v5.10 (Librado and Rozas 2009) was used to estimate the number of haplotypes (H), the number of variable sites (S), nucleotide diversity (π), haplotype diversity (Hd), and average percent sequence divergence (Dxy). Uncorrected p-distances were calculated as the average number of nucleotide differences between site pairs using MEGA6 (Tamura et al. 2013).

A maximum-likelihood (ML) tree was generated using the software MEGA6, with branch support estimated via 1000 bootstrap pseudoreplicates. Publically available sequences were included to evaluate monophyly of speckled dace originating from the Klamath–Trinity River drainage and provide comparisons to other drainages. (Table S3). Additionally, we included a speckled dace haplotype from Dead Indian Creek (42.25194, − 122.4516) a tributary of Little Butte Creek in the Rogue River drainage, Oregon (Dowling et al. in press) (hereafter referred to as (TLB) and additional speckled dace haplotypes. Rhinichthys atratulus served as an out-group to root the tree (Dowling et al. 2002). The additional sequences were obtained from GenBank and Dryad Digital Repository (https://doi.org/10.5061/dryad.ht554; Table S3). The best model of sequence evolution was identified using jModelTest v2.1.4 (Posada 2008) based upon Akaike’s Information Criterion corrected for small sample size AIC (Burnham and Anderson 2002). To visualize relationships among the mtDNA haplotypes, a 95% maximum parsimony haplotype network was constructed with TCS 1.21 (Clement et al. 2000).

Results

Microsatellites

A total of 1075 individuals were assayed in the microsatellite data with an average of 43 fish per location (range 30–48). Preliminary tests indicated that the locus CypG 13 consistently departed from Hardy-Wienberg expectations and therefore was removed. The final eight microsatellite loci were highly polymorphic, containing a total of 258 alleles (mean 32.3 alleles/locus; range 5–78). Out of a total of 200 tests (8 loci and 25 populations) for conformance to Hardy–Weinberg expectations (HWE), nine were significant after Bonferroni correction (critical value = 0.00025, Rice 1989). A less conservative multiple test correction, the B-Y FDR method, found twenty-seven tests that were significant (critical value = 0.008506, Narum 2006). Departures from HWE were likely due to null alleles, as suggested by MICRO-CHECKER, but there was no evidence of stuttering, large allele dropout, or linkage disequilibrium in the loci. No single locus or site consistently departed from expectations, eliminating locus and site-specific factors as causes for the deviations.

Genetic diversity and population structure

Genetic diversity was higher in the Klamath group than in the Trinity group (Table 2). Mean expected heterozygosity (H e) was 0.73 (range 0.69–0.74) in the Klamath and 0.52 in the Trinity (range 0.49–0.56). Mean rarified allelic richness (A R) was 13.10 (11.82–14.26) in the Klamath and 9.08 (8.03–9.76) in the Trinity. Expected heterozygosity and rarified allelic richness were significantly lower in the Trinity in comparison to the Klamath (P = 0.001). Among the three sites containing hybrids between the Trinity and Klamath groups, one had relatively high diversity (TT, A R = 12.55, H e = 0.67) whereas the other two locations in the Salmon River exhibited levels of diversity that were intermediate to the Klamath and Trinity groups (NFS: A R = 10.56, H e = 0.62; SFS: A R = 9.19, H e = 0.59). The genetically distinctive speckled dace from JEN had low values of diversity (Ar = 7.58, H e = 0.45), compared to geographically proximate locations.

Table 2 Genetic diversity at eight microsatellite loci for Klamath–Trinity Basin speckled dace, including genetic group, site, site abbreviation (ID), sample size (N), observed heterozygosity (H o), expected heterozygosity (H e), allelic richness (A), rarified allelic richness (A R), and rarified number of private alleles (A p)

A total of 257 of the 300 pairwise F ST values (range 0.00–0.345) were significant (P ≤ 0.000167, Table S4). Significant F ST was generally resolved in pairwise comparisons between Klamath and Trinity groups whereas comparisons among sites within each group were not generally significant. The mean standardized FST for comparisons between Klamath and Trinity was 0.488, whereas the mean level of divergence among sites within the Klamath (0.051) and within the Trinity (0.029) were much lower (Table 3). The three Klamath X Trinity hybrid sites were more similar to the Trinity (mean standardized FST 0.11) than the Klamath (0.406). JEN was resolved as divergent from all collection sites (range 0.539–0.625).

Table 3 Mean pairwise standardized genetic differentiation (FST) at eight microsatellite loci among the main genetic groups of Klamath–Trinity Basin speckled dace

The unrooted neighbor-joining tree revealed two distinct clusters of speckled dace, including one containing all sites from the Klamath and a second including all sites from the Trinity. The Klamath X Trinity hybrids (TT, NFS and SFS) were resolved as most similar to the Trinity group (Fig. 2). Jenny Creek (JEN) was resolved as very divergent from all other populations and located just outside of Klamath cluster.

Fig. 2
figure 2

Unrooted neighbor-joining tree based on the Cavalli-Svorza and Edwards’s chord distances calculated using eight microsatellite loci assayed in Klzamath-Trinity Basin speckled dace. The numbers on the branches are bootstrap values from 1000 replicates. Only values above 75 are shown. Trinity branches are green, Klamath branches are red, Jenny Creek (JEN) is purple and the Klamath X Trinity hybrids (TT, NFS, and SFS) are blue. (Color figure online)

Bayesian cluster analysis using STRUCTURE suggested that the data was best described by two genetic clusters (K = 2) using the ad hoc statistic ∆K (Figs. S1 and S2). All 20 replicate runs at K = 2 resulted in an identical clustering pattern (Figs. 3 and S1). In the individual assignment plot the two distinct clusters were consistent with recognition of Klamath and Trinity genetic groups. The distribution of individual admixture proportions (q) indicated most individuals assigned to either pure Trinity (q > 0.9) or pure Klamath (q < 0.1) with 10% of individuals (n = 109) being assigned as hybrids (0.1 < q < 0.9) (Fig. 4a). However, hybrids were not assigned with high confidence as none of the individuals with intermediate q values had 90% probability intervals that were entirely contained with 0.1 and 0.9, except for three individuals from the Salmon (NFS and SFS) (Fig. 4b–d). Tish Tang (TT) contained the highest numbers of hybrids, with 48% of assayed individuals assigned as hybrids (Fig. 4b). The Salmon River sites (NFS and SFS), showed lower levels of hybridization with 23% of individuals assigned as hybrids (Fig. 4c, d).

Fig. 3
figure 3

Results from Bayesian cluster analysis based upon eight microsatellite loci for Klamath–Trinity Basin speckled dace. Each individual is displayed as a thin horizontal line divided into sections, whose length is equal to the probability of membership to a cluster (q) while populations are differentiated by thick black lines. Two genetic clusters (K = 2) was the most probable number of clusters based on ad hoc statistic ∆K (Fig. S1) and all 20 iterations showed the same genetic clustering. Assuming three clusters (K = 3) displayed evidence of multimodality, 19/20 iterations had Jenny Creek as a distinct cluster while the remaining iteration found the Salmon River populations (NFS and SFS) distinct

Fig. 4
figure 4

Individual admixture proportions (q) estimated by Bayesian cluster analysis (K = 2) of eight microsatellite loci assayed in Klamath–Trinity Basin speckled dace. Shown are point estimates of q and 90% probability intervals for a all individuals in the data set, b Tish Tang Creek (TT), c North Fork Salmon (NFS), and d South Fork Salmon (SFS). Individual q values close to zero and one signify Klamath and Trinity clusters, respectively. Individual q values are sorted from low to high for presentation

The Bayesian cluster analysis assuming K = 3 distinct groups resulted two different solutions across the 20 independent STRUCTURE runs (Fig. 3). The most common solution resolved JEN as a third distinct cluster (19 of 20 runs) and the other solution resolved the Salmon River sites as distinct (1 of 20 runs). At K = 4 there was evidence for multimodality and some solutions assignments were symmetric to all sites, suggesting that the number of clusters represented by the data was being overestimated (not shown). The multivariate DAPC analysis resolved similar results to the STRUCTURE analysis (Figs. S3, S4).

Isolation-by-distance

Tests for isolation-by-distance (IBD) were conducted separately for the Klamath and Trinity groups due to the large divergence resolved between them. The Klamath X Trinity hybrids were grouped in the Trinity IBD analysis as they were more closely aligned with the Trinity group in the Bayesian cluster analysis. The relationship between pairwise genetic distances (FST) and river distance was significant for both the Klamath and Trinity groups (p < 0.05) and the intercept was essentially zero in both cases, which is consistent with a IBD model of gene flow (Fig. S4). However, geographic distance only explained 15% of the variation in genetic differentiation between sites in the Klamath (R2 = 0.145, P = 0.0115), whereas 58% of the variation in genetic differentiation was explained for the Trinity (R2 = 0.581, P = 0.0005). The slope of Trinity IBD relationship (6.350e-04) was nearly twice that of the Klamath IBD relationship (3.550e-04). Trinity IBD analysis performed without the Salmon sites (NFS and SFS) resolved the same results (data not shown). Jenny Creek (JEN) was excluded for tests of IBD because of its deep level of divergence from all sites.

Mitochondrial results

For mtDNA, the average number of fish sequenced per location was nine and ranged from 2 to 23 (Table 4). A total of 79 unique haplotypes were defined by 88 variable nucleotide positions (Genbank accession numbers MF066096-MF066172). Mean nucleotide diversity was 1.91% in the mtDNA dataset. Within collection sites, the number of haplotypes ranged from 2 to 9 and nucleotide diversity ranged from 0.0 to 3.1%. Jmodel test selected the Kimura two-parameter model (K80) of sequence evolution as the best fit to describe the mtDNA sequence data.

Table 4 Mitochondrial DNA sequence diversity for Klamath–Trinity speckled dace including genetic group, site abbreviation (ID), number of sequences (N), number of variable sites in the sequences (S), sequence diversity (π), unique haplotypes (H), and haplotype diversity (Hd)

Similar to the pattern observed in the microsatellite data, the Klamath displayed higher levels of mtDNA diversity than the Trinity (Table 4). The Klamath group contained 47 haplotypes (N = 103) with a mean haplotype diversity of 0.88, while Trinity contained 21 haplotypes (N = 84) with a mean haplotype diversity of 0.54. Mean nucleotide diversity was also lower in the Trinity (0.10%) compared to Klamath (1.0%). The Trinity X Klamath hybrids contained 13 haplotypes (N = 36) with a mean haplotype diversity of 0.57 and a mean nucleotide diversity of 0.7%. Among the Klamath X Trinity hybrids, the Salmon River populations (NFS and SFS) had reduced levels of mtDNA diversity, while TT had elevated levels of diversity. Jenny Creek (JEN) contained four haplotypes (N = 12) with a haplotype diversity of 0.561 and a mean nucleotide diversity of 0.2%.

Mitochondrial structure

The Klamath–Trinity Basin speckled dace were resolved as nearly monophyletic [bootstrap (BS) 77], and exhibited a sister group relationship with nearby basins in California and Oregon (Sacramento, Pit River, and Goose Lake Basin, Fig. 5). There were two exceptions to monophyly for Klamath–Trinity speckled dace: (1) a divergent haplotype (sequence divergence > 5.8%) identified in a single individual found in Seven Mile Creek, a tributary to Klamath Lake, which was resolved in a basal position in our tree (hereafter referred to as SEV01), and (2) a haplotype from the tributary of Little Butte Creek (TLB), Rogue River Basin, that aligned with Jenny Creek (JEN) haplotypes.

Fig. 5
figure 5

Maximum Likelihood tree generated from unique mitochondrial cytochrome b haplotypes for Klamath–Trinity speckled dace and publicly available speckled dace haplotypes. A single R. atratulus sequence served as an out-group. Support for the tree was established via 1000 bootstrap replicates and only values above 70 are shown

Excluding the (SEV01) haplotype, the Klamath–Trinity Basin contained three primary clades: (1) Trinity (BS = 75), (2) Klamath (BS = 90), and (3) Jenny (BS = 70) (Fig. 5). The Trinity was composed exclusively of individuals that originated from the Trinity River. The Klamath included the majority of the individuals collected from Klamath River and 10 of the 15 speckled dace from the hybrid location in the lower Trinity River, TT, and the hybrids from NFS, and SFS. The Jenny group was comprised all 12 individuals examined from JEN, five individuals from the upper Klamath River and the two individuals examined from a tributary to Little Butte Creek (TLB). The Trinity was resolved as sister to Jenny (BS = 89) and Klamath clade was sister to this group. The close relationship between the Trinity clade and the Jenny clade was unexpected given the distances involved and the geographic proximity of the Klamath and Jenny groups. The Klamath contained considerably more within-group structuring than the Trinity, including a well-supported (BS = 88) branch containing haplotypes from above Upper Klamath Lake in Oregon.

In the 95% maximum parsimony network Klamath–Trinity Basin speckled dace were divided into two unconnected primary networks (excluding the single divergent haplotype SEV01, Fig. 6). The same three clades resolved by the ML tree were evident: (1) Trinity, (2) Klamath, and (3) Jenny. The SEV01 haplotype (not shown) was highly divergent from all other groups (5.87–6.16%). The Klamath displayed considerable structure in the haplotype network and possessed the largest within-group percent sequence diversity (1.22%, Table 5). The Trinity had the lowest within group percent sequence diversity (0.150%) and contained one primary haplotype at high frequency in all populations. Klamath and Trinity displayed the largest sequence divergence from one another (2.96%), followed by Klamath versus the Jenny group (2.64%), and lowest was Klamath versus Klamath X Trinity hybrids (1.21%). Among the Klamath X Trinity hybrids NFS and SFS contained only Klamath haplotypes whereas TT contained a mixture of Klamath and Trinity haplotypes and a sequence diversity of 1.06%.

Fig. 6
figure 6

Two unconnected ninety-five percent parsimony haplotype networks based on mitochondrial cytochrome b sequences for Klamath–Trinity Basin speckled dace. Circles represent unique haplotypes and the diameter of the circle corresponds to relative haplotype frequency. Small black dots represent inferred haplotypes. The circles are color coded to display the proportion of individuals from the Klamath (red), Trinity (green), Jenny/Rogue (purple) or Klamath X Trinity (blue) with the haplotype. (Color figure online)

Table 5 Average percent sequence divergence of mitochondrial cyt b sequences (Dxy) in Klamath–Trinity Basin speckled dace.

Discussion

Collections from throughout the Klamath–Trinity basin combined with analysis of both mitochondrial DNA and microsatellite loci revealed complex within-basin patterns including evidence for multiple cryptic groups within R. o. klamathensis. Overall, our analysis indicates the presence of three genetically distinct groups of speckled dace within the Klamath–Trinity Basin: (1) Klamath, (2) Trinity, and (3) Jenny, each named according to its geographic distribution. The most notable result of this study was the extent of genetic divergence among the speckled dace lineages occurring in the Klamath–Trinity Basin. The level of genetic divergence in mtDNA cyt b (1.38–6.16%) between groups found in the Klamath–Trinity Basin is comparable to divergences found between recognized Rhinichthys species (McPhail and Taylor 2009).

Klamath and Trinity groups

The Klamath group is distributed throughout the Klamath River and its tributaries and the Trinity group is distributed in the Trinity River and its tributaries (Fig. 1). The area near the confluence of the Klamath and Trinity rivers and the Salmon River appear to be zones of contact and hybridization between the Klamath and Trinity groups (see below).

The precise biogeographic processes in the Klamath–Trinity Basin that have impacted speckled dace populations during the Pliocence-Pleistocene are unclear because the substantial number of geomorphic rearrangements that have occurred in this geologically active region over the past few million years (Minckley et al. 1986; Aalto 2006). Furthermore, repeated glacial advance, retreat and associated erosion make reconstruction of historical river drainage patterns uncertain (Anderson 2008). The evidence that remains suggests that the northwest Klamath Mountains are the result of recent uplift during Pliocene–Pleistocene and before that, the area was a low-lying peneplain (Aalto 2006). Before this uplift, the Trinity and Klamath Rivers presumably had different drainage patterns then they do today, though the nature of their flow and outlet are obscured (Anderson 2008).

We hypothesize that the Klamath and Trinity groups diverged in allopatry and are now in secondary contact for several reasons. First, the estimated divergence time between the Klamath and Trinity groups ranges from 1.18 to 2.47 million years ago based upon Smith and Dowling’s (2008) estimate of cyt b mutation rate of 1.2–2.5% per million years for speckled dace. This estimated Pleistocene divergence time for Klamath and Trinity groups is consistent with the estimated time of uplift for the northwest Klamath Mountains (Aalto 2006). Second, the confinement of the Klamath and Trinity groups to their respective river systems, and the sharp northward bend of the Trinity River where in connects with the Klamath River, suggests that these systems were once separate. Finally, the extent of divergence (~ 3%) between the Trinity and Klamath groups is similar to levels observed between speckled dace isolated in different river systems (Pfrender et al. 2004; Ardren et al. 2010). All potential within-basin barriers, such as Burnt Ranch Gorge on the Trinity River and Ishi Pishi Falls on the Klamath River, are surmountable to fishes and unlikely serve as long-term migration barriers to speckled dace, which have appreciable dispersal capacity for their size (Pearsons et al. 1992; Brown and Moyle 1997).

Klamath X Trinity hybridization

Most individuals examined from the Klamath and Trinity rivers were resolved as non-admixed as indicated by the bimodal distribution of individual admixture coefficients (Fig. 4). However, our analysis suggested the existence of contact and hybridization between Klamath and Trinity groups in the geographic region near the confluence of the Klamath and Trinity Rivers, including Tish Tang Creek (TT), and the Salmon River (NFS and SFS).

Hybridization in Tish Tang Creek was suggested in the analysis of microsatellite loci by the intermediate positioning of this collection between the Klamath and Trinity groups in the tree-based analysis (Fig. 2). Further in Bayesian cluster analysis a large number (48%) of individuals from TT had intermediate admixture coefficients (0.1 < q < 0.9), although none of the hybrid individuals had probability intervals that could be used to make confident assignments of hybridization (Fig. 4). In the mtDNA analysis, Tish Tang Creek was the only location studied that contained a mix of haplotypes originating from the Klamath and Trinity groups, a pattern also suggestive of hybridization (Fig. 6). Hybridization in this region is in accord with geography, as pure Trinity and Klamath appear to be confined to their respective river systems and presumably contact each other near the confluence of the two rivers.

Hybridization between the Klamath and Trinity groups in the Salmon River (NFS and SFS) was indicated by discordance between nuclear and mtDNA markers (Avise 2004). All Salmon River speckled dace had Klamath mtDNA haplotypes (n = 21), whereas nuclear microsatellite analysis aligned the Salmon River populations with the Trinity group (Figs. 2, 3). The individual admixture coefficients from Bayesian cluster analysis indicated the presence of several individuals in the Salmon River with probability intervals contained entirely within 0.1 < q < 0.9, (Fig. 4c, d) indicating the existence of hybridization between the Klamath and Trinity lineages within the Salmon River.

The geographic mechanism responsible for creating contact and hybridization in the Salmon River is unclear, as sites we examined near the confluence of the Klamath and Salmon Rivers (BB and CN) appeared to consist of pure Klamath individuals, suggesting the absence of contemporary contact between the Trinity and Klamath groups through the mouth of the Salmon River (Fig. 1). However, hybrid zones can be dynamic in space and time (Buggs 2007), and it is conceivable that Trinity and Klamath groups may have historically contacted one another through the lower Salmon River but they no longer do so. Another possibility is that the contact between the lineages may have resulted from headwater capture. There is a well-established river capture wherein a Trinity River tributary was diverted into the South Fork Salmon River during the Pleistocene (Hershey 1900; Sharp 1960; Fig. 1). Finer-scale sampling around the Salmon River and the confluence of the Trinity and Klamath rivers combined with application of molecular markers with improved resolution for hybrid identification are needed to more clearly define the geographic extent, frequency, and patterns of hybridization.

The mechanism(s) responsible for limiting hybridization to a relatively small geographic region (e.g., near confluence of the Klamath and Trinity rivers and the Salmon River) and maintaining the Klamath and Trinity groups from outside of this area as pure is uncertain. While hybrid zones can develop along environmental gradients (Harrison 1990), there are no readily apparent environmental gradients across the Trinity and Klamath rivers. Alternatively, interactions with predators have been shown to be important in constraining the distribution of speckled dace (Baltz et al. 1982; Harvey et al. 2004) as have water velocity and spawning substrate (Smith and Dowling 2008; Peden and Hughes 1981).

Jenny group

The analysis of both mtDNA and microsatellites indicated that the Jenny group was genetically distinct in comparison to the Klamath and Trinity groups (see also Pfrender et al. 2004). The Jenny group is distributed in two disjunct river basins, Jenny Creek, a tributary to the upper Klamath River, and a tributary of Little Butte Creek (TLB) of the Rogue River Basin (Fig. 1). We hypothesize that the Jenny group originally diverged within Jenny Creek and was recently introduced from Jenny Creek into TLB via artificial or natural means. The original genetic divergence presumably resulted from isolation caused by a 10-m waterfall created by Pleistocene lava flow in the lower portions of Jenny Creek (Hohler 1981). A dwarf form of the Klamath smallscale sucker (Catostomus rimiculus) is also isolated to Jenny Creek (Hohler 1981).

Our analysis indicated (1) the presence of JEN mtDNA haplotypes in the Upper Klamath River, and (2) nuclear admixture between the Jenny and Klamath groups in the Upper Klamath River (Fig. 3). These patterns may be the result of one-way downstream leakage of the Jenny group into the upper Klamath River. Alternatively, the Jenny Creek group may once have been present in the upper Klamath River but experienced significant drift within Jenny Creek due to isolation and small population numbers. Jenny Creek’s isolation and lack of outside contact could also explain the low levels of genetic diversity we observed (Tables 2, 5).

The genetic similarity in mtDNA between JEN and TLB supports a recent human-induced transfer between these rivers, as hypothesized by Bond (1994). Artificial means of transfer of Jenny Creek speckled dace into TLB may have resulted from water connections created by the US Bureau of Reclamation. However, a natural introduction via headwater capture is possible, as TLB and Jenny Creek systems are geographically close (3 m vertically and 0.4 km horizontally). Available evidence suggests the Jenny group is restricted to Jenny Creek and TLB, but does not occur elsewhere in the Rogue River basin as Pfrender et al. (2004) found that speckled dace from Little Butte Creek were more similar to those from other coastal Oregon streams than Jenny Creek. Jenny Creek speckled dace appear to represent a unique and isolated form that has gone undetected by traditional taxonomic surveys. We recommend incorporating collections from nearby coastal basins (Coos, Umpqua) and utilizing additional genetic markers to clarify Jenny Creek genetic relationships and geographic distribution.

SEV01 mitochondrial DNA haplotype

We detected a single mitochondrial DNA haplotype (SEV01) from the upper Klamath River that was highly differentiated from all other Klamath–Trinity Basin haplotypes (Fig. 5). Pfrender et al. (2004) identified the same haplotype from upper Klamath River but in our analysis we only resolved this haplotype in 1.67% of individuals examined (1/60) in the Upper Klamath whereas Pfrender et al. (2004) resolved this haplotype at a higher frequency, 12.73% (7/55). In contrast to the mtDNA results, the individual with the SEV01 haplotype was resolved as a member of the Klamath group by STRUCTURE analysis of nuclear microsatellites. This discordance between markers suggests a hybridization event between the Klamath group and the SEV01 lineage. Pfrender et al. (2004) proposed that the SEV01 lineage and the main Klamath lineage may represent two reproductively isolated forms co-occurring in the Upper Klamath River but our analysis does not support this hypothesis. While further study is needed to understand the origin of SEV01 and its relationship with the main Klamath lineage, the low frequency of the haplotype may hamper research.

Phylogenetic relationships

Phylogenetic analysis of mitochondrial DNA generally resolved Klamath–Trinity Basin speckled dace as cohesive group excluding the SEV01 haplotype and the Jenny group. The close relationship of Klamath–Trinity Basin speckled dace to the Sacramento, Pit, Goose Lake and Warner Basin likely reflect the historic connections these drainages had with the hypothesized westward flowing Proto-Snake River (Oakey et al. 2004; Arden et al. 2010). The close association the Oregon Great basin drainages (Goose and Warner) share with the Pit and Sacramento Basins is explained by the historic Goose Lake overflows that spilled into the Pit River (Baldwin 1981).

Phylogenetic relationships among the Klamath, Trinity, and Jenny groups were discordant between mitochondrial and nuclear markers, suggestive of gene-tree and species-tree discordance. In the analysis of mitochondrial DNA, Jenny Creek was resolved as sister to the Trinity group but in analysis of nuclear microsatellites, Jenny Creek appeared more closely related to the Klamath group (Fig. 6). Incomplete lineage sorting of mitochondrial DNA or a hybridization event between Jenny Creek speckled dace and other forms are potential explanations for these patterns (Ballard and Whitlock 2004; Carstens and Knowles 2007; Waters et al. 2010). These findings highlight that using single genes, or even two genes, may fail to resolve true species relationships and that multiple independent genes are needed to resolve true species relationships (Maddison and Knowles 2006; Heled and Drummond 2010).

Within-group genetic structure

Our analysis indicated the Klamath group contained significantly higher levels of mitochondrial DNA and nuclear microsatellite diversity than the Trinity group. This finding is in accord with a recent survey of genetic diversity in speckled dace across 10 northern California sites which indicated that Klamath group speckled dace had higher genetic diversity than the other sites studied (Kinziger et al. 2011). The higher level of genetic diversity within the Klamath group is consistent with biogeographic patterns, which support the upper Klamath River as a zoogeographic province for fishes (Moyle 2002).

Our field sampling specifically targeted sites isolated by impassable dams but these barriers produced no appreciable within-group genetic structuring in our study (Fig. 1). Based on the high level of genetic diversity detected in this study, the effective populations sizes are most likely too large for the dams to have impacted isolated populations via drift. However, the Upper Klamath River sites (SEV, SPR, LNK, and SPE) did exhibit some minor sub-structuring (Fig. 2). The Klamath and Trinity group both exhibited conformance an IBD model of gene flow, but the Klamath group displayed a weak IBD relationship with less within river system genetic differentiation compared to the Trinity, despite the greater distances in the Klamath (Fig. S5). The Trinity group stronger IBD relationship could be explained by the steep gradients and high elevations, which have created physical barriers to gene flow (Castric et al. 2001).

Conclusion

The discovery of previously unknown cryptic diversity of speckled dace in the Klamath–Trinity Basin illustrates the usefulness of molecular studies for cataloging the planet’s biodiversity. We recommend additional morphologic and ecological studies to determine if the cryptic genetic groups may warrant formal taxonomic recognition in the future. Fine-scale sampling throughout the Klamath–Trinity Basin combined with mitochondrial and nuclear markers was crucial for discerning genetic structure and cryptic diversity. The genetic structure of speckled dace in the Klamath–Trinity Basin and other western basins is much more complicated than contemporary drainage patterns would suggest, reflecting historical climatic and geological upheaval of the west. Our study and other recent studies (Hoekzema and Sidlauskas 2014) have uncovered far more diversity within speckled dace than previously thought. Possibly, the single taxonomic designation of speckled dace needs to be revised to accurately represent this diversity.