Introduction

Within riverine systems, many species are strongly affected by the unidirectional flow of water. River current is particularly important for population connectivity, especially in systems fragmented by natural or anthropogenic barriers and in species with limited swimming abilities. Many riverine fishes experience passive downstream drift as larvae or adults (e.g., Brown and Armstrong 1985; White and Harvey 2003) and, without compensatory upstream movement, the overall pattern of migration will be biased in a downstream direction. In these situations, upstream populations provide migrants to downstream populations while receiving few migrants themselves, resulting in an overall source-sink metapopulation (e.g., Hänfling and Weetman 2006; Barson et al. 2009). Thus, genetic diversity is generally highest in downstream populations and lowest in those furthest upstream (e.g., Barson et al. 2009; Dehais et al. 2010). Upstream populations, with reduced genetic diversity and little opportunity for “rescue” by migrants from downstream populations, may thus be particularly vulnerable to extirpation.

The western brook lamprey Lampetra richardsoni (WBL) is a small, freshwater-resident fish found in coastal streams from southern Alaska to northern California (Scott and Crossman 1973). Lamprey larvae of all species are filter-feeders in the sediment of streams and rivers, where they remain burrowed for several years until they undergo metamorphosis. During the larval phase, most movement is expected to occur through passive drifting with the current (reviewed in Dawson et al. 2015). However, unlike parasitic lampreys, which generally migrate out of the stream after metamorphosis to feed on actinopterygian fishes in the ocean or in lakes, brook lampreys do not feed following metamorphosis. Instead, they undergo rapid sexual maturation, and spawn and die in their natal stream within 6–10 months of metamorphosis (reviewed in Docker 2009). Parasitic lampreys, which do not necessarily return to their natal streams to spawn (Waldman et al. 2008; Moser et al. 2015), show minimal genetic differentiation among even widely-separated locations, as has been observed in sea lamprey Petromyzon marinus on each side of the Atlantic Ocean (Bryan et al. 2005; Almada et al. 2008; Waldman et al. 2008), Pacific lamprey Entosphenus tridentatus in North America (Goodman et al. 2008; Spice et al. 2012; Hess et al. 2013), Arctic lamprey Lethenteron camtschaticum in Asia (Yamazaki et al. 2014), and European river lamprey Lampetra fluviatilis in northern Europe (Bracken et al. 2015). Furthermore, these large-bodied parasitic lampreys are capable of swimming several hundred kilometers upstream—and surmounting some barriers—during their spawning migrations (Moser et al. 2015), thus providing opportunity for gene flow within as well as among river systems. With their limited dispersal, brook lampreys are expected to show greater population structuring compared to their closely-related parasitic counterparts. This has in fact been shown in European brook lamprey Lampetra planeri compared to European river lamprey and in non-parasitic versus parasitic populations of Arctic lamprey (Bracken et al. 2015; Yamazaki et al. 2011b, respectively). Furthermore, since small-bodied brook lamprey adults are capable of upstream migrations of only a few kilometers (Moser et al. 2015), gene flow within a river system may also be limited.

There is a great deal of concern about declining populations of native parasitic lampreys (Maitland et al. 2015; Clemens et al. 2017) but, given their life history differences, conservation strategies developed for anadromous species may not be suitable for brook lampreys. The large parasitic species have long been valued for food and ceremonial purposes, but brook lampreys also play important ecological roles in freshwater systems (Docker et al. 2015). WBL are not considered endangered or threatened (Maitland et al. 2015), although this is partly due to the lack of information regarding population structure and trends, as well as information about specific threats to the species and its habitat (United States Fish and Wildlife Service 2004). The present study thus aims to address some of these knowledge gaps through examination of WBL genetic population structure in the Columbia River Basin (CRB).

The Columbia River system is approximately 2000 km long, with a drainage basin of approximately 668,000 km2. Many large and small tributaries drain into the mainstem Columbia, and over 200 dams have been constructed within the CRB (Gelfenbaum and Kaminsky 2010). Most of the CRB remained unglaciated during the last glacial maximum (McPhail and Lindsey 1986), and the Columbia River has been suggested as a glacial refugium for a number of fish species, particularly salmonids in the genus Oncorhynchus (e.g., Bickham et al. 1995; McCusker et al. 2000; Beacham et al. 2006; Martin et al. 2010). WBL are believed to have persisted in the Columbia River during glaciation (McPhail and Lindsey 1986) and are currently present in the CRB at least as far upstream as the Naches River (Fig. 1).

Fig. 1
figure 1

Map of 23 locations where western brook lamprey were collected. Site labels correspond to those given in Table 1. Squares indicate sites where both microsatellite and mtDNA data were collected; multiplication symbols indicate sites where only mtDNA data were collected. Squares and multiplication symbols are color-coded according to region

Table 1 Collection information for western brook lamprey Lampetra richardsoni

The objective of the present study was to gain insight into both contemporary and historical population structure in WBL in this large, fragmented riverine system using analysis of microsatellites and mitochondrial DNA (mtDNA). Microsatellite loci, which have a relatively high mutation rate, are suitable for analysis of contemporary ecological questions (Selkoe and Toonen 2006). In contrast, mtDNA retains historical patterns that are more quickly lost in microsatellites, especially with ongoing gene flow among populations (Avise 2000; Heckel et al. 2005). Combining mtDNA and microsatellites can result in particularly informative results (Howes et al. 2006; Flanders et al. 2009; Sala-Bozano et al. 2009), which will help identify appropriate conservation units and suitable management strategies for WBL.

Materials and methods

Study design

We obtained permits from the states of Washington (administered through the Washington Department of Fish and Wildlife) and Oregon (administered through the Oregon Department of Fish and Wildlife) authorizing the capture of and tissue collection from the lamprey used in this study. For microsatellite analysis, approximately 40 WBL were collected from each of 15 sites within the CRB and two sites in coastal Washington (sites 1–17; Table 1; Fig. 1). Sites were grouped into seven regions to facilitate geographic analyses: five provinces (Coastal Washington, Columbia Estuary, Lower Columbia River, Columbia Gorge, and Columbia Plateau) and two sub-basins within the Lower Columbia (Lewis and Salmon sub-basins). Bonneville Dam demarcates the Lower Columbia from the Columbia Gorge provinces, and the Dalles Dam (whose reservoir inundates the former Celilo Falls, a barrier present prior to dam construction) divides the Columbia Gorge and Plateau. Sample sites were chosen to allow examination of gene flow and genetic diversity in several different situations: across saltwater barriers and dams; in different sub-basins separated by the mainstem Columbia; and within individual sub-basins. A subset of individuals from each of sites 1–17 were also analyzed using cytochrome b (see below). Since the Columbia Gorge and Columbia Plateau regions were poorly represented among the original 17 sites, samples were included from six additional sites within these regions (sites A–F; Table 1; Fig. 1) that were collected for other studies (Docker et al. 2016; M.F. Docker, unpublished data). Sample sizes from these additional sites were not large enough to permit microsatellite analysis but allowed cytochrome b analysis to be conducted on a total of 23 sites.

Sample collection

Lamprey were collected using electrofishing (see Harris et al. 2016), briefly anesthetized using tricaine methanesulfonate (MS-222), and preliminary morphological identification was made using the characters in Goodman et al. (2009). A small fin clip was taken for genetic analysis and stored in 95% ethanol. Lamprey were released once they had recovered from anesthesia.

Identifications were verified genetically or with adult records since larvae can be difficult to distinguish morphologically. At sites 1–17, larvae from the genus Entosphenus were eliminated using the assay described by Goodman et al. (2009), whereby a 433-bp fragment of mitochondrial cytochrome b digested with the enzyme HaeIII produced fragments of different sizes in Entosphenus and Lampetra. At sites A–F, the microsatellite assay described by Docker et al. (2016) was used to distinguish Entosphenus from Lampetra. Within the genus Lampetra, the parasitic western river lamprey L. ayresii and non-parasitic Pacific brook lamprey L. pacifica also occur in the CRB (Kostow 2002; Reid et al. 2011). Mitochondrial DNA sequencing confirmed that no specimens used in the current study were Pacific brook lamprey (Boguski et al. 2012). Western river lamprey and WBL cannot be distinguished genetically (Boguski et al. 2012) or morphologically as larvae (Docker 2009), but they are morphologically distinct at metamorphosis and as adults. Metamorphosing and adult lamprey were collected from 10 of 23 sites in the present study, and all Lampetra were morphologically verified to be WBL. At the remaining seven sites, historical records likewise suggest that only non-parasitic Lampetra (i.e., WBL) are present.

Calculation of river distances

River distances between populations were determined using ArcGIS (ESRI 2015). For each pair of sites, the total distance between the two sites was separated into three parts: tributary distance from the first site to the Columbia River mainstem; distance in the mainstem; and tributary distance from the mainstem to the second site (Online Resource 1—Table S1). For calculations of individual site distance upstream in the mainstem (see Table 1), the mouth of the Klaskanine River was designated as the furthest downstream point (i.e., 0 km upstream).

DNA extraction, microsatellite amplification, and fragment size determination

DNA was extracted using the Qiagen DNeasy Kit (Qiagen) following the manufacturer’s instructions. Eight microsatellite primer pairs developed or optimized for Lampetra (Lri-1, Lri-2, Lri-3, Lri-5, Lri-7, Lri-9, Lri-10; Luzier et al. 2010) or Entosphenus (Etr-3; Spice et al. 2011) were labeled with Ned and Pet (Applied Biosystems) and 6-Fam and Hex (Sigma Life Science) fluorescent dyes. PCR amplification was performed with a 10 µL reaction volume containing approximately 50 ng genomic DNA, 1 × KAPA Taq Buffer A, 2.5 mM MgCl2, 0.2 mM of each dNTP, 0.4 µM of each primer, and 0.2 U Taq DNA Polymerase (KAPA Biosystems). Thermal cycler conditions were an initial denaturation at 94 °C for 2 min; followed by 35 cycles of denaturation at 94 °C for 30 s, annealing at 52 to 58 °C (Luzier et al. 2010; Spice et al. 2011) for 30 s, and extension at 72 °C for 30 s; followed by a final extension at 72 °C for 5 min. Microsatellite fragments were size fractionated using a 3730XL DNA Analyzer (Applied Biosystems) sequencer and allele sizes were determined using GENEMAPPER v. 4.0 (Applied Biosystems).

Cytochrome b amplification and sequencing

A 438-bp fragment of mitochondrial cytochrome b was amplified in 181 individuals (Table 1). Primer sequences were 5′-CACCGTTGTAGAATTCAACTATAAG-3′ (Glu-F; Boguski et al. 2012) and either 5′-GTTAGGGTGGCGTTTGATACTG-3′ (cytb-518-R-Entosphenus) or 5′-GTTAAGGTGGCGTTTGATACTG-3′ (cytb-518-R-Lampetra). PCR amplification was performed in a 30 µL reaction volume containing approximately 50 ng genomic DNA, 1× GoTaq® Flexi Buffer, 2.5 mM MgCl2, 0.2 mM of each dNTP, 0.4 µM of each primer, and 0.6 U GoTaq® Flexi DNA Polymerase (Invitrogen). PCR amplification conditions were an initial denaturation at 95 °C for 2 min; followed by 10 cycles of denaturation at 95 °C for 1 min, annealing at 60 °C for 1 min, and extension at 72 °C for 1.5 min; followed by 10 cycles of denaturation at 95 °C for 1 min, annealing at 58 °C for 1 min, and extension at 72 °C for 1.5 min; followed by 10 cycles of denaturation at 95 °C for 1 min, annealing at 55 °C for 1 min, and extension at 72 °C for 1.5 min; followed by a final extension at 72 °C for 5 min. PCR products were purified and amplified using the BigDye Terminator v. 3.1 Cycle Sequencing Kit (Applied Biosystems) and sequenced using a 3500 Genetic Analyzer (Applied Biosystems).

Genetic diversity and population differentiation

A test for null alleles was performed using MICROCHECKER v. 2.2.3 (Van Oosterhout et al. 2004), and null allele frequency was calculated in FREENA (Chapuis and Estoup 2007). In order to determine whether null alleles had a substantial effect on the results, pairwise and global FST values with and without a correction for null alleles were calculated in FREENA and compared. Affected loci were retained for further analysis since the effect of null alleles was small (see “Results”) and discarding the loci would have resulted in a loss of power.

Using GENEPOP v. 4.2 (Raymond and Rousset 1995; Rousset 2008), observed and expected heterozygosity (HO and HE, respectively) were calculated for each locus and population. As well, each locus was tested for Hardy–Weinberg Equilibrium (HWE), and significance of deviations from HWE was determined after Bonferroni correction. FIS was calculated using FSTAT v. 2.9.3.2 (Goudet 1995). Allelic richness (AR) and private allelic richness (pAR) were standardized using rarefaction in HP-RARE v. 1.1 (Kalinowski 2005). Effective population size (Ne) was estimated in COLONY v. 2.0.6.1 (Jones and Wang 2010), using a medium run length with random mating and both male and female polygamy permitted (Johnson et al. 2015). A test for recent population bottlenecks was performed in BOTTLENECK v. 1.2.02 (Piry et al. 1999) using the stepwise mutation model and a one-sided Wilcoxon signed rank test for heterozygote excess. To test the hypothesis that populations further upstream or at higher elevations may have lower genetic diversity, one-sided Spearman’s rank correlation tests were performed in base R v. 3.4.0 (R Core Team 2016). These tests excluded samples from sites 1 and 2, as these sites were outside the CRB. HO, AR, pAR, FIS, and Ne were used as dependent variables in separate tests with elevation and total river distance upstream as independent variables. GENEPOP was used to calculate Weir and Cockerham’s (1984) FST for all pairwise comparisons of populations. Significance of FST values after false discovery rate correction was determined using FSTAT.

Geographic structure and population clustering

Single and multiple regressions of river distance and site elevation versus genetic distance (linearized FST) were performed using the lm function in base R. Sites 1 and 2 were excluded from this analysis as they are outside the CRB. River distance was separated into three components (tributary distance from the first site to the Columbia River mainstem, distance in the mainstem, and tributary distance from the mainstem to the second site). Models were compared using AICC (Burnham and Anderson 2002), which was calculated using the R package AICcmodavg (Mazerolle 2017).

STRUCTURE v. 2.3.4 (Pritchard et al. 2000) was used to analyze population clustering. To compensate for the presence of null alleles, the recessive alleles setting was used. Runs were performed using the admixture model with allele frequencies correlated, with an initial burn-in of 1,000,000 iterations followed by 1,000,000 iterations, and 20 replicates of each K. Preliminary analyses (data not shown) indicated that the most likely number of population clusters was probably less than 10; therefore, K values from 1 to 10 were examined. The most likely number of population clusters was determined using the Evanno et al. (2005) method as implemented in STRUCTURE HARVESTER v. 0.6.94 (Earl and VonHoldt 2012). For this K, results of replicates were combined using CLUMPP v. 1.2.2 (Jakobsson and Rosenberg 2007) and visualized using DISTRUCT v. 1.1 (Rosenberg 2004). To identify additional substructure, the complete dataset was subdivided into the clusters indicated by STRUCTURE, and similar analysis was performed for each cluster. In each of these clusters, K values from 1 to the number of sites were examined.

A neighbor-joining tree was constructed using POPULATIONS v. 1.2.30 (O. Langella, available from http://bioinformatics.org/~tryphon/populations/) with Cavalli-Sforza and Edwards’ (1967) chord distance and 500 bootstrap replications on loci. This tree was visualized in TREEVIEW v. 1.6.6 (Page 1996).

Direction of gene flow

To determine the direction of gene flow (i.e., primarily downstream versus bidirectional), spatial models were constructed to compare the proportion of genetic differentiation among sites explained by bidirectional connectivity of sites (distance-based Moran’s eigenvector maps; dbMEM) versus only downstream connectivity (asymmetric eigenvector maps; AEM).

dbMEM models were constructed following the methods of Borcard and Legendre (2002), Dray et al. (2006), and Blanchet et al. (2011). Pairwise distances between sites were converted to a minimum spanning tree matrix in R using the package ape (Paradis et al. 2004); the minimum spanning tree matrix was used as the binary edge matrix. Edge weighting formulas were binary (unweighted), distanceα, 1 − ((distance/distancemax)α), and 1/(distanceα); α values were 1 to 3. Adjacent sites (those coded as 1 in the binary matrix) were given the weight corresponding to that edge. The largest weight was identified as the threshold value, and all non-adjacent sites were assigned weights of 4*threshold. The weighted matrix was transformed into eigenvectors through principal coordinates analysis (PCA) in R using ape.

AEM models were constructed following the methods of Blanchet et al. (2008a). The confluence of the Klaskanine River and the mainstem Columbia River was designated as the furthest downstream point. For each site, all river segments connecting it to this point were coded as 1, and all other segments were coded as 0 [see Blanchet et al. (2008a) for a diagram demonstrating how river segments are coded]. The same weighting schemes were used as for the dbMEM models; the weight of each edge was determined by multiplying its binary value by the corresponding weight. This weighted edge matrix was transformed into eigenvectors in R using package AEM (Blanchet et al. 2015).

Redundancy analysis (RDA) was performed to determine how much of the genetic variation was explained by the spatial variation included in each model. This RDA was performed using the R package vegan (Oksanen et al. 2016). The independent variables were the spatial eigenvectors produced by AEM or dbMEM models. The dependent variables were the eigenvectors produced by PCA of the pairwise FST values (Vangestel et al. 2012; Orsini et al. 2013) using ape. After initial redundancy analysis, forward selection of spatial eigenvectors was performed following the method of Blanchet et al. (2008b) using the R package packfor (Dray et al. 2016). Eigenvectors were added one by one to the model until one of two stopping criteria was reached: (a) the adjusted r2 equalled that of the model with all eigenvectors included, or (b) the α value of the next eigenvector being added was ≥ 0.05. After forward selection, redundancy analysis was performed again to determine which model had the highest adjusted r2 (i.e., the most explanatory power). Finally, variation partitioning was performed for all models, using vegan, to compare the importance of bidirectional connectivity of sites (i.e., dbMEM eigenvectors) versus downstream connectivity (i.e., AEM eigenvectors) in determining genetic differentiation among populations.

Cytochrome b sequence analysis

Sequences were manually verified using CHROMAS LITE v. 2.1.1 (Technelysium; available from http://technelysium.com.au/), and aligned in GENEIOUS v. 5.6.6 (Kearse et al. 2012) using the Geneious Alignment method with default settings. Median-joining networks were constructed using NETWORK v. 4.6.1.1 (Bandelt et al. 1999) with default settings; these networks included samples from the present study and range-wide samples from Boguski et al. (2012).

Results

Genetic diversity and population differentiation

A total of 660 WBL were genotyped at eight microsatellite loci with a missing data rate of 10.44%. Summary statistics for each location are given in Online Resource 1—Table S2. Most loci had low (< 5%) and/or not statistically significant frequency of null alleles; however, null alleles were present at loci Etr-3 and Lri-10 at nine and seven of the 17 sites, respectively. Corrected and uncorrected pairwise and global FST values were very similar, with corrected values only slightly lower (Table 2; Online Resource 1—Tables S3 and S4). Thus, it appears that the presence of null alleles caused a slight overestimation of population differentiation; this was considered in the interpretation of the results. Locus Lri-9 displayed heterozygote excess in five populations. Homozygote excess was observed in one population each for loci Lri-2, Lri-3, and Lri-7; four populations for Lri-10; and seven populations for Etr-3. Among the 17 sites, average HO ranged from 0.159 (site 17) to 0.474 (site 4); FIS ranged from − 0.328 (site 7) to 0.378 (site 17). Ne ranged from 6 (site 10) to 40 (site 2); AR ranged from 1.683 (site 7) to 3.564 (site 1); pAR ranged from 0.01 (site 13) to 0.33 (site 5). Bottlenecks were not detected in any population (p ranging from 0.6563 to 1 for 17 tests). Distance upstream was negatively correlated with HO (ρ = − 0.5077, p = 0.0334) and Ne (ρ = − 0.5683, p = 0.0170), and elevation was negatively correlated with HO (ρ = − 0.5077, p = 0.0334; Online Resource 1—Table S5). Pairwise FST values ranged from − 0.0026 (between sites 7 and 9) to 0.7117 (between sites 7 and 17), with an average of 0.2929 (Table 2). Out of 136 pairwise comparisons, 125 were significant after false discovery rate correction (p < 0.000327).

Table 2 FST values for pairwise comparisons among populations of western brook lamprey, calculated from eight microsatellite loci

Geographic structure and population clustering

Although all components of river distance were significantly correlated with genetic distance, the best model was the one that included downstream tributary distance and upstream tributary distance as independent variables (p = 3.14 × 10− 12, adjusted r2 = 0.3934, AICC = 70.2023; Online Resource 1—Table S6). The effects of downstream and upstream tributary distances were similar (coefficients of 0.0085 and 0.0071, respectively, with overlapping 95% confidence intervals). This model was not improved by adding site elevations as additional independent variables (p = 3.91 × 10− 11, adjusted r2 = 0.3975, AICC = 71.8755).

STRUCTURE HARVESTER indicated that the most likely number of genetic clusters for WBL was two (mean lnP(K) = − 8412.7850; mean ΔK = 645.7748; Fig. 2; Online Resource 1—Table S7). Cluster 1 contained sites 7–15 (the Lewis and Salmon sub-basins and Gibbons Creek from the Lower Columbia). Within cluster 1, two sub-clusters were identified (mean lnP(K) = − 3066.0850; mean ΔK = 1359.0665); sub-cluster A contained sites 11–15 (the Salmon sub-basin plus Lockwood and Gibbons creeks), and sub-cluster B contained sites 7–10 (the Lewis sub-basin except for Lockwood Creek). Cluster 2 contained sites 1–6, 16, and 17 (Coastal Washington, the Columbia Estuary, Columbia Gorge, Columbia Plateau, and most of the Lower Columbia) and could be divided into three sub-clusters (mean lnP(K) = − 4006.1050; mean ΔK = 177.0766). Sub-cluster C contained Walla Walla River (the Columbia Plateau); sub-cluster D contained sites 3, 4, and 6 (both Columbia Estuary sites and North Fork Scappoose in the Lower Columbia); and sub-cluster E contained sites 1, 2, and 5 (both Coastal Washington sites and Merrill Creek in the Lower Columbia). Trout Lake Creek (site 16) was intermediate between sub-clusters C and E.

Fig. 2
figure 2

Bar plot illustrating membership in western brook lamprey population clusters, using data from eight microsatellite loci. Each individual is represented by a single vertical bar broken into colored segments; the height of each color within the bar represents the proportionate genetic assignment of that individual to a population cluster. Numbers beneath the bar plot are color-coded according to region and indicate collection locations corresponding to those given in Table 1 and Fig. 1. a Population clusters present in the entire dataset. Cluster 1 = light gray, cluster 2 = dark gray. b Sub-clusters present in Cluster (1) Cluster 1a = light gray, cluster 1b = dark gray. c Sub-clusters present in Cluster (2) Cluster 2a = light gray, cluster 2b = mid-gray, cluster 2c = dark gray. (Color figure online)

A neighbor-joining tree showed generally strong support for geographic grouping of WBL (Fig. 3). Sites 1 and 2 (Coastal Washington) clustered together with 80% bootstrap support. Most of the Lewis sub-basin (sites 7–10) clustered together with 90% bootstrap support. Within the Salmon sub-basin, geographic groupings were not as strong; these sites (12–14) clustered together with 52% bootstrap support. Two sites outside of the Salmon sub-basin (site 11, Lockwood Creek, and site 15, Gibbons Creek) also grouped close to the Salmon sub-basin sites.

Fig. 3
figure 3

Neighbor-joining tree showing the relationships among western brook lamprey collected from 17 sites, using data from eight microsatellite loci. Scale bar shows Cavalli-Sforza and Edwards’ (1967) chord distance; bootstrap values greater than 50% are shown at nodes. Site numbers correspond to those given in Table 1 and Fig. 1

Direction of gene flow

One of the dbMEM models (i.e., including bidirectional connectivity of sites) and two of the AEM models (including only downstream connectivity) were rejected completely because no spatial eigenvectors were retained after forward selection. The three best models were those weighted by distance1, distance2, and distance3 (adjusted r2 = 0.812, 0.800, and 0.687, respectively; Online Resource 1—Table S8). The dbMEM models explained little variation (− 0.003 to 0.094) on their own; a larger portion of the variation was attributed solely to AEM models (0.183–0.434). The largest portion of the variation was shared between AEM and dbMEM models (0.381–0.523).

Cytochrome b analysis

Seven haplotypes were found in the samples analyzed in the present study: three that were shared with other west coast Lampetra (H1 to H3) and four that have been reported only in the CRB (H4 to H7; Fig. 4). Coastal haplotypes not present in the CRB were H8 to H10 and H13 to H16; haplotypes H11 and H12 were found in the CRB by Boguski et al. (2012) but not in the present study. Haplotype frequency and distribution are given in Online Resource 1—Tables S9 and S10, along with GenBank accession numbers. Within the CRB, several haplotypes were unique to particular regions: H5 occurred only in the Lewis sub-basin and H7, which was distinct from all other CRB haplotypes by at least five mutations (1.1%), occurred in only one site (site 6, North Fork Scappoose Creek). H3 occurred in the present study only in the Columbia Estuary; however, this haplotype was shared with samples from Coastal Oregon in Boguski et al. (2012). H2, which was found by Boguski et al. (2012) in all six regions from Alaska to Point Conception, was also widely distributed within the CRB, occurring in five of the seven regions examined in the present study (i.e., all except Coastal Washington and the Columbia Estuary). Virtually all (53/54) of the WBL surveyed from the upstream-most CRB sites (i.e., sites 16–17 and A–F in the Columbia Gorge and Columbia Plateau) possessed H2; only one individual from Trout Lake Creek had a haplotype other than H2.

Fig. 4
figure 4

Median-joining networks constructed from 438 base pairs of cytochrome b sequence. Labels H1 through H16 indicate haplotypes. a Network including Lampetra samples from the entire west coast of North America, from Boguski et al. (2012) and the present study. Color-coding indicates regions used by Boguski et al. (2012). Haplotype frequency is given in Online Resource 1—Table S9. b Network including only western brook lamprey samples from the present study. Color-coding indicates regions given in Table 1. Haplotype H5 was found in sites 7–10; H2 and H6 were found in site 11; H7 was found only in site 6. Haplotype frequency is given in Online Resource 1—Table S10. (Color figure online)

Discussion

In the Columbia River Basin, clear geographic structuring and moderate to high genetic differentiation were evident among populations of western brook lamprey. This contrasts dramatically with the low levels of genetic differentiation seen among anadromous lampreys from even widely-disparate locations (e.g., Spice et al. 2012; Yamazaki et al. 2014). A trend of isolation by distance (IBD) was present, indicating that some gene flow among geographically proximate populations occurs, but most of this gene flow appears to take place in tributaries rather than in or through the mainstem Columbia River. Spatial models indicated that gene flow occurs primarily in a downstream direction, and genetic diversity decreased from downstream to upstream. Microsatellite genetic structure usually corresponded to mtDNA genetic structure and contemporary river connectivity, but the few discrepancies noted may help provide insight into historical processes within the CRB.

Contemporary connectivity among WBL populations

This study is novel in that it examined population structure of a freshwater-resident, non-parasitic lamprey in a single large river basin and showed moderate to high genetic differentiation among populations. Most studies of lamprey population structure to date have focused on anadromous parasitic species and found low (although sometimes significant) genetic differentiation among locations (e.g., Almada et al. 2008; Spice et al. 2012; Hess et al. 2013; Yamazaki et al. 2014). FST values for WBL populations separated by < 570 km ranged from − 0.0026 to 0.7117, with an average of 0.2929; these values are more than an order of magnitude greater than those for anadromous Pacific lamprey separated by up to 2600 km (maximum 0.0584, average 0.0165; Spice et al. 2012).

Most population genetic studies conducted on brook lampreys used only mtDNA markers (Mateus et al. 2011; Boguski et al. 2012), compared sites between river basins separated by saltwater barriers (Rougemont et al. 2016, 2017), or included relatively few sites within a single basin. Nevertheless, these studies also suggest that brook lamprey populations tend to be genetically distinct over small to moderate spatial scales. In France, in tributaries of the Garonne River separated by less than 100 km, genetic differentiation between two European brook lamprey populations was moderate and significant (FST = 0.210; Rougemont et al. 2015). Among four European brook lamprey populations separated by 108–222 km in the Humber River drainage in the U.K., FST values were likewise moderate and significant (0.103–0.191; Bracken et al. 2015). Lack of genetic differentiation among populations of brook lampreys has been reported only over very small spatial scales (FST = − 0.020 to 0.005 among three populations of Lethenteron sp. N separated by approximately 10 km; Yamazaki et al. 2011a) or in recently established populations (FST = 0.037 between two non-parasitic populations of Arctic lamprey separated by approximately 40 km but apparently founded by the anadromous parasitic form within the last century; Yamazaki et al. 2011b).

Even in comparison to other freshwater fish species with limited dispersal, WBL have relatively strong population structure. The comparatively high FST values seen in the present study may be partially due to the large size of the CRB; however, moderate differentiation was sometimes present even between spatially proximate sites. For example, FST was 0.2687 between lower Cedar Creek and North Fork Chelatchie Creek (a tributary to Cedar Creek) which are separated by only 10.6 km. In contrast, in Rocky Mountain sculpin Cottus sp., pairwise FST values ≤ 0.02 were observed among sites separated by 0.2–11 km within the same river and dispersal in this species appears very limited; 50% of tagged individuals moved ≤ 10 m over a 5-month period (Ruppert et al. 2017). In the more mobile but still non-migratory European chub Squalius cephalus, which is able to cover distances of up to 15 km to spawn, pairwise FST values among 11 sites within a 219 km stretch of the mainstem Durance River ranged from 0 to 0.055 with a global FST of 0.015 (Dehais et al. 2010). Population structure observed in guppy Poecilia reticulata was more similar to that observed in the present study (Barson et al. 2009). In five pairs of upstream and downstream sites separated by less than 10 km, pairwise FST values calculated from eight microsatellite loci were significant and ranged from 0.013 to 0.927. Upstream and downstream sites, however, were also separated by changes in elevation, which would presumably be a barrier to upstream gene flow for fish without strong swimming abilities. In the present study, genetic distance was much more strongly affected by river distance than by elevation; however, elevation was negatively correlated with observed heterozygosity. Future research should examine the effects of elevation changes on gene flow in WBL.

The presence of significant IBD shows that some gene flow among WBL populations occurs (Hutchison and Templeton 1999); however, tributary rather than total distance was the best predictor of FST, indicating that most gene flow occurs within tributaries rather than in or through the mainstem of the Columbia River. Although evidence suggests that WBL larvae drift downstream into the mainstem fairly often (Harris and Jolley 2017), and Lampetra sp. larvae that may be WBL have been detected in large rivers such as the Willamette River (Jolley et al. 2012), the results of the present study suggest that larvae that drift into the mainstem Columbia rarely succeed in re-entering tributaries and spawning. Our finding that genetic distance was better predicted by AEM models (including only downstream connectivity) than dbMEM models (bidirectional connectivity) also indicates that there is very little upstream gene flow.

In river systems, asymmetric gene flow is expected to result in reduced genetic diversity in upstream populations (Morrissey and de Kerckhove 2009). This should be particularly evident in brook lampreys since compensatory upstream movement for spawning is limited relative to the largely passive downstream movement experienced during the prolonged larval period. Other studies of freshwater fishes with limited migration have found that both allelic richness and heterozygosity increase from upstream to downstream (Hänfling and Weetman 2006; Dehais et al. 2010). The same pattern has been observed in other sedentary riverine organisms, such as mussels (e.g., Mock et al. 2010, 2013). In WBL in the CRB, heterozygosity (but not allelic richness) was lower in upstream and high elevation sites. Lack of correlation between allelic richness and upstream distance may be due to a confounding effect related to historical patterns of post-glacial colonization (see below).

Headwater areas are expected to have particularly low genetic diversity due to the lack of incoming gene flow from upstream; however, they may also have a high frequency of private alleles (Morrissey and de Kerckhove 2009). In the present study, relatively low genetic diversity was observed at upstream sites within the Lewis and Salmon sub-basins. The frequency of private microsatellite alleles was not correlated with upstream distance, but unique or mostly unique mtDNA haplotypes (H5 and H6) were observed in both the Lewis and Salmon sub-basins. Relatively high Ne (39) was found at lower Cedar Creek (furthest downstream in the Lewis sub-basin). Very low Ne was found in the remaining sites, with the lowest Ne (6) at upper Cedar Creek (furthest upstream in the Lewis sub-basin). Headwater areas tend to have higher stream gradients and less fine sediment than areas lower in a watershed, potentially giving them a lower capacity to support larval lamprey (see Dawson et al. 2015). Given the low effective population sizes and unique haplotypes observed in headwater sites, conservation of WBL in headwater areas may be particularly important. Unlike large-bodied parasitic lampreys that are capable of swimming up to several hundred kilometers upstream, brook lampreys will be more limited in their recolonization abilities if extirpated. Anadromous Pacific and sea lampreys have successfully colonized or recolonized rivers within years of the removal of dams or natural barriers (e.g., Farlinger and Beamish 1984; Hogg et al. 2013; Jolley et al. 2018). Where barriers persist, translocation from downstream sites is being used to introduce Pacific lamprey into upstream sub-basins (Ward et al. 2012). However, given the moderate to strong genetic differentiation observed in WBL in the CRB, translocation in this species could severely disrupt population structure and local adaptation. In vulnerable headwater regions, therefore, conservation efforts should focus on habitat conservation and restoration (see Maitland et al. 2015).

Historical connectivity among WBL populations in the CRB

One of the major questions raised by the present study is whether the population structure observed in WBL is due to patterns of colonization or present connectivity. The answer is likely that the effects of both can be seen within the CRB. In other relatively sedentary freshwater fish species, changes in drainage systems have resulted in phylogeographic patterns that reflect past rather than present river connectivity (Waters et al. 2001; Adamson et al. 2012; Wu et al. 2016). In mottled sculpin Cottus bairdii, for example, microsatellite population structure was found to be more reflective of historical patterns of colonization than present patterns of connectivity (Homola et al. 2016). Similarly, the presence of an unexpected haplotype in European river and/or brook lampreys in one tributary of the Tagus River was attributed to geological events which previously isolated this tributary (Mateus et al. 2011), and microsatellite data likewise show that lampreys on the Iberian Peninsula have been isolated for long periods (Mateus et al. 2016). The present study found two major genetic groups (the Lewis and Salmon sub-basins, all other sites), with additional sub-structuring apparent within each larger group. This sub-structuring roughly corresponds to contemporary drainages. It should be noted that sampling focused heavily on the Lewis and Salmon sub-basins, which may have made it easier to detect genetic groupings in these areas. There were several cases where the genetic groupings did not correspond exactly to present geography, and these discrepancies may provide information about the history and biogeography of the CRB.

Coastal Washington, the Columbia Estuary, and the Lower Columbia were generally grouped together by microsatellite analyses (except for site 15; see below); however, some degree of historical isolation among and within these sites was suggested by the mtDNA analysis. Haplotype 7, for example, was found only in North Fork Scappoose Creek, which drains into Scappoose Bay and then into the Multnomah Channel (a distributary of the Willamette River) near its confluence with the Columbia River. Haplotype H7 was the most distinctive haplotype identified in the present study, differing by at least 1.1% from any other WBL examined to date, including those from nearby (12 km linear distance, 55 km river distance) Merrill Creek. Most of the Willamette drainage samples examined by Boguski et al. (2012) were Pacific brook lamprey, but the WBL in North Fork Scappoose Creek differed from Pacific brook lamprey by 3.7% in cytochrome b sequence and were clearly not Pacific brook lamprey. The remaining two Willamette drainage samples analyzed by Boguski et al. (2012) contained haplotypes H11 and H12, which are distinct from both H7 from North Fork Scappoose (1.1–1.4%) and H2 (0.9–1.1%) which is common in other parts of the CRB and along the west coast. During the last glaciation, the Missoula Floods resulted in periodic flooding of the Willamette Valley (Waitt 1985). Perhaps WBL colonized different parts of the Willamette Valley during these floods and were subsequently isolated, resulting in the development of distinctive haplotypes, before drainage patterns shifted again to allow the present river connectivity. The distinctiveness of haplotypes H7, H11, and H12 (and restriction of Pacific brook lamprey to this drainage) indicates that the Willamette drainage warrants further study.

One surprising result was the low genetic differentiation (FST = 0.0199) between the two Coastal Washington sites. Although all individuals from these two rivers had the Coastal Washington-specific haplotype H1, suggesting colonization by a common lineage, patterns of colonization alone cannot explain the low genetic differentiation observed with microsatellite data. Other pairs or groups of sites sharing region-specific haplotypes had much higher pairwise FST values (e.g., 0.0927 between the two Columbia Estuary sites; up to 0.2687 among sites 7 to 10, the tributaries of the mainstem Lewis River). Anadromous western river lamprey might mediate contemporary gene flow between coastal WBL populations. Western river lamprey and western brook lamprey form a species pair, and the two cannot be genetically differentiated using cytochrome b sequence (Docker 2009; Boguski et al. 2012). Although size differences between non-parasitic WBL and parasitic western river lamprey generally prevent successful hybridization, some interbreeding has been observed in the laboratory (Beamish and Neville 1992). As well, WBL populations may occasionally give rise to western river lamprey (see Jolley et al. 2016); however, western river lamprey have not been directly recorded from either Bear River or South Fork Naselle River. Alternatively or additionally, non-parasitic lampreys may have some ability to osmoregulate in salt water (see Dawson et al. 2015), perhaps permitting contemporary gene flow mediated by WBL. The Bear and South Fork Naselle rivers drain into the Willapa Bay estuary, and Lampetra sp. larvae have been detected in estuarine areas in Ellsworth Creek, a tributary of the Naselle River (Silver 2015). With only two Coastal Washington sites included in the present study, it is difficult to properly assess the possibility or magnitude of gene flow through salt water or estuaries. Future studies should examine population structure of WBL in a larger number of coastal rivers and directly estimate rates of migration between populations.

The Lewis and Salmon sub-basins appeared to contain WBL from at least two different lineages, which did not directly correspond to contemporary drainages. Most Lewis sub-basin sites surveyed (sites 7–10, i.e., excluding Lockwood Creek) drain into the mainstem of the Lewis River. These four sites grouped together in the microsatellite analyses, and all individuals sequenced had the Lewis-specific haplotype H5. However, Lockwood Creek, which drains into the East Fork Lewis River, grouped with the Salmon sub-basin in both microsatellite and mtDNA analyses. The haplotype distribution of Lockwood Creek was more similar to the Salmon sub-basin (a mix of H2 and H6) than to the rest of the Lewis sub-basin (exclusively the Lewis-specific haplotype H5). Tributaries of Salmon Creek are within a few kilometers of the East Fork Lewis River, and a post-glacial headwater capture seems plausible. Headwater capture has been documented as a mode of dispersal and gene flow in many other freshwater fish species (e.g., Waters et al. 2001; Burridge et al. 2006; Adamson et al. 2012), and it may be particularly important in non-migratory species such as WBL. As well, the lower elevation of the Salmon sub-basin may make contemporary connectivity with Lockwood Creek possible. The rest of the Lewis sub-basin appears to have been colonized by a different lineage. Cytochrome b sequence of WBL from the tributaries of the mainstem Lewis River differed by only 1–2 substitutions (0.23–0.46%) from those sampled by Boguski et al. (2012) in northwestern Washington, including those from one tributary of the Chehalis River. The Chehalis River and surrounding areas were a refugium—separate from the Columbia River refugium—for other freshwater fish species during the most recent glaciation (McPhail and Lindsey 1986), and it seems likely that the mainstem Lewis River and its tributaries were colonized from this refugium. Relatively recent colonization and successive founder events may contribute to the low microsatellite genetic diversity found in this area. It is notable that, although all individuals from North Fork Chelatchie Creek had the Lewis-specific haplotype H5, North Fork Chelatchie Creek was somewhat distinct from the rest of the Lewis sub-basin (FST = 0.212–0.269). As well, STRUCTURE analysis placed about one-third of individuals from this site in the Salmon sub-cluster rather than the Lewis sub-cluster. The reasons for this are unclear, and further study of population structure in this area is recommended.

Gibbons Creek, the uppermost site in the Lower Columbia, grouped with the Salmon sub-basin in microsatellite analyses; this is surprising because the river distance between these sites is about 110 km. Pairwise FST values between Gibbons Creek and the Salmon sub-basin were only 0.026–0.083, suggesting some degree of historical connectivity. In contrast, FST values between Gibbons Creek and the other two Lower Columbia sites outside of the Lewis and Salmon sub-basins (Merrill and North Fork Scappoose creeks, sites 5 and 6) were 0.200 and 0.277, respectively, despite their closer proximity (83 and 22 km). Thus, although Gibbons Creek extends only a few kilometers upstream from the Columbia mainstem, it seems to defy simple categorization as a Lower Columbia River population, and the genetic similarity between WBL in Gibbons Creek and those in the Salmon sub-basin requires further study. Since Gibbons Creek is separated from the Salmon sub-basin by the Washougal River, historical headwater capture is unlikely, although a major flooding event (such as the Missoula Floods) could have provided a temporary connection between these drainages. If individuals from Gibbons Creek or the mainstem Columbia were swept into the Salmon sub-basin, this would account for the presence of the H2 haplotype alongside the H6 haplotype common in the Salmon sub-basin. Further study of this region (including any WBL present in the Washougal River drainage) is necessary to ascertain the nature of the connection between Gibbons Creek and the Salmon sub-basin.

In general, the Columbia Gorge and Plateau regions displayed low haplotypic diversity; of the 54 individuals sequenced from these regions, 53 had the common haplotype H2. This suggests WBL may have persisted primarily in the Lower Columbia during glaciation, and upstream regions were later colonized from downstream and subjected to multiple founder events. However, both microsatellite and mtDNA diversity were higher than expected in Trout Lake Creek, a tributary of the White Salmon River. Four of the five individuals sequenced from Trout Lake Creek had the common haplotype H2; however, one individual had haplotype H6, which otherwise occurs only in the Salmon sub-basin and Lockwood Creek. Both headwater capture and contemporary connectivity between Trout Lake Creek and Salmon Creek seem unlikely, as river distance between these sites is approximately 230 km and linear distance is approximately 80 km. This site may have been colonized by more than one group of WBL, but analysis of other sites in this region will be needed to better elucidate colonization patterns. Trout Lake Creek is also of particular interest because it is upstream of several waterfalls and the Condit Dam site. Condit Dam was in place for over 100 years and, along with the waterfalls, was a clear barrier to upstream passage of fish (see Jolley et al. 2018). In the present study, WBL samples were collected prior to the decommissioning of the dam in 2011. Additional study of the White Salmon River and its tributaries would therefore also be of interest to determine whether decommissioning the dam has caused any changes in population connectivity or genetic diversity.

Conclusions

Although considerable research and management efforts are being directed at anadromous Pacific lamprey (e.g., Wang and Schaller 2015; Reid and Goodman 2015; Clemens et al. 2017), WBL have received little research attention, and a lack of knowledge about this species inhibits conservation efforts. Most programs designed to monitor anadromous lamprey abundance (e.g., during upstream migration) are not suitable for detecting brook lampreys. Even in the CRB, where lamprey abundance is closely monitored, most of the available data are for Pacific lamprey (Kostow 2002). The present study aimed to rectify some of these information deficits through examination of historical and contemporary WBL population structure in the CRB. Future studies using other markers (e.g., single nucleotide polymorphisms) and more extensive sampling outside of the Lewis and Salmon sub-basins may provide additional insight into WBL population structure in this large river basin. However, even conservative interpretation of the current results shows significant geographic structuring and limited gene flow among populations. Gene flow appeared to occur primarily in a downstream direction, resulting in a decrease in genetic diversity in upstream sites and suggesting that WBL populations in these areas may be particularly vulnerable to local extinction. Furthermore, most gene flow appeared to take place in tributaries rather than through the mainstem Columbia River. Therefore, whereas Pacific lamprey management efforts are focusing on improving passage at mainstem dams (Moser et al. 2015; Clemens et al. 2017), WBL will likely benefit more from conservation of populations in individual watersheds, particularly headwater areas. Compared to Pacific lamprey, WBL will be more far limited in their ability to recolonize headwater areas if extirpated and, given the moderate to strong genetic differentiation observed over even relatively short distances, translocation within the CRB could severely disrupt population structure and local adaptation. Conservation efforts, therefore, should focus on habitat protection and restoration.

The present study also demonstrated that conservation decisions should be informed by both historical and contemporary patterns of river connectivity. Although population structure often corresponded to contemporary river connectivity, there were some cases in the study where historical connectivity provided a better explanation for the observed patterns of genetic diversity. Due to their limited dispersal, non-migratory freshwater fish species such as WBL may be ideal for examining biogeography and patterns of colonization. Additional analysis of WBL populations in the broader Cascade Range area and the Willamette drainage may shed light on the biogeography of the CRB as a whole.