Introduction

The emerging field of landscape genetics explores the relationship between landscape features and gene flow (Manel et al. 2003; McRae 2006). Genetic data can be used to indirectly evaluate dispersal, which is essential for effective wildlife conservation and management due to the role of dispersal in recruitment and maintaining metapopulations in fragmented landscapes (Seidensticker et al. 1973; Sweanor et al. 2000). Genetic analysis is particularly useful for reclusive and solitary predators, such as the cougar, which can be challenging to locate and directly observe in the wild. The genetic structure of a population integrates the results of successful dispersal events—those that have resulted in survival and reproduction over the past few generations (Cushman et al. 2006). Genetic gradients among individuals can be quantified by calculating genetic distances between pairs of individuals, and can serve as a proxy for dispersal (Jombart et al. 2008). Where population structure is present, the degree of connectivity between populations may be assessed using genetic clustering analyses (Guillot et al. 2005).

The cougar’s reclusive nature and low density make it difficult to evaluate the influence of natural and anthropogenic landscape features on dispersal success. The primary mechanism for gene flow in cougars is the dispersal of subadults, especially males, away from natal areas following independence between one and two years of age (Logan and Sweanor 2010). Much of what is known about habitat use during dispersal comes from small, short-term studies of radio-tagged individuals (Beier 1995; Sweanor et al. 2000; Robinson et al. 2008; Hornocker and Negri 2010). Dispersing subadults in the Rocky Mountains used habitat types similar to those used locally by resident adults (Newby 2011). Factors shown to influence cougar movement include elevation, forest cover, paved roads, and human development (Beier 1995; Dickson and Beier 2002; Dickson et al. 2005; Kertson et al. 2011; Newby 2011). Cougars selected low elevation areas or canyon bottoms in western Washington (Kertson et al. 2011), the Rocky Mountains (Newby 2011), and southern California (Dickson and Beier 2007). Although cougars may cross open areas, they spend the majority of their time in forests with a developed understory, which provides stalking cover and concealment of food caches (Logan and Irwin 1985; Beier 1995; Kertson et al. 2011; Newby 2011). Cougars may make use of unimproved roads while traveling, however roads with high traffic volume may pose a mortality risk (Taylor et al. 2002; Dickson et al. 2005) and reduce gene flow in cougars and other large mammals (McRae et al. 2005; Riley et al. 2006; Balkenhol and Waits 2009; Shirk et al. 2010; Parks et al. 2015). Cougar space use has also been negatively correlated with residential density in western Washington (Kertson et al. 2011).

Gene flow and genetic structure in cougar populations ranges from high gene flow and panmixia to geographically structured populations. Sinclair et al. (2001, n = 50) and Anderson et al. (2004, n = 257) explored differences in genetic structure within Utah and the Wyoming Basin, respectively, but reported those populations exhibited high gene flow and low genetic structure and each suggested a single megapopulation. Similarly, Castilho et al. (2011, n = 37) and Miotto et al. (2011, n = 111) found no evidence of genetic structure in Brazilian cougars. However, several studies have reported on genetic structure. Walker et al. (2000, n = 25) and Holbrook et al. (2012, n = 245) found genetically distinct cougar populations in Texas, which Holbrook et al. (2012) ascribed primarily to isolation by distance. Ernest et al. (2003, n = 431) analyzed samples throughout occupied habitats in California and found variable levels of genetic structure and identified areas where gene flow may be at risk. McRae et al. (2005, n = 540) evaluated samples from several regions in the Rocky Mountain range and reported genetic structuring at 2 levels; a north–south differentiation, and genetic isolation by distance within regions. Andreasen et al. (2012, n = 739) sampled cougars across Nevada and eastern California, finding five genetically distinct subpopulations that were separated by desert basins. Balkenhol et al. (2014, n = 371) detected spatial genetic differentiation in Idaho cougars, which they attributed to urban development, forest cover and geographic distance. Naidu (2015, n = 401) found evidence for several subpopulations in the southwestern U.S. and northern Mexico; the boundaries between these subpopulations largely corresponded with interstate highways. Though widely distributed throughout the Pacific Northwest, the genetic structure of cougars is not well documented in Washington and British Columbia. Juvenile cougars have been documented dispersing between 190 and 250 km from their natal site in Washington (R. Beausoleil, unpublished data), and this capacity for long-distance dispersal suggests that inbreeding due to limitations on gene flow should be limited at the regional scale. However, one of only three previously identified genetic bottlenecks in North American cougars comes from the Olympic Peninsula (Culver et al. 2000). The low level of heterozygosity observed in Olympic cougars led Beier (2010) to suggest that reintroductions may be needed to ward off inbreeding depression. Evidence of genetic isolation and inbreeding depression has been reported in Florida panthers (Puma concolor coryi), a subspecies of the cougar, prompting the U.S. Fish and Wildlife Service to translocate cougars from the closest population, in Texas, in an effort to reverse decades of inbreeding (Pimm et al. 2006). Connectivity between individuals on the Olympic Peninsula and the nearby Cascade Mountains—a distance of about 100 km—remains unresolved.

In this study, we described the genetic structure of cougars across Washington and south-central British Columbia in order to evaluate the natural and anthropogenic features that influence population connectivity in this region. We used genetic cluster analysis to test for population structure and spatial principal components (sPCA) analysis to identify patterns in individuals’ allele frequencies within the study area. We hypothesized that the observed structure was due to one or a combination of several landscape resistance factors which have been shown to influence cougar movement: elevation, forest cover, human population density and highways. We determined the relative influence of each factor on gene flow by relating genetic distance to landscape resistance using multiple regression on distance matrices and boosted regression tree analysis.

Materials and methods

Study area

The study area comprised all of Washington as well as south-central British Columbia (Fig. 1). It included the mostly forested Cascade, Olympic and Blue Mountain Ranges, in addition to the shrub-steppe expanses of the Okanogan Valley and Columbia Plateau (Fig. 1). Elevation ranged from 0 to 4392 m above sea level. Human population density varied considerably across the study area, ranging from zero in roadless wilderness to densely populated urban centers such as Seattle.

Fig. 1
figure 1

Locations of samples collected from cougars for genetic analysis in Washington and British Columbia, 2003–2010, Washington Department of Fish and Wildlife

Sample collection and genotyping

Washington Department of Fish and Wildlife (WDFW) (2011) collected 612 blood and tissue samples from cougars across the state of Washington between 2003 and 2010. Additionally, during the same timeframe the British Columbia Ministry of Forests, Lands and Natural Resource Operations obtained 55 samples for a total of 667 samples. In Washington, samples were taken from all known mortalities (those harvested by hunters and roadkills) and from live animals sampled during research. Locations of each animal were based on reported kill or capture site and were accurate to within 10 km (Fig. 1). All genotyping was performed by the WDFW Molecular Genetics Laboratory in Olympia, Washington. DNA was extracted from blood and tissue samples using DNeasy 96 Blood and Tissue Kits (Qiagen, Los Angeles, CA), or NucleoSpin Tissue Kit (Macherey–Nagel, Bethlehem, PA), following the manufacturers’ protocols. Polymerase chain reaction (PCR) was used to amplify 18 previously characterized microsatellite markers (Menotti-Raymond and O’Brien 1995; Culver 1999; Menotti-Raymond et al. 1999) in six multiplexes (Table 1). The thermal profile for all multiplexes, except Fco F, was (1) initial denature at 94 °C for 2 min; (2) three cycles of 94 °C for 30 s (denature), 60 °C for 30 s (annealing), 72 °C for 60 s (extending); (3) 36 cycles of 94 °C for 30 s, 50 °C for 30 s, 72 °C for 60 s, and (4) 72 °C for 10 min. Multiplex Fco F differed at step 2 with a 62 °C annealing temperature and step 3 with a 52 °C annealing temperature. The PCR products were visualized with an ABI3730 capillary sequencer (Applied Biosystems) and sized using the Gene-Scan 500-Liz standard (Applied Biosystems, Foster City, CA).

Table 1 Allelic diversity for 18 microsatellite loci from 667 cougars sampled in Washington and British Columbia, 2003–2010, showing PCR multiplex, the number of alleles, expected heterozygosity (HE) and observed heterozygosity (HO)

We checked for amplification and allele scoring errors using Microchecker version 2.2.3 (van Oosterhout et al. 2004). We tested for deviations from Hardy–Weinberg and linkage equilibria using Genepop version 4.1 (Rousset 2008); alpha was adjusted using a simple Bonferroni correction to accommodate multiple tests (Rice 1989).

Cluster analysis

We explored the pattern of population structure within the study area by clustering samples based on their allele frequencies using Geneland (version 3.3.0; Guillot et al. 2005). The program estimates the number of clusters, or subpopulations within a sample of individuals and assigns individuals to clusters by minimizing Hardy–Weinberg and linkage disequilibria within groups. Geneland also uses the geographic coordinates of each individual as part of the clustering process (Guillot et al. 2005). We used the spatial model with null alleles and uncorrelated allele frequencies. The uncertainty attached to the coordinates for each individual was specified as 10 km, 106 iterations were performed, of which every 100th observation was retained, and a maximum of 10 clusters was assumed.

Spatial principal components analysis (sPCA)

Genetic clustering algorithms, such as Geneland, are designed to identify discrete groups of individuals, therefore we also used sPCA to detect clinal population structure. Principal component analysis of allele frequencies can capture the variation contained in many allele frequencies and distill this down to a few synthetic variables. Spatial PCA is a modified version of PCA on allele frequencies where the principal component score for each individual is multiplied by Moran’s I, a measure of spatial autocorrelation for that individual (Jombart et al. 2008). Individuals with allele frequencies similar to their neighbors will have a positive I value, while individuals with allele frequencies quite different from their neighbors will have a negative I value. Spatial PCA breaks spatial autocorrelation into global structure, where neighbors are positively autocorrelated, and local structure, where neighbors are negatively autocorrelated. Global structure arises when individuals are more genetically similar to their immediate neighbors than expected if the spatial distribution were random, such as a genetic cline or spatially distinct genetic groups. Conversely, local structure indicates that individuals are genetically different from their immediate neighbors, as happens when genetically similar individuals avoid mating with each other, and instead select mates with whom they share fewer alleles (Jombart et al. 2008). Spatial autocorrelation was calculated between neighboring points as defined by a Gabriel graph connection network (Gabriel and Sokal 1969). We tested for significant global and local structure using a Monte Carlo randomization test with 999 permutations, as described in Jombart et al. (2008).

Descriptive statistics

We calculated total number of alleles, expected heterozygosity (Nei 1987), and observed heterozygosity for each population cluster identified by the Geneland analysis using Microsatellite Toolkit version 3.1.1 (Park 2001). We calculated the average number of alleles and private alleles per locus for each population cluster using rarefaction to account for unequal sample sizes with the program HP-Rare v. 1.1 (Kalinowski 2004, 2005). To compare genetic differentiation between clusters we calculated pairwise estimates of FST (Weir and Cockerham 1984) using Genepop version 4.1. We also used GENALEX v. 6.4 to estimate inbreeding coefficients (FIS) for each cluster (Peakall and Smouse 2006).

Landscape resistance analysis

We generated landscape resistance surfaces using GIS data layers for elevation, forest canopy cover, human population density and highways. Elevation data for the U.S. was taken from the National Elevation Dataset at a resolution of 30 m (USGS 2012), and for Canada from Terrain Resource Information Management Digital Elevation Model at a resolution of 25 m (Crown Registry and Geographic Base 2012). Percent canopy cover data was derived from Landsat imagery at a resolution of 100 m (WHCWG 2010). Human population density was based on census data from 2000 in the U.S. and 2001 in Canada, and ranged from <10 to >80 acres per dwelling unit at a resolution of 100 m (WHCWG 2010). Highways were classified as freeways, major highways and secondary highways (WHCWG 2010), where resistance was equal to the annual average daily traffic volume (28,000, 10,000, and 4000 vehicles per day, respectively) at a resolution of 100 m (WSDOT 2012). The untransformed raw values of each layer were rescaled to range between values of 1 and 2 in order to standardize resistance estimates and allow for evaluation of the relative importance of each factor. The resolution of each layer was reduced to 300 m by 300 m by aggregating cells based on the average cell value to maintain practical computation times. All sample points were at least 70 km from the map boundary, except where boundaries coincided with actual barriers to dispersal, such as Puget Sound; this buffer was used to minimize the risk of overestimating resistance near map edges (Koen et al. 2010).

We calculated pairwise resistance estimates for each landscape variable between every pair of individuals using Circuitscape version 3.5.8 (McRae et al. 2008) as it more realistically accounts for the presence of multiple dispersal pathways and the effect of the width of dispersal pathways than least cost path analysis (McRae 2006). Samples from the Blue Mountains of southeastern Washington were excluded from the landscape resistance analysis due to their geographic isolation and the artificial barriers imposed by the boundaries of the study area. Specifically, when calculating landscape conductance due to forest canopy cover with Circuitscape, current would be forced across the unforested Columbia basin, when it seems more likely that cougars would follow forested corridors outside of the study area in Idaho to reach the Blue Mountains. After these samples were removed a total of 633 (95 %) individual samples remained. Elevation, human population density and highway traffic volume were run as resistance surfaces, while forest canopy cover was run as a conductance surface, where conductance is simply the reciprocal of resistance (McRae and Shah 2011). We used an eight neighbor, average resistance/conductance cell connection scheme for each grid.

We used multiple regression on distance matrices (Legendre et al. 1994) to evaluate relationships between genetic distance and resistance estimates for each landscape variable. Multiple regression on distance matrices has proven more accurate than Mantel tests in evaluating alternative hypotheses of landscape resistance in simulation studies (Balkenhol et al. 2009) and, unlike the Mantel test, the scaling of the relationship between landscape features and genetic distance does not need to be defined a priori. A linear relationship between resistance variables and genetic distance is still assumed under multiple regression on distance matrices, and resistance variables must be screened for multicollinearity. We used PCA of allele frequencies to calculate genetic distances between individuals; we created a distance matrix in R derived from the first principal component scores for each individual (Patterson et al. 2006; Shirk et al. 2010). While multiple regression on distance matrices produces coefficients and R2 values identical to those produced with ordinary multiple regression, significance must be determined using permutation tests because the individual elements of a distance matrix are not independent from one another (Legendre et al. 1994). In order to evaluate the contribution of geographic distance alone, we also included a pairwise distance matrix based on the Euclidean distance between the coordinates for each genotyped individual, generated using the Ecodist package in R (Goslee and Urban 2007). Each resistance distance matrix was included as a term in a linear model, where genetic distance was the response variable:

$${\text{G}}\sim {\text{R}}_{\text{E}} + {\text{C}}_{\text{F}} + {\text{R}}_{\text{P}} + {\text{R}}_{\text{H}} + {\text{R}}_{\text{G}}$$

where G is the Genetic distance, RE the resistance due to elevation, RF the conductance due to forest canopy cover, RP the resistance due to human population density, RH the resistance due to highways, and RG the resistance due to geographic (Euclidean) distance. Resistance estimates were z-transformed to standardize partial regression coefficients. P values were derived from 1000 random permutations of the response (genetic distance) matrix. All regression modeling was performed using the Ecodist package in R (Goslee and Urban 2007).

Geographic distance is a component of all resistance estimates, therefore some correlation was expected between resistance estimates for each landscape variable. Like other forms of linear regression, uncorrelated independent variables are an assumption of multiple regression on distance matrices. We calculated pairwise correlations between all resistance distance matrices using Mantel tests with the Ecodist package in R (Goslee and Urban 2007); we used the Pearson correlation method and significance was based on 1000 permutations. As a complement to correlation analysis, we calculated the variance inflation factor for each resistance estimate using the Companion to Applied Regression (car) package in R (Fox and Weisberg 2011); a variance inflation factor greater than 10 generally indicates that the terms are too highly correlated to be included in the same model (Marquardt 1970).

An alternative to linear regression, boosted regression tree analysis is a recently developed machine learning technique that can evaluate the relative influence of independent variables on a response variable, and is appropriate for nonlinear data (Elith et al. 2008; Balkenhol 2009). The response data is repeatedly split into two groups based on a single variable, while keeping the groups as homogeneous as possible. Boosted regression trees minimize deviance by adding, at each step, a new tree that best reduces prediction error. The relative influence of each predictor variable is measured by the number of splits it accounts for weighted by the squared improvement to the model, averaged over all trees (Elith et al. 2008). The regression tree model with the lowest deviance based on cross-validation consisted of 1100 regression trees and a learning rate of 0.05. Given that geographic distance is a component of every resistance estimate we chose not to model interactions between predictor variables. Our analysis constrained conductance/resistance due to forest canopy cover, human population density and highways to result in a monotonic increase in genetic distance. Resistance due to elevation was not similarly constrained, allowing for a nonmonotonic or Gaussian response. We used the packages gbm (Ridgeway 2013) and gbm.step (Elith et al. 2008) for boosted regression tree analysis.

Results

Population genetics

We detected significant homozygote excess at 16 of 18 loci when all individuals were pooled into a single population. This could have resulted from the presence of null alleles or the Wahlund effect resulting from genetic structure within a hypothesized single population. Estimated frequencies of null alleles were ≤5.1 % for all but one locus (FCA293, 13.5 %). Geneland clustering revealed multiple populations in the study area (described below). After separating individuals into the four clusters indicated by Geneland, the estimated frequency of null alleles at locus FCA293 was still greater than 10 % in two of four clusters, therefore this locus was dropped and all subsequent analyses were based on the remaining 17 loci (Table 1).

Eight loci were out of Hardy–Weinburg equilibrium (HWE) after Bonferroni correction for multiple tests. Concurrent with HWE testing, we detected significant departures from linkage equilibrium in 83 of 136 (61 %) pairwise comparisons between loci after Bonferroni correction. Seven of 17 (41 %) loci occur on separate chromosomes or linkage groups and should be considered independent (Menotti-Raymond et al. 1999), while one locus, FCA166, has yet to be mapped. After separating individuals into clusters identified by Geneland, no consistent patterns of linkage or Hardy–Weinberg disequilibria between clusters remained. All 17 retained loci were polymorphic, with between 2 and 9 alleles per locus and 91 total alleles globally.

Cluster analysis

Support was highest for four populations in the study area from Geneland simulations. These populations corresponded with the Blue Mountains in southeastern Washington, northeastern Washington, western Washington following the Cascade Mountains, and the Olympic Peninsula (Fig. 2).

Fig. 2
figure 2

Posterior probability of membership in Geneland clusters (1–4 shown) for cougars in Washington and British Columbia, 2003–2010

The total and mean number of alleles was highest in the northeast and Cascades clusters, even after correcting for unequal sample sizes (Table 2). Although the mean alleles per locus was lower for the Blue Mountains than the Cascades cluster, the number of private alleles per locus was nearly equivalent (Table 2). Both expected and observed heterozygosity were lower in the Olympic cluster than in all other clusters, indicating lower genetic diversity, and possibly greater isolation of this cluster (Table 2). Population differentiation (FST) increased with distance between clusters; differentiation was lowest between the northeast and Cascades clusters, and highest between the Olympic and Blue Mountain clusters (Table 3). The geographically adjacent Olympic and Cascades clusters showed a considerable degree of differentiation (FST = 0.145), nearly equal to that observed between the Cascades and Blue Mountains clusters (FST = 0.151), which are separated by more than 200 km of unforested shrub-steppe.

Table 2 Genetic diversity for cougar population clusters identified using Geneland
Table 3 Genetic differentiation (FST) between cougar population clusters

sPCA

The first two global sPCA axes explained most of the spatial genetic variation; as can be seen in Fig. 3, they had the highest eigenvalues and were well differentiated from the other axes. Therefore, only these two axes were retained. Additionally, the Monte Carlo randomization test for global structure was highly significant [max(t) = 0.016, P = 0.001]. Local sPCA axes (axes with negative eigenvalues in Fig. 3) explained little spatial genetic variation and were poorly differentiated from each other; no evidence of local structure was found [max(t) = 0.0028, P = 0.74].

Fig. 3
figure 3

Spatial principal component analysis eigenvalues for cougars in Washington and British Columbia, 2003–2010; the first two global axes (darker shading) were retained while no local axes (negative values) were retained

The first global sPCA axis displayed strong east–west genetic differentiation across the study area; the strongest separation between neighboring samples was found along the Okanogan Valley and edge of the Columbia River Basin (Fig. 4). The second global sPCA axis clearly separated out individuals on the Olympic Peninsula and in the Blue Mountains from the rest of the state, as well as showing a weak east–west gradient in genetic similarity in northeastern Washington, coinciding approximately with the Columbia River (Fig. 5).

Fig. 4
figure 4

Map of spatial principal component analysis scores from axis 1 for each individual. Genetic similarity is represented by color and size of squares; squares of different color are strongly differentiated, while squares of similar color but different size are weakly differentiated. Data were collected from cougars sampled in Washington and British Columbia during 2003–2010

Fig. 5
figure 5

Map of spatial principal component analysis scores from axis 2 for each individual. Genetic similarity is represented by color and size of squares; squares of different color are strongly differentiated, while squares of similar color but different size are weakly differentiated. Data were collected from cougars sampled in Washington and British Columbia during 2003–2010

Landscape resistance analysis

While we found significant correlations between landscape resistance variables, all Mantel r values were <0.75 (Table 4). The most highly correlated resistance/conductance surfaces were human population density and forest canopy cover (Mantel r = 0.74, P = 0.001). In contrast to the Mantel test results, all variance inflation factor coefficients were <4, suggesting that multicollinearity was not an issue.

Table 4 Correlations among landscape resistance features in Washington and British Columbia

Only conductance due to forest canopy cover and resistance due to geographic distance were significant predictors of genetic distance, and the final model explained 14.9 % of the variation in genetic distance (Table 4). The null hypothesis that there was no relationship between any explanatory variable and genetic distance was rejected (F = 7003.1, P = 0.001; Table 5).

Table 5 Multiple regression on distance matrices test statistics with genetic distance among cougar samples as the response variable and elevation, forest canopy cover, human population density, highways and geographic distance as potential predictor variables

The boosted regression tree model explained 19.2 % of the deviance in genetic distance. Of the explained deviance, conductance due to forest canopy cover had the highest relative influence on the model (53.6 %), followed by resistance due to geographic distance (31.8 %), human population density (8.9 %), elevation (3.0 %), and highways (2.8 %; Fig. 6).

Fig. 6
figure 6

Partial dependence plots from boosted regression tree modeling, in order of decreasing relative influence, for cougars in Washington and British Columbia, 2003–2010. The Y-axes shows the marginal effect of resistance on genetic distance. Negative marginal effects correspond to a decrease in genetic distance between individuals, and vice versa. The X-axis for the geographic distance plot is shown in units of km, while all other X-axes are shown in terms of Circuitscape resistance (unitless). In order to model landscape resistance, the reciprocal of forest cover was used, where resistance was greatest for unforested areas and least for densely forested areas. The relative influence of each variable on the explained deviance is shown in parentheses; total deviance explained by the model = 19.2 %

Discussion

Our results suggest that cougar populations in Washington and south-central British Columbia are structured as a metapopulation, not a single, panmictic population. The results of Geneland clustering largely agreed with those of spatial PCA, showing four clusters in the study area. State-wide analyses in Nevada (Andreasen et al. 2012), Oregon (Musial 2009) and California (Ernest et al. 2003) revealed spatially-structured cougar populations, however similar analyses in Wyoming (Anderson et al. 2004) and Utah (Sinclair et al. 2001) did not. Anderson et al. (2004) found less genetic differentiation between cougars in Wyoming than we observed in Washington, yet found a stronger relationship between genetic and geographic distance (r = 0.61, P = 0.011). This suggests that although there was an isolation by distance effect, the sparsely-developed Wyoming landscape may be more permeable to movement than that of Washington. In Utah, Sinclair et al. (2001) found little evidence of population structure, however this may have been due to sampling design and low sample size. Genetic structure was evaluated using F-statistics where populations were a priori defined by management units, which may not have held any biological relevance, and each unit consisted of only 5 individual samples.

While cougars on the Olympic Peninsula were genetically differentiated from others in the study area, they do not appear to be as isolated as previously thought. The Olympic cluster had the lowest mean observed heterozygosity, 0.33, of the four clusters (Table 2); this value was similar to that found by Culver et al. (2000), 0.31, for Olympic cougars. The percentage of polymorphic loci for this cluster, however, was much higher in the present study, 94 %, than was previously found (50 %; Culver et al. 2000). This difference may be attributable to our larger sample size (26 vs 4 samples). The Olympic cluster also had the highest inbreeding coefficient (FIS) of any cluster, at 0.078 (Table 2), yet this value was relatively low compared with those reported for small or isolated populations in California (0.03–0.20; Ernest et al. 2003) and the Intermountain West (0.036–0.227; Loxterman 2010). This evidence suggests that although the Olympic cougar population is small and somewhat isolated, translocations do not appear to be necessary at this time.

The boundaries between population clusters were not sharply defined, as evinced by mixed membership in the clustering results, and variation in spatial PCA scores of adjacent individuals. This implies that limited gene flow has occurred between clusters. Musial (2009) detected a genetic cline in Oregon cougars where the eastern foothills of the Cascades meet the high desert, separating the state into eastern and western clusters. This closely resembles the pattern of differentiation we observed in the first sPCA axis, and between the Cascades and northeastern clusters in Geneland clustering, aligning approximately with the Okanogan Valley. Musial (2009) attributed this isolation to unsuitable habitat, characterized by low slope and the lack of vegetative cover, between the eastern and western clusters. Habitat in the Okanogan Valley is similar to that of the clinal region in Oregon, however the width of this unforested corridor in Washington is far narrower, ranging from 17 to 36 km. Similarly, Loxterman (2010) and Balkenhol et al. (2014) identified the largely agricultural and urban Snake River Plain as a barrier to gene flow for cougars in Idaho. The Okanogan Valley has also been shown to be a partial barrier to gene flow in mountain goats (Parks et al. 2015). Furthermore, the separation between the Cascades and the Olympic cluster aligns with the sparsely forested and heavily developed I-5 corridor between Seattle and Portland.

While we found a significant correlation between genetic and geographic distance, geographic distance alone cannot explain the genetic structure observed. If distance were the only factor influencing allele frequencies, then both north–south and east–west genetic clines should be apparent. North–south clines were notably absent, even in the Cascades cluster which covers over 480 km from the northern to southern tip, more than twice the average male dispersal distance in Washington.

The results of multiple regression on distance matrices and boosted regression tree analysis both suggest that forest canopy cover has the strongest influence on gene flow, followed by geographic distance. Balkenhol et al. (2014) also found a negative effect of geographic distance and positive effect of forest cover on genetic distance between cougars. Within individual home ranges, Elbroch and Wittmer (2012) found that cougars used forested habitats more than expected if space use were random. These factors explain the genetic differentiation observed between the Cascades and northeastern clusters, which were separated by the unforested Okanogan Valley. Though the Blue Mountains cluster was not included in landscape resistance modeling, forest cover and geographic distance could both logically have contributed to the differentiation of this cluster from the others, as it is separated from them by the wide shrub-steppe expanse of the Columbia River Basin.

We found congruence in variable selection among the multiple regression on distance matrices final model and the boosted regression trees model. While multiple regression on distance matrices assumes a linear relationship between variables, we reached the same conclusion using boosted regression trees, which makes no such assumptions. This suggests that any nonlinear relationships in the data did not have a strong effect on the results. However, this may not be the case for all datasets, highlighting the need for exploratory analysis and inference based on multiple methods.

Potential sources of error in this analysis included the imprecision associated with cougar sample coordinates, as well as non-uniform sample coverage across the study area. Coordinates for most genetic samples were based on hunter descriptions, and were estimated to be accurate within 10 km. Therefore, error could have been introduced into pairwise resistances at short distances if kill sites were incorrectly placed. This was a random source of error, however, and should not have resulted in a systemic bias for any variable. Furthermore, Graves et al. (2012) found only a small reduction in the strength of landscape genetic relationships under a scenario of simulated spatial uncertainty. Hunters are required to report the Game Management Unit (GMU) in which they harvested a cougar, and since GMU boundaries are delineated on the basis of roads, all sample locations were on a known side of a highway. Cougar samples were obtained opportunistically; this irregular sampling design provides a wide range of distances for pairwise comparisons, but can undersample or oversample some areas (Storfer et al. 2007). Indeed, sample coverage is very poor in wilderness areas and national parks (Fig. 1), due to lack of access or prohibitions against hunting. The one variable this could have affected meaningfully was highways, as most cougar samples were obtained in proximity to paved roads. Maletzke (2010) reported mean cougar home range sizes from 199 to 753 km2 in Washington, depending on sex, age class and hunting pressure, which suggests that the majority of cougars have at least some exposure to highways in their daily movements. A bias toward proximity to highways in sample collection may not necessarily translate to a misrepresentative sample, then, if the home ranges of most cougars overlap one or more highways.

Multiple regression on distance matrices and boosted regression trees both highlighted the importance of forest canopy cover and geographic distance, however each model explained only 15 and 19 % of the variation in genetic distance, respectively. Models based on pairwise dissimilarities between points, as in distance matrices, generally have less explanatory power than those based on variables measured at the points themselves (Legendre and Fortin 2010). Model fit in this study was similar to that reported by other researchers working with vagile predators (Balkenhol 2009; Garroway et al. 2011), and was likely limited by the cougar’s ecological niche as a habitat generalist. Clearly, however, other factors are influencing gene flow in northwestern cougars, factors that could include prey distribution and density, sport hunting, and intraspecies territoriality and social interactions. The influence of sport hunting on cougar gene flow is difficult to quantify, because it can both restrict dispersal, through direct mortality of immigrants, and encourage dispersal, when resident males are killed and dispersing subadults from other areas move into take their place (Robinson et al. 2008; Cooley et al. 2009).

Management implications

Cougars are typically managed independently by state agencies, however our research showed that populations overlap political boundaries. Therefore, managers may wish to explore a larger, landscape-scale approach when constructing management zones or data analysis units. Toward that end, we encourage future genetic research in Washington to include other jurisdictions such as British Columbia, Idaho, and Oregon to establish a consistent sampling procedure and series of genetic markers. Since almost all agencies in North America have mandatory sealing requirements (Beausoleil et al. 2008), we recommend managers include tissue collection in their data gathering protocols. Even with a limited budget, sample collection is inexpensive and may be archived for decades when stored properly (Beausoleil and Warheit 2015).

Forested corridors appear to be essential for facilitating cougar movements, maintaining landscape connectivity, and preserving gene flow in the study area. With four distinct genetic clusters identified, regional population stability and gene flow may depend on the ability of subadult cougars to disperse; losing this connectivity could potentially result in genetic isolation. Therefore, we recommend that agencies develop and implement strategies to identify and preserve these connections. As fragmentation of forested lands continues, forested corridors will become increasingly important in maintaining genetic connectivity and population stability for cougars in the northwest.