Introduction

The loss of genetic diversity and inbreeding depression can be important precursors to population extinction (Frankham 2005). For this reason, recovery plans for endangered and threatened species often aim to maintain, or even increase, genetic diversity, while also minimizing inbreeding. Achieving these objectives is key for ensuring resilience and adaptability of a species (Harris et al. 2006) and important for overall restoration success (Hughes and Stachowicz 2004; Crawford and Whitney 2010). To accomplish these aims, it helps to have prior knowledge of both local and range-wide genetic variation and degree of inbreeding to inform restoration and management decisions (Dobson et al. 1997; Fenster and Dudash 1994; Maschinski and Haskins 2012; Neale 2012). Although the distribution of genetic diversity and levels of inbreeding can be inferred from life-history traits (Hamrick and Godt 1996; Nybom 2004; Duminil et al. 2007), such as breeding system, pollination vector and dispersal mechanism (Thiel-Egenter et al. 2009; Kramer et al. 2011; Meirmans and Hedrick 2011), as well as habitat preferences (Alvarez et al. 2009; Meirmans and Hedrick 2011) and historical events (Hewitt 2000; Hu et al. 2009), this does not replace empirical studies. As the levels of neutral genetic diversity can depend on phylogenetic histories (Gitzendanner and Soltis 2000), a comparison to a related taxon can sometimes be useful in establishing expectations (Karron 1987; Fréville et al. 1998; Gitzendanner and Soltis 2000) and identifying any potential hybridization (Egger 1994; Hersch-Green and Cronn 2009).

One group that is of high conservation priority are endemic species, which often provide a unique and critical component of ecosystems (Burlakova et al. 2011; Myers et al. 2000). Edaphic endemics that occupy specialized niches are often restricted to small and fragmented habitats (Lesica et al. 2006; Jiménez-Mejías et al. 2015). Due to anthropogenic pressures many of these habitats, are being increasingly fragmented and, consequently, many of the edaphic endemics species that grow at these sites are now classified as threatened or endangered (Brooks et al. 2002). The threats faced by these taxa are similar to those associated with small population sizes, including genetic drift and inbreeding depression (Ellstrand and Elam 1993; Aguilar et al. 2008), and with being fragmented, such as increased isolation (Vrancky et al. 2012; Breed et al. 2013; Ison and Wagenius 2014) and limited recruitment (Gibson et al. 2012). Combined, these threats can lead to lower genetic diversity within and high genetic differentiation among populations (Gitzendanner and Soltis 2000; Fant et al. 2014; Tapper et al. 2014). These concerns become of particular importance when considering the restoration of these taxa (Hufford and Mazer 2003; Weeks et al. 2011), particularly for identifying suitable source populations that have not experienced a reduction in genetic diversity, are not heavily inbred and are also of local provenance (Brown and Briggs 1991; Hufford et al. 2012; Fant et al. 2013b).

One common group of edaphic endemics in North America are the taxa specialized on the serpentine rock outcrops found along the west coast from CA, USA to BC, Canada (Kruckeberg 1984). These habitats all occur on bedrock that is extremely basic, very low in silica, and rich in ferromagnesian minerals. Although this is a stressful environment, these outcrops support a number of endemic taxa that are specialized and restricted to this soil type (Anacker et al. 2011). Because these outcrops are discontinuous and usually restricted in size, the plant populations they support usually have a naturally patchy and fragmented distribution throughout the region (Wolf 2001). However, many of these outcrops are also threatened by urbanization, overgrazing, and climate change and therefore many of the plant populations on them have become vulnerable to extinction (USFWS 1995; Elam et al. 1998; Damschen et al. 2011). In the San Francisco Bay Area, serpentine soils outcrops are found in eight of nine counties in the region. These outcrops support 13 plant species that are federally-listed as endangered or threatened, and many more, which are of special concern or state listed. Due to the high endemism, a region wide recovery plan has been developed to protect the habitat and the species found on these outcrops (Elam et al. 1998).

One of these endemic serpentine species is Castilleja affinis ssp. neglecta (Zeile) T. I. Chuang and Heckard (Orobanchaceae), which is only known from six sites (formally ten sites but four sites were either misidentified or not found). Its natural distribution is restricted within the San Francisco Bay Area and consequently is federally-listed as endangered (USFWS 1995; Niederer and Weiss 2011). Currently, C. affinis ssp. neglecta is the focus of active restoration efforts and there is a desire to increase the size of known populations and reintroduce individuals to historic sites (Elam et al. 1998; Niederer and Weiss 2011). C. affinis ssp. affinis is a more widespread subspecies within the group and co-occurs within the nine counties surrounding the San Francisco Bay area. Previous karyotyping suggests that the ploidy levels can vary in C. affinis complex (Chuang and Heckard 1992), but populations of both C. affinis ssp. neglecta and C. affinis ssp. affinis are described as hexaploid in the San Francisco Bay Area. Heckard (1968) has proposed that these taxa are likely of allopolyploid origin, although the evolution of genomes within the Castilleja genus are thought to be the product of a combination of Allo- and Auto-polyploidization (Heckard 1968; Meirmans and Van Tienderen 2012). Despite these subspecies sharing ploidy, a close phylogenetic relationship, overlapping ranges, and breeding systems, they do vary in floral color and soil preference. C. affinis ssp. affinis has large red bracts believed to be bird pollinated (Grant 1994), and C. affinis ssp. neglecta has a yellowish flower, thought to attract bee pollinators, although no pollination study has been conducted to our knowledge. Such a shift in pollinators can drive reproductive isolation between subspecies (Kay and Sargent 2009) and influence population genetic structure (Hughes et al. 2005; Kramer et al. 2011; Toon et al. 2014). In addition, hybrid swarms and introgression in Castilleja (Egger 1994; Hersch-Green and Cronn 2009) can influence the morphology and genetics of close species and complicate restoration strategies (Ownbey 1959; Hersch and Roy 2007). C. affinis ssp. affinis is known to hybridize with Castilleja wightii, and sometimes form hybrid swarms (Heckard 1968; Mark Egger [UTW], pers. Comm), however, to date there is no evidence of hybridization with C. affinis ssp. neglecta.

Clarifying the genetic relationships within and between these two subspecies is important in designing management protocols to conserve C. affinis ssp. neglecta. The majors goals of this study are to determine; (1) if populations of the C. affinis ssp. neglecta differ genetically and morphologically from neighboring populations of C. affinis ssp. affinis?, (2) is there evidence of hybridization or shifts in ploidy in the extant populations of C. affinis ssp. neglecta?, (3) what is the genetic structure of known populations of C. affinis ssp. neglecta? More specifically is there evidence of inbreeding within populations, and are the populations genetically distinct? We used morphometric measurements (flower, leaf, color) of the key characteristics that distinguish the two subspecies to confirm taxonomic divergence and identify potential hybridization. Genetic structure and diversity were assessed with microsatellite markers to examine genetic diversity and isolation within and among subspecies. We hypothesize that C. affinis ssp. neglecta will differ from its common congener for both morphological and neutral genetic markers, that there will be no evidence of recent hybridization in extant populations, and that as C. affinis ssp. neglecta has smaller fragmented populations they will have less diversity and higher inbreeding.

Methods

Study system

The Castilleja genus (~180 species) is a group of root hemiparasitic species in the broomrape family, distributed throughout North and South America from coastal dunes to alpine meadows (Hickman 1992; Tank and Olmstead 2008). C. affinis ssp. affinis and C. affinis ssp. neglecta are closely related subspecies found along the west coast of the United States. C. affinis ssp. neglecta is a federally listed endangered, serpentine endemic that differs from its sister subspecies in having yellow to peach colored floral bracts and shorter flowering stems. Like many of the region’s serpentine endemics, C. affinis ssp. neglecta has a narrow distribution, and small population sizes, which currently range from less than 20–1000 plants (California Natural Diversity Data Base 2011). C. affinis ssp. affinis, which is much more widespread, has red floral bracts and long flowering stems.

Study area

Castilleja affinis ssp. neglecta occurs in the Inner North Coast District and San Francisco Bay Area floristic provinces. It is a serpentine soil endemic limited to six different locations only in the San Francisco Bay area (<400 m above sea level). C. affinis ssp. affinis is located throughout California in chaparral and Sierra Nevada foothills, along with some coastal scrub area in central and southern California (Fig. 4). Although C. affinis ssp. affinis has a very wide floristic range, sampling was restricted to the floristic provinces where C. affinis ssp. neglecta co-occurs. A total of six populations of C. affinis ssp. neglecta and six of C. affinis ssp. affinis were visited in the San Francisco Bay area and along the coast of central California in May 2013 (Table 1). All known C. affinis ssp. neglecta sites were visited, as determined through California Natural Diversity Database (CNDDB) records and conversations with land managers. Selection of C. affinis ssp. affinis collection sites were determined from herbarium vouchers, databases, and conversation with land managers. Sources included geo-referenced locations found on the Jepson Herbarium eFlora website (http://ucjeps.berkeley.edu/IJM.html) and the Consortium of California Herbaria (http://ucjeps.berkeley.edu/consortium/), along with the CNDDB data in ArcGIS shapefiles. Populations where focal subspecies were previously photographed by Mark Egger [UTW], were also identified and revisited. Finally, land managers and California Native Plant Society chapter members were contacted to confirm subspecies occurrence at presumed populations from herbarium vouchers and CNDDB data.

Table 1 Site name, collection code, individuals sampled (n), GPS coordinates, voucher idenification by Mark Egger (UTW)

Sample collection

One to three herbarium voucher specimens were collected at each population and deposited at the Chicago Botanic Garden’s Nancy Poole Rich Herbarium, Glencoe, IL (CHIC). Duplicate herbarium vouchers collected at Point Reyes National Seashore were deposited at Point Reyes National Seashore Herbarium (PORE), Point Reyes Station, CA and remaining duplicate vouchers were sent to the University of Washington (UTW), Seattle, WA as part of the extensive Castilleja herbarium collection of Mark Egger. The herbarium vouchers were verified by Mark Egger [UTW].

One to four leaves were collected below the inflorescences from 15 to 35 individuals at each population, and one to five flowers were collected from approximately half (roughly 15) of the same individuals. All leaves were digitally scanned for morphometric comparisons, along with a metric ruler, using a Cannon LiDE 60 (LED Indirect Exposure) scanner. Leaves were then stored in coin envelopes with silica to be used for DNA extraction at the Chicago Botanic Garden’s Harris Family Foundation Plant Genetics Laboratory. The color of the bract tips subtending the flowers was scored using a Royal Horticultural Society (RHS) yellow to red color chart (The Royal Horticultural Society, London, England 2007). These color data were then converted to red–green–blue (RBG) ratios for analysis from the color chart data (Azalea Society of America, Inc 2007). Flowers and subtending bracts were then stored in tubes of 75% ethanol solution and taken to the Chicago Botanic Garden for morphometric measurements.

Morphometrics

The floral measurements were adapted from Crosswhite and Crosswhite (1970), who identified informative floral and bract traits for Castilleja sessiliflora. Fifteen morphological traits were selected based on characteristics used in identifying the subspecies in the field (Hickman 1992; Jepson Flora Project 2013, Mark Egger [UTW] pers. comm.); including two leaf measurements (width and length), three floral bract measurements (width, length, and lobe length), three calyx measurements (length, width and lobe length), three corolla measurements (length, widest width, narrowest width) and finally beak length, herkogamy, stigma exsertion and stamen exsertion (Fig. 1). Leaves were measured from digitally scanned leaf images using the program ImageJ (Rasband 2010), while flowers were measured from preserved specimens using digital calipers.

Fig. 1
figure 1

Illustration of Castilleja bract, the flower (with corolla and calyx combined), and corolla on it own with location of floral measurements indicated

DNA extraction and genotyping

Genomic DNA was extracted from ~1 cm2 of silica dried leaf tissues, using a modified 2× cetyltrimethylammonium bromide (CTAB) method adapted for silica-dried leaves (Doyle and Doyle 1987). The quality and concentration of DNA were evaluated using a Nanodrop 2000 (ThermoScientific, USA). Eleven microsatellite primers (A01, A101, A102, B04, B104, B116, C104, C105, D101, D103, and D119) used were developed for C. sessiliflora (Fant et al. 2013a) but amplified reliably and were variable in both subspecies. Each forward primer was labeled with WellRed D2, D3, or D4 fluorescent dye (Sigma-Aldrich, St. Louis, MO, USA), which allowed samples to be multiplexed within a single polymerase chain reaction (PCR). For each reaction, approximately 1 µl of genomic DNA (100 ng/µl) was mixed with 3 µl of DNA-grade water, 5 µl of MyTaq™ Master mix (Bioline, Taunton MA, USA), 0.25 µl of BSA (10 mg/ml), 0.25 µl of magnesium sulfite (25 mM), 0.125 µl of two blue (D4) labelled-forward (10 mM) and reverse primers (10 mM), 0.25 µl of a green (D3) labelled-forward (10 mM) and reverse primer (10 mM), and 0.5 µl of a black (D2) labelled-forward (10 mM) and reverse primer (10 mM). Thermal cycler conditions for the multiplex PCR reactions started with an initial denaturation set at 94 °C for 4 min and was followed by 35 cycles of 95 °C for 40 s, 55 °C for 40 s, 72 °C for 1 min, ending with a final extension at 72 °C for 10 min. PCR products were analyzed on a CEQ 8000 Genetic Analysis System with GenomeLab 400 internal size standard (Beckman Coulter, Brea, CA, USA). Before running samples on the CEQ 8000 Genetic Analysis System labeled multiplexed PCR products were mixed with 30 µl of HiDi (Azco Biotech., San Diego, CA) and 0.3 µl of 400 bp size standard ladder (Beckman Coulter, Brea, CA, USA) to each well of a 96 well plate.

As some primers that were used replicated multiple regions (Fant et al. 2013a), C. wightii, a diploid species, was used to distinguish if multiple peaks are a result of gene duplication, and therefore showed multiple regions within a diploid species, as opposed to those multiple peaks resulting simply from ploidy differences. Three primers (B116, D119, and B104) amplified multiple regions in the diploid C. wightii, however, the peaks regions produced were sufficiently separated from each other that they could be scored separately. These distinct regions showed no sign of linkage in C. wightii, hence were treated separately in the analysis. The ability to check for null alleles and linkage disequilibrium in hexaploid dataset is limited, the primers were checked for null alleles using exact tests in MICRO-CHECKER (Van Oosterhout et al. 2004) and linkage disequilibrium using Fisher’s method in GENEPOP (Raymond and Rousset 1995) in the diploid C.wightii and in original C. sessiliflora (Fant et al. 2013a). For C. affinis ssp. affinis and C. affinis ssp. neglecta a total of 14 regions were amplified and scored. Although peak height can be used to score allele frequencies, as this is difficult to quantify and not reliably only presence or absence of a peak was scored. The maximum number of peaks observed was used to determine ploidy variation. All bands occurred within the region described in Fant et al. (2013a). As the microsatellites appear to amplify a maximum of six peaks for C. affinis ssp. affinis and for C. affinis ssp. neglecta, they were both scored as hexaploids.

Statistical analysis

Morphological data

All morphological analyses were performed using R Statistical Software version 3.1.0 (R Core Team 2014). As each measurement represent different aspects of the same flower, correlations among the floral traits and leaf measurements were tested using the corr.test function in the R package psych to identify traits that might be responding independently for each other. (Revelle, in preparation). A linear mixed effects model (lme) was used to compare morphological traits between subspecies and populations. When comparing subspecies, we included populations as a random variable to account for variation between populations. A Tukey’s honest significant difference (HSD) test was conducted on each trait using the R package agricole (De Mendiburu 2009) to determine the similarity between populations. To look at whole flower differences, rather than single trait variation, a Non-metric multidimensional scaling (NMDS) was performed using the R Package vegan (Oksanen et al. 2013) using floral, bract and leaf morphological data and RGB values to compare the populations and species. As the NMDS does not handle negative data, the stigma exsertion, stamen exsertion and herkogamy were relativized to the maximum value.

Genetic data

GenoDive V 2.0, which has been designed for use with polyploid data (Meirmans and Van Tienderen 2004), was used to calculate all genetic parameters, including genetic diversity (He), Polymorphic Loci (Ap), No of private allele, inbreeding (GIS), and genetic differentiation (GST and AMOVA). The parameters used for the AMOVA analysis were “only populations”, “no grouping” and “ploidy independent infinite allele model (Rho)”. GenoDive was also used to calculate inbreeding using GIS, and Nei’s Genetic Distance (Nei 1987), and the pairwise distance between populations and between subspecies. The parameters for GIS were calculated from genetic distances, under the Polyploid Dosage Corrected G’_st (Nei). Polyploid Dosage Corrected in GenoDive allows for correction of unknown dosage of alleles from marker phenotypes, which allows estimation of GST with less bias (Meirmans 2013). The parameters for Nei’s Genetic Distance per population were calculated for indices separately for every population, and per locus was calculated for indices separately for every locus. GenAlEx (Peakall and Smouse 2012) was used to determine the number of private alleles among subspecies and within a subspecies.

A Principle Components Analysis (PCA) of microsatellite data were also assessed using POLYSAT, a package in R Statistical Software developed for analyzing microsatellites of autopolyploid and allopolyploid species (Clark and Jasieniuk 2011). Allele frequency and a pairwise distance matrix using Lynch distance were calculated between all samples by subspecies, then between populations of each subspecies. The pairwise distance matrix was used to create a PCA of microsatellite data with the R function cmdscale for classical multidimensional scaling of a data matrix, both to compare C. affinis ssp. affinis and C. affinis ssp. neglecta and compare the populations of C. affinis ssp. neglecta. Lynch distance was selected over Bruvo distance because it has been shown to work better for distinguishing subspecies and populations (Clark and Jasieniuk 2011). Polysat was also used to estimate allele frequencies in populations, which were used to create appropriate datasets and able to be used in the statistical programs SPAGeDi (Hardy and Vekemans 2002) and STRUCTURE (Pritchard et al. 2000).

SPAGeDi (Hardy and Vekemans 2002) was used to calculate two measures of pairwise genetic distance among populations: (1) Weir and Cockerham FST (Weir and Cockerham 1984), and (2) rho (Ronfort et al. 1998). Although both measures are calculated assuming that the dataset is derived from an autopolyploid, Meirmans and Van Tienderen (2004) demonstrate that rho (Ronfort et al. 1998) was robust regardless of assumption made on origins of ploidy, while Fst was included as it is a more conventional measure and better statistical measure for making demographic inferences (Meirmans and Hedrick 2011; Whitlock 2011). To test for isolation by distance, pairwise genetic distances between populations were regressed against Euclidean distance (km). Correlation between pairwise genetic distance and geographic distance was checked using Mantel (1967) tests (103 permutations) in GENALEX (Peakall and Smouse 2012). To determine if pairwise genetic distances between subspecies could be best explained by geographic distance or taxonomic separation, each pairwise comparisons was separated into, (a) pairwise distance between populations of C. affinis ssp. affinis, (b) pairwise distance between populations of C. affinis ssp. neglecta and (c) pairwise distance between populations of the two subspecies.

The Bayesian clustering analysis software STRUCTURE (Pritchard et al. 2000) was used to visualize genetic structure within and among subspecies. Both subspecies were treated as hexaploid, based on the maximum number of unique alleles. As it was not possible to infer copy numbers of markers, each unique allele was scored once. Given the limitations in STRUCTURE of using hexaploid data without knowing allele frequencies or if these species are derived from auto- or allopolyploids, we chose to use three parallel datasets that use different assumptions. The first dataset (polysat) was derived from polysat, which replaces unknown additional alleles with a common allele from dataset (as recommended in STRUCTURE manual). The second dataset (non-polysat) we used all confirmed alleles once and the remaining alleles were scored as missing up to six alleles, which requires little guesswork of allele frequencies but likely results in biased estimates. And finally, we treated the third dataset (AFLP) as dominant data, converting it to the equivalent of AFLP dataset in STRUCTURE. This format reduces the amount of information we can draw from the dataset but makes less assumption. To convert to AFLP dataset, each allele was treated as present or absent (binary). Hence the absence of that allele was scored as six zeros, while present of that allele was scored as one with remaining five as missing, to account for the fact we did not know the frequency of the allele. The parameters for STRUCTURE were ploidy level six, the length of the burn-in period was 100,000, and the number of MCMC reps after the burn was 100,000, using the admixture model. For the STRUCTURE analysis comparing among subspecies, K ranges from 1 to 10 were tested and for analysis of populations within each subspecies a K range was set from 1 to 12. The methods described by Evanno et al. (2005) implemented in Structure Harvester (Earl and vonHoldt 2012) were used to choose the most likely K.

Results

Morphometric differences among subspecies

Most of the 15 morphological traits measured (three bract, ten floral, and two leaf measurements) were found to be significantly correlated (p < 0.05). The exceptions were leaf length, which was not significantly correlated with any floral or bract measurements, and leaf width, which was only significantly correlated with bract width and length. The three bract and 10 floral measurements were strongly correlated (r2 > 0.50) with each other, with some of the strongest correlations (r2 > 0.70) between corolla length, stigma exsertion, stamen exsertion and herkogamy. Most of the traits showed significant differences between the subspecies, confirming they are useful for distinguishing the taxa. The exceptions were leaf length, narrowest corolla width, beak length, stamen exsertion and herkogamy. An NMDS showed distinct differences between the two subspecies with very little overlap (Fig. 2a), however as all traits were correlated it was not possible to identify which traits were driving this separation. Despite no significant difference in leaf lengths between the subspecies, populations of C. affinis ssp. neglecta had significantly thinner leaves compared to populations of C. affinis ssp. affinis, while the bracts were also significantly shorter and thinner. All the floral traits were shorter or thinner in C. affinis ssp. neglecta. In both subspecies the stamen filament was shorter than the corolla (inserted) and the anther and stigma extended past the corolla (exsertion) although stigmas were significantly less exserted in C. affinis ssp. neglecta (Table 2).

Fig. 2
figure 2

ac NMDS comparison of morphological data; images are repeated highlight; a differences between subspecies, b differences between populations of C. affinis ssp. affinis and C. affinis ssp. neglecta and finally, c differences between populations of C. affinis ssp. neglecta populations to C. affinis ssp. affinis. df PCA comparison of genetic data; images are repeated highlight; d differences between subspecies, e differences between populations of C. affinis ssp. affinis and C. affinis ssp. neglecta and finally, f differences between population of C. affinis ssp. neglecta and C. affinis ssp. affinis

Table 2 Mean values ± CI (95%) for morphological characters, and color ranges for C. affinis subspecies

The most distinguishing trait between the taxa was floral color. C. affinis ssp. neglecta had more yellowish floral bracts while the populations of C. affinis ssp. affinis had orange-red floral bracts. C. affinis ssp. neglecta has floral bracts that ranged from the yellow to a yellowish-pink color in the RHS color chart, although the CNAC population had one individual with red floral bracts. Using the RGB color ranges to quantify differences, all subspecies had floral bracts with similar Red (R) absorption values (162–255) on the RGB scale, although the average R-value was highest in C. affinis ssp. neglecta. The largest differences were in the values for green (G) and blue (B) absorption. Castilleja affinis ssp. neglecta with more yellow floral bracts had greater absorption in both the green and the blue spectra (G = 41–234 ± 4.6, B = 16–187 ± 2.7), while C. affinis ssp. affinis which had more red bracts showed much lower absorption in the green and the blue spectra (G = 23–108 ± 2.1, B = 13–57 ± 1.5).

Morphometric results within subspecies

Within a subspecies, all the traits varied significantly (p > 0.001) between populations. However in the NMDS, five of the C. affinis ssp. neglecta populations overlapped (Fig. 2b), suggesting considerable morphological cohesion. The one exception was the population, CNAC, which was consistently different from the rest. This population had the longest leaves, bracts, calyx, calyx lobe, and corolla, as well as widest corolla of all populations. The RGB values of the C. affinis ssp. neglecta at three populations (CNPH, CNPC, and CNNR) had more of the yellowish-pink colored flower bracts, while the remaining three populations were more strongly yellow. The populations of C. affinis ssp. affinis showed significantly more morphological differentiation compared to C. affinis ssp. neglecta (Fig. 2b, c), with the NMDS analysis showing large differences between the populations of C. affinis ssp. affinis and only moderate overlap. The population with the largest leaves was CASCS, while CAPC and CATB had largest and widest bracts of C. affinis ssp. affinis five populations. The population, CATB had the shortest floral traits and lowest stigma exsertion of all populations.

Microsatellite data

Both subspecies generated a maximum of six peaks across all fourteen markers, as expected for a hexaploid, although a majority of individuals and markers showed far fewer peaks. Some markers produced a high level of variability, with up to 38 alleles per locus (Table 1), however, most alleles were rare. The most frequent alleles were common to both subspecies and all populations; consequently, there was considerable overlap between the two subspecies for these neutral markers. The AMOVA showed that 98% of the genetic variation was shared across subspecies, and there was a low differentiation between the subspecies when averaged across populations (FST = 0.05 and rho = 0.25; Table 3).

Table 3 Average geographic (km) and genetic (Fst and Rho) pairwise distances between populations

Three measures of genetic diversity (average number of alleles per locus, average number of private alleles per locus, and HS) were used to compare the two subspecies. The average number of alleles for populations of C. affinis ssp. neglecta ranged from 7.4 to 9.6, with an average of 8.5 ± 0.4, which was slightly higher than what was found in C. affinis ssp. affinis whose average number of alleles ranged from 5.4 to 9.1, with an average of 6.5 ± 0.6. Similarly, genetic diversity (HS) in C. affinis ssp. neglecta ranged from 0.56 to 0.68, with an average of 0.63 ± 0.0.02. This is a slightly higher than what was found in C. affinis ssp. affinis, whose genetic diversity (HS) ranged from 0.49 to 0.63, with an average of 0.56 ± 0.02. By contrast, populations of C. affinis ssp. affinis showed a larger variation in an average number of private alleles per locus, ranging from 0.2 to 1.2, and a higher average (0.6 ± 0.15) compared to C. affinis ssp. neglecta which ranged from 0.3 to 0.6, with an average of 0.4 ± 0.05. Nonetheless, a comparison of all metrics for genetic diversity showed no significant differences between populations of the subspecies. This was likely a product of the high variability between populations of C. affinis ssp. affinis, driven by large differences in sample sizes. By contrast, inbreeding showed a significant difference between the two subspecies (t9 = −3.7, p = 0.005), with none of the populations of C. affinis ssp. affinis showing any evidence of inbreeding, while four of the six populations of C. affinis ssp. neglecta showed some evidence of moderate inbreeding (GIS > 0.07).

All populations of C. affinis ssp. neglecta had levels of genetic diversity equivalent to that of C. affinis ssp. affinis (Table 1). CNMR had the highest diversity for all three measures (Â = 9.6; Prv = 0.6, HS = 0.68), while CNPC had the lowest (Â = 7.4; Prv = 0.3; HS = 0.56). CNAC and CNNR were the only two that showed no evidence of inbreeding (GIS = −0.04 and −0.02 respectively); all the remaining populations showed low to moderate inbreeding with the highest in CNPC (GIS = 0.13).

The PCA generated for microsatellite data using Lynch Distances in Polysat showed a clear separation between the two taxa (Fig. 2d), despite some individuals of C. affinis ssp. neglecta overlapping with C. affinis ssp. affinis individuals. A majority of individuals that overlapped with C. affinis ssp. affinis individuals were predominately from two of the six populations (CNAC and CNPH; Fig. 2e). Most individuals of C. affinis ssp. affinis tended to cluster (Fig. 2f), suggesting a high similarity. By contrast, individuals of C. affinis ssp. neglecta showed a much larger spread (Fig. 2e), suggesting greater differences in these neutral markers.

When comparing average pairwise genetic distance (Fst and rho) between populations of C. affinis ssp. neglecta, there was very low levels of genetic differentiation (Average Fst < 0.05 and Average rho < 0.20) between all populations (Table 1). This is in contrast to some C. affinis ssp. affinis populations (CAPB, CASCS & CATB) which showed much higher genetic differentiation (Average Fst > 0.05 and Average rho > 0.20). Not surprisingly, two of these populations were designated as ambiguous (Mark Egger [UTW], pers. Comm). When comparing pairwise differences between subspecies, the pairwise differences for C. affinis ssp. neglecta were much higher (Average Fst > 0.05 and Average rho > 0.20) for all populations except CNAC and CNPH, which were the most similar to C. affinis ssp. affinis populations (Tables 1, 3). The isolation by distance graph showed similar patterns regardless of the measure of genetic distance used (FST or rho). For the same geographic distances, the pairwise genetic distance between populations of the same subspecies was smaller than the pairwise genetic distance between populations of different subspecies. This suggests that taxonomic distinctions explain the separation between the groups better than geographic distance. Isolation by distance for both subspecies increased but was smallest for C. affinis ssp. neglecta population pairs when compared to C. affinis ssp. affinis population pairs. The isolation by distance between populations of opposite subspecies was higher and showed no correlation to geographic distance (Fig. 3).

Fig. 3
figure 3

Pairwise geographic distance (km) versus genetic distance as measured by a Fst (Weir and Cockerham 1984) assuming autoploidy and b rho (Ronfort et al. 1998) which robust regardless of types of ploidy. Comparisons between pairs of C. affinis ssp. affinis populations are closed circles, comparison between pairs of C. affinis ssp. neglecta are open circles, while triangles represent pairwise comparison between populations of C. affinis ssp. affinis and C. affinis ssp. neglecta. (Color figure online)

Despite the three datasets in STRUCTURE having different assumptions about auto and allopolyploid origins, allele frequencies and codominance, all produce relatively similar results. Using STRUCTURE Harvester (Earl and vonHoldt 2012) all datasets identified K = 2 or 3 as the best explanation of the data, which match subspecies divisions, but all produced a secondary peak at K = 7 (polysat data and non-polysat data) or K = 6 (AFLP data set), associated with population differences. At the smallest value for K (K = 2 or K = 3) the division aligned with the subspecies distinction, with all of the individuals from C. affinis ssp. neglecta populations falling into one genetic cluster (>75% to Cluster 1) and all of the individuals from C. affinis ssp. affinis population falling into the second (>85% to cluster 2) (Fig. 4a, b). The exception was one population of C. affinis ssp. neglecta, CNAC, which seemed to be intermediate (~50% to both clusters). The distinction between the subspecies was much more dramatic using the polysat and non-polysat data, rather than AFLP datasets.

Fig. 4
figure 4

Map of the distribution of the more widespread subspecies, Castilleja affinis subsp affinis (red), and endemic subspecies, C.affinis subsp neglecta (green) within California, with a box highlighting overlapping ranges and study area. The inset pie graphs represent the populations breakdown of STRUCTURE output among subspecies (a, b) and among populations within both subspecies (c, d). (Color figure online)

At the higher values of K (K = 7 for polysat data and non-polysat data, or K = 6 for AFLP data set), we see that individuals separate out by population’s identity. As with K = 2, the K = 6 for the AFLP dataset, showed that in many cases each population had a majority association (>50%) with one genetic group, the data for K = 7 using the polysat (or non-polysat) dataset produced much more clear distinctions between populations (>60%) (Fig. 4c, d). Some trends that were common to both datasets were that two populations of C. affinis ssp. neglecta, CNPH, and CNPC, both clustered together into the same genetic group, which is not surprising as they are geographically close. One population, CAPB, always was distinct from the others. The remaining populations were comprised of a combination of the remaining genetic groups. It is worthy to note that the polysat and non-polysat data produced much clearer distinction between populations than AFLP dataset. With the polysat data, all populations of C. affinis ssp. neglecta could be distinguished into four groups, CNPH and CNPC were one group, CNAC and CNMP were the second, CNRM was the third and CNNR was the fourth group. Meanwhile all C. affinis ssp. affinis populations were predominately comprised of the three remaining groups, with CAPB standing out as being composed of one distinct group, and all remaining populations are comprised of a mixture of the other two remaining groups.

Discussion

Castilleja affinis ssp. neglecta being a serpentine soil endemic only found in the San Francisco Bay area is defined as separate subspecies from its common congener C. affinis ssp. affinis. Morphological and genetic comparison of C. affinis ssp. neglecta and C. affinis ssp. affinis populations support this distinction, revealing significant differences between the two subspecies, supporting their separation as distinguishable taxonomic units (Chuang and Heckard 1992; Hickman 1992). Overall we found that within the Bay Area, C. affinis ssp. affinis has significantly longer and wider red floral bracts, while C. affinis ssp. neglecta has smaller yellow floral bracts, consistent with subspecies descriptions (Hickman 1992). The adaptation of C. affinis ssp. neglecta to serpentine soil within the vicinity of populations of their widespread congener (Cacho and Strauss 2014), might suggest these subspecies represent budding speciation, where a new species forms within or at the edge of an ancestral species, as described by Anacker and Strauss (2014). Given the patchy distribution of these serpentine outcrops, and that its congener, C. affinis ssp. affinis is found growing throughout the neighboring habitats, the movement from general to serpentine habitats may have occurred multiple times (Kay et al. 2011).

The strong morphological consistency between populations of C. affinis ssp. neglecta, despite potential gene-flow from congeners growing in the vicinity, suggesting there could be selection favoring these floral traits (Yost et al. 2012). The differences could be driven by abiotic factors, such as soil preferences, which have been shown to generate variation in floral morphology and mating systems (Hamrick et al. 1979; Clegg 1980; Holtsford and Ellstrand 1992; Jogesh et al. 2017). Alternatively, these differences could be driven by biotic selection, such as an adaptation to a different suite of pollinators (Kay and Sargent 2009). Although no pollinator observations were conducted, the floral differences between the subspecies align with different expectations of pollinator syndromes (Grant 1966, 1994; Duffield 1972; Beardsley et al. 2003; Rosas-Guerrero et al. 2014); where the smaller, yellow flowers of C. affinis ssp. neglecta would be associated with bee pollination, while the large red flowers of C. affinis ssp. affinis are more likely hummingbird pollinated. If these subspecies do attract a different suite of pollinators, then this would reinforce reproductive isolation between them, despite the range overlap (Kay and Sargent 2009).

We found C. affinis ssp. neglecta populations had levels of genetic diversity equivalent to its more common congener, suggesting that this species is not limited by genetic diversity. Other similar studies have found that diversity in rare species often overlaps with their widespread congeners (Stebbins 1980; Gitzendanner and Soltis 2000). We also found that the level of genetic differentiation between populations of C. affinis ssp. neglecta was equivalent to that found between populations of C. affinis ssp. affinis growing over a similar geographic area, even lower. This is not consistent with other studies that have found higher levels of differentiation in the endemic subspecies restricted to fragmented habitats (Stebbins and Major 1965; Loveless and Hamrick 1984; Fiedler 1987; Wolf and Thorp 2011). This might be complicated by the poor taxonomic resolution of species in the Coastal California complex of Castilleja species that both C. affinis ssp. affinis and C. affinis ssp. neglecta are part of (Pennell 1951; Chuang and Heckard 1992; David Tank [U. Idaho], pers. comm). In addition, the overall low values for differentiation (FST and rho) likely underestimate the true value of genetic differentiation since dosage differences cannot be determined accurately in hexaploids (Dufresne et al. 2014). This is somewhat supported by the elevated levels of inbreeding found in C. affinis ssp. neglecta populations, despite equivalent levels of realized gene flow (Slatkin 1985). This pattern may not be unexpected given C. affinis ssp. neglecta inhabits restricted and locally patchy habitats (Frankham 1998; Fréville et al. 1998; Thompson 1999). However, other potential drivers of the higher inbreeding could be a lower colonization ability and dispersal distances. While seed dispersal was not directly observed, the seeds of Castilleja species are lightweight and either fall a short distance from the parent plant or could possibly be dispersed short distances by the wind (Caplow 2004). Because of their strong habitat-specificity, edaphic endemics often show lower investment in pollen and seed dispersal, especially when compared to their more widespread congeners (Byers and Meagher 1997; Lavergne et al. 2004). As both subspecies share a similar seed and pod structures, which suggest gravity dispersal, that would then suggest the differences are driven by pollen dispersal. The change in flower color of C. affinis ssp. neglecta, from red to yellow, would make them more attractive to bees and other insect pollinators (Chittka and Waser 1997; Campbell et al. 2010), which have shorter average pollen movement distances than hummingbirds (Cronk and Ojeda 2008; Kramer et al. 2011).

As some populations of both subspecies showed some genetic or morphological overlap, this might indicate evidence of hybridization, incomplete divergence or misidentification. In particular, two populations of C. affinis ssp. neglecta (CNAC and CNPH) showed low genetic differentiation (Fst and Rho) from other C. affinis ssp. affinis populations and individuals from both populations showed considerable overlap for microsatellite markers in the PCA plots. The CNAC population was also consistently morphologically different from the rest of the C. affinis ssp. neglecta populations, and with the low genetic differentiation could be the result of hybridization with nearby C. affinis ssp. affinis populations. Although field identification of some Castilleja hybrids is possible (Egger 1994), it is difficult to know what level hybridization has occurred and the role it may have on diversity and possible cryptic taxa known in Castilleja (Rieseberg 1995; Hersch-Green and Cronn 2009). Hybridization is common in the Castilleja genus (Chuang and Heckard 1992; Egger 1994; Hersch-Green and Cronn 2009; Hersch-Green 2012) and therefore cannot be out ruled as a possibility for the populations’ observed molecular overlap in this study; despite both populations having morphological characters consistent with other C. affinis ssp. neglecta populations. A similar study by Sambatti and Rice (2006) found evidence of hybridization in ecotypes of Helianthus exilis were selection maintained the strong ecotypic distinction in spite of extensive gene flow. One of the populations did have one red flowering individual and overall larger flowers, which might be consistent with hybridization (Rieseberg 1995). Alternatively, this morphological and genetic overlap might also simple represent the recent and incomplete separation of these subspecies. Regardless, we feel that the ecological differences along with morphological and genetic divergence fit what Crandall et al. (2000) describe as an evolutionarily significant unit (ESU), and support C. affinis ssp. neglecta as a high conservation priority that merits separate management practices to prevent their extinction (Crandall et al. 2000).

Conservation implications for C. affinis ssp. neglecta

The successful recovery and restoration of a species require a complete knowledge of the species status, including biotic and abiotic community preferences, demographic information, and knowledge of genetic diversity within the species as a whole (Amos and Balmford 2001). Monitoring all of these factors is important for identifying negative trends that could require further management (Noss 1990). Unfortunately, genetic monitoring is often neglected although it can provide critical information for guiding management activities (Laikre et al. 2010). C. affinis ssp. neglecta is a federally endangered species for which active restoration plans are in place. Genetic diversity for all populations of C. affinis ssp. neglecta was found to be equivalent to its common congener, and similar to other endangered Castilleja species, C. levisecta and C. grisea, which also showed high genetic diversity despite factors including overgrazing (Helenurm et al. 2005) and geographic isolation (Godt et al. 2005). There was, however, some evidence of elevated levels of inbreeding in some populations, which is likely a result of the limited dispersal between patches. This higher inbreeding, along with the documented low population recruitments (Nieder and Weiss 2011), suggests some populations may benefit from some genetic augmentation (Frankham et al. 2011).

One restoration option currently being considered is the augmentation of small pop ulations using seed from neighboring populations. When moving plants from one population to another to increase a population’s numbers, practitioners are often faced with the concerns that (1) the environments are so different that the plants will not thrive or (2) the introduced plants are going to be sufficiently genetically different that progeny from these crosses will show a decline in fitness, known as outbreeding depression (Frankham et al. 2011). Land managers who know the environments they are working in can best address the first concern. As for the genetic concerns, given the small geographic range, the fact that all the populations share similar climate and soil preferences and show only minor morphological differences, the risk of outbreeding depression with C. affinis ssp. neglecta is likely reduced (Frankham et al. 2011). There was evidence that populations in close proximity were most similar and therefore using proximal populations to augment populations will be the least risky option. Informed mixing of source populations may be a useful approach to maximize genetic diversity while minimizing inbreeding in reintroduction efforts (Hufford et al. 2012; Neale 2012; Pekkala et al. 2012; Fant et al. 2013b; Maschinski et al. 2013). Mixing seed sources for the rare Florida coastal Jacquemontia reclinata resulted in hybrid vigor and positive population growth for populations that had mixed source materials (Maschinski et al. 2013). Ideally, conducting a similar experiment on C. affinis ssp. neglecta could demonstrate if this would also be the case for this subspecies. The authors support mixing geographically close populations as a source for seeds to enhance another population of C. affinis ssp. neglecta (Havens et al. 2015). The exception would be the CNAC and CNPH population of C. affinis ssp. neglecta, which showed some genetic structure similar to C. affinis ssp. affinis, and it is advised to avoid using these populations as source material because of possible introgression of the two subspecies at this location. C. affinis ssp. neglecta is a federally endangered species would benefit from restoration efforts that aim to increase genetic diversity and reduce inbreeding such as informed mixing of proximal populations.