Introduction

When introductions of new plant species occur, gene flow between previously allopatric species or populations is a possibility. Creation of novel hybrids or genotypes may create increased genetic variation (Anderson 1949; Stebbins 1959; Abbott 1992), new interactions between genes (Templeton 1981), and the transfer of favorable genes traits such as cold tolerance or disease/herbivore resistance (Milne and Abbott 2000; Abbott et al. 2003; Whitney et al. 2006; Rieseberg et al. 2007), all of which may enhance a species’ ability to invade (Ellstrand and Schierenbeck 2000; Vila et al. 2000; Lee 2002; Sakai et al. 2001). Most novel hybrids that are created are not selectively advantageous, but evidence is growing that hybrids or novel genotypes are involved in many invasions (Schierenbeck and Ellstrand 2008). Infamous recent examples include marsh grasses in the genus Spartina (Ainouche et al. 2008; Sloop et al. 2008), the freshwater aquatic plant Myriophyllum spp. (Moody and Les 2002), and the Brazilian pepper tree (Schinus terebinthifolius; Williams et al. 2005). In addition to potentially stimulating invasiveness, novel hybrids also may affect control efforts by presenting phenotypes that are novel to potential classical biological control agents.

Saltcedars (Tamarix ramosissima Ledeb. and T. chinensis Lour.; family Tamaricaceae) are diploid deciduous shrubs or small trees. Tamarix ramosissima is distributed across temperate Asia, while T. chinensis is restricted to China, Korea, and Japan (Baum 1978). Since their introduction into the North America in the 1800s, these plants have become common in riparian habitats (Robinson 1965), and now inhabit 470,000–650,000 ha of the western USA (Zavaleta 2000). Saltcedar invasions are widely thought to have significant negative ecological and environmental impacts due to ground water consumption, alteration of hydrologic regimes, deposition of salts on underlying soils, reduced recreational use of riparian areas, reduced biodiversity, and poor wildlife habitat quality (Di Tomaso 1998; Zavaleta 2000; Bailey et al. 2001; Shafroth et al. 2005).

The two saltcedar species T. ramosissima and T. chinensis are difficult to distinguish morphologically (Crins 1989), differing slightly in sepal margin, petal shape, and filament insertion (but see Baum 1967; Baum 1978). Because of this morphological similarity some authors have, not unreasonably, listed T. ramosissima as a synonym of T. chinensis (e.g., Allred 2002). Recent molecular work has shown that in their native range the two species are genetically distinct for a nuclear DNA sequence marker but within the USA invasion many of the plants (23%) are novel hybrids (Gaskin and Schaal 2002). Another study, using both nuclear and chloroplast DNA sequence markers, found relatively rare and localized hybrids between athel (T. aphylla (L.) Karst.) and both saltcedar species (Gaskin and Shafroth 2005). Clearly, hybrids between Tamarix spp. exist within the USA invasion, but estimates of saltcedar hybrid frequency have been made using only a single locus DNA sequence marker. This may underestimate hybrid frequency (e.g., an F1 hybrid back-crossed with one of its parental species may produce progeny with a homozygotic genotype for the single locus marker and appear to be a parental type, when it actually contains genetic material from both species). Additionally, the single locus marker does little to inform us about levels of introgressive hybridization (e.g., how much of a plant’s genetic material comes from either parental species). Other molecular markers, such as microsatellites, ISSRs (Inter Simple Sequence Repeats), and AFLPs provide multi-locus data, and are thus better suited for determining patterns of hybridization and introgression. AFLPs have been used in many hybridization/introgression studies of animals (e.g., Young et al. 2001; Wang and Porter 2004), wild plants (e.g., Gobert et al. 2002; Scotti et al. 2002; Choler et al. 2004; Guo et al. 2005), crop plants (e.g., Rieseberg et al. 1999; Hansen et al. 2003), and invasive plants (e.g., O’Hanlon et al. 1999; Tranel and Wassom 2001; Bleeker 2003).

Novel hybrids may present a problem for biological control programs using highly host-specific agents. Some genotypes of biological control targets have been found to have differential susceptibility to fungal or insect control agents (e.g., Burdon et al. 1984; Bruckart et al. 2004; Goolsby et al. 2006). If a biological control agent is effective against some invasive types but not others, there is potential for the uncontrolled types to increase their range, as happened with ecotypes of Chondrilla juncea L. (Burdon et al. 1981), thus host-specificity testing of biological control agents should be done on a range of parental and hybrid types, especially when the invasion contains novel hybrids or genotypes. Recent efforts at classical biological control of saltcedar in the western USA have resulted in the field release of a leaf-feeding chrysomelid beetle, Diorhabda elongata Brullé, which has established and extensively defoliated saltcedar at some sites but failed to establish at other sites (DeLoach et al. 2004). This indicates that there are gaps in our knowledge about saltcedar, Diorhabda, or the ecology of their interactions in the USA.

Analysis of invasion identities and population dynamics using genetic markers has recently contributed much information about invasive species (Roderick and Navajas 2003; Sakai et al. 2001). Here, we used AFLPs to determine if USA saltcedar plants are pure parental genotypes, F1 hybrids, or introgressed individuals. We did this by comparing USA plants to plants from the native range of each species. This information will be used in future host-specificity tests of biological control agents being developed to control the saltcedar invasion.

Materials and methods

We obtained DNA from silica-dried material of 213 plants, including 45 T. chinensis from China, 58 T. ramosissima from Asia (China, 1 plant; Azerbaijan, 2; Republic of Georgia, 4; Turkmenistan, 13; Iran, 15; Kazakstan, 22), and 110 saltcedars from the USA, including northern plants (Montana, 48; Wyoming, 2), southern plants (Texas, 30; Arizona, 2; New Mexico, 5, Oklahoma, 3), and mid-latitude plants (Utah, 20) (Fig. 1). Collections were mainly focused on populations that are sources of plants being used in biological control host-specificity tests (Ft. Peck Reservoir, Montana, and Tornillo, Texas).

Fig. 1
figure 1

Location of USA collection sites. Number next to each location is the number of saltcedar plants sampled at that location

DNA extraction and AFLP analysis

Genomic DNA was extracted from approximately 20 mg of silica-dried material using a modified CTAB method (Hillis et al. 1996). The AFLP method followed Vos et al. (1995) with these modifications: restriction and ligation were performed during a single step in an 11 μl reaction containing 500 ng genomic DNA, 2 U MseI, 1 U EcoRI, 1 × T4 DNA ligase buffer, 0.45 U T4 DNA ligase, 0.05 M NaCl, 0.5 × BSA, 4.5 μM MseI adaptor, 0.45 μM EcoRI adaptor, and H2O. The restriction-ligation was incubated at room temperature overnight, and then 5.5 μl of the product was diluted to 100 μl in TE (15 mM Tris and 0.1 mM EDTA). A pre-selective polymerase chain reaction (PCR) was performed in a 20 μl reaction containing 4 μl of the diluted, restricted-ligated product, 1 × PCR buffer, 1.5 mM MgCl2, 0.2 mM each dNTP, 0.2 μM of each pre-selective amplification primer (MseI + C and EcoRI + A), 0.5 U Taq polymerase and H2O. The pre-selective PCR consisted of 20 cycles of: 30 s at 94°C, 60 s at 56°C, and 60 s at 72°C. Ten microliters of the pre-selective amplification product was diluted to 200 μl in TE (15 mM Tris and 0.1 mM EDTA). The selective amplification was performed in a 20 μl reaction containing 3 μl of the diluted pre-selective amplification product, 1× PCR buffer, 1.5 mM MgCl2, 0.2 mM each dNTP, 0.1 μM MseI selective primer, 0.05 μM EcoRI selective primer dye-tagged with D4 (blue), 0.5 U of Taq polymerase and H2O. The selective PCR was 120 s at 94°C; 10 cycles of: 20 s at 94°C, 30 s at 66°C (decreasing by 1°C each cycle), 120 s of 72°C; 25 cycles of: 20 s at 94°C, 30 s at 56°C, 120 s at 72°C. One microliter of each selective PCR product was combined with 0.3 μl of 600 bp size standard and 28.7 μl of de-ionized formamide and loaded into a Beckman Coulter CEQ 2000 fragment analyzer.

Fragments were first scored using the Fragment Analysis module of the Beckman Coulter CEQ Genetic Analysis System software (bin width of one nucleotide, accepted peak height = 10% of second highest peak). These bins were then manually screened and scored using Genographer (Benham 2001) to visualize fragments, making this a semi-automatic scoring method, as suggested by Papa et al. (2005). Gel images were normalized across total signal to avoid errors in scoring due to band intensity. All selective primer combinations of MseI + CAA, CAC, CAT, CTA, or CTA and EcoRI + AAG, ACC, or ACT were pre-screened for eight plants and the three most polymorphic primer pairs were chosen (MseI + CAT/EcoRI + ACC, MseI + CTA/EcoRI + ACC, and MseI + CTA/EcoRI + ACT). Repeat runs, starting from restriction/ligation, were run for ten plants (5% of all plants) and scored blindly, then compared to the original runs to calculate AFLP error rates.

NTSYS-pc ver. 2.1 software (Rohlf 1992) (SIMQUAL program) was used to calculate the Dice (1945) similarity coefficient (as suggested by Bonin et al. 2007, and identical to the Nei and Li (1979) coefficient): 2a/(2b+c) where a = number of bands present in both samples, b and c = number of bands present in only one or the other sample. Principal coordinates analysis (PCOA) was performed on Dice similarity coefficients using the DCENTER and EIGEN programs of NTSYS.

Because there is putatively no hybridization between the saltcedars in their native range (Gaskin and Schaal 2002), we were able to identify diagnostic markers, defined as bands that were found 100% of the time in one parental species, and never in the other parental species.

A Bayesian model-based clustering analysis; NewHybrids (Anderson and Thompson 2002), was also used to look at the relation between parental species and USA invasive plants. NewHybrids computes posterior probabilities that the samples belong to either parental type or to distinct hybrid classes such as F1 and F2 hybrids. The program allows analysis over multiple generations of interbreeding. It is likely that Tamarix species have been interbreeding in the USA for over 150 years, however, as suggested by the program authors, we restricted the number of generations to two, which gave us six possible classes (the two parental types, F1 and F2 hybrids, and the two backcrosses of F1 hybrids to the parental types). Parental species (T. chinensis and T. ramosissima from their native ranges) were identified during the data input stage. We ran five Markov chains, which all converged on a common space within 1,000 sweeps. After that we reset the average category probabilities and ran the program for over 200,000 sweeps for each chain, starting at different random points each time. The posterior probabilities for each class of each plant varied by <1% between the five chains.

Mantel r correlations were calculated with the program “Isolation by Distance, on the Web” (Jensen et al. 2005), comparing matrices of the pairwise geographic distances and pairwise similarity coefficients. Significance of correlation (P ≤ 0.05) was tested by using 1,000 random permutations.

We were able to develop an index of introgression, defined as the estimated proportion of the genome originating from T. chinensis (as opposed to T. ramosissima) for three analyses of the AFLP data. Pearson product-moment correlation coefficients were used to assess the degree of concordance between these indices, as well as the associations between one of the indices and latitude and longitude. Correlation coefficients were calculated using SAS (SAS Institute Inc. 2000).

Results

Initial output from the fragment analyzer software listed 318, 276, and 295 fragments (total = 889) between 50 and 704 bp for MseI + CAT/EcoRI + ACC, MseI + CTA/EcoRI + ACC, and MseI + CTA/EcoRI + ACT, respectively. We manually screened these fragments and scored 43, 74, and 31 unambiguous, polymorphic fragments, respectively, between 50 and 610 bp for a total of 148 fragments.

AFLP error rates

Knowledge of error rate in AFLP analysis is important as errors can bias final conclusions (Bonin et al. 2004). Within the 148 fragments scored for each of the ten plants where the AFLP procedure was duplicated (1,480 total fragments) we found 18 differences between the duplicate runs for an error rate of 1.2%. This is equivalent to an error of 1.8 miscalled fragments out of 148 scored fragments per plant.

Similarity coefficients

These ranged from 0.81 to 1.00 (mean = 0.91) between native T. chinensis plants, 0.47–0.99 (mean = 0.73) between native T. ramosissima plants, and 0.39–0.98 (mean = 0.65) between invasive USA plants. The maximum similarity between T. ramosissima and T. chinensis plants from Asia was 0.45, indicating that the two species are genetically distinct in their native range. When comparing native with invasive plants, similarity coefficients ranged from 0.37 to 0.91(mean. = 0.62) between native T. chinensis and invasive USA plants, and from 0.31 to 0.85 (mean = 0.65) between native T. ramosissima and invasive USA plants.

Diagnostic markers

We found seven diagnostic fragments (fragments found in all plants of one species and never in the other species) for T. chinensis and none for T. ramosissima. Two of the USA plants contained zero T. chinensis diagnostic fragments, and 17 contained all seven diagnostic fragments, suggesting that 91 of the 110 USA plants (83%) are of hybrid origin. As an index of introgression based on the T. chinensis diagnostic markers, we divided the number of diagnostic markers (from one to seven) found in each USA plant by the total number of diagnostic markers, such that 1 represents “pure” T. chinensis and 0 represents “pure” T. ramosissima.

Principal coordinates analysis

Results of the principal coordinates analysis based on Dice similarity coefficients are shown in Fig. 2. The variance of the first two principal coordinates accounted for 31 and 5% of the total variation, respectively. This analysis also indicates that T. ramosissima and T. chinensis plants from Asia are genetically distinct, and that the USA plants lie on a continuum between the parental genotypes, particularly along the first axis of the PCOA. As an index of introgression for the PCOA analysis, we divided the PCOA axis 1 score of each USA plant by the difference between the maximum PCOA axis one score (a T. chinensis individual) and the minimum PCOA axis one score (a T. ramosissima individual) so that the index ranges from zero to one.

Fig. 2
figure 2

Clustering of native T. chinensis, native T. ramosissima and USA invasion plants based on principal coordinates analysis of Dice similarity coefficients. The first two principal coordinates accounted for 31 and 5% of the total variation, respectively

Bayesian analysis

Posterior probabilities of each USA plant, sorted by geographic region, are shown in Fig. 3. The maximum posterior probability for each USA plant was >0.5 and 90% of the USA plants had maximum posterior probabilities >0.8. Based on maximum posterior probabilities, plants from Montana and Wyoming were classified as either T. ramosissima (= 4) or F1s backcrossed with T. ramosissima (Bc T. ramosissima, n = 47). Plants from Utah were classified as either Bc T. ramosissima (n = 15) or F2 (n = 5). Plants from the southern states were classified as Bc T. ramosissima (n = 2), F2 (n = 10), Bc T. chinensis (n = 17), or T. chinensis (n = 10).

Fig. 3
figure 3

Graphical summary of the results of a Bayesian model-based clustering analysis (NewHybrids) for 110 USA invasion plants. Each bar represents one individual and individuals are grouped by northern (Montana, Wyoming), central (Utah), and southern (Arizona, New Mexico, Texas, Oklahoma) locations. The posterior probabilities of each individual being a parental T. chinensis or T. ramosissima, F1 or F2 hybrid, or F1 backcrossed to a parental type are shown using the indicated bar fill patterns

Only 14 USA plants (13%) were classified as parental types. No USA plants were classified as F1, an unexpected outcome given the high numbers of individuals classified as backcross and F2. In their native ranges, all 45 T. chinensis plants were classified as T. chinensis and 52 of the 58 native T. ramosissima were classified as T. ramosissima (the other 6 were classified as Bc T. ramosissima).

As an index of introgression for the Bayesian analysis, we multiplied each of the six posterior probabilities by the expected proportion of the genome originating from T. chinensis for their respective class and then summed the resulting six values. For example, if an individual had posterior probabilities of 0.9, 0.08 and 0.02 for being T. chinensis, an F1, and a Bc T. chinensis, respectively, the value of the index for that individual would be (0.9 × 1) + (0.08 × 0.5) + (0.02 × 0.25) = 0.9425.

Concordance among indices of introgression

The AFLP data indicate more extensive introgression than the single-locus, nuclear PPCL sequence data used in the earlier study (Gaskin and Schaal 2002). Gaskin and Schaal (2002) reported that 23% of the individuals were hybrids based on PPCL data and, for the specimens used in this study, 35% of the 91 individuals with PPCL sequences assignable to T. chinensis or T. ramosissima were actually hybrids. The diagnostic AFLP markers and Bayesian analysis of the AFLP data indicate that 83 and 87% of the individuals in this study were hybrids, respectively.

Comparisons of the PPCL and the three AFLP-based indices of introgression are shown in Fig. 4. The median values of the three AFLP-based indices are lowest (in terms of proportion T. chinensis) for individuals classified as T. ramosissima based on PPCL data, intermediate for individuals classified as hybrids, and highest for individuals classified as T. chinensis. However, for each of the AFLP-based indices, the range of the proportion T. chinensis values for each PPCL classification is quite broad, particularly for individuals classified as T. chinensis by PPCL where the range between the 10th and 90th percentiles of the AFLP-based indices exceeds 0.60 in each case.

Fig. 4
figure 4

Concordance among four indices of introgression for USA invasion plants. Each horizontal bar is a box plot showing the 10th, 25th, 50th, 75th and 90th percentiles for the distribution of the proportion T. chinensis values estimated using 1 of the 3 AFLP-based indices of introgression. Separate box plots are shown for each of three hybrid-status categories (parental T. chinensis, parental T. ramosissima, or hybrid) based on single locus PPCL sequence data (Gaskin and Schaal 2002)

Concordance among the AFLP-based indices is much stronger. The AFLP-diagnostic marker index had the lowest correlation coefficients with the other two AFLP-based indices (AFLP-Bayesian: r = 0.832; AFLP-PCOA: r = 0.863; P < 0.0001 in both cases). The AFLP-diagnostic marker index also produced higher proportion T. chinensis values than the other two AFLP-based indices. Concordance between the AFLP-Bayesian and AFLP-PCOA was high (r = 0.972, P < 0.0001), even for individuals that had low maximum posterior probabilities in the Bayesian analysis.

Geographic variation

The isolation by distance test shows correlation between geographic distance and similarity coefficients in the USA was moderate (r < −0.66) and highly significant (P < 0.0001). The correlation was negative, indicating that a larger geographic distance between plants correlates with less genetic similarity. The AFLP-PCOA index of introgression was strongly and negatively correlated with latitude (r = −0.869, P < 0.0001) and positively correlated with longitude (r = 0.238, P = 0.0122). Similar correlations exist between latitude, longitude and the other two AFLP-based indices of introgression.

Discussion

The two saltcedar species found in Asia are genetically distinct as evidenced by their separation in the principal coordinates analysis. Additionally, the Bayesian analysis does not categorize any of the native T. ramosissima as T. chinensis, nor vice-versa. The presence of T. chinensis diagnostic markers also indicates that both species are genetically distinct.

The principal coordinates analysis indicates that invasive saltcedars in the USA form a continuum between the parental types. Diagnostic marker and Bayesian analyses suggest that 83 and 87% of the USA plants, respectively, are genetically intermediate to the parental species. Our study indicates a much higher rate of hybridization than found in an earlier single locus study of USA plants (23%, Gaskin and Schaal 2002), which is not surprising because our AFLP-based analyses are based on more loci (7 diagnostic marker loci or 148 loci for the PCOA and Bayesian analyses). Differences in the number of loci used for analysis most likely accounts for the very weak concordance between the hybrid-status classification based on a single locus and the three AFLP-based indices of introgression. The diagnostic marker introgression values were less strongly correlated with the PCOA and Bayesian introgression values, most likely reflecting differences in the number of loci used. Diagnostic marker introgression values also tended to be higher than PCOA and Bayesian introgression values. This is presumably due to the fact that all of the markers were diagnostic for T. chinensis.

The very high degree of concordance between the PCOA and Bayesian analyses is surprising given the radically different approaches and assumptions of the two analyses. The Bayesian analysis was constrained to two generations of introgression resulting in six genotype classes whereas the PCOA analysis makes no assumptions about the number of generations of introgression. It is likely that Tamarix species have been interbreeding for more than two generations in the USA since their introduction over 150 years ago. One anomaly in the Bayesian analysis, that more individuals were classified as F2 than F1 individuals, could arise because there have been more than two generations of introgression. However, the simulation studies of Anderson and Thompson (2002) show that the Bayesian analysis, while working well to distinguish between purebred and hybrid individuals, produces less robust classifications of individuals to specific hybrid types when most of the allele frequencies are not at near-diagnostic levels.

The high level of hybridization, possibly dating from their 1800 s introduction into the USA, no doubt led to confusion in morphologically determining plants as one species or the other, especially when examining only USA plants. This led to an apparently reasonable conclusion of synonymy of the two species by Allred (2002), but AFLP and DNA sequence data from the native range supports taxonomic retention of both the parental species, rather than placing T. ramosissima as a synonym of T. chinensis.

Seehausen’s (2004) hybrid swarm theory, along with earlier theories regarding hybridization as a source of genetic variability (e.g., Stebbins 1959, Lewontin and Birch 1966; Templeton 1981), suggest that adaptive radiation, under certain selection regimes, can be enhanced via new combinations of genotypes. The high proportion of Tamarix hybrids in the USA may have facilitated this plant’s successful invasion. Though we have not tested the Tamarix hybrids for selective advantages over their parental types regarding characteristics that may enhance invasion in western USA environments, the high ratio of hybrids to parental types (>5:1) suggests that such an advantage may exist.

This study was not intended to explore the full geographic extent of Tamarix in the USA, however, we did find significant geographic variation. Results of the Mantel correlation test indicate that genetic similarity decreases as the geographic distance between pairs of samples increases. Latitudinal rather than longitudinal separation may contribute most to the geographic distance effect. In the Bayesian analysis, Montana and Wyoming plants tended to be categorized as predominantly Bc T. ramosissima and a few parental T. ramosissima types. The Utah plants were either Bc T. ramosissima or F2, while the southern plants were predominantly Bc T. chinensis, F2, or parental T. chinensis. The AFLP-based indices of introgression were strongly correlated with latitude and less strongly, albeit significantly, correlated with longitude.

The geographic variation in genetic makeup of the USA plants is based on a small sample of plants from a limited number of locations. If the pattern were to hold true following more intensive sampling, it would be interesting to contemplate the reason behind this genetic structuring. Is natural selection creating the difference between plants from different latitudes, or is there a historical component where T. ramosissima were primarily introduced to the northern range and T. chinensis to the southern range?

Despite the reasons for this possible pattern, these results highlight that the USA invasion may be composed primarily of novel genotypes and hybrids. These novel genotypes would have no evolutionary history with Asian biological control agents. Whether that is a factor for future control efforts remains to be seen, but as a result of this research a selection of saltcedars that represent the full range of parental and hybrid genotypes found in the invasion are now being included in selected host-specificity tests and ecological studies.