Introduction

The Chinese mitten crab, Eriocheir sinensis (H. Milne Edwards), is distributed natively in China, extending from the latitude 24–43° N and longitude 112–124° E (Zhao et al. 1988). In the Yangtze River valley of China, the adult Chinese mitten crabs begin to migrate downstream in October, and they reach the estuary during November to December. Mating and egg releasing occur at salinities between 9 and 25 (salinity near 18 is commonly used in artificial reproduction). During April and May, the five zoeal stages develop in the brackish water of the estuary. The megalopa stage occurs during the last part of May and the first part of June, when it increases tolerance for low salinity and is able to survive in freshwaters. After 7–10 days, the megalopae develop into benthic juvenile crabs that migrate into freshwaters. The juvenile crabs spend time in freshwater streams and lakes to grow. When sexual maturity is achieved on their second year, they migrate back to saline waters to reproduce. Generally, in China, the life span of the Chinese mitten crab is about 22 months for males and 24 months for females.

Global trading expansion has resulted in an increasing number of translocations of species (Everett 2000). The Chinese mitten crab was introduced in Europe and North America mainly through ships’ ballast water (Panning 1939; Cohen and Carlton 1997; Hanfling et al. 2002; Herborg et al. 2003, 2007). In 1912, the first record of Eriocheir sinensis was documented in a tributary of the Weser River in North Germany (Peters 1933), and in 1914, it was also found in the Elbe River. From there, it reached the Baltic Sea via the Kiel Canal in 1927 and Finland in 1933. Now, the Chinese mitten crab has spread across most areas in Continental Europe: North to Sweden and Finland, West to the Tyne River in UK (Herborg et al. 2005), South to the Tagus River in Portugal (Cabral and Costa 1999), and East to the Caspian Sea region (Slynko et al. 2002; Robbins et al. 2006).

In North America, an adult male was collected in a water-intake pipe in the Detroit River at Windsor, Ontario, Canada in 1965. In addition, three adult specimens, a female and two males, were collected by fishermen in Lake Erie in 1973 (Nepszy and Leach 1973). Meanwhile, several crabs were collected in the Great Lakes region between 1973 and 1994 (Cohen and Carlton 1995). None of these areas, however, currently supports a self-sustaining population of mitten crabs because the Great Lakes are too fresh for mitten crabs to successfully reproduce (Rudnick et al. 2000). In California, the Chinese mitten crab was first detected by commercial shrimp trawlers in the open waters of the South San Francisco Bay in 1992. Since then, the population has quickly developed in the estuary (Cohen and Carlton 1997; Rudnick et al. 2000).

Since the Chinese mitten crab has settled in Continental Europe and North America, more and more reports about its introduction pathway, colonization history, potential impact, and the like have been conducted. For instance, Cohen and Carlton (1997) suggested that the most likely origin of the San Francisco population would be from Asia, especially China, through the ballast water or international importation. Meanwhile, Hanfling et al. (2002), based on the mtDNA COI sequence, found a single haplotype possessed by the San Francisco population that disappeared in the native populations but appeared in the European populations. This indicated that the source of the San Francisco population was the European and not the Asian populations. This result is surprising, not easily understood, and does not evidently show the source of the European populations. Then Herborg et al. (2007), using microsatellite markers, examined the genetic structure and differentiation of native and colonized populations. However, the Yangtze River population, which is the largest and most important population in native distribution, was not included in their study. Therefore, the genetic relationship between the native and colonized Chinese mitten crab remains unclear. In this paper, the sequence variation of mtDNA COII and Cytb, and nuclear ITS fragments from the native (China) and colonized (European and San Francisco) populations of the Chinese mitten crab were investigated. Correspondingly, the aims of the study are as follows: (1) to obtain more exact information on the population structure and differentiation of the Chinese mitten crab in native and colonized distributions, (2) to examine the source of the San Francisco population, and (3) to detect which rivers in the native distribution would be the most likely origin of the European populations of the Chinese mitten crab.

Materials and methods

Samples collection

The specimens of the Chinese mitten crab from seven rivers were collected during 1998–2005. These included three river populations in mainland China (Yangtze, Yellow, and Liaohe Rivers) collected in 1998, the North American (Sans Francisco Bay) population in 1999, and three European populations in 2002–2005. The muscles of each specimen’s leg were removed and preserved in 95% ethanol and stored at −20°.

DNA extraction, PCR amplification, and sequencing

The total genomic DNA from the muscles was extracted using a proteinase K and phenol–chloroform procedure. The quantity and quality of the extracted DNA were estimated on 1% agarose gels stained with ethidium bromide (EB).

The polymerase chain reaction (PCR) was used to amplify the mitochondrial COII, part Cytb fragment, and the nuclear ITS (ITS1-5.8sRNA-ITS2). The primers for the complete amplified COII gene were as follows: COII-F (5′-CAT CAC CTT GTC AAG GTG AAA-3′) and COII-R (5′-CAT GGT CAG TCT CAG GAT TCA-3′). Then the primers for the part Cytb fragment were as follows: Cytb-F (5′-TCG TGA CGT AAA TTA CGG TTG-3′) and Cytb-R (5′-TGT CCA GTA ATG ACA TAA GGG-3′). These primers were designed from the reported mtDNA sequences of the Chinese mitten crab (Sun et al. 2005; GenBank accession no. NC_006992). The nuclear ITS was partially amplified using the primers ITS-F (5′-TCA TTA ACG TGT TTG CCC GA-3′) and ITS-R (5′-TGG TTT CTT TTC CTC CGC TT-3′), which were designed from the reported ITS sequences of the Chinese mitten crab (Tang et al. 2003; GenBank accession no. AF316388). PCR was performed on an Eppendorf Thermal Cycler in a reaction mixture of 50 μl containing 2 μl DNA polymerase (Tiangen products, China), 5 μl PCR buffer (Tiangen products, China), 2 μl template DNA (50 ng/μl), 2 μl dNTP (0.4 mM), 4 μl primers (0.2 μM each), and 35 μl distilled water. The amplification conditions were 94°C for 5 min. This was followed by 30 cycles at 94°C for 30 s, 54°C for 30 s, and 72°C for 1 min, and a final extension was done at 72°C for 10 min. The verification of successful PCR amplification was assessed by agarose gel electrophoresis.

All amplified products were purified using a 3S Spin PCR Product Purification Kit (Biocolor Inc., China) following the supplier’s instructions. Then the purified products of the COII gene were cloned into the plasmid pUG-T18. The cloned products were then prepared for sequencing by M13 primers. The PCR products of the part Cytb and ITS was purified and directly sequenced using the PCR primers. All specimens were sequenced on an Applied Biosystems ABI 3730 DNA sequences.

Sequences alignment and data analyses

The DNA sequences were edited using the BioEdit software (Hall 1998), aligned by CLUSTAL X version 1.83 (Thompson et al. 1994), and checked via ocular inspection. The aligned sequences were used to analyze the population structure and genetic variation using Arlequin version 3.01 (Excoffier et al. 2005). Nucleotide diversity was obtained by estimating the gene diversity (h) and nucleotide diversity (π) for the mtDNA using Tajima’s (1983) and Nei’s (1987) methods, and by the Watterson estimator (θ W) for the ITS (Watterson 1975). The pairwise fixation index (F ST) was employed to test the genetic differentiation between populations.

The Analysis of Molecular Variance (AMOVA; Excoffier et al. 1992) as implemented in Arlequin was used to assess the population structure of the Eriocheir sinensis. In the AMOVA, populations were grouped according to their continent distribution (China, Europe, and North America). The neutrality tests of Tajima’s D (Tajima 1989), Fu and Li’s D (1993) on the total number of segregating sites were performed in Arlequin version 3.01 and DnaSP version 4.0 (Rozas et al. 2003), respectively. The R 2 statistic (Ramos-Onsins and Rozas 2002), which has more power for detecting population growth within a small sample size, was also estimated using the DnaSP version 4.10 (Rozas et al. 2003).

Phylogenetic analysis was performed using Bayesian and Neighbour Joining methods. Eriocheir japonicus served as an outgroup. The minimum spanning network trees from haplotypes were also created using the program Network 4.2.0.1.

Results

Sequence variation

In total, 143 individuals from the seven populations were sequenced for the mtDNA COII and Cytb genes, and 90 individuals for the ITS segment. The complete sequence length of the COII gene was 693 bp, and the unambiguous lengths of the Cytb and ITS segment were 766 and 706 bp, respectively.

The COII gene complete sequence in all individuals (= 143) contained 95 segregating sites, 49 of which were parsimony informative sites, and 46 were singleton sites. The Cytb fragment contained 22 segregating sites, 7 of which were parsimony informative sites, and 15 were singleton sites. In the combined data of both the COII and Cytb, 95 segregating sites and 31 parsimony informative sites were found in all individuals. As expected for protein coding genes, no insertion or deletions were found in both mtDNA sequences.

The variable sites information in the mtDNA sequences of all populations are given in Table 1. The three Chinese populations of Eriocheir sinensis exhibited 50 segregating sites and 16 parsimony informative sites in the COII sequence, and 16 segregating sites and 5 parsimony informative sites in the Cytb sequence. The three European populations of Eriocheir sinensis exhibited 27 segregating sites and 7 parsimony informative sites in the COII sequence, and 7 segregating sites and 4 parsimony informative sites in the Cytb sequence. Compared with the two continents, the San Francisco population showed slightly low polymorphic sites in both mtDNA sequences.

Table 1 Samples size, numbers of halplotypes, haplotype diversity (h), nucleotide diversity (π), and neutrality test statistics in mtDNA for seven populations of Eriocheir sinensis

The alignment of ITS sequence contained a total of 97 segregating sites with 27 parsimony informative sites (Table 2). That is, 38, 22, and 9 segregating sites and 15, 15, and 8 parsimony informative sites were detected in the Chinese, European, and San Francisco populations, respectively.

Table 2 Samples size, polymorphic sites, Watterson’s estimator (θ W ), nucleotide diversity (π), and neutrality test statistics in ITS for seven populations of Eriocheir sinensis

Population structure and genetic diversity

There were 57, 25, and 83 haplotypes detected in the mtDNA COII, Cytb, and the combined sequences, respectively. Then nine haplotypes in the COII and three haplotypes in the Cytb were shared in all populations, and 10 haplotypes were shared in the combined sequences (Table 3). Haplotype H1 was both the most widespread (all populations) and the most common (35 individuals) in the COII sequence. Furthermore, haplotype H3 was the most common (53 individuals) in the Cytb sequence. Only one haplotype (H42) in COII and two haplotypes (H55, H58) in the combined sequences were detected in the European and San Francisco populations.

Table 3 Absolute frequency of shared haplotypes from seven populations of Eriocheir sinensis

The haplotype diversity (h) and nucleotide diversity (π) of the two separate and combined mtDNA sequences within each population are presented in Table 1. The COII exhibited higher haplotype and nucleotide diversities than the Cytb. The three native populations displayed high haplotype diversity, ranging from 0.8737 to 0.9737, in the COII, while the colonized populations, the European and San Francisco populations, exhibited relatively low haplotype diversity (0.7692–0.9263). In comparing the populations, the Yangtze River population showed high haplotype diversities (0.8370–0.9899) and nucleotide diversities (0.0029–0.0032), while the Rhine population showed the lowest haplotype diversities (0.4103–0.8717) and nucleotide diversities (0.0007–0.0029). Moreover, the San Francisco population displayed low haplotype and nucleotide diversity. The neutrality test from Tajima’s D and Fu and Li’s D showed significant findings only in the mtDNA sequences from the Yangtze River population, and in the Cytb sequence from the San Francisco population. Similarly, the Yangtze River population showed low R 2 statistic values.

In the ITS fragment, the Yangtze River population showed the highest variable sites (segregating sites and parsimony informative sites) and genetic diversity (θ W), while the European and San Francisco populations exhibited similar levels as the Watterson’s estimator (θ W = 0.0041–0.0050). On the other hand, the highest nucleotide diversity (π) and K value (average number of nucleotide differences) were detected in the San Francisco population, whereas the lowest Watterson’s estimator and nucleotide diversity were found in the Liaohe River population. Tajima’s D value in the Yangtze River population had negative significance, and there was no significance found in the other populations. Conversely, Fu and Li’s D values were not significant in all populations. Meanwhile, the R 2 statistic was the lowest in the Yangtze River population.

Geographic differentiation

The AMOVA in Table 4 showed that the genetic differentiation among the three continents was not significant for all studied sequences. However, it was extremely significant among populations within continents and within populations. It was clear that the genetic differentiation within populations contributed to most of the total variance (accounting for 78.49–85.41%).

Table 4 Hierarchical Analysis of Molecular Variance (AMOVA) for Eriocheir sinensis

On the other hand, the pairwise F ST values between all populations indicated that most of the pairwise comparisons were significant (Tables 4 and 5). In the mtDNA COII and Cytb sequences, the pairwise F ST value was not significant between the Liaohe and Yellow River populations in China. However, the three populations in China have significant F ST values with each population in Europe and San Francisco (except for no significance between the Yangtze River and Rhine River in the COII sequence). For Europe, they were all significant in the pairwise F ST values for each two comparisons, except between the Elbe and Thames in the COII sequence, and between the Elbe and Thames in the Cytb and combined data. The San Francisco population had the largest F ST values as compared to the other populations, except for the Thames River population.

Table 5 Population pairwise F ST values in the COII (below diagnose) and Cytb (above diagnose) for seven populations of Eriocheir sinensis

In the ITS sequences, no significant F ST value in the native populations was found between the Liaohe and Yellow River populations. Interestingly, no significant F ST values were detected between the three native populations and the Elbe River population, and between the Yangtze and Rhine River populations. Nonetheless, significant F ST values were found from each comparison of the other populations. The San Francisco population exhibited a significantly high genetic divergence with the Chinese and European populations.

Evolutionary relationships

By using the Japanese mitten crab (Eriocheir japonicus) as an outgroup, the Bayesian and NJ analyses of all studied sequences consistently displayed that all haplotypes or specimens from the three continents mixed with one another and were clustered into one big group (not showed). This indicated that some colonized haplotypes or specimens were closer to native haplotypes or specimens than to the other colonized haplotypes or specimens. Meanwhile, the haplotype networks from all the studied sequences showed that the native and colonized haplotypes were mixed throughout the networks. Figure 1 shows the haplotype network of the Cytb sequence.

Fig. 1
figure 1

Haplotypes network for the three continents of Eriocheir sinensis in the Cytb sequence. Note: YZ, Yangtze River; LH, Liaohe River; YR, Yellow River; EB, Elbe River; RN, Rhine River; TM, Thames River; SF, Sans Francisco Bay

Discussion

Genetic variability and structure

Due to the founder effects or genetic bottlenecks in colonized populations, there are some classic, traditional, and experimental views indicating that genetic variability is much lower in colonized populations than in native ones (Rozas et al. 1990; Williamson 1996; Austerlitz et al. 2000; Stepien et al. 2002; Piccinali et al. 2007). In this study, overall, the genetic variability, expressed as haplotype and nucleotide diversity, in the native populations is higher than that in the colonized European and San Francisco populations. In particular, the low genetic variability was detected from the mtDNA sequence in the Rhine, Thames, and San Francisco populations. On the other hand, there was some differentiation in the haplotype composition between the native and colonized populations. Furthermore, a markedly loss of singletons has been associated with the colonized populations. For example, the singleton variants at positions 93, 386, and 582 in the mtDNA COII sequence showed largely higher frequencies of loss in the European and San Francisco populations (30.3–50.0%) than in the native populations (0.0–17.3%).

Some reports also indicated that the genetic variability in colonized populations is equivalent to that in native ones (Stepien and Tumeo 2006). In the present study, the genetic variability in the Elbe River population was the highest among the colonized populations and displayed the same level of variation with the native populations. This would have the following alternative explanations: (1) the Elbe River population had undergone large genetic change and adaptive evolution since colonization in the 1900s; (2) the Elbe River population was likely introduced from China via multiple rivers because two shared haplotypes (H21 and H22) in COII were only detected in the Elbe, Liaohe, and Yellow Rivers; and (3) the other European populations were isolated from the Elbe River population, having a lower founder population size than the latter.

Genetic differentiation

It is common that genetic differentiation would increase between the native and colonized populations, and also among the colonized populations. In our study, the native populations of the Chinese mitten crab displayed significant genetic differentiation from those in European and North American continents (Tables 5 and 6). Within the three native populations, there was a significant differentiation between the Yangtze and the two other native populations, but no significant differentiation was found between the Liaohe River and Yellow River populations. This was likely because of the long distance (more than 1,000 km) between the Yangtze River and the two other native rivers, and the short distance (100–200 km) between the Liaohe River and Yellow River with large-scale gene flow. Meanwhile, the population expansion in the Yangtze River population would increase the genetic differentiation with the two populations.

Table 6 Population pairwise F ST values in the COII + Cytb (below diagnose) and the ITS (above diagnose) for seven populations of Eriocheir sinensis

Hanfling et al. (2002), based on mtDNA COI sequence, found that there is a significant differentiation among the populations from different continents, but there is no significant differentiation within the European populations. Conversely, Herborg et al. (2006), using microsatellite markers, detected a significant differentiation within the European populations. In our study, the large-scale genetic differentiation, presented by most of the significant pairwise F st values, was found. This is in agreement with the results of the work of Herborg et al. (2006), but is contrast to those of the work of Hanfling et al. (2002). Meanwhile, the result also supported that the expansion-drift and gene flow pattern was involved in the European population of the Chinese mitten crab (Herborg et al. 2006).

The pairwise F st values in the mtDNA and ITS sequences indicated that the San Francisco population had a highly large-scale differentiation from the Chinese and European populations, with the exception of the Thames population in the mtDNA. The F ST differences between the mtDNA and ITS sequences would have resulted from the low sample sizes and sampling variance in the ITS sequence. No significant differentiation between the Thames population and the San Francisco Bay population showed that current gene flow would have occurred between the two populations.

Origin of the European and North America populations

Since the first discovery of the Chinese mitten crab found in a non-indigenous area of the Weser in 1912, this species have colonized most of Continental Europe (Herborg et al. 2003, 2005, 2006). Meanwhile, in 1992, the Chinese mitten crab was first collected in San Francisco Estuary and then had spread rapidly to river areas upstream of the Delta (Rudnick et al. 2000, 2003, 2005). Some reports have discussed the possible pathways and invasion history of the Chinese mitten crab to Europe and North America (Peters 1933; Panning 1939; Cohen and Carlton 1997; Hanfling et al. 2002; Herborg et al. 2003, 2007). It is evident that the Chinese mitten crab in Europe was introduced from China via ballast water (Peters 1933; Panning 1939; Hanfling et al. 2002; Herborg et al. 2003, 2007). However, the source population of the Chinese mitten crab in San Francisco Bay is under dispute. Cohen and Carlton (1997) suggested that the Chinese mitten crab in San Francisco most likely originated from Asia by either ballast water or international importation, and that Europe seemed an unlikely immediate source. Based on the result that the single haplotype in the mtDNA COI sequence in the San Francisco population could be detected in the European populations but not in the Chinese populations, Hanfling et al. (2002) indicated that the San Francisco population was established by a single event and originated from the European populations. In the present study, some haplotypes possessed by the San Francisco population were found in both the native and European populations, but some were detected only in the European populations (Table 3). This demonstrated that the Chinese mitten crab colonized in San Francisco would have had two possible introduction sources, one from the Chinese populations and another from the European populations.

The Yangtze River basin is a major distribution area for the Chinese mitten crab, followed by the Liaohe River basin. From haplotype composition and frequency, the Yangtze River population was the most likely possible origin of the European populations. However, some haplotypes were commonly possessed by the Liaohe and Yellow River populations and the European (Elbe River) population. This indicated that the European populations of the Chinese mitten crab originated from multiple rivers in China, that is, not only from the Yangtze River but also from the Yellow or Liaohe Rivers.

Furthermore, no significant neutral test and R 2 values in the colonized populations demonstrate that there were moderate founder populations in Europe and San Francisco. In addition, the AMOVA results that showed no significant differentiation among the three continents indicates colonization history or introduction on multiple occasions, that is, the frequent ship traffic from each continent give a higher probability for the introduction of the Chinese mitten crab via ballast water.