Introduction

Successful establishment of non-native species is related, in large part, to high propagule pressure, high genetic diversity and release from natural enemies (Sakai et al., 2001). In recent years, global trade has increased the opportunities for repeat introductions of non-native organisms, resulting in increased propagule pressure (Wilson et al., 2009). Multiple introductions play an important role in the expansion and establishment of invasive species and are one of the main factors mediating genetic diversity in introduced populations (Roman & Darling, 2007). While small founding populations from a single introduction event are expected to have lower genetic variation than native populations, partially, as a result of genetic bottlenecks, drift or founder effects (Bock et al., 2014), the release of numerous individuals into the novel area increases the likelihood of sufficient genetic diversity and a lower potential to experience negative genetic consequences of inbreeding, stochastic diversity loss through drift and lowered response to selection (Roman & Darling, 2007). Accordingly, recent genetic studies have revealed that reduction in genetic diversity in non-native populations is less frequent than expected (e.g. Cristescu, 2015). Moreover, multiple introductions may create populations that are even more genetically diverse than a single source population, particularly when the species in its native range is highly structured (i.e. the species forms genetically disparate native range populations; Kolbe et al., 2008).

Release of non-native species from natural enemies, such as parasites, in new areas may provide them with an advantage over local species, leading to increased population growth due to the absence of top-down regulation (Keane & Crowly, 2002). While local species have to cope with a full range of pathogens, introduced species usually harbour a decreased number of their natural parasites and often show a lower proportion of infected individuals compared to native populations (Torchin et al., 2003). Parasite release, or the opposite process, parasite co-introduction, depends on several mechanisms. Probability of parasite release/co-introduction reflects the founding population size (number of introduced individuals), origin (from low/high parasitized sources) or developmental stage (e.g. eggs vs. adults). If the parasite is co-introduced, its survival and establishment may also reflect the size of its own founding population. Moreover, in parasites with complex life cycles, presence of all hosts necessary for completion of the life cycle is crucial for their survival and establishment in the new environment (summarized in MacLeod et al., 2010).

The pumpkinseed sunfish Lepomis gibbosus (L., 1758), a North-American Centrarchid fish species, was introduced into Europe via a number of complex routes, including multiple source populations, multiple introduction sites and repeat introductions since the end of the 19th century (Soes et al., 2011). Historical data indicate several pumpkinseed introduction events of small to large numbers from different parts of North America (summarized in Wiesner et al., 2010), and this was supported by a recent phylogeographical study which identified at least two important source regions in North America (Yavno et al., 2020). The first introductions mainly took place in two countries, France and Germany, from where they were likely distributed to other parts of Europe by aquarists, anglers and hobbyists, either incidentally or accidentally (Wiesner et al., 2010). Thus, human-mediated introductions probably caused the present discontinuous distribution of pumpkinseed in Europe (García-Berthou et al., 2005), with recent studies showing the species as established in at least 28 European countries (Copp & Fox, 2007), with some of these populations, particularly in the southern part of Europe, considered invasive (Cucherousset et al., 2009). Presently, the European invasive range of this species stretches from the Iberian Peninsula in the west to the Sea of Azov and coastal rivers in the east (Diripasko et al., 2008; Ribeiro & Collares-Pereira, 2010). Interestingly, despite the complex nature of pumpkinseed introduction into Europe, Yavno et al. (2020) observed low genetic diversity, suggesting that genetic variance may be less important than expected for the successful establishment of pumpkinseed and its spreading into new environments.

Non-native pumpkinseed populations in Europe are known to host at least 11 North-American parasite species. The majority of these introduced parasites are monogeneans, ectoparasites with a direct life cycle, and include seven dactylogyrid and two gyrodactylid species (Sterud & Jørgensen, 2006; Ondračková et al., 2011, 2019a, b, 2020; Havlátová et al., 2015; Kvach et al., 2018). In addition, two endoparasitic species, with complex life cycles, include one larval diplostomid digenean (Kvach et al., 2017; Stoyanov et al., 2017; Ondračková et al., 2019a) and larval proteocephalid cestode species (Kvach et al., 2020). The monogeneans co-introduced to Europe are mostly specific to centrarchid fish hosts. Similarly, metacercariae of Posthodiplostomum centrarchi Hoffman, 1958 appear specific for centrarchid sunfishes, as indicated through recent molecular studies (Locke et al., 2010; Kvach et al., 2017) and such high specificity limits potential host switching of these parasites to native fish species. On the other hand, the larval cestode, Proteocephalus ambloplitis (Leidy, 1887), recently found in Europe (Kvach et al., 2020), is non-specific and may infect a wide range of intermediate fish hosts from different fish families (Amin, 1990), potentially representing a threat to native European fish species (Kvach et al., 2020).

The main objective of this study was to determine the genetic differentiation of non-native-European pumpkinseed populations and to identify how their genetic structure relates to the distribution and abundance of parasite species. We tested 14 European pumpkinseed populations, covering seven river basins of the Black Sea, North Sea, Aegean Sea and Atlantic drainages, for six microsatellites in order to assess whether and how (i) European populations are structured and (ii) how they vary in genetic diversity. The same fish used for genotyping were also subjected to (iii) parasitological examination. In addition to overall parasite species richness and abundance, we evaluated the species richness and abundance of North-American species (i.e. co-introduced, or at least co-evolved) as both the number and abundance of North-American parasites may potentially reflect the propagule pressure (MacLeod et al., 2010). We also assessed infection by local parasite species (i.e. either native or introduced from parts of the world other than North America). Parasite load of acquired local parasites may reflect host genetic diversity, as the loss of genetic variation that has been observed in European pumpkinseed populations (Yavno et al., 2020) is expected to increase host susceptibility to parasites (Luikart et al., 2008). Parasite infection was compared with genetic diversity of L. gibbosus at both the population and individual levels to assess (iv) the association between heterozygosity and allelic richness (representing host genetic diversity) and parameters of parasite infection. Finally, as parasites have been recognized as a useful tool for discriminating host populations (MacKenzie, 2002; Poulin & Kamiya, 2015), we tested (v) whether parasite communities represent successful natural tags usable for identification of genetic lineages of non-native pumpkinseed populations in Europe.

Materials and methods

Fish and parasite sampling

Fish were sampled by electrofishing (SEN portable backpack electrofishing gear, F. Bednář, Czech Republic; maximum output 225/300 V, frequency 75–85 Hz), seine-netting (7 m long beach seine, mesh size 4 × 4 mm) or angling rod, depending on habitat type. Fourteen localities were sampled from late spring to summer during 2015–2018 from across the pumpkinseed’s European distribution area (Table 1, Fig. 1), covering basins of the Rivers Danube (Black Sea drainage), Elbe (North Sea), Garonna (Atlantic Ocean) and Struma (Aegean Sea). In addition, fish tissue samples were collected from the Landeira and Coruche Reservoirs (Sado and Tejo basins, Atlantic drainage) and the Knielingen pond (Rhine River basin, North Sea drainage), along with corresponding parasitological data published by Kvach et al. (2017) and Ondračková et al. (2019a). In total, 20–21 adult fish (fish maturity determined according to gonad development) were collected per population and transported alive to the laboratory. Fish were individually sacrificed prior to measurement for standard length (SL, nearest 1 mm) and dissection. The caudal fin of each fish was preserved in 96% ethanol for further DNA analysis. Fin tissues of 3 individuals from each sampling site are deposited in Genetic Bank of the Institute of Vertebrate Biology, Czech Academy of Sciences, which is part of National Animal Genetic Bank, Czech Republic under unit IDs/catalog numbers IVB-F-2150-2191. Fish were examined under an Olympus SZX 10 binocular microscope (Olympus Optical Co., Japan) for the presence of parasites according to standard methods within 3 days of capture (Kvach et al., 2016). Fish skins, fins, gills and opercula were examined for ectoparasitic species, while the eyes, heart, liver, spleen, kidney, peritoneal cavity, intestinal tract and muscle tissue were examined for endoparasitic species. Protozoan parasites and myxozoans were recorded and identified from fresh material to the lowest possible taxonomic level. Monogenean parasites were preserved in a mixture of ammonium picrate and glycerine (1:1) and determined according to the shape and size of haptor hard parts and copulatory organ (Malmberg, 1970). Trematodes and Cestodes were removed from the relevant tissues, preserved in 4% hot formaldehyde, stained with iron acetic carmine, dehydrated in ethanol of increasing concentration and mounted in Canada balsam as permanent slides (Georgiev et al., 1986; Cribb & Bray, 2010). Nematoda, Copepoda, Branchiura and Bivalvia were preserved in 4% formaldehyde. Prior to identification, nematodes were mounted as glycerol temporary slides (Moravec, 2013). Mounted parasites were identified using an Olympus BX51 light microscope equipped with phase contrast, differential interference contrast and Olympus MicroImage™ Digital Image Analysis software (Olympus Optical Co., Japan), using corresponding keys (Beverley-Burton, 1984; Gusev et al., 1985; Bauer, 1987; Hoffman, 1999; Kuchta et al., 2008; Moravec, 2013).

Table 1 Populations of Lepomis gibbosus analysed in the present study, with detailed characteristics on the location related to river basin/sea drainage, coordinates (latitude and longitude), date of sampling, number of fish analysed (No.), fish standard length (SL, mm) with mean and range (min–max), mean ± SD parasite abundance and total parasite species richness
Fig. 1
figure 1

Map indicating the 14 Lepomis gibbosus populations analysed during this study, showing the identification of the three clusters obtained from STRUCTURE analysis: (i) “French” lineage in purple, (ii) “Mid-European” lineage in green, (iii) “Edge” lineage in orange. Localities signed by numbers: 1 Landeira Reservoir, 2 Coruche Reservoir, 3 Bordeaux Lac, 4 Bègles plage, 5 Koenigsmacker pond, 6 Knielingen pond, 7 Karolinka stonepit, 8 Opatovice sandpit, 9 Donbas sandpit, 10 Helpun borrow pit, 11 Lake Neusiedl, 12 Kula Reservoir, 13 Ognyanovo Reservoir and 14 Kocherinovo borrow pit

Microsatellite genotyping

Samples from 282 individuals from the 14 localities were genotyped for 6 polymorphic microsatellite loci (Lmar9, Lmar14, Lmar18, Lmar29, RB7, and RB20) previously isolated from different centrarchid species (Colbourne et al., 1996; DeWoody et al., 1998; Schable et al., 2002). DNA was extracted from ethanol-preserved tissue samples (fish caudal fins) using the Invisorb® Spin Tissue Mini Kit (STRATEC Biomedical AG) following the standard protocol. PCR was performed using the PCR Multiplex Kit (Qiagen) under standard protocols, with an annealing temperature of 51.5°C for all six loci. Amplifications were carried out in a total reaction volume of 10 µl. The PCR products were electrophoresed on the ABI Prism®3130 Genetic Analyser (Applied Biosystems) and analysed using GeneMapper® v. 3.7 (Applied Biosystems).

Data analysis

To assess fish host genetic diversity, the GenAlEx package v. 6.41 (Peakall & Smouse, 2006) was used to compute basic statistical parameters, i.e. number of alleles, Shannon’s information index (I), observed heterozygosity (Ho), expected (He) and unbiased expected heterozygosity (UHe), fixation index (F) and percentage of polymorphic loci (%P). A Hardy–Weinberg equilibrium (HWE) test was performed for each locus in all populations using the Markov chain method (“Exact probability test”) in Genepop v. 4.0.10 (Raymond & Rousset, 1995; Rousset, 2008) under the following parameters: dememorisation = 10,000, batches = 100, iterations per batch = 5000. The frequency of null alleles for each locus and population was calculated using the FreeNA software package (Chapuis & Estoup, 2007). Mean allelic distance (d2) was determined as the squared difference in repeat units in microsatellite loci between two alleles in one individual, which acts as a useful indicator of inbreeding (Coulson et al., 1998). Because of observed non-normality of the data, differences in mean allelic distance between populations was tested using the Kruskal–Wallis ANOVA test followed by the Mann–Whitney Multiple Comparison post hoc tests.

Five microsatellite loci: Lmar9, Lmar18, Lmar29, RB7, and RB20, were amplified in all individuals of L. gibbosus to evaluate fish host genetic structuring. Missing genotypes only occurred at locus Lmar14, with negative amplification success in 40% of samples. To avoid potential bias, population-level genetic variability was analysed using a reduced dataset of 5 loci (Lmar9, Lmar18, Lmar29, RB7, and RB20; excluding locus Lmar14). The population genetic structure of microsatellite genotypes for 282 individuals sampled from the 14 populations was then analysed using the Bayesian clustering algorithm implemented in the program STRUCTURE v. 2.3.4 (Hubisz et al., 2009). The program was run with 20 independent simulations for each K from 1 to 12, with 1 million iterations for each simulation (burn-in = 100,000 iterations), using an admixture model and a correlated allele frequencies model (λ = 1). The output of the STRUCTURE analysis was post-processed using CLUMPAK software (Kopelman et al., 2015) to identify separate groups of runs on the basis of similarity between Q-matrices for each K using the LargeKGreedy algorithm, random input order and 2,000 repeats. Different modes were identified for similarity scores from the 20 runs for each K value at a threshold of 0.9. The web-based software STUCTURE HARVESTER (Earl & vonHoldt, 2012) was used for summarizing output data from STRUCTURE. The likelihood of K (Ln Pr(X|K)), the ΔK criterion using the method of Evanno et al. (2005) and a proportion of similar runs that formed the major modes for each K were used to infer the best number of real populations in the datasets.

Mean parasite abundance was calculated for all parasite species, including species native to North-America and those acquired in the host’s new range, these including parasites native to the new range and non-native parasites introduced from countries other than North-America (i.e. in our case, Asia). Mean abundance was expressed as mean number of parasites per all hosts in a sample, while frequency of occurrence was calculated as the percentage of localities where each species was present. Parasite community was analysed at both the infracommunity (including all parasites on a single host) and component community (all parasites in a host population) level (Bush et al., 1997).

To test for an association between parasite infection and genetic variance at the individual level, i.e. individual fish heterozygosity (measured as the proportion of heterozygous loci) and mean allelic distance, a generalized linear model (GLM) with a Poisson error distribution and locality as a random factor was used. At the population level, associations between parasite infection and genetic variance, i.e. heterozygosity and allelic richness, were tested using the Spearman correlation test. As total parasite abundance was significantly affected by the abundance of North-American parasites, and mean heterozygosity was strongly associated with allelic richness (with the results for total parasite abundance/species richness and allelic richness being comparable), only outputs related to North-American parasite infection and fish host heterozygosity are presented in results.

To assess whether the parasite community data corresponded to genetic population structure, discriminant analysis was performed on the parasite abundance data to classify differences between parasite communities among genetic lineages suggested by STRUCTURE analysis. Initially, the analysis was performed to discriminate populations along the three clusters, showing the most likely representation of European pumpkinseed population structuring. Subsequently, a further analysis was performed to discriminate populations classified according to suggestions produced by STRUCTURE for all other K (i.e. K = 2, 4–10). The percentage of correctly classified individuals among the three lineages was used to represent the discrimination power. The Spearman rank correlation test was used to evaluate the relationship between the parasite community data discriminatory power (arcsin transformed proportion of correctly classified individuals) and ΔK criterion values calculated by STRUCTURE. All analyses were performed using Statistica v. 13.1 for Windows (StatSoft, 2017) and the R software package v. 3.0.3 (R Core Team, 2015).

Results

Fish host microsatellites

Six loci were amplified in the 14 pumpkinseed populations sampled. All six loci were polymorphic with 6 to 20 alleles per locus and the mean frequency of microsatellite null alleles per population was less than 5% for all loci (for a summary of amplified alleles in the six loci, see Online Resource 1). The total number of alleles per population, measured for the five loci amplified in all populations (i.e. excluding locus Lmar14), ranged from 10 in the Landeira Reservoir and Donbas sandpit to 31 at Bègles Plage (Table 2). Average allelic richness was strongly correlated with the total number of alleles and ranged from 2 to 6.2 (corresponding to the same sampling sites mentioned above). All except four populations exhibited 100% polymorphic loci. Observed heterozygosity (Ho) corresponded to the expected heterozygosity (He), with Ho being slightly higher than He in most populations. Accordingly, the number of effective alleles and the Shannon information index differed between populations and were strongly associated to allelic richness (Table 2). Three populations, Karolinka stonepit (Elbe basin) and the Landeira and Coruche Reservoirs (Atlantic drainage) were characterized by a higher F-index compared to other populations, indicating the possible occurrence of inbreeding (Table 2). Two of these three populations also showed low allelic distance (mean d2 = 23.7 and 26.8 at Landeira and Coruche, respectively), which was similar to that at the Donbas sandpit (mean d2 = 17.4) but significantly lower compared to other populations (Kruskal–Wallis test, H = 97.4, P < 0.001; Mann–Whitney multiple comparisons, P < 0.001 for all comparisons except Ognyanovo Lake and the Opatovice sandpit, where P ≤ 0.012). Private alleles (15 in total) were observed in nine populations, with a maximum of three private alleles in the population from the Lake Neusiedl (Danube basin). The majority of populations were in HWE (Table 2), with the exception of the Lake Neusiedl population, where a significant deviation from HWE (P < 0.01) was observed, attributable to an excess of heterozygotes, and a marginally significant deviation from HWE (P < 0.05) was also observed in the Bordeaux Lac population (Table 2).

Table 2 Total (Total NA) and mean (Mean NA) number of alleles per 5 microsatellite loci, number of private (NAp) and effective (NAe) alleles, Shannon Information index (I), expected heterozygosity (He), unbiased expected heterozygosity (UHe), observed heterozygosity (Ho), percentage of polymorphic loci (P), Fixation index (F) and deviation from Hardy–Weinberg equilibrium (HWE)

The best supported model in STRUCTURE separated genetic variation into three clusters based on a combination of likelihood of K (Ln Pr(X|K)), the ΔK criterion (Evanno et al., 2005), and the proportion of similar runs. Individuals were arranged into 2 to 12 clusters (see Online Resource 2), with the geographic distribution of individual clusters set at K = 3, the most likely solution to represent population structuring of non-native pumpkinseed analysed in this study (Figs. 1, 2). The three clusters corresponded with (i) the “French” lineage, representing populations in the region of original species introductions (purple in Figs. 1, 2); (ii) the “Mid-European” lineage, including populations along the Southern European Invasion corridor interconnecting the rivers Rhine, Maine and Danube, supplemented by the River Struma population (green); and (iii) the North–South “Edge” lineage, including populations introduced in the 1970s–1980s from uncertain sources in the Elbe river basin (Czech Republic) and Portuguese reservoirs within the Atlantic basin (orange). An increase in the number of clusters leads to successive separation of individual populations, first from the “Mid-European” and “Edge” groups, followed by the “French” group in models for K > 10.

Fig. 2
figure 2

Genetic population structure for K = 3 estimated in STRUCTURE from 282 pumpkinseed sunfish Lepomis gibbosus from 14 localities. Each individual is represented by a vertical line partitioned into three coloured segments, the length of each colour being proportional to the estimated membership coefficient (Q). Black lines separate different sampling localities: 1 Landeira Reservoir, 2 Coruche Reservoir, 3 Bordeaux Lac, 4 Bègles Plage, 5 Koenigsmacker pond, 6 Knielingen pond, 7 Karolinka stonepit, 8 Opatovice sandpit, 9 Donbas sandpit, 10 Helpun borrow pit, 11 Lake Neusiedl, 12 Kula Reservoir, 13 Ognjanovo Reservoir and 14 Kocherinovo borrow pit

Composition of parasite communities

The pumpkinseed’s parasite fauna comprised 34 parasite species across 14 European localities, including nine non-native species from North-American, four non-native species from East Asia and 15 native European (or worldwide) species, with six taxa not identified to species level (Table 3). Species richness at particular sites ranged from three (Donbas sandpit) to ten (Bègles Plage, Bordeaux Lac, Knielingen and Koenigsmacke ponds). Mean parasite abundance varied greatly between sampling sites (Table 3), with maximum values at Lake Neusiedl (246) and Kocherinovo borrow pit (269), and lowest values at the Opatovice and Karolinka ponds (3). None of the parasite species were found at all sampling sites, but a large proportion (68%) were observed at just 1 to 2 sites. Three North-American species (i.e. the monogeneans Onchocleidus dispar (Mueller, 1936) and O. similis Mueller, 1936 and the digenean Posthodiplostomum centrarchi Hoffman, 1958) occurred with highest frequency of occurrence (64–79%; Table 3), while the most frequent native parasite was a branchiurid, Argulus foliaceus Leach, 1814, present at 43% of sites. The majority of parasites infecting pumpkinseed in Europe were non-native North-American species (Table 3, Fig. 3), representing 91.1% of all parasite individuals found (overall mean abundance 95.1). Native European parasites represented just 2.2%, with a mean abundance of 2.3. The most abundant non-native Asian parasite was the copepod Neoergasilus japonicus (Harada, 1930), observed at five sampling sites with mean abundance of 6.4.

Table 3 Abundance (or presence indicated as +) of parasite species collected from Lepomis gibbosus at 14 sites in Europe, with indication of parasite natural distribution
Fig. 3
figure 3

Map indicating the proportion of parasites in 14 European pumpkinseed populations analysed in this study: North-American parasites in red colour, acquired European and Asian parasites in dark blue colour and unidentified parasites in white colour. Localities signed by numbers: 1 Landeira Reservoir, 2 Coruche Reservoir, 3 Bordeaux Lac, 4 Bègles plage, 5 Koenigsmacker pond, 6 Knielingen pond, 7 Karolinka stonepit, 8 Opatovice sandpit, 9 Donbas sandpit, 10 Helpun borrow pit, 11 Lake Neusiedl, 12 Kula Reservoir, 13 Ognyanovo Reservoir and 14 Kocherinovo borrow pit

Parasite infection and microsatellite diversity

At the population level, parasite species richness was significantly associated with both microsatellite heterozygosity and allelic richness (N = 14, rs = 0.72, P = 0.004 and rs = 0.81, P < 0.001, respectively). Similar, though less strong, results were found for North-American parasites (rs = 0.58, P = 0.03 and rs = 0.74, P = 0.003, respectively). Mean abundance of all parasites was not associated with either heterozygosity or allelic richness (all P > 0.1), but there was a trend indicating a positive correlation between allelic richness and abundance of North-American parasites, and a negative correlation with abundance of acquired parasites (rs = 0.57, P = 0.03 and rs = − 0.65, P = 0.011, respectively). Though the association between abundance of acquired parasites and heterozygosity was not significant (rs = − 0.50, P = 0.067), a negative relationship was still apparent. At the individual level, there was no significant association between abundance or species richness in either North-American or Eurasian acquired parasite species, or with individual heterozygosity or mean allelic distance (GLM, all P > 0.05).

Discriminant power of parasite communities

The three genetic clusters structuring European pumpkinseed populations showed a high correspondence with the results of parasite community discriminant analysis (Fig. 4). In total, 89.4% of the fish hosts were correctly classified for the three genetic clusters, with the best classified individuals originating from the “Edge” cluster (just 3% of all false classifications). The percentage of correctly classified individuals based on composition of parasite communities correlated significantly with the ΔK criterion values (see Online Resource 2) calculated in STRUCTURE (N = 10, rs = 0.82, P = 0.004), with a minimum of 80.5% correctly assigned individuals for the nine genetic clusters.

Fig. 4
figure 4

Plot of discriminant analysis based on abundance of metazoan parasite species, separated by three genetic lineages of European pumpkinseed populations: “French” lineage: purple circles, “Mid-European” lineage: green squares and “Edge” lineage: orange triangles (colours correspond to the results of genetic population structuring in STRUCTURE, Fig. 2)

Discussion

Host genetic lineages

Genetic differentiation for the non-native European pumpkinseed populations across 14 sites corresponded partially with historical records and analysis of mitochondrial variation (Yavno et al., 2020). The optimal outcome based on microsatellite analysis within STRUCTURE software (Pritchard et al., 2000) separated the dataset into three groups, these corresponding to a “French” populations with high genetic diversity (i.e. high allelic richness and heterozygosity), North/South “Edge” populations with generally low genetic diversity and West/East “Mid-European” populations along the Rhine-Main-Danube invasion corridor (Panov et al., 2009), exhibiting low to high allelic richness and heterozygosity. Historical data indicated that the first countries where pumpkinseed were introduced (in the late 19th century) were France and Germany, with hundreds of individuals introduced from at least two different sources in North America. It was from these two countries that the species spread into other countries (Wiesner et al., 2010). Of the European populations analysed, sites located in France (covering the Garonne and Upper Rhine basins) showed highest genetic diversity, which may reflect the high propagule pressure (number of individuals introduced and number of introduction events) in this region (Roman & Darling, 2007). Despite this, microsatellite diversity in these genetically diverse populations was still lower than pumpkinseed in their native range (Weese et al., 2012). With the exception of locus Lmar29, allelic richness in the five loci examined was approximately two times lower than in fish from 12 native populations in the Adirondack region of New York State and Central Ontario (Weese et al., 2012).

The North/South “Edge” lineage includes populations introduced more recently (i.e. within 20–40 years) into waterbodies along the River Elbe (North; Czech Republic) and the rivers Tejo and Sado (South; Portugal). Populations along the Elbe first appeared around 1998–2005 (V. Horak & V. Jelinek, personal communication), though the original source remains unknown. Clear genetic differentiation in populations along the Rhine (the region from which pumpkinseed first dispersed into Central Europe, including the Elbe river basin; Wiesner et al., 2010) indicates that pumpkinseed from the Karolinka and Opatovice ponds originated from other sources, possibly associated with recreational fishing, which is popular in both the Czech Republic (Nechanska et al., 2016) and Portugal (Ribeiro et al., 2009). Pumpkinseed introductions to the Landeira and Coruche Reservoirs in the late 1970s appear to have originated from successive East/West introductions along Iberian drainages, which may have reduced allelic richness and contributed to higher inbreeding values (Ribeiro et al., 2009; Kvach et al., 2017). Low allelic richness and heterozygosity in these populations indicates that they were founded by a limited number of individuals (lower propagule pressure). The only study to date to have analysed microsatellites in European pumpkinseed populations covered eight reservoir populations in Portugal (Bhagat et al., 2011). Genetic differentiation of these populations indicated two distinct groups, suggesting that current pumpkinseed distribution in Portugal is the result of at least two colonization events or a single event including fish from more genetically distinct populations. In partial support of this, the close genetic similarity between the two Portuguese populations used in this study indicates a common source of introduction.

The largest lineage, termed the West/East “Mid-European” lineage in this study, included seven populations along the southern invasion corridor connecting the Rhine, Maine and Danube (Panov et al., 2009). This lineage corresponds, more or less, to populations with the highest occurrence of the European ND1-1 haplotype of the mitochondrial ND1 gene (Yavno et al., 2020). Presence of populations from localities along the Southern invasion corridor corresponds well with available historical data. According to Gariloaie (2007), for example, pumpkinseed gradually expanded from Germany along the Rhine, Oder and Danube towards Eastern Europe. Aside from Helpun, where pumpkinseed were introduced over 2010–2011 (Ondračková et al., 2019b), most of the populations first appeared during the 1970s and 1980s (noted in, Kvach et al., 2017 for Bulgarian populations, Kritscher, 1973 for Lake Neusiedl, unknown for Knielingen pond). These populations were established following the introduction from adjacent water bodies after the flood events, release by aquarists or unintentional stocking (Kritscher, 1973; Trichkova et al., 2007; Uzunova & Zlatanova, 2007; Ondračková et al., 2019b), and the means of introduction being an important factor affecting genetic diversity. The “Mid-European” lineage includes populations with both relatively high allelic richness and heterozygosity (but still lower than that of the “French lineage”) and low genetic diversity (Donbas), similar to populations in the “Edge” lineage. The Lake Neusiedl population proved the most diverse population in this lineage, showed the highest number of private alleles and was the only population that showed a highly significant deviation from HWE (Table 2). The original founder population at this site is presumed to have been small as the fish were apparently introduced by aquarists (Kritscher, 1973); nevertheless, numbers increased rapidly and the species is now the third most abundant fish species in the lake (Sallai, 2019). As the actual number of founding individuals and introduction events remain unknown; however, more detailed studies are required to clarify why the pumpkinseed population in Lake Neusiedl is now so diverse.

Parasite infection

The pumpkinseed’s parasite fauna in its native range comprises over a hundred species (Hoffman, 1999), of which 11 were introduced to Europe (see Table 3, Havlátová et al., 2015 and Kvach et al., 2018) along with their fish hosts (monogeneans, 9 spp.) or by other potential vectors (digenea, cestodes; means of introduction not confirmed to date). The probability of parasite co-introduction increases with the number of infected hosts in the founding population or with the number of introduction events (Torchin & Mitchell, 2004). Our data clearly support this, showing a significant positive association between abundance and the number of North-American parasite species and host genetic diversity measured by allelic richness and heterozygosity. Highest North-American parasite abundance and species richness was observed in “French” lineage populations, and these also exhibited the highest genetic diversity among the European populations used in this study. Co-introduced parasite fauna in this lineage may even have been underestimated as at least one other North-American monogenean species has been documented from another French locality (Havlátová et al., 2015). This corresponds with the results of García-Berthou et al. (2005), who found that France and Germany were the main recipient countries of non-native fishes from North America.

High species richness and/or abundance were also observed in genetically diverse populations of the “Middle-Europe” lineage, namely the Knielingen pond in Germany, Lake Neusiedl in Austria and the Kocherinovo borrow pit in Bulgaria. Since the initial wave of introductions in France and Germany during the 19th century, hundreds of individuals have been introduced elsewhere (Wiesner et al., 2010). With the exception of establishing populations with high genetic variability, a high number of introduced organisms is likely to support co-introduction of native parasite fauna. Further, in those populations where genetic data indicate a large founding population and/or multiple introduction events, one would expect a sufficient number of founding parasite species/individuals to survive and become established in the new environment. Conversely, host population bottlenecks in the initial stages of species translocation resulting from small founder populations may disrupt parasite transmission to new areas, due to their absence in the founding population or presence at such low densities that the parasite population cannot be sustained (MacLeod et al., 2010). Populations with low allelic richness, heterozygosity or proportion of polymorphic loci typically had no, or just a low number, of North-American parasites in the parasite community. This was only partially true for Portuguese populations, however, where three of nine North-American parasite species were observed at relatively high abundance, possibly due to the co-occurrence of other centrarchids in the reservoirs (Ribeiro et al., 2009). Indeed, largemouth bass (Micropterus salmoides (Lacépéde, 1802)) has been introduced multiple times across Portugal and Spain due to its popularity as an angling species (Ribeiro et al., 2009).

Non-native species commonly accumulate local parasite species from the introduced range, though usually in insufficient numbers to achieve comparable richness as in the species’ native range (Torchin et al., 2003). Native pumpkinseed populations, for example, commonly harbour 14–21 species per population (e.g. Cone & Anderson 1977; Chapman et al., 2015), compared to 3–11 spp. in non-native European populations. Though 22 species have been acquired by pumpkinseed throughout its non-native European range (Table 3), most of these are found at low abundance and occur at only 1–2 sites. As an exception, the non-native copepod Neoergasilus japonicus, originating from East Asia, has been recorded in five widely separated populations in France, Bulgaria and the Czech Republic, and was recorded at an exceptionally high mean abundance of 82 copepods per fish at the Donbas sandpit site. According to Nagasawa & Sato (2015) and Ondračková et al. (2019b), pumpkinseed and other centrarchids appear to be highly susceptible hosts for this parasite species, potentially contributing to the spread of N. japonicus in Europe. Of the native European parasites, only larval gryporhynchid cestode Valipora campylancristrota (Wedl, 1855) successfully infected pumpkinseed at five localities in Portugal, Bulgaria and the Czech Republic, with only the Portuguese populations showing a mean abundance above 1. Though the number of local parasites infecting pumpkinseed was generally very low, an increased abundance of acquired parasites was observed in populations with decreased allelic richness and, to a certain extent, heterozygosity, supporting the theory that increased parasitism may be a consequence of reduced heterozygosity in wild populations (Luikart et al., 2008).

Parasites as tags of host genetic lineages

The use of parasites as biological tags has gained wider acceptance in recent decades, particularly for discriminating the origin of commercial fish (Poulin & Kamiya, 2015). Of the 282 fish analysed, 252 (89.4%) were correctly classified to the respective genetic lineage (for K = 3) according to their parasite communities. Moreover, the probability of correct classification increased significantly with increasing Evanno’s ΔK values, representing a good predictor of the actual number of clusters. In other words, the composition of the parasite community was highly associated with the genetic lineage of their fish hosts. For pumpkinseed, other methods widely used for population discrimination, such as meristic analysis, do not work well owing to the species’ high morphological plasticity (Bhagat et al., 2011); hence, use of parasites as biological tags may provide a more reliable tool for the discrimination of pumpkinseed origin. Use of non-native species as a model could potentially suffer some limitations, however, as the European distribution of pumpkinseed is not natural, instead resulting from intentional or unintentional introductions over different periods in a wide range of countries and localities. As a result, distant populations may in fact be related, as in the case of the Norwegian population, which was probably introduced from the Czech Republic (Sterud & Jorgensen, 2006), while relatively close populations sharing the same river basin may in fact represent different genetic lineages, as observed in the Koenigsmacker and Knielingen ponds in the Rhine basin (Fig. 1). Interestingly, despite the abovementioned limitations, the probability of correctly assigning a fish to its genetic lineage within its non-native range was higher than that found for natural populations of marine fish, which show a moderate to high classification percentage (44% in Poulin & Kamiya, 2015; 71% in Marengo et al., 2017). As noted by Poulin & Kamiya (2015), however, improvements in discriminatory power when using fewer host groups (i.e. three in our case) may actually represent an artifact. Nevertheless, a high classification success (minimum of 80.5% for K = 9) was achieved even when higher numbers of fish groups were separated by STRUCTURE analysis. As such, we suggest that parasite community composition, alongside more costly genetic analyses, could prove a useful biological tool for discriminating non-native fish populations and their interrelationships.