Introduction

Conserving the intraspecific diversity of species under exploitation is essential to maintain their adaptive potential and resilience to environmental changes (Hilborn et al. 2003), and promote their sustainability. The underestimation of the number of stocks due to the lack of knowledge of the population genetic structure in species with fishing importance can generate the loss of genetic diversity and affect its fishing potential (Ward 2000; Viñas et al. 2011). Therefore, a detailed understanding of population genetic structure is especially relevant for fisheries management and conservation (Hauser et al. 2002; Viñas et al. 2011).

The shrimp Pleoticus muelleri (Bate 1888) is one of the most important resources of Argentine fisheries (Boschi 1997; Fernández and Hernández 2002). This decapod crustacean has a wide geographic distribution along the Southwestern Atlantic Ocean from Espíritu Santo, Brazil (20° S), to Santa Cruz, Argentina (50° S) (Boschi 1989). It has demersal-benthic habits and remains within the marine environment throughout its life cycle (Boschi 1997). Its greatest concentrations are found along the Patagonian coast, in areas with temperatures between 6 and 20 °C, and salinities between 31.5 and 33.5 PSU (Boschi 1986). It has been caught at depths ranging from 3 to 100 m (Bertuche et al. 2000b) and has become one of Argentina's main fish export products due to its high commercial value in international markets (Fischbach et al. 2006). In 2018, this crustacean registered a historical export maximum that represented 61% of the total of the fishing sector, and an annual income of more than 1,200 million US dollars (Piedrabuena and Salama 2021).

Pleoticus muelleri´s fishery in the Patagonian coast began to develop in the 1980s, and encouraged research on this resource. In 1984, the first closure area was established prohibiting fishing activity in a delimited zone south of the Golfo San Jorge (Mazarredo) to protect the shrimp´s breeding and growth grounds. In 1990s, after the Golfo San Jorge was declared as a biological and economic unit for the exploitation of the shrimp resource, an adaptive management system with dynamic closure areas was established to avoid overfishing of young individuals, with sizes smaller than those desirable for the fishery (growth overfishing) (De Carli 2012; Marcucci et al. 2017). Currently, management strategies for this species lack information on its genetic diversity, but are based on the precautionary approach (Weiss 1992) with a permanent monitoring (Bertuche et al. 2000a), and assuming that the resource behaves as a single stock (De Carli et al. 2012). Annual shrimp catches fluctuate significantly, with minimums of 1100, 6500 and 7500 t in 1987, 1995 and 2005, respectively (Fischbach et al. 2006; Sánchez et al. 2012). As of 2008, the fishing capacity was concentrated in waters of national jurisdiction at the same time that fishing effort in the Golfo San Jorge began to be reduced (Marcucci et al. 2017). Thereafter, total catches increased until they reached its maximum in 2018 with 254,925.7 t, since then, catches varied annually. In 2020, a decrease of 27.9% was registered compared to 2018, and in 2021 it increased again (223,653.7 t) (Navarro et al. 2014, 2019, 2022; Dirección Nacional de Coordinación y Fiscalización Pesquera 2021).

Particularly P. muelleri has larval stages that are planktonic and their position in the water column varies according to environmental conditions (e.g., luminosity, turbulence and transparency) (Boschi 1989). It is a species characterized by a large population size with a high dispersal capacity, that inhabits an open marine environment without recognizable migration barriers; and therefore, it could easily represent a single genetically homogeneous (panmictic) population with high levels of gene flow (Machado-Schiaffino et al. 2011; Palumbi 1994). Nevertheless, the possibility that organisms with these characteristics can exhibit population genetic structure determined by different factors, such as the environment or life history traits (Machado-Schiaffino et al. 2011), is not ruled out, even across large geographic distances (Ward et al. 1994; Marcucci et al. 2017).

The first genetic information of P. muelleri was used to identify species of crustaceans in food products (Calo-Mata et al. 2009). Then, De Carli (2012), Marcucci et al. (2017) and Carvalho-Batista et al. (2018) investigated several mitochondrial DNA sequences covering all together the entire geographical distribution of the species. However, none of these studies found genetically structured populations. According to the characteristics of this species and previous studies, we tested the hypothesis of panmixia, analyzing the distribution of the genetic variability of P. muelleri in all its geographical range, using a Restriction Site-Associated DNA Sequencing approach (RADSeq, Miller et al. 2007; Baird et al. 2008; Davey and Blaxter 2010). RADSeq is the most widely used Next Generation Sequencing technique for the detection and genotyping of single nucleotide polymorphisms (SNPs) in ecological and evolutionary studies of non-model organisms, since it does not require prior information on the genome of the species under study (Andrews et al. 2016). In the field of fisheries, it has been implemented in studies of phylogenomics and species delimitation (Díaz-Arce et al. 2016; Pedraza-Marrón et al. 2019), population genomics (Deagle et al. 2015; Xuereb et al. 2018), traceability (Jiang et al. 2020), detection of markers linked to sex (Carmichael et al. 2013) and natural selection (Chen et al. 2020), among others (Cancela et al. 2010; Kumar and Kocour 2017). The implementation of this technique allowed us to perform powerful population genetic analyses in P. muelleri, and to obtain results never attained before in this species. This investigation provides basic knowledge that can be used to design better strategies for its sustainable management and conservation.

Materials and methods

Sample collection

A total of 42 individuals of Pleoticus muelleri were collected from eight sampling sites in the Southwestern Atlantic Ocean; i.e., south (SGJ, N = 6) and north of Golfo San Jorge (NGJ, N = 6); Jueves Santo (JS, N = 5), Rawson (RA, N = 2) and Deseado (DE, N = 5) off Rawson (Argentina); Punta del Diablo (UR, N = 7) (Uruguay); Rio Grande do Sul (RG, N = 2); and Macaé (RJ, N = 9) (Brazil) (Fig. 1). Argentinian specimens were obtained from commercial vessels, between 2013 and 2017, whereas the samples from Uruguay and Brazil were acquired in 2013, by means of a collaboration between the Univesidade Federal do Rio Grande do Sul and the Universidad Nacional de la Patagonia Austral (Supplementary Table S1).

Fig. 1
figure 1

Map of the Southwestern Atlantic Ocean, showing the geographic origin of samples of Pleoticus muelleri used in this study. For each sampling location it shows the genetic clustering graphs for the number of clusters K = 2, result of STRUCTURE analysis (658 SNPs). Each vertical bar represents an individual, and the color corresponds to that individual’s estimated membership fraction in each of the K inferred clusters (red and green clusters). In the upper left corner of the figure, the graph of DeltaK (mean(|L’’(K)|) sd(L(K))-1, Evanno et al. 2005) as a function of K (potential number of genetic clusters)

DNA extraction and library preparation

High quality genomic DNA of each sample was isolated from a portion of telson muscle preserved in 96% ethanol, using “DNeasy Blood & Tissue” kit (Qiagen) following the procedure indicated for animal tissue, or a salting-out method for DNA extraction (Aljanabi and Martinez 1997). In both cases, after proteinase K tissue digestion, 3 µl RNase A were added to each extract (PBL, Argentina, 10 mg·ml−1) and were incubated at 37 °C for one hour to remove residual RNA. Later, the process continued following the protocol for each method. DNA concentration was measured using the Qubit dsDNA BR assay kit with a Qubit fluorometer (Invitrogen; Thermo Fisher Scientific, Inc.). DNA integrity was inspected on a 1% agarose gel electrophoresis.

For the RAD library preparation (Miller et al. 2007; Baird et al. 2008; Davey and Blaxter 2010), one microgram of total DNA (20–50 ng·µl−1) was used following the protocol described in Roesti et al. (2013), modified from Hohenlohe et al. (2010). In brief, each sample was individually subjected to restriction digestion using the SbfI enzyme (New England BioLabs Inc.), followed by the ligation of a P1 barcoded adapter to the restricted DNA of each individual (Supplementary Table S2). DNA was multiplexed, sheared with a Covaris M220 sonicator, size selected (300–500 bp), end repaired, and ligated to the Illumina P2 adapter (Supplementary Table S2). Ligation products were enriched using 18 cycles of high-fidelity PCR amplification (Supplementary Table S2). Final library concentration was quantified using a Qubit fluorometer (Invitrogen; Thermo Fisher Scientific, Inc.). The molar concentration of the library and the median size of the library smear were examined by agarose gel electrophoresis. The library was paired-end sequenced in an Illumina HiSeq 4000 at the Genomics & Cell Characterization Core Facility, Oregon University.

Sequence data analysis

Data quality was first checked using FastQC version 0.11.8 (Andrews 2010). Stacks version 2.4 (Catchen et al. 2013) was used to build loci de novo from raw reads in absence of a reference genome. Paired end sequences were demultiplexed using process_radtags according to individual barcodes, and reads with low quality scores were filtered out. Additionally, reads containing the adapter sequence were filtered, and the remainder of the SbfI enzyme recognition site was identified from each sequence. Clone_filter was used to filter PCR duplicates. After this, single end reads were used. Assembly parameters, i.e., m, M and n, were optimized as follows. We ran denovo_map pipeline several times varying just one parameter each time. We varied m from 1 to 6, M from 0 to 8 and n from 1 to 6, while keeping all other parameters consistent (m = 3, M = 2 and n = 0). We extracted the number of (i) assembled loci, (ii) polymorphic loci, and (iii) SNPs for each run of the program and selected those parameters that maximized the number of polymorphic loci present in 80% of the samples (r80 loci rule, Paris et al. 2017; Rivera-Colón and Catchen 2022). Then, the final denovo_map.pl with optimal parameters, i.e., m = 4, M = 2, n = 4 was ran, and the populations module of Stacks was used to set the minimum number of populations a locus must be present in to process that locus (p 8), the minimum proportion of individuals in a population required to process a locus for that population (r 0.80); and to remove from further analysis SNPs based on minor allele frequency (min-maf 0.035) and loci that exceeded the maximum allowed observed heterozygosity (max-obs-het 0.70) following the recommendation of Rochette and Catchen (2017). A Variant Call Format (vcf) file was exported. Using R software version 4.2.2 (R Core Team 2022) loci with more than four SNPs were eliminated and only the first SNP per locus was retained for the analyses. Missing data threshold was 0.90 (max-missing in VCFtools version 0.1.13, where 0 allows sites that are completely missing and 1 indicates no missing data allowed). VCFtools was also used to export the file in PLINK format (Danecek et al. 2011).

Loci in linkage disequilibrium (LD) equating to an r2 value of more than 0.5 were removed from some analyses, i.e., genetic diversity estimates, pairwise genetic differentiation (FST) and STRUCTURE, using PLINK’s—indep command (SNP window size: 300; SNPs shifted per step: 5; and variance inflation factor (VIF): (2), which recursively removes SNPs within a sliding window (PLINK 2.0, Chang et al. 2015).

Genetic diversity

Estimates of genetic diversity such as the number of private alleles per population, the percentage of polymorphic loci, nucleotide diversity (Pi), observed (HO) and expected heterozygosity (HE), and the inbreeding coefficient (FIS), were estimated using the populations program in Stacks (Catchen et al. 2013). The allelic richness (R) (El Mousadik and Petit 1996) was calculated using hierfstat 0.5–7 R package (Goudet and Jombart 2020). FST between populations and their significance were calculated following the Weir and Cockerham (1984) formulation as implemented in Arlequin 3.5.2.2 (Excoffier and Lischer 2010) with 99,999 permutations. Sampling locations with N = 2, i.e., RA and RG, were excluded from this analysis to avoid low sample size bias.

A neighbor joining (NJ) analysis was performed using the poppr package version 2.8.7 in R software (Kamvar et al. 2014, 2015) using Nei's genetic distance calculated between individuals with 10,000 bootstrap replicates.

Population structure analysis

We performed a Discriminant Analysis of Principal Components (DAPC, Jombart et al. 2010) using the R package adegenet v2.0.1 (Jombart 2008) to visualize relationships among groups of samples. The “find.clusters” function was executed retaining 35 PCs (accounting for 89.9% of the variance), and the Bayesian Information Criterion (BIC) was used to determine the optimal number of clusters (K). Then, two DAPC analyses were ran with 21 PCs to explain 60.9% of the variance, one with the optimal K = 2, and the other, with the samples preassigned to the original sampling sites (i.e., SGJ, NGJ, JS, RA, DE, UR, RG and RJ). The most contributing alleles to the first discriminant function (DF1) above a threshold loading of 0.006 (argument threshold) were identified, and their allele frequencies per populations were graphed.

Population structure was explored with STRUCTURE version 2.3.4 software (Pritchard et al. 2000). We chose the admixture model and assumed correlated allele frequencies (Falush et al. 2003; Hubisz et al. 2009). Three independent runs were performed for each number of genetic clusters evaluated (K = 1–8). The initial burn-in period was set to 100,000, and the run length to 150,000 steps. STRUCTURE HARVESTER v0.6.94 (Earl and vonHoldt 2012) was used to process STRUCTURE results, and to perform the Evanno method (Evanno et al. 2005), to detect the number of K groups that best fitted the data.

Detection of loci associated with environmental variables

BayPass (Gautier 2015) was used to identify loci subjected to selection. All loci with a minor allele frequency of at least 0.035 were used (1740 loci in total). We chose the standard covariate model (STD), which allows identifying genetic markers associated with environmental variables (Gautier 2015). This model corrects for the scaled covariance matrix of population allele frequencies (Ω, calculated under the core inference model) to completely remove isolation by distance or substructure effects. We considered the annual average surface (https://doi.org/10.5067/MODSA-AN4D9; http://navigator.oceansdata.ca) and bottom seawater temperature (Baldoni et al. 2015; http://navigator.oceansdata.ca), the annual average surface (https://doi.org/10.5067/SMP50-3TMCS; http://navigator.oceansdata.ca) and bottom seawater salinity (Baldoni et al. 2015; Piola et al. 2018; http://navigator.oceansdata.ca), and the annual average surface seawater chlorophyll a concentration (https://doi.org/10.5067/AQUA/MODIS/L3B/CHL/2018) at each sampling site (Table 1). Spearman's rank correlation coefficients were calculated to estimate the relationship between the variables using the R function cor, and showed that some environmental variables were statistically correlated (surface vs. bottom seawater temperature: r = 0.84; surface seawater temperature vs. bottom seawater salinity: r = 0.9; bottom seawater temperature vs. bottom seawater salinity: r = 0.77; P < 0.05). Therefore, only three uncorrelated variables, i.e., bottom seawater temperature, surface seawater salinity and chlorophyll a concentration, were used for this analysis. BayPass uses BFs (Bayes Factors) to associate SNPs with population specific covariates. Associations with BF ≥ 10 were considered as statistically significant according to Jeffreys’ rule (Jeffreys 1961).

Table 1 Summary of annual average environmental variables (Baldoni et al. 2015; https://doi.org/10.5067/AQUA/MODIS/L3B/CHL/2018; https://doi.org/10.5067/MODSA-AN4D9; https://doi.org/10.5067/SMP50-3TMCS; http://navigator.oceansdata.ca; Piola et al. 2018)

Pcadapt R package (version 4, Privé et al. 2020) was also used for outlier detection. This is a statistical tool to detect genetic markers involved in biological adaptation based on Principal Component Analysis of individual genotype data. To choose the number of principal components (K) to run pcadapt we followed Cattell’s graphical rule, i.e., we kept PCs to the left of the straight line of the “scree plot” (Supplementary Fig. S1.a) that displays the eigenvalues in descending order (Luu et al. 2017). In addition, “score plots” were used to assess the value of K that corresponds to a relevant level of population structure (Luu et al. 2017, https://bcm-uga.github.io/pcadapt/articles/pcadapt.html; Supplementary Fig. S1.b). Once the “pcadapt” function was executed with the optimal number of PCs, the R package qvalue (version 2.22, Storey et al. 2020) transformed the P-values into q-values. For an α = 0.01 value, SNPs with q-values less than α were considered as outliers with an expected false discovery rate bounded by α (Luu et al. 2017). Only those loci detected consistently as candidates by BayPass and pcadapt were retained.

Potentially adaptive loci were identified in the catalog of RAD loci obtained from Stacks, and blasted against GenBank (http://blast.ncbi.nlm.nih.gov/Blast.cgi). The search focused on the protein database using the nucleotide query (BLASTX 2.13.0 + , Altschul et al. 1997) and the non-redundant protein sequence (nr) database. Sequences with percentage identity and query cover of 75% or more were reported. The possible function of matched proteins was searched in the UniProtKB database (The UniProt Consortium 2021). Finally, we verified if allelic variants of these sequences were synonyms or non-synonyms, and the allele frequencies per population were graphed.

The STRUCTURE analysis was also performed after removing potentially selected loci from the dataset.

Results

Sequence data analysis

After demultiplexing, quality filtering and PCR clone removal, a total 156,136,888 paired reads were retained with an average of 3,717,544.95 (SD = 1,618,830.36) per sample. The mean depths of coverage for processed samples calculated by the single end denovo_map pipeline, was 81.38x (SD = 28.40). The widely shared loci (R 0.80) were 21,981, composed of 10,713 variant sites. A total of 1740 loci met all the filters specified in “Materials and methods” (i.e. p 8, r 0.80, min-maf 0.035, max-obs-het 0.70, loci with ≤ 4 SNPs, only the first SNP per locus, and max-missing 0.90). Removal of loci under LD resulted in a 658 loci matrix.

Genetic diversity

Estimates of HO and HE over the 658 loci varied across sampling locations (Table 2). Positive FIS values indicated that individuals in a population were related, however, the values of FIS were low (FIS < 0.05) indicating random mating in sampling locations (Hartl and Clark 1997), except RJ that showed average values (FIS = 0.0835, Table 2) according to the scale proposed by Hartl (2000). The individuals of UR showed the highest values of Pi, HO, HE and R indicating the highest genetic diversity among the eight locations, whereas RG showed the lowest values. RG and RA displayed the lowest percentage of polymorphic loci (34.35% and 36.93% respectively), suggesting that many SNPs were monomorphic with one allele fixed, however, this was probably due to the small sample size in these locations. Allelic richness was low and similar in all sites (R = 1.344–1.410). The number of private alleles per population ranged from 0 (SGJ, JS, RA and RG) to 5 (RJ) (Table 2).

Table 2 Summary of genetic diversity statistics (658 SNPs) for Pleoticus muelleri from eight sampling locations (SGJ: south of Golfo San Jorge—NGJ: north of Golfo San Jorge—JS: Jueves Santo—RA: Rawson—DE: Deseado—UR: Punta del Diablo—RG: Rio Grande do Sul—RJ: Macaé)

Pairwise FSTs revealed that genetic differentiation between pairs of Argentinian populations was low and not significant (Table 3). However, these populations showed significant differences with the populations from UR and RJ (Table 3). In addition, UR also differed significantly from RJ (P < 0.01) (Table 3).

Table 3 Pairwise FST values over 658 SNPs across six sampling locations of Pleoticus muelleri

The NJ dendrogram showed two well supported clusters (bootstrap support over 98%, Fig. 2), one including all Argentine samples and the other one all RJ samples. In turn, UR and RG individuals were scattered in both groups.

Fig. 2
figure 2

NJ dendrogram of individuals of Pleoticus muelleri based on the analysis of 1740 RAD (Restriction site Associated DNA) loci. Individuals are identified with sampling location followed by sex (M male, F female) and the sample identification number. SGJ south of Golfo San Jorge, NGJ north of Golfo San Jorge, JS Jueves Santo, RA Rawson, DE Deseado, UR Punta del Diablo, RG Rio Grande do Sul, RJ Macaé

Population structure analysis

The DAPC plot (Fig. 3) showed differentiation among individuals consistent with FST values and the NJ analysis (Table 3; Fig. 2). The “find.clusters” function identified two genetic clusters (Supplementary Table S3), one formed by all Argentine samples and the other by all Brazilian samples. Most of the UR samples were also in this last cluster, except for one (i.e., UR_F23). Then, the DAPC analysis was performed using K = 2 (Fig. 3a). Based on the DF1 (eigenvalue = 552.7) DAPC calculated the membership probabilities of each individual for the different groups (Supplementary Table S4) which can be interpreted as proximities of individuals to the different clusters. All individuals had probabilities of 1 of belonging to cluster 1 and 0 to cluster 2, or vice versa, except for three individuals from UR and one from RG that had membership probabilities between 0 and 1 (see Supplementary Table S4), suggesting possible admixture.

Fig. 3
figure 3

Genomic variation by non-parametric Discriminant Analysis of Principal Components (DAPC) of individuals of Pleoticus muelleri based on the analysis of 1740 RAD (Restriction site Associated DNA) loci and 21 principal components (explaining 60.9% of the total variance). a Density plot generated using K = 2 (red and black clusters), result of the “find.clusters” function (DF1 eigenvalue = 552.67). b Density plot of individual scores on the first discriminant function (eigenvalue = 124.97) using preassigned populations according to sampling sites. c Scatterplot of the first and second discriminant functions (eigenvalue = 9.87) using preassigned populations according to sampling sites. SGJ south of Golfo San Jorge, NGJ north of Golfo San Jorge, JS Jueves Santo, RA Rawson, DE Deseado, UR Punta del Diablo, RG Rio Grande do Sul, RJ Macaé

A second DAPC performed with samples preassigned to the original sampling locations yielded similar results. The DF1 (eigenvalue = 124.97) showed that individuals are genetically structured in groups arranged along a latitudinal gradient (Fig. 3), with the Argentinian sampling sites largely overlapping (Fig. 3b, c). The second discriminant function (eigenvalue = 9.87) contributed little to explain the differentiation between samples. Four loci reflected most contribution to the genetic differentiation linked to the geographic distribution (DF1) (Fig. 4a). The allele frequency graphs showed that these loci changed their allele frequencies throughout the geographic distribution of this species (Fig. 4b).

Fig. 4
figure 4

SNPs that contributed to explain the genetic structure along the latitudinal gradient. a Loading plot based on 1740 RAD (Restriction site Associated DNA) loci of Pleoticus muelleri. Numbers above peaks indicate the identification number (ID) of the most contributing SNPs (above a threshold loading of 0.006) to the first discriminant function (eigenvalue = 124.97) of the DAPC analysis. b Allele frequencies per population of each contributing locus. SNP #: SNP ID in the total catalog. SGJ south of Golfo San Jorge, NGJ north of Golfo San Jorge, JS Jueves Santo, RA Rawson, DE Deseado, UR Punta del Diablo, RG Rio Grande do Sul, RJ Macaé

The STRUCTURE analysis evidenced the presence of two genetic clusters (K = 2; Fig. 1). Individuals found in the south of P. muelleri´s distribution (SGJ, NGJ, JS, RA and DE) had an estimated membership to one of the two clusters (green color, Fig. 1) of > 0.850 (except for one individual in JS that had a 0.636), while individuals in the north (RJ) had an estimated membership > 0.883 to the other cluster (red color, Fig. 1), indicating correspondence between the genetic groups detected and the geographic origin of individuals (Fig. 1). In general, individuals found in in-between locations (UR and RG) showed intermediate estimated membership values to both clusters, suggesting admixture of these two neighbouring groups (Fig. 1).

Detection of loci associated with environmental variables

BayPass identified 20 loci correlated with environmental variables (Table 4), of these, 17 were associated with bottom seawater temperature, two with surface seawater salinity, and one with chlorophyll a (Table 4). For the analysis using pcadapt two principal components (K) were retained (see Supplementary Fig. S1.a and b). Pcadapt detected 13 possible outlier loci (q-value < 0.01, Supplementary Fig. S1c), nine of which coincided with loci identified by BayPass as correlated with bottom seawater temperature (Table 4, Supplementary Table S5). The remaining four loci were not identified by BayPass (IDs 874, 1125, 1554 and 13,142).

Table 4 BayPass results of the SDT covariate model

The four loci that were identified by DAPC as the major contributors to the genetic differentiation in DF1 (i.e., ID 4054, 9481, 10,209 and 11,727; Fig. 4a), were also detected by BayPass and pcadapt (Table 4).

Only one of the nine potentially adaptive RAD loci identified by BayPass and pcadapt (ID 1240, Table 4) blasted with high percentage of coverage and identity (≥ 75%) in the GenBank protein search (Supplementary Table S6). It matched against the protein monocarboxylate transporter 13-like or 12-like of P. chinensis, P. japonicus and P. monodon (Penaeoidea superfamily) (Supplementary Table S6). Allelic variants found for locus 1240 throughout the geographic distribution of P. muelleri (see Supplementary Fig. S2) were non-synonymous, i.e., they codified either for leucine or for methionine. Monocarboxylate transporters belong to a superfamily of membrane transport proteins (Major Facilitator Superfamily, MFS) that facilitate movement of solutes through cell membranes as a response to chemiosmotic gradients (Pao et al. 1998). Some authors have found that this type of protein can be involved in osmoregulation (Ertl et al. 2019; McCarty et al. 2022), hypoxia and acid–base regulation in marine invertebrates (Tresguerres et al. 2020).

The STRUCTURE analysis performed after removing the potentially selected loci from the dataset (656 loci), gave similar results to the previous ones (K = 2, Supplementary Fig. S3).

Discussion

This work is the first attempt to use genome wide data to assess broad-scale population differentiation and genetic structure of Pleoticus muelleri. 1740 loci could be genotyped using the RADSeq technique. In contrast to what we hypothesized, all our results pointed to the existence of population structure along the studied geographic range of P. muelleri. Two genetic clusters were identified, one found in the south (SGJ, NGJ, JS, RA and DE), and the other in the north of its distribution (RJ). Individuals found in intermediate locations (UR and RG) suggested the existence of admixture between these two neighbouring groups. Nine potentially adaptive loci were correlated to bottom seawater temperature, and/or with variables correlated to it, i.e., surface seawater temperature and bottom seawater salinity, and four of them explained much of the genetic structure found along the geographic distribution of this species.

Pleoticus muelleri inhabits the continental shelf off eastern South America, which is characterized by a strong contrast in water mass characteristics. In the upper layer, it is influenced by the continental discharge of the Río de la Plata (34° S) and, more locally, of the Patos Lagoon (32° S). The position of this low salinity plume (S < 33) varies seasonally, reaching 28° S during winter, while it is limited to the south of 32° S in summer. Below this low salinity layer, the relatively cold, fresh Subantarctic Shelf Water is found south of 33° S; whereas to the north, warm, salty Subtropical Shelf Water occurs. Between both extends a relatively narrow frontal zone, i.e. the Subtropical Shelf Front. The front has approximately a N-S direction, and appears as a shelf extension of the confluence of the subtropical Brazil Current and the subantarctic Malvinas Current, situated near the mouth of the Río de La Plata (Piola et al. 2000; Matano et al. 2010). All these conditions can be contributing to the reduction of gene flow observed between northern and southern populations of Pleoticus muelleri.

Certain flow fields, such as the one generated by converging marine currents, may be capable of acting as barriers to dispersal in benthic marine species with planktonic larvae, even in the absence of other constraints (Gaylord and Gaines 2000; Hohenlohe 2003; Pelc et al. 2009). Furthermore, the association of some SNPs with environmental variables, identified through BayPass, suggested the existence of divergent selection between northern and southern populations that could be related to bottom seawater temperature, and/or to variables correlated to it, i.e., surface seawater temperature and bottom seawater salinity. Although this association may not necessarily imply a causal relationship, this could be suggesting that temperature and/or salinity could be involved as selective pressures in the genetic divergence observed.

Changes in temperature and salinity can affect growth rate and physiologic functions of various marine invertebrates, such as shrimp, oysters, mussel and clams (Ertl et al. 2019). Besides, variations in water temperature can modify the breeding season and induce maturation and spawning (Sancinetti et al. 2019). The candidate locus that matched in the protein search with monocarboxylate transporters (i.e., 1240) showed non-synonymous allele frequency variation between northern and southern populations of P. muelleri. Monocarboxylate transporters have been proposed to have a role in osmoregulation (Ertl et al. 2019; McCarty et al. 2022), hypoxia and acid–base regulation in marine invertebrates (Tresguerres et al. 2020). However, information is still scarce to secure the identity and function, as well as the potentially adaptative advantage of the different alleles of this locus; and therefore, to ensure the underlying adaptive mechanism. Finally, we cannot rule out that other environmental factors not considered in this study, such as grain particle size, organic matter content and sediment texture might be involved in adaptation (Ruello 1973; Costa and Fransozo 2004; Costa et al. 2004; Sancinetti et al. 2014). Additional studies such as whole genome sequencing can help investigate the adaptative mechanism involved in P. muelleri´s differentiation, enabling the discovery of genes linked to the candidate loci reported in this study that may have a known function in related species.

The fact that the STRUCTURE analysis including only the putatively neutral loci gave similar results to the one performed with all loci, reveals that genetic differentiation affects more loci than just the selected ones. This could indicate either that the lack of gene flow as a result of selection has already affected neutral loci (Tigano and Friesen 2016), or that differentiation originated because of the presence of a physical barrier to dispersion, making it difficult to distinguish between ecologically driven divergence and allopatric differentiation with posterior adaptation (Teske et al. 2019). Therefore, we cannot know whether the observed intermediate zone constitutes a primary or a secondary contact zone. To determine the scope of the hybrid area, samples from intermediate zones will be studied.

To inspect whether the inclusion of localities with small sample size (N = 2), i.e., RA and RG, had an effect on population structure analyses, DAPC and STRUCTURE were run without including these samples. Results obtained were the same as those obtained when including all samples. They revealed the existence of two clusters, one in the south and one in the north of P. muelleri´s distribution, with individuals found in the intermediate location (UR) suggesting the existence of admixture between these two neighbouring groups (results not shown).

Overall, these novel markers constitute a powerful tool for studies of genetic structure and population assignment in P. muelleri. This high-resolution genetic information is expected to be useful in improving conservation and management policies for this species, since the underestimation in the number of stocks can result in the loss of intraspecific diversity (Viñas et al. 2011) due to overexploitation, harming the adaptation capacity to changes or modifications in the environment (Villaseñor Gómez 2005; Rocha and Gasca-Pineda 2007). Resilience and sustainability of commercially relevant species depend on identifying population structure and adaptive diversity to preserve all discrete biological units of fisheries resources (Hilborn et al. 2003; Mullins et al. 2018; Clucas et al. 2019).