Introduction

Species delimitation

Accurate species delimitation analyses are a keystone for most aspects of biological sciences including the recognition of biodiversity hotspots, detecting declines in populations, assessing environmental impacts, developing effective management strategies and monitoring the trade of endangered species (Krishnamurthy and Francis 2012). Furthermore, a correct understanding of species diversity and distribution ranges is also fundamental in the study of speciation mechanisms (Tan et al. 2021). Hence, species is the basic unit of analysis in many scientific disciplines in which proper identification of specimens is the crucial first step in a large number of biological analyses (da Silva et al. 2018; De Queiroz 2007).

Traditionally, specimen identification has been based primarily on the description and comparison of morphological characters to define and address species (Vitecek et al. 2017). However, an integrative approach based on independent sources of information (anatomical, ecological, geographical, molecular, morphological, physiological, etc.) has been proposed as an improved method for delineation and description of taxa (Dayrat 2005). On this basis, morphology-based approaches have been complemented in the last decades with the arrival of molecular techniques, such as the use of a fragment of the mitochondrial protein-coding gene cytochrome oxidase subunit I (COI) as a DNA barcode to identify specimens when compared with an available dataset (DNA barcoding) (Hebert et al. 2003). Nevertheless, the use of single-locus analyses can lead to uncertainties or inaccuracies due to the biological nature of the marker used (events of introgression, hybridization, etc.) or differences between single-gene and species trees (Krishnamurthy and Francis 2012). Nowadays, DNA-based species delimitation analyses rely mostly on the study of the branching pattern of multi-locus phylogenetic trees to define one and/or several thresholds to group the specimens in species-like clusters, such as the poisson tree processes (PTP) (Zhang et al. 2013). This method is based on the hypothesis of a higher number of nucleotide substitutions between species than within species and has been improved into a multi-rate poisson tree processes (mPTP), to consider possible variations in the substitution ratio due to differences in the evolutionary process among branches (Kapli et al. 2017). These analyses are available in a web server which includes another modification: adding Bayesian support values at the poisson tree processes (bPTP) to delimited species thanks to Markov Chain Monte Carlo sampling; where a higher Bayesian support value on a node indicates that all descendants from that node are more likely to belong to one species (Zhang et al. 2013).

In addition, the coalescent theory has taken a place in species delimitation analyses, and its use has grown in the last years (Fujita et al. 2012). One of the most popular algorithms is general mixed Yule coalescent (GMYC), which differentiates species by estimating the node with the most recent common ancestor in each species, and therefore, estimating a threshold time before which all nodes reflect diversification events (Pons et al. 2006).

An improvement of this approach is the use of several points in a phylogenetic tree, the multiple GMYC (mGMYC), that allows for variable transitions, from coalescent to speciation, among different lineages (Monaghan et al. 2009).

Other strategies for DNA-based species delimitation are available, such as relying on the existence of a gap among the genetic distances calculated from the data, as in the software automatic barcode gap discovery (ABGD) (Puillandre et al. 2012). In ABGD, several thresholds are tested to obtain distinct species-like clusters assuming the existence of a gap among the genetic distances calculated from the data, when values within and between groups of individuals are compared (Puillandre et al. 2012).

Genus Gaidropsarus

The genus Gaidropsarus Rafinesque, 1810 is characterized by an impressive ecological diversity despite its otherwise preserved morphology and its moderate number of species. Their main characteristics are an elongated and slender body, one barbel on the chin and two at each anterior nostril in the snout, and a first dorsal fin with a ray followed by a row of small fleshy filaments (Cohen et al. 1990). The position of the genus Gaidropsarus in the gadiforms phylogeny has been a point of debate for a long time, with several hypotheses proposed over the years, being alternatively placed in the families Lotidae (Froese and Pauly 2020) and Gadidae (Nelson 2016), or even creating a specific family called Gaidropsaridae, which seems to be definitive (Roa-Varón et al. 2020).

There are currently a total of 13 species in this genus, of which 9 are found in the North Atlantic Ocean and the Mediterranean Sea (Svetovidov 1986a,b): Gaidropsarus argentatus (Reinhardt, 1837), Gaidropsarus biscayensis (Collett, 1890), Gaidropsarus ensis (Reinhardt, 1837), Gaidropsarus granti (Regan, 1903), Gaidropsarus guttatus (Collett, 1890), Gaidropsarus macrophthalmus (Günther, 1867), Gaidropsarus mediterraneus (Linnaeus, 1758) Gaidropsarus vulgaris (Cloquet, 1824) and Gaidropsarus gallaeciae (Bañon, et al. 2022). In reality, true diversity of species in this genus is not yet fully known, as evidenced by the recent description of new species, such as Gaidropsarus pakhorukovi (Shcherbachev 1995), Gaidropsarus mauli (Biscoito and Saldanha 2018) and Gaidropsarus gallaeciae (Bañón et al. 2022). In fact, it is sometimes difficult to differentiate some Gaidropsarus species from each other because of their well-conserved morphology (Svetovidov 1986b). Moreover, there is a general lack of knowledge of the morphological variability of this genus due to the absence of representative specimens in the museums (Balushkin 2009). Thus, it is not surprising that when the morphology-based available knowledge has been compared with molecular data, discrepancies have arisen (Barros-García et al. 2018). The use of genetic distances among 171 COI sequences indicated: (i) a possible synonymy between the Atlantic G. macrophthalmus and the Mediterranean endemism G. biscayensis, (ii) a possible synonymy between G. mediterraneus and G. guttatus, a Macaronesian endemic, (iii) a low genetic divergence between two boreal deep-sea species (G. argentatus and G. ensis), and between G. granti and G. vulgaris, and (iv) the existence of several putative unknown deep-sea Gaidropsarus species in the North Atlantic Ocean (Barros-García et al. 2018).

Objectives

This investigation aims to apply single and multi-locus species delimitation analyses on a dataset obtained from several species of Gaidropsarus from the North Atlantic Ocean and the Mediterranean Sea, in an attempt to resolve certain taxonomic discrepancies previously found.

Materials and methods

Sampling

The specimens (n = 44) were captured in several locations of the North Atlantic Ocean and the Mediterranean Sea (Fig. 1). As a general rule, specimens were immediately frozen onboard and a muscle sample was obtained in the laboratory and preserved in 95% ethanol at −20 ºC. Specimens were identified to the species level by examining morphological features following the available taxonomic literature (Svetovidov 1986a,b). Moreover, COI nucleotide sequences from the same specimens were obtained and compared in the available repositories (BOLD systems and GenBank) to cross-check the morphological identification. The specimens were deposited in the “Museo de Historia Natural da Universidade de Santiago de Compostela” (Santiago de Compostela, Spain) and the “Colección de Fauna Marina del Centro Oceanográfico de Málaga” (CFM-IEOMA; Málaga, Spain). All the sequences of the different genetic markers used in the present study are publicly available in GenBank (Table 1).

Fig. 1
figure 1

Map of the sampled points in the North Atlantic Ocean and the Mediterranean Sea. Each point shows the species found and the number of specimens of each one between brackets

Table 1 Species-level identification based on morphology, specimen ID, and accession numbers for each specimen of Gaidropsarus used in this study

DNA extraction, PCR amplification and sequencing

Total DNA was purified from 25 mg of muscle tissue taken from each specimen according to the spin column protocol of the Tissue DNA Extraction Kit (Omega-Biotek). Parts of three mitochondrial: cytochrome oxidase subunit I (COI), cytochrome B (CytB) and NADH dehydrogenase 2 (NAD2), and two nuclear markers: rhodopsine (Rho) and Zic family member 1 (ZIC1) were amplified by PCR employing specific conditions and primers (supplemental material Table 1). Amplicons were sequenced with the BigDye Terminator v3.1 Cycle Sequencing Kit, and the resulting products were resolved in an ABI3130 Genetic Analyzer at the “Centro de Apoyo Científico y Tecnológico a la Investigación” (CACTI, University of Vigo). Forward and reverse chromatograms were visually inspected and finally assembled with SEQSCAPE v. 2.5 (applied biosystems).

For the subsequent analyses, first each marker was considered independently, and then clustered in three independent groups of datasets; “mtDNA” (COI + CytB + NAD2), “nDNA” (Rho + ZIC1), and “concatenated” (the sum of mtDNA + nDNA).

Phylogenetic analyses

The final dataset comprised 9 putative Gaidropsarus species and 2 outgroup species; cod (Gadus morhua) and the spotted gar (Lepisosteus oculatus) for a total of 220 Gaidropsarus sequences plus the outgroups mined from Genbank.

Since most of the species delineation analyses applied rely on the phylogenetic tools used, trees obtained by Bayesian inference (BI) were used for GMYC analyses, and maximum likelihood (ML) trees for bPTP analyses, as previously described (Tang et al. 2014).

The best partitioning scheme was estimated with PartitionFinder2 (Lanfear et al. 2016), spanning the range from a single partition for the entire alignment to each gene and codon position treated as a partition (supplemental material Table 2). For every partition, the most appropriate nucleotide substitution model was selected using jModelTest 2.1.8 (Darriba et al. 2012), based on the Akaike Information Criterion (AIC; Akaike 1973).

The ML analysis was performed using RAxMLv.8.2.10, conducting 1000 rapid bootstrap replicates, and the best-scoring ML tree was evaluated under the GTRGAMMA model implemented on the CIPRES Science Gateway portal (Miller et al. 2010).

Bayesian inference was used to build a phylogenetic tree using BEAST 2.6.2 (Bouckaert et al. 2019). BEAUti was used to assemble the XML files with the following settings: the models used were those obtained from jModelTest 2.1.8 following the partitions suggested by PartitionFinder2 with empirical frequencies, Yule model was selected as tree prior while all other variables were not modified from their default values. Markov Chain Monte Carlo (MCMC) chain length was set to 5 × 107, logging every 1000 samples. Three different runs were carried out for a final number of 50,000 sampled trees/run. To check the convergence of the analyses, the resulting trace files were inspected with TRACER.1.7.0 (Rambaut et al. 2018). The runs were merged using Logcombiner, discarding the first 25% of each run as burn-in. A maximum clade credibility tree with posterior probability limit set at 0.90 and mean node heights was constructed with TreeAnnotator v.2.4.5 and visualized with FigTree v.1.4.3 (http://tree.bio.ed.ac.uk/software/ figtree/).

Species delineation analyses

The online version of automatic barcode gap discovery software (ABGD; available at https://bioinfo.mnhn.fr/abi/public/abgd/abgdweb.html) was used, based on calculated p distances using the default priors for the relative gap width (1.5), Pmin (0.001) and Pmax (0.1) (Puillandre et al. 2012). All analyses were carried out with 10 steps and 20 bins of distance distribution.

To estimate the number of putative species in the data, a variation of the poisson tree processes (PTP) was used. The bPTP analysis adds Bayesian support values to delimited species on the input tree. The analyses were carried out in the webserver (https://species.h-its.org/ptp/) with default settings removing the outgroup to optimize the delimitation results (Zhang et al. 2013).

The BI trees obtained were analysed in the webserver (https://species.h-its.org/gmyc/) for either single and multiple threshold approaches for the generalized mixed Yule-coalescent (GMYC) method, which identifies the time threshold that defines coalescent or speciation processes on the branching patterns of ultrametric trees (Pons et al. 2006).

Results

mtDNA data

Phylogenetic analyses of the combined mitochondrial markers showed a fully resolved topology (Fig. 2a; supplemental material Fig. S1). Both G. mediterraneus and G. guttatus were located at the same branch with mixed individuals of both species, in two different but closely related clades.

Fig. 2
figure 2

Collapsed consensus trees obtained with Bayesian inference and maximum likelihood for three combined datasets; a mitochondrial DNA markers (COI, CytB and ND2), b nuclear DNA markers (Rhodopsin and ZIC1), and c mtDNA and nDNA concatenated. Nodes with Bayesian posterior probability and bootstrap values over 0.9 are highlighted with a white diamond. Each collapsed branch show the species represented in each one. The results of the delimitation analyses are included; automatic barcode gap discovery (ABGD) in black, bPTP in blue, GMYC single threshold in green and multiple threshold GMYC in red. Clusters obtained in bPTP with a Bayesian support over 0.9 are highlighted with an asterisk. The outgroup individuals have been discarded in this figure since they have been not used in the species delimitation analyses

In addition, G. granti and G. vulgaris showed a close relationship as well as G. argentatus and G. ensis, while G. biscayensis and G. macrophthalmus are mixed (Fig. 2a). On the contrary, the six individuals assigned to G. gallaeciae clustered together in an independent branch.

Five out of the nine putative species showed no incongruences among all the delimitation analyses: G. argentatus, G. ensis, G. granti, G. gallaeciae and G. vulgaris. Among the others, four analyses (ABGD, bPTP and GMYC) indicated a single cluster between G. biscayensis and G. macropthalmus (Fig. 2a). Only the GMYC with the multi-threshold approach divided these species into two independent clusters, and yet each cluster contained sequences from both species (supplemental material Fig. S1). The same results were obtained for the species G. guttatus and G. mediterraneus, with the difference that, in this case, the phylogenetic analyses divided them into two well-supported clusters combining sequences from both species (Fig. 2a). All the clusters obtained with bPTP showed a high level of confidence of Bayesian support (> 0.9) except for the cluster which brings together G. guttatus and G. mediterraneus. The results for the mtDNA markers independently analysed are summarized in the supplemental material (supplemental material Fig. S2).

nDNA data

Regarding the nuclear markers, interestingly ZIC1 showed a highly conserved nature with all substitutions found in Gaidropsarus being synonymous and an average genetic distance among all samples of 0.7%, whereas Rho resulted in a more variable marker, with an average genetic distance of 3.6% among all samples.

Phylogenetic analyses of nDNA markers showed a partially resolved phylogeny with G. guttatus and G. mediterraneus splitting from the other species in the Gaidropsarus lineage (Fig. 2b; supplemental material Fig. S3). Gaidropsarus granti and G. gallaeciae are well supported as well as another two clades formed by the combination of G. argentatus/G. ensis, and G. biscayensis/G. macrophthalmus, respectively. The branches not supported in the tree topology included sequences from G. vulgaris, G. argentatus (GDT002) and G. guttatus specimens (GGT003, GGT005), and one allele from GGT002 and GGT004 (Fig. 2b; supplemental material Fig. S3).

Species delimitation analyses showed a gradient in the results; ABGD differentiated only two groups, the first including G. guttatus and G. mediterraneus and a larger second group with the rest of the sequences (Fig. 2b). Both bPTP and GMYC analyses showed the joint grouping of G. argentatus and G. ensis, as well as the independence of G. granti and the division between G. guttatus and G. mediterraneus. Single threshold GMYC and bPTP agreed in one cluster for G. gallaeciae and G. biscayensis/macrophthalmus, while mGMYC divided the former into two groups and the latter in five. The unresolved part of the consensus tree is considered a single cluster by both GMYC and mGMYC, while bPTP separated it, including G. argentatus and G. guttatus sequences. The results for the independently analysed nDNA markers are graphically summarized in the supplemental material (supplemental material Fig. S4 and Fig. S5).

Concatenated data

The phylogenetic trees obtained with the dataset comprising all genetic markers showed a fully resolved topology, with G. mediterraneus and G. guttatus splitting firstly in the Gaidropsarus clade (Fig. 2c; supplemental material Fig. S6). Sequences grouped in two independent clusters, one comprising sequences of both species and the other of G. guttatus specimens only, but with low statistic support. Gaidropsarus argentatus and G. ensis showed a close relationship as G. granti and G. vulgaris. All the individuals belonging to G. biscayensis and G. macrophthalmus clustered together in the same way as the specimens of G. gallaeciae (Fig. 2c).

The species delimitation analyses were fully congruent only in the case of G. gallaeciae (Fig. 2c). ABGD and bPTP analyses were able to differentiate between the pairs of species formed by G. argentatus/G. ensis, and G. granti/G. vulgaris, but only the latter with high confidence values in bPTP. The GMYC algorithm, in both single and multi-threshold approaches, combined them as single units. Conversely, the synonymy between G. macrophthalmus and G. biscayensis is supported by three of four analyses performed (ABGD, bPTP and GMYC), while mGMYC divided them into two clusters but mixing sequences from both species (Fig. 2c; supplemental material Fig. 6). Both ABGD and GMYC algorithms supported the existence of a single cluster for the grouping of G. mediterraneus and G. guttatus specimens, while bPTP and mGMYC differentiated the two branches obtained in the phylogenetic analyses (Fig. 2c).

Discussion

Phylogenetic relationships

Phylogenetic trees obtained through single-locus mitochondrial data were almost identical since all mitochondrial markers are inherited as a single locus. In addition, and as expected, the combination of the mtDNA markers in a multi-locus dataset improved the statistical support of the phylogenetic trees obtained when compared with the single-locus ones (Janko et al. 2011).

The phylogenetic tree obtained with mtDNA showed longer branches and better statistically supported clusters than those obtained with nDNA. These results could be explained in a scenario of recent divergence among some of the species and by the higher substitution rate observed in mtDNA compared to nDNA (Brown et al. 1979). Nevertheless, both mtDNA and nDNA strongly acknowledged the close relationship between G. mediterraneus and G. guttatus in the Gaidropsarus lineage in agreement with previous studies, with the remaining species in a second group (Francisco et al. 2014).

Gaidropsarus biscayensisGaidropsarus macrophthalmus

Previously, it has been proposed that G. biscayensis could be distinguished morphologically from G. macrophthalmus by the number of second dorsal fin rays and by the anal fin ranges (Iwamoto and Cohen 2016). However, in further taxonomical revisions these magnitudes were found to overlap, invalidating them as distinctive (Barros-García et al. 2018). This geographical scenario was also observed between the Atlantic Lepidion eques (Günther 1887) and the Mediterranean Lepidion lepidion (Risso, 1810), resulting in the latter as the only valid species with an Atlantic-Mediterranean distribution (Bañón et al. 2013; Barros-García et al. 2016). In both cases, G. biscayensisG. macrophtalmus and L. equesL. lepidion, the slight morphological differences recorded could be explained by the different environmental conditions, particularly temperature, between the Atlantic Ocean and the Mediterranean Sea (Bañón et al. 2013).

The analyses carried out in the present study showed a general agreement in considering G. biscayensis and G. macrophthalmus as a single species, despite the different approaches and dataset combinations tested. It is true that, since mtDNA has been included in the analyses, the possibility of hybridization/introgression events must be considered (Harrison and Larson 2014). However, no evidence of mixed alleles has been found in the nuclear markers considered, so these possibilities can be discarded. Present results are in agreement with previous results obtained by combining morphological data and DNA Barcoding (Barros-García et al. 2018). Therefore, G. biscayensis should be considered a junior synonym of G. macrophthalmus, resulting in a single species with an Atlantic–Mediterranean distribution, as occur with other Gaidropsarus species.

Gaidropsarus guttatusGaidropsarus mediterraneus

During their original descriptions, the high similarity between G. guttatus and G. mediterraneus was reported (Collet 1905). Recently, a revision of the morphological data available showed overlaps in the main characters measurements used to distinguish between these two species (Barros-García et al. 2018). Only colour patterns and distribution areas remained as distinguishing features (Cohen and Russo 1979). Therefore, it is not surprising that a DNA barcoding analysis was not able to differentiate them, producing a single cluster where individuals of both species were mixed (Barros-García et al. 2018).

Far from helping to clarify the taxonomic problem of these two species, the nDNA analysis showed incongruent results when compared with mtDNA, misplacing some of these individuals with other species. It is well known that nDNA possess specific characteristics such as a lower mutation rate and higher effective population sizes than compared with mtDNA which slows down the time required for divergence (Torres-Hernández et al. 2022). The different responses of the two sets of DNA data could be due to a variety of biological phenomena such as introgression of the mtDNA or incomplete lineage sorting at the nDNA level. Similar results with mixed individuals in nDNA-based trees have been found in other fishes like Nemacheilidae (Cypriniformes) (Dvořák et al. 2022). Taking into account the mtDNA results, an introgression event should be unlikely, but further analyses will be required to clarify the evolutionary phenomenon behind these results (Kornilios et al. 2020).

In any case, the undeniable fact is that, apart from colouring, neither the examination of morphological characters nor the mtDNA sequences comparisons can distinguish between specimens of one species or another. However, taking into account that not all the distribution areas of these species are represented in the data, caution should be exercised when interpreting the results obtained. Therefore, further analyses will be required to fully accept the synonymy between G. guttatus and G. mediterraneus.

Gaidropsarus argentatusGaidropsarus ensis

The boreal Gaidropsarus argentatus and G. ensis are deep-sea species with an overlap in their North Atlantic Ocean distribution areas. Although their colouration and morphology are similar, they can be easily differentiated by the length of the first ray of the first dorsal fin, which is longer than its head in G. ensis and shorter in G. argentatus (Svetovidov 1986b). Previous DNA barcoding and phylogenetic studies indicated a close relationship between these two species (Francisco et al. 2014; Barros-García et al. 2018). Interestingly, species delimitation analyses based on mtDNA clearly distinguished them, whereas nDNA markers did not, since phylogenetic analyses based on nDNA grouped them similarly to G. guttatus and G. mediterraneus. In a recent speciation scenario, nDNA trees may fail to differentiate between species since not enough time has passed to accumulate substitutions in nuclear markers compared to the faster-evolving mtDNA genes (Brown et al. 1979). Therefore, this discrepancy observed in the species delimitation analyses performed with mitochondrial and nuclear DNA could be interpreted as evidence of a recent speciation phenomenon between G. argentatus and G. ensis.

Gaidropsarus grantiGaidropsarus vulgaris

While morphological examination places G. granti as a species close to G. mediterraneus and G. guttatus (Svetovidov 1986a), molecular data place it close to G. vulgaris (Francisco et al. 2014; Barros-García et al. 2018). The latter two species can be differentiated from each other by their colouration pattern, habitat and distribution. Gaidropsarus vulgaris is found in the European continental shelf down to a depth of 120 m, while G. granti is a far less common species, associated with the occurrence of cold-water corals habitats and mainly inhabiting islands and seamounts, at greater depths (Bañón et al. 2020). Analyses relying on mtDNA data clearly distinguished between the two species in a similar way to those performed for the boreal G. argentatus and G. ensis. Despite clustering almost all sequences together, nDNA indicated the independence of G. granti, which could be indicative of a speciation event older than in the previous case of the boreal species, in a way in which nDNA markers would have had some time to begin to acquire species-level substitutions (Brown et al. 1979). Since it is a rare species, the sample size of G. granti in this investigation is low and its related results should be interpreted cautiously. Nevertheless, the analysis of the data set used suggests that its status as an independent species must be maintained although in close relationship with G. vulgaris.

Gaidropsarus gallaeciae

Six deep-water specimens of Gaidropsarus captured in two independent locations in the North Atlantic Ocean did not fit with any current description of Gaidropsarus species. Both mtDNA and nDNA data, when employed in the phylogenetic and species delimitation analyses, indicated that this set of individuals belongs to the same species, different from the other eight nominal species considered in this investigation. These results were confirmed after the morphological comparison of these specimens with the material available, and therefore, described as a new species; Gaidropsarus gallaeciae (Bañón et al. 2022).

Final remarks

The aim of this study was to shed new light on several taxonomic incongruities regarding the known Gaidropsarus species of the North Atlantic Ocean and the Mediterranean Sea. Thus, two possible synonymies among the already recognised species, and the presence of a newly discovered species for the northern hemisphere have been detected. Gaidropsarus biscayensis and G. guttatus were described under the premises of the existence of phylogeographic barriers between the Mediterranean Sea and the Atlantic Ocean for the former, and between European continental waters and the Macaronesian region for the latter. These assumptions have led to an overestimation of Gaidropsarus species in the shallow waters of the North Atlantic Ocean and the Mediterranean Sea. Conversely, the recently discovered Gaidropsarus mauli (Biscoito and Saldanha 2018), and Gaidropsarus gallaeciae (Bañón et al. 2022) indicate the presence of a certain number of yet unknown Gaidropsarus species in the deep-sea ecosystems of the North Atlantic Ocean, despite the relatively good knowledge of the fish fauna of this geographic area (Barros-García et al. 2018; Bañón et al. 2021).

The existence of shared nuclear alleles among different Gaidropsarus species could be evidence of a far more complex evolutionary history than expected for this lineage of teleost fishes. Increasing the number of individuals from more areas and examining more regions of their genomes using next-generation sequencing techniques may be necessary to try to clarify the evolution, and thus the taxonomy, of the North Atlantic and Mediterranean species of the genus Gaidropsarus.