Introduction

DNA barcoding is widely recognised as a valuable molecular tool for a fast and accurate identification of species and biodiversity assessment (Hebert et al., 2016; Hendrich et al., 2015). In animals, it is based on sequencing and analysing a comparatively short, standardised fragment from the 5′ end of the mitochondrial cytochrome c oxidase I (COI) gene (Hebert et al., 2003; Valentini et al., 2009). This technique relies on the legacy of morphology-based taxonomy as, for the purpose of specimen identification, the barcodes are linked with already known species (Gibbs, 2018). For the method to work well in the identification process, the taxonomic coverage of the sequence database must be almost exhaustive (Ekrem et al., 2007) and the identification success must be tested with new sequences. With the rapid increase in the number of available barcodes, at least theoretically identifications become straightforward: if a specimen is included in a known Barcode Identification Number (BIN), it is then accurately matched to a corresponding species (except cases of barcode sharing); if not, it may represent a potentially new or overlooked species (Gibbs, 2018; Ratnasingham & Hebert, 2013). Even if species delimitation using a single locus has many shortcomings (Mallo & Posada, 2016), DNA barcoding still has the immense advantage of being a fast and standard method, generating compatible datasets and sparing biologists from starting to gather sequences from scratch all over again. Practically, data amassed from all over the globe can be integrated and compared.

For insects, the numerous barcodes for Central Europe were mostly generated by the ‘Barcoding Fauna Bavarica’ (BFB) and the ‘German Barcode of Life’ (GBOL) projects (Morinière et al., 2019) but comprehensive barcode libraries were also published for Northern Europe (e.g. Huemer et al., 2014; Pentinsaari et al., 2014a; Pohjoismäki et al., 2016). Concerning Coleoptera, the largest barcoding effort in Europe was directed towards particular geographic regions (Hendrich et al., 2015; Pentinsaari et al., 2014a) and ecologic (Rougerie et al., 2015) or taxonomic (Baselga et al., 2015; Bergsten et al., 2012a, b; Fossen et al., 2016; Horecka et al., 2017; Lopes et al., 2015; Magoga et al., 2018; Marin et al., 2010; Pentinsaari et al., 2014b; Raupach et al., 2010, 2016, 2018) groups. No comprehensive studies are available for the Mediterranean region except the DNA barcode reference library provided for the Iberian butterflies by Dincă et al. (2015) and Ortiz et al. (2017), and for the Euro-Mediterranean leaf beetles (Baselga et al., 2015; Lopes et al., 2015; Magoga et al., 2018). Southeastern Europe remains poorly covered, related to the limited funding and availability of taxonomic experts. Nevertheless, Romania is the single country in this region where a library of DNA barcodes for almost all known butterflies was created (Dincă et al., 2011).

DNA barcodes are available for many representatives of Cerambycidae due to several large DNA barcoding projects (Grebennikov et al., 2017; Hebert et al., 2016; Hendrich et al., 2015; Pentinsaari et al., 2014a; Rougerie et al., 2015) or various studies that focused on integrative taxonomy (e.g. Etzler et al., 2013; Jin et al., 2019; Kvamme et al., 2012; Li et al., 2016; Wallin et al., 2009), intraspecific diversity (Jeon et al., 2015; Wu et al., 2020) or economic topics (Torres-Vila & Bonal, 2019; Wu et al., 2017). Yet, other studies use publicly available sequences to solve taxonomic issues (Wallin et al., 2017; Zamoroka et al., 2019). Complete mitogenomes are now available for over 100 species of Cerambycidae (Nie et al., 2021; Zhang et al., 2021).

However, almost no species of the tribe Dorcadionini has the DNA barcode published and publicly available except Iberodorcadion fuliginator (Linnaeus, 1758) that is one of the most northern species, its distribution reaching Germany. It was barcoded during the BFB and GBOL projects (Hendrich et al., 2015; Rulik et al., 2017).

Dorcadionini is a morphologically diverse Palaearctic tribe in the subfamily Lamiinae (Cerambycidae) and consists of six genera (Danilevsky, 2006, 2020; Danilevsky & Kasatkin, 2007; Danilevsky et al., 2005). It is a specious group of flightless coleopterans, representing about 40% of the European longhorn beetles fauna (Sama et al., 2010), with most species and subspecies being distributed in the south, where the DNA barcoding effort is the lowest.

The species of Dorcadionini from Romania are well understood taxonomically, likely most taxonomic issues being solved (Dascălu, 2018; Dascălu & Fusu, 2012; Pesarini & Sabbadini, 2010; Sama et al., 2010; Zamoroka, 2019). Several species are represented by more than one subspecies, allowing DNA barcoding at a subspecific level. Being flightless, they have low dispersal capabilities (Baur et al., 2005, 2020) and are therefore prone to significant phylogeographic structure, which might be a problem for molecular species delimitation (Moritz & Cicero, 2004; Mutanen et al., 2012; Sukumaran & Knowles, 2017). Yet, most species in the study area are likely immigrants at the beginning of the Holocene (Taberlet et al., 1998; Vitali & Schmitt, 2017) and hence less genetically structured and less challenging to delimit molecularly (but see “Results”). It logically follows that Dorcadionini are a good model for testing DNA barcoding success and molecular species delimitation. The aims of the current study were to provide the barcodes for all the species of Dorcadionini (Cerambycidae, Coleoptera) from the Romanian fauna and partly from other European countries and to analyse the effectiveness of DNA barcodes in discriminating 152 specimens representing 21 species of Dorcadion, two species of Neodorcadion and one of Iberodorcadion used as outgroup.

We discuss our findings by comparing the results with other groups, where speciation is better understood, since there is a general feeling that the taxonomy of Dorcadionini with so many taxa is an unnecessary complication, and it could be much simplified. Speciation is a complex and usually gradual phenomenon that is by no means complete in all the cases one investigates. It is additionally complicated for the human observer by the partial or complete fusions by hybridisation and introgression of previously independently evolving lineages. DNA barcoding is a tool that can give us clues to the natural phenomena that underlay the observed biological diversity, but neither it nor any other genetic technics in our toolkit will solve all taxonomical issues; they only give us a more complete picture to guide our subjective—but this time better informed—decisions to lump or split taxa.

Material and methods

Specimens

The study covers all the species and subspecies of Dorcadionini from Romania plus several more other species from the neighbouring countries (Bulgaria, Czech Republic, Greece, Hungary, Republic of Moldova and Ukraine), to obtain a better estimate of genetic diversity (Table 1). Additionally, we sampled a specimen of Dorcadion mniszechi Kraatz, 1873, the type species of the subgenus Cribridorcadion. For each taxon, multiple individuals mostly coming from different localities were sequenced, except single specimens that were used for the locally rare species Neodorcadion exornatum (Frivaldsky) which, before our study, was last collected in 1975 (Serafim, 2010), and for Dorcadion decipiens Germar (we sequenced a single specimen from Hungary, since most, if not all, records from Romania are doubtful). Specimens were identified by the first author using identification keys and species concepts in Plavilstshikov (1958), Breuning (1962), Hernández (2000), Tomé (2004, 2012), Pesarini and Sabbadini (2007, 2008, 2010, 2013), Sama et al. (2010), Lazarev (2011, 2014), Dascălu and Fusu (2012), Danilevsky (2014), Dascălu (2018) and Zamoroka (2019). Taxonomy and nomenclature, if not mentioned otherwise, follow Danilevsky (2020). Vouchers are deposited in the first author collection at the “Al. I. Cuza” University of Iaşi.

Table 1 Taxa used for DNA barcoding including voucher numbers and collecting locality

Additionally, we downloaded from GenBank the sequences of Dorcadionini from Giannoulis et al. (2020) and the sequences of I. fuliginator from Hendrich et al. (2015) and Rulik et al. (2017). However, sequences obtained by Giannoulis et al. (2020) overlap only on 468 bp with the standard barcode region, and to keep missing data at a minimum, only their D. equestre (ssp. reclinatum Kraatz, 1892 according to locality data) and N. exornatum sequences were included in our final analyses. Iberodorcadion fuliginator from GenBank plus two more sequences from two subspecies of Iberodorcadion perezi Graells, 1849 obtained for this study were used as outgroup alongside Neodorcadion.

DNA amplification, sequencing and data depository

Being comparatively large beetles, we used tissue samples extracted from each specimen: the abdomen was dissected, and the muscular tissue and gonads were stored in 96% ethanol. DNA was extracted using Chelex® 100 resin (Bio-Rad Laboratories) from a small piece of tissue as described in Fusu and Ribes (2017). When dry mounted specimens were used, DNA was extracted from one leg using a non-destructive protocol based on the DNeasy Blood & Tissue Kit (Qiagen) as described in Cruaud et al. (2019). After extraction, the leg was glued back to the voucher.

Standard 25-μl PCRs were performed, using the primer pair LCO1480/HCO2198 (Folmer et al., 1994). A reaction contained 2.5 μl of 10 × PCR buffer, 1 μl of 50 mM MgCl2, 0.5 μl dNTP solution (10 mM each), 1.25 μl of each primer (10 μM), 1.25 μl Taq polymerase (1 u/μl, Red Taq DNA Polymerase, Rovalab), 2 μl DNA extract and water to final volume. For six samples that failed at the first attempt, a second attempt was made by adding 6.25 μl of 25% trehalose to the PCR mix; four yielded a sequenceable product. For each round of reactions, negative controls were included. All PCR amplification reactions were conducted in a Labcycler thermal cycler (SensoQuest). PCR conditions were 94 °C for 1 min, followed by 7 repeated cycles at 94 °C for 40 s, 45 °C for 40 s and 72 °C for 45 s, and then by 33 repeated cycles at 94 °C for 40 s, 51 °C for 40 s and 72 °C for 45 s, with a final extension at 72 °C for 5 min.

It was suggested that the D1 and D2 regions of the 28S ribosomal RNA (rRNA) gene can be used for DNA barcoding similar to COI (Sonnenberg et al., 2007). This easily obtainable nuclear marker can be used in combination with COI to clarify cases of mitochondrial capture or even to detect hybrids (Sonnenberg et al., 2007). We amplified this gene in selected specimens when there was disagreement between the barcode sequence and morphology. We used the primer pair 28S-01 and 28SR-01 (Kim et al., 2000) and the following PCR conditions: 94 °C for 1 min, followed by 35 repeated cycles at 94 °C for 1 min, 50 °C for 1 min and 72 °C for 1 min, with a final extension at 72 °C for 5 min.

Since 28S was found to be too conserved, we also amplified the nuclear internal transcribed spacer 2 (ITS2) for the same specimens. We used the primers developed by Germain et al. (2013) except the forward primer that was not a primer cocktail as originally published but a single primer that we designed by further degenerating their primers F1, F2 and F3: 5.8S_cbgp_F (5′-TCG ATG AAR RMC GCA GYD AAH TG-3′). We used the following PCR conditions: 94 °C for 3 min, followed by 40 repeated cycles at 94 °C for 30 s, 44 °C for 90 s and 72 °C for 90 s, with a final extension at 72 °C for 5 min.

PCR products were sequenced at Macrogen Europe, and double-stranded sequences were assembled using Pregap4 v.1.5 and Gap v.4.10 in the Staden Package (Bonfield et al., 1995). For three specimens that showed on the ITS2 trace files double-peaks characteristic of heterozygous indels, individual haplotypes were resolved using Indelligent v.1.2 (Dmitriev & Rakitov, 2008) and by direct comparison with homozygous individuals. COI sequences were checked for mitochondrial pseudo-genes (numts) by translating them to protein sequences in MEGA v.7 (Kumar et al., 2016) and looking for the presence of stop codons. All sequences were deposited in GenBank (accession numbers OK050368–OK050519 for COI, OK048652–OK048680 for 28S and OK048688–OK048718 for ITS2).

Sequence alignment and genetic distances

The COX and 28S sequences were aligned with the Clustal W algorithm as implemented in MEGA v.7. For ITS2, the alignment was done in MAFFT v.7 (Katoh & Standley, 2013) on the website of the European Bioinformatics Institute (Madeira et al., 2019), followed by minor manual adjustments.

Within- and between-groups distances were calculated in MEGA v.7 using both Kimura’s 2-parameter model (K2P distances) and p-distances as suggested by Srivathsan and Meier (2012). Since the values were similar, for simplicity, we discuss in the text mostly K2P distances. We calculated interspecific and intraspecific distances both including and excluding subspecies that were found to be the result of introgression (see “Results”). This was done because mitochondrial capture following introgression in some populations of a species but not in others will enlarge the estimates of average evolutionary divergence within species and reduce the interspecific distances, potentially masking the barcoding gap. Graphs for visualising the barcoding gap were produced with R v.3.6.1 (R Core Team, 2019) using the faceting function in the ggplot2 package (Wickham, 2009, 2011).

Phylogenetic analyses and haplotype networks

For COI, a substitution saturation analysis was performed in DAMBE v.7 (Xia, 2018) as described in Fusu (2017) based on unique sequences and fully resolved sites only. For phylogenetic inferences, we analysed our data under both maximum likelihood (ML) and Bayesian inference (BI). For the ML analysis, the alignment was delimited by codon position and the best substitution model and partitioning scheme identified using PartitionFinder 2 (Lanfear et al., 2016) with the settings branch lengths = linked and search = all. The result was used as input for a partitioned analysis with linked branch lengths in RAxML-NG v.0.9.0 (Kozlov et al., 2019). We also executed 20 alternative runs on distinct starting trees in RAxML-HPC2 v.8.2.12 (since the results were very similar, data are not shown). In both cases, support for nodes was estimated based on bootstrap pseudo-replicates (BP) with an automatic bootstopping criterion. For the BI analysis, we used BEAST v.1.10.4 (Suchard et al., 2018). Since our intent was to also estimate the timing of divergence events, data was not partitioned because published molecular clock estimates that were used as priors are available only for the entire COI gene. Hence, we used the GTR + G + I substitution model with four gamma categories for the entire alignment, as estimated using a ML criterion in MEGA v.7. Two independent runs were executed in BEAST, each run consisting of 108 generations with sampling performed every 104 generations. The analysis of the complete dataset with a Yule model of branching pattern (see “Molecular clock calibrations”) did not reach convergence, and the runs were extended to 3 × 108 with sampling every 5 × 103. Convergence of parameters was assessed in Tracer v.1.7.1 (Rambaut et al., 2018), and after ensuring all ESS were > 200, the tree files were combined in LogCombiner v.1.10.4. All runs were performed using the CIPRES Science Gateway (Miller et al., 2010).

The 28S rRNA gene and the ITS2 sequences were first analysed individually under ML with the T92 and T92 + G substitution models, respectively. Support for nodes was estimated based on 1000 bootstrap pseudo-replicates. Both model selection and tree reconstruction were done in MEGA v.7. After ensuring that there are no abnormally long branches or misplaced specimens that might indicate contamination or paralogous sequences, the two datasets were combined for a partitioned analysis; 28S and ITS2 were treated as two data blocks, and the best partitioning scheme and substitution models were selected with PartitionFinder 2. We used RAxML-NG as described above with the F81 + I and K80 + G substitution models for 28S and ITS2, respectively. We also analysed the dataset with MrBayes v.3.2.6 (Ronquist & Huelsenbeck, 2003) with default parameters. For each partition, we used the same models as for the ML analysis and parameters were unlinked across partitions. The analysis was run for 107 MCMC generations, with samples taken every 100 generations. Before summarising the trees, convergence of parameters was assessed in Tracer v.1.7.1. Posterior probability (PP) values were plotted on the maximum likelihood tree.

Final trees were imported in FigTree v.1.4.4 (A. Rambaut, http://tree.bio.ed.ac.uk/software/figtree/) and further edited in Adobe Illustrator and Adobe Photoshop. The haplotype networks were reconstructed in PopART v.1.7 (Leigh & Bryant, 2015) with the maximum statistical parsimony method (aka TCS method) (Clement et al., 2000).

Molecular clock calibrations

To estimate the time frame for the diversification of Dorcadionini in the study region, we used BEAST v.1.10.4. The model used for the expected branching pattern was either a Yule model (applied to the complete dataset or to a reduced dataset containing one representative per species) or a coalescent model with constant population size applied to the complete dataset. This was done because genealogical (intrapopulation) substitution rates can be much higher than phylogenetic rates (Papadopoulou et al., 2010). The two are best modelled using a coalescent or a Yule model, respectively, while mixed datasets containing numerous specimens from multiple species are especially prone to vagrancies of divergence time estimates under the inappropriate model (Ritchie et al., 2017).

There are no published clock estimates for Dorcadionini. Hence, instead of applying a rate from a previous study on other groups, such as the universal arthropod mitochondrial DNA (mtDNA) rate (Brower, 1994), the ucld.mean prior of the lognormal distribution was set in real space with a mean of 0.011 substituted sites per million year per lineage and a standard deviation of 0.003 to obtain a distribution with a 95% confidence interval of 0.006–0.018. This encompasses a wide range of rates ranging from some of the slowest rates for COI to the fast rate reported for Tenebrionidae (Papadopoulou et al., 2010); the median of the distribution is very close to the widely known divergence rate reported by Brower (1994) of 2.3% divergence rate per million year (0.0115 substituted sites per lineage). This approach was used by Marshall et al. (2016) to calibrate the molecular clock in cicadas since likewise there were no published clock estimates available.

Single-locus species delimitations

We used the two most popular methods (Luo et al., 2018) of single-locus species delimitation, namely General Mixed Yule Coalescent (GMYC) (Fontaneto et al., 2007) and Poisson Tree Processes (PTP) (Zhang et al., 2013), both with two versions: single threshold (Pons et al., 2006) and multiple threshold (Fujisawa & Barraclough, 2013) versions of GMYC, and both original PTP (Zhang et al., 2013) and multi-rate PTP (mPTP) (Kapli et al., 2017). The two methods are based on different models; hence, if the two are congruent, this should give more confidence in the results. For example, PTP does not require an ultrametric tree and hence it avoids the error-prone procedure of time calibration. For GMYC, we used its implementation in R with the package splits (Ezard et al., 2014) on the tree obtained in BEAST using the coalescent model with constant population size applied to the complete dataset. For PTP, we used the web servers provided by the authors (available at https://species.h-its.org/ and https://mptp.h-its.org/#/tree) on the tree obtained with RAxML-NG.

Results

Alignments

Overall, our DNA barcode library comprises 152 sequences from 24 species of Dorcadionini. For six polytypic species, we barcoded two to three different subspecies (Table 1); hence, our dataset covers a total of 33 newly barcoded taxa plus other two mined from GenBank. All newly generated barcode sequences, except four, were full-length barcodes of 658 bp (the incomplete sequences were 629–654 bp). The alignment had 249 variable sites and 229 parsimony-informative sites. The substitution saturation analysis indicated little saturation (p < 0.00001 for 32 taxa and both symmetrical and asymmetrical trees).

For 28S, the alignment contained 29 sequences and 842 positions with sequences varying in length from 783 to 842 and a single indel of 1 bp or 2 bp in the outgroup taxa. There were 20 variable and eight parsimony-informative sites. For ITS2, the alignment of the 31 sequences had 636 positions including gaps, with complete sequences varying in length from 507 to 523 bp. Three heterozygous specimens were represented by two haplotypes each. The dataset was much more informative compared to 28S, with 112 variable and 78 parsimony-informative sites.

Phylogenetic analyses

Both the ML and BI trees based on COI had similar topologies, with a few notable exceptions (Figs. 1 and 2). In the BI tree, D. decipiens is sister to nominotypical D. pusillum, while in the ML tree, it is nested within it. Sister to the (D. decipiens + D. murrayi + D. pusillum) clade is D. axillare axillare in the ML tree, while in the BI tree, it is D. scopolii. The position of D. tuleskovi varies radically between the two trees.

Fig. 1
figure 1

Maximum likelihood (ML) trees of the investigated Dorcadionini based on COI or 28S + ITS2. Numbers at nodes indicate bootstrap support (COI) or bootstrap support and posterior probability from a BI analysis (28S + ITS2). Taxa names in red indicate lumping. From top to bottom are vouchers axMz1704, puIzvD2_04, puLS11, deHu1601, linAng1001, aeVe01, fuBg1702, tuGr1201, eqDD1201, gaBa1501, mnGe1701, hiSpa1201 and exEn1401. Countries are abbreviated using ISO codes. Scale bars represent 1 mm

Fig. 2
figure 2

Maximum clade credibility ultrametric tree generated with Bayesian inference in BEAST from 162 Dorcadionini COI sequences (152 from this study, 10 from GenBank). Clades with numerous sequences and no delimitation issues are collapsed, with numbers indicating the included sequences. Vertical bars show different delimitation scenarios; red colour indicates lumping of species or subspecies and blue indicates oversplitting. Countries are abbreviated using ISO codes. The scale shows time in million years

Most species were recovered as monophyletic with some exceptions. In the BI tree, all D. pusillum pusillum sequences are part of the same clade (though it also includes one D. pusillum ochrolineatum and three D. axillare moldavicum), while D. pusillum vasiliscus and D. pusillum ochrolineatum are nested within D. murrayi. In the ML tree, D. pusillum pusillum is not monophyletic: specimens from Ukraine (Od14 and Od09) are recovered as a distinct basal lineage, while the species is paraphyletic relative to D. murrayi and D. decipiens.

Dorcadion axillare axillare is monophyletic in both trees, while D. axillare moldavicum is polyphyletic since it is nested within both D. murrayi and D. pusillum pusillum. One specimen, that was identified initially as D. lineatocolle and on second thought as D. lugubre (linPro1001), is grouped with D. lineatocolle instead of clustering with D. lugubre lugubre.

Both trees also recover three highly supported clades (marked with 1, 2 and 3 in Figs. 1 and 2), though the exact relationships between the constituent species vary slightly between the two approaches. The first clade has a posterior probability of 1 and a bootstrap probability of 95%, the second a posterior probability of 1 and a bootstrap probability of 100% and the third a posterior probability of 0.97 but a bootstrap probability below 50%.

The 28S tree recovered only the second group (Fig. S2); however, the alignment had low information content. The ITS2 tree (Fig. S2) and the concatenated nuclear tree (Fig. 1) recovered both the first and second groups with high support at least on one tree. No species of the third group was sequenced for the nuclear markers, since they were not involved in mitochondrial-morphological discordances. Similar to the COI tree, members of the subgenus Carinatodorcadion are nested within the subgenus Cribridorcadion with 99 BP and 1 PP. However, the species that were polyphyletic or paraphyletic on the COI tree are monophyletic on the concatenated nuclear tree (Fig. 1). The exception is D. etruscum [sensu Pesarini & Sabbadini (2007)] that is very distinct on COI but appears conspecific with D. lugubre on the nuclear gene tree. Also, the specimen of D. lugubre linPro1001 that clustered with D. lineatocolle on the COI tree is heterozygous for ITS2, being represented on the tree twice (Fig. 1).

Divergence times

On the COI dataset including one sequence per species and a Yule model, the substitution rate was estimated (mean ± SD) at 0.0127 ± 0.0034 substituted sites per million year per lineage (2.54% divergence rate). When all sequences are included, both under a coalescent model with constant population size and under a Yule model, the estimate was 0.0119 ± 0.0032 (2.38% divergence). Both values are close to the rate provided by Brower (1994), which is considered the universal arthropod mtDNA rate. We obtained quite different age estimates for nodes depending on the composition of the alignment and the modelling of the branching patterns. Most accurate are likely those obtained on the reduced dataset under a Yule model (Fig. S1), and the most erroneous are those obtained using the complete dataset under the same Yule model, because most branching events would be better described by the coalescent. The first analysis estimated the root of the tree (the last common ancestor of Dorcadion, Iberodorcadion and Neodorcadion) at 11.8 Mya (95% highest posterior density (HPD) interval 6.05–19.95) while the youngest split, the one between D. murrayi and nominotypical D. pusillum, was dated at 0.47 Mya (95% HPD interval 0.1–1.06) (Table 2).

Table 2 Age of selected nodes in million years before present and 95% HPD interval between brackets, as estimated by three different BI analyses in BEAST

Haplotype sharing

On the COI phylogenetic trees (Figs. 1 and 2), there is extensive haplotype sharing between D. murrayi, D. pusillum and D. axillare. However, this involves only three subspecies of the latter two species: D. pusillum ochrolineatum, D. pusillum vasiliscus and D. axillare moldavicum.

In the haplotype network (Fig. 3), the haplotypes of D. murrayi are connected to the haplotypes of D. pusillum pusillum through only four mutational steps, while the other two species included in the analysis are more divergent: a minimum of 19 substitutions separate D. axillare axillare from both D. murrayi and D. pusillum pusillum while D. decipiens is 12 substitutions apart from D. pusillum pusillum. Five specimens of D. pusillum ochrolineatum and D. pusillum vasiliscus have mitochondria from the same haplogroup as D. murrayi. In D. pusillum vasiliscus, we detected only D. murrayi mitochondria, while in D. pusillum ochrolineatum, one specimen (puSG03) has a haplotype that clustered with those of the nominotypical subspecies. All 16 barcoded specimens of D. axillare moldavicum harbour mitochondria that are not of their own species but were captured either from D. murrayi (most populations, 13 specimens) or from D. pusillum pusillum (the Chirceşti population, specimens axCh1201–axCh1203). In both cases, they have the most common haplotype of the donor species.

Fig. 3
figure 3

A Maximum statistical parsimony network of Dorcadion species sharing COI haplotypes. The four groupings separated by at least 4 mutational steps are the mitochondria of the four species. However, D. axillare moldavicum harbours D. murrayi mitochondria or D. pusillum pusillum mitochondria; D. pusillum vasiliscus and D. pusillum ochrolineatum harbour mostly D. murrayi mitochondria. Clockwise are vouchers muCht1701, axBg1704, deHu1601 and puHn17. B Maximum statistical parsimony network of D. equestre COI haplotypes. Clockwise are vouchers eqBe1201, eqTo1102, eqDD1201 and eqBa0901. Each circle represents a haplotype with the size proportional to its frequency. Small cross lines represent substitutions, and small black circles are unobserved hypothetical haplotypes. Scale bars represent 1 mm

Divergence and barcoding gap

Mean intraspecific K2P distances based on COI ranged from 0.04% in D. aethiops to a maximum of 1.61% in D. equestre followed closely by D. lugubre at 1.53% and D. axillare at 1.34% (Table S1; Table S3 shows the p-distance). The mean divergence between species varies from 0.62% between D. pusillum and D. murrayi to 16.85% between D. holosericeum and I. fuliginator (Table S1; Table S3 shows the p-distance). This overlap between the intraspecific and interspecific genetic distances clearly shows that, unfortunately, there is no barcoding gap (Fig. 4).

Fig. 4
figure 4

Comparison between interspecific and intraspecific pairwise genetic distances in Dorcadionini based on the DNA barcode. Distances were calculated once with all sequences from a species included (graphs on left side) or after excluding taxa resulted from introgression and mitochondrial capture (graphs on right side). With introgression removed, the grey area of overlap between intraspecific and interspecific distances decreases, the number of uncertain cases dropping from 16 to 4 (one square represents one mean pairwise distance)

The overlap is even bigger if the maximum intraspecific pairwise distance is used because it can be as high as 5.8% (in D. equestre), followed by D. axillare (3.65%), D. lugubre (2.79%), D. pedestre (2.17%) and D. fulvum (1.38%) (Table S5). The maximum value is more meaningful for molecular species delimitations than the mean intraspecific distance that is strongly influenced (lowered) by the inclusion in the analysis of numerous identical or very similar sequences. A similar problem that artificially broadens the barcoding gap is its estimation using mean between-species distances rather than minimum distances (Meier et al., 2008).

When taxa involved in haplotype sharing are removed from the analysis, the smallest and the largest mean intraspecific distances remain the same, but there is a noticeable drop in the molecular variability for D. lugubre to 0.36% and D. axillare to 0.40%. The smallest mean distance between species (the one between D. pusillum pusillum and D. murrayi) increases to 0.77%, but it is still very small and comparable to intraspecific distances; the next smallest distance (the one between D. murrayi and D. axillare) increases significantly from 0.9 to 3.2% (Table S2; Table S4 shows the p-distance). Hence, with introgression removed, the grey area where there is overlap between intraspecific and interspecific distances is smaller, with the number of uncertain cases dropping from 16 to 4 (red and blue squares on the grey area in Fig. 4).

Species delimitation approaches

When comparing the various species delimitation scenarios with the morphology and current taxonomy (Fig. 2), it appears that out of the two approaches, PTP lumps more, sometimes rightfully so: D. equestre is recognised by PTP as a singles species except for Dorcadion equestre reclinatum. More frequently, the lumping is unjustified: D. murrayi and nominotypical D. pusillum are considered as potentially one species by original PTP, plus D. decipiens being also included in the same putative species in mPTP. The whole group consisting of D. aethiops, D. etruscum, D. pedestre and D. kozanii is considered as one species in mPTP. The original PTP method oversplit only once, in an instance where all other methods identified the sequences correctly as a single species: it delimited a pair of basal sequences from the middle of the distribution area of D. tauricum as a potentially distinct species.

On the contrary, the GMYC method tends to oversplit, and especially its multiple threshold variant, that identified as putative species several unique basal but not particularly divergent sequences of D. pusillum ochrolineatum, D. scopolii, D. equestre equestre and N. exornatum (Fig. 2, GMYCm). Strangely enough, the multiple threshold method outperformed the singe threshold method in correctly assigning one basal sequence of D. fulvum to the species and not splitting it (Fig. 2, GMYCs and GMYCm).

Discussion

Phylogenetic reconstructions

The substitution saturation analysis of the COI sequences indicated limited substitution saturation. Also, there is good correlation between the inferred trees and some of the previously recognised species groups. Depending on the type of BI analysis, Iberodorcadion is sister to Dorcadion while Neodorcadion is a separate clade (Fig. 2) or Iberodorcadion and Neodorcadion are sister groups (Fig. S1). Members of the subgenus Cribridorcadion of Dorcadion are divided in three highly supported clades, recovered by both inference methods.

The first clade contains species explicitly included by Pesarini and Sabbadini (2007) in the Dorcadion minutum species group or morphologically close to them (D. axillare, D. decipiens, D. litigiosum, D. murrayi, D. pusillum). Resolution is poor for the three species involved in extensive haplotype sharing (D. axillare, D. murrayi and D. pusillum). The group was also recovered on the ITS2 tree and on the concatenated tree, though species sampling was less dense compared to the COI tree.

The second well-supported clade contains species that at least in males are mostly devoid of elytral pubescence: D. etruscum, D. kozanii and D. pedestre species groups proposed by Pesarini and Sabbadini (2007, 2010). This clade surprisingly also contains both species of the subgenus Carinatodorcadion included in our study: D. fulvum and D. aethiops that are thus nested within the subgenus Cribridorcadion. They do not form a monophyletic group, but instead, D. aethiops is much closer to D. pedestre, D. etruscum and D. kozanii, while D. fulvum is well apart (Figs. 1 and 2). The mean pairwise distance between D. aethiops and D. etruscum is 3.43%, and that between D. aethiops and D. kozanii is only 2.66% (Table S1). This is significantly lower than the maximum intraspecific distance in D. equestre (5.8%) and close to that in D. pedestre (2.17%) (Table S5). The divergence between D. aethiops and (D. kozanii + D. pedestre) is estimated at 1.6 Mya (95% HPD interval 0.52–2.66 Mya) (Fig. S1). On the 28S tree (Fig. S2), D. aethiops, D. fulvum, D. etruscum, D. lineatocolle and D. lugubre form a polytomy. The tree is not fully resolved because, contrary to our expectations, the fragment was found to be too conserved. On the concatenated nuclear tree, however, D. aethiops is sister to D. fulvum, as expected from the morphology and from their classification in the same subgenus, indicating its hybrid origin.

The third clade on the COI tree contains all other taxa of the subgenus Cribridorcadion included in the study, but this group is very heterogeneous, likely because it contains a low sampling of many other species groups. For example, the position of D. tuleskovi (Dorcadion peloponesium group) on the tree varies greatly between the Bayesian and the maximum likelihood approaches.

Hybridization and haplotype sharing

Most frequently, haplotype sharing between species can be either the result of recent speciation and incomplete lineage sorting or the result of introgressive hybridisation (Funk & Omland, 2003; Sloan et al., 2017; Toews & Brelsford, 2012; Wirtz, 1999). In D. murrayi, D. pusillum and D. axillare, the mitochondrial-morphological discordance between species is present only in the areas of range overlap between the taxa (Fig. 5). While the populations of D. pusillum and D. axillare sampled from localities outside the distribution range of D. murrayi are genetically distinct (D. pusillum pusillum from Republic of Moldova and Ukraine and D. axillare axillare from Bulgaria), those in Romania (where the distribution of the three species overlaps) have mostly D. murrayi haplotypes. Mismatches resulting from incomplete lineage sorting should not have any coherent biogeographic pattern, while those resulting from differentiation in isolation followed by secondary contact and hybridisation do have it (Toews & Brelsford, 2012). Hence, we attribute the identified pattern to hybridisation between three originally allopatric species.

Fig. 5
figure 5

World distribution of D. murrayi (black circles) and of D. axillare (blue circles; violet shading is subspecies moldavicum) and western limit of the distribution of D. pusillum (red circles; red shading is subspecies ochrolineatum, and yellow shading is subspecies vasiliscus). Yellow crosses: localities for the barcoded specimens. Distribution based on published localities, first author collection and other public collections

Recently documented cases of haplotype sharing in beetles can be found in the genera Anastrangalia (Cerambycidae) (Hendrich et al., 2015), Agrilus (Buprestidae) (Pentinsaari et al., 2014b), Amara and Bembidion (Carabidae) (Raupach et al., 2016, 2018). In Hydroporus (Dytiscidae) (Bilton et al., 2017), two species were hypothesised to be of hybridogenic origin, the same as the subspecies of D. pusillum and D. axillare, and likely D. aethiops.

Within the classical biological species concept, hybridisation is regarded as a rare abnormality generated by a failure of the reproductive barriers, but, as reviewed by Mallet (2005), an average of 10% of the animal species hybridise, and in some groups, the percentage can be as high as 25%, similar to the rate seen in vascular plants. A well-understood example of introgressive hybridisation and mitochondrial capture between good species involves Ursus arctos Linnaeus (the brown bear) and Ursus maritimus Phipps (the polar bear) (Kumar et al., 2017). Several previous studies (Bahillo de la Puebla, 1999; Bernhauer & Peks, 2016; Dascălu, 2007; del Saz Fucho, 2009) reported putative hybrids in Dorcadionini, but there was no genetic proof of this. Besides the introgression cases we also document a presumably first-generation hybrid between D. lugubre and D. lineatocolle. Hence our study brings genetic evidence of past and present hybridisations between different species within the tribe. Because of this reticulated evolution generated by hybridisation followed by introgression, DNA barcodes in Docadionini should be used cautiously and only in combination with distribution data, morphology or nuclear genes.

According to Wirtz (1999), when two species that differ conspicuously in size hybridise, hybridisation is usually restricted to pairs formed by females of the smaller species and males of the larger species. It is not the case here since both D. pusillum and D. axillare are smaller than D. murrayi, while the hybridogenic taxa have the mitochondrial genome of D. murrayi. This indicates that the females involved in hybridisation belonged to the larger species. A similar case was documented for the Drosophila mojavensis species cluster (Diptera), where interspecific matings are successful if the involved male has a smaller aedeagus compared to conspecific males of the concerned female (Richmond, 2014). In the same Drosophila species, aedeagus shape was found to be less important than aedeagus size (Richmond, 2014). Similarly, in the three concerned Dorcadion species, the differences in endophallus shape (Dascălu, 2018) are not a barrier for interspecific mating.

Evolutionary implications

With some 5000 recognised genera and an estimation of about 35,000 valid species (Costa, 2000; Monné et al., 2017; Rossa & Goczał, 2021), the longhorn beetles (Cerambycidae) are one of the most numerous and diverse groups of the order Coleoptera and one of the largest families of animals (Haddad et al., 2018; Rossa & Goczał, 2021), and several different drivers of their diversity have been proposed. The most often evoked hypotheses for the immense diversity of the phytophagous coleopterans, including the longhorn beetles, involve the co-radiation with the flowering plants (Farrell, 1998; McKenna, 2011; Wang et al., 2013, 2014). However, this was questioned as the sole driving force of diversification in Cerambycidae, since many species are polyphagous (Haddad et al., 2018). Allopatric divergence driven by the cyclic climatic oscillations of the Pleistocene (Goczał et al., 2020; Shoda et al., 2003a, b; Smith & Farrell, 2005; Zamoroka et al., 2019) and vicariance events (Kim et al., 2018) is another mechanism that certainly played a role in the speciation and diversification of Cerambycidae, and it should have been especially important in a flightless group like Dorcadionini. Although introgressive hybridization has been earlier suggested as another potential mechanism that generates diversity in other flightless Cerambycidae like Mesechthistatus (Nakamine & Takeda, 2008) or Morimus (Hardersen et al., 2017), we are providing new evidence in support of this hypothesis. Otherwise, genetic evidence of hybridisation following the secondary contact was recently revealed for two species or subspecies of Monochamus based on microsatellite data (Goczał et al., 2020), while based on intermediate genitalia morphology and haplotype sharing, Zamoroka et al. (2019) concluded that Anastrangalia reyi (Heyden) is the result of introgressive hybridisation between Anastrangalia sequensi (Reitter) and Anastrangalia dubia (Scopoli) and that they should be treated as subspecies. Hybridisation and possibly introgression was recently documented for two species of Cerambyx as shown by the incongruence between mitochondrial sequences and morphology (Torres-Vila & Bonal, 2019).

Though introgression—based on morphological evidence—was rarely invoked, either explicitly (Zamoroka, 2019) or implicitly (Danilevsky, 1999), it could, in fact, be one of the drivers of the huge diversity of Dorcadionini. There is growing evidence that hybridisation can promote phenotypic novelty contributing to species diversification (Genner & Turner, 2012), and in Dorcadion, it is a generator of both diversity and blurriness in species boundaries, the same being recently concluded for a group of Cryptocephalus leaf beetles (Gómez-Zurita et al., 2012). Even distantly related Dorcadionini species, belonging to distinct lineages and classified in different subgenera, can hybridise. This is strongly suspected in Iberodorcadion (Bahillo de la Puebla, 1999), while in Dorcadion, it appears that D. aethiops is the result of the hybridization between a species of Dorcadion (Carinatodorcadion) and a yet unknown species of Dorcadion (Cribridorcadion), since its nuclear sequences are grouped with one subgenus, while its mitochondrial sequence beyond any doubt belongs to the other subgenus. The generation of novel combinations of genes by the gradual introgressive hybridization ongoing within overlapping distribution ranges is a new possible mechanism involved in speciation in Cerambycidae.

Small interspecific distances

Introgression is not the only factor making molecular identifications or taxonomic decisions in Dorcadionini challenging. Even if introgression followed by mitochondrial capture is removed from the analyses, geographic structuring within species or presence of distinct species cannot be distinguished based solely on molecules, because some species pairs have very low between-species distances. A threshold of 2–3% was suggested as indicative for species-level divergence (Mutanen et al., 2012; Ratnasingham & Hebert, 2007, 2013; Smith et al., 2005). In our study, certain sequences from known, well-differentiated species are not sufficiently distinct genetically according to this threshold. Between-species distances are comparatively small for the trio D. aethiops, D. pedestre and D. kozanii (2.66–2.88%), and they are even smaller for the trio D. murrayi, D. pusillum pusillum and D. decipiens (0.77–2.38%). These distances are comparable or below the largest intraspecific distances found in D. equestre (5.8%) and D. pedestre (2.17%) (Tables S1 and S5).

There are other recently documented cases concerning reciprocally monophyletic species with low divergence: Rhynchites bacchus (L.) and Rhynchites auratus (Scopoli) (Coleoptera, Attelabidae) have a minimum K2P distance of 1.86%; Harpalus attenuatus Stephens and Harpalus rubripes (Duftschmid) (Coleoptera, Carabidae) show a minimum K2P distance of 1.07% (Hendrich et al., 2015); Bembidion ascendens K. Daniel and Bembidion fasciolatum (Duftschmid) (Coleoptera, Carabidae) show interspecific distances values ranging from 0.49 to 0.82% (Raupach et al., 2016).

Genetic distinctiveness of subspecies

Concerning the ability of the DNA barcodes to distinguish subspecies, we identified both, cases in which subspecies are genetically distinct and cases where they are identical. Dorcadion equestre has a very pronounced genetic substructuring that follows closely the accepted subspecific structure, while in D. axillare and D. pusillum, even if subspecies are genetically distinct from the nominotypical ones, this is not because of genetic divergence but the result of their origin through introgression (see further below for details).

On the other hand, in other three species where we analysed more than one subspecies, no differentiation was found between them. For D. fulvum, we sampled from five countries and covered three subspecies but found no geographic substructure, the same for D. aethiops sampled from distant areas (Romania and Greece) with representatives of three subspecies. In the case of I. perezi, we included only one specimen per subspecies, and they were from the contact area. The morphological differentiation in this case is much larger than that between the subspecies of D. fulvum or D. aethiops, and they are considered either good species or subspecies depending on the author (Hernández, 2000; Tomé, 2004, 2012), but still the DNA barcode is almost identical. Since low genetic divergence in the DNA barcode between taxa or specimens is not always a proof of conspecificity, due to the possibility of introgression and mitochondrial capture, our treatment of I. perezi hispanicum and I. perezi ghilianii as subspecies is purely arbitrary and follows Danilevsky (2020). If they are treated as distinct species, we have yet another case of interspecific hybridisation and mitochondrial-morphological discordance.

Below, we discuss in more details the cases where we found discordances between the taxonomy and genetic data.

Dorcadion aethiops and D. fulvum

Dorcadion aethiops belongs to the subgenus Carinatodorcadion, but its mitochondrial DNA is much closer to species of Cribridorcadion than to D. (Carinatodorcadion) fulvum. Prior to Pesarini and Sabbadini (2007), the Greek populations of D. aethiops were regarded as distinct species: Dorcadion propinquum Breuning (described from near Kozani) and Dorcadion majoripenne Pic, with the type locality around Thessaloniki (Breuning, 1962). According to Danilevsky (2014), both taxa are valid subspecies and a third subspecies, D. aethiops strumense Danilevsky, was described from south Bulgaria.

Therefore, according to the currently accepted geographic distribution of the subspecies, the barcoded specimens are D. aethiops propinquum (aeVe01 from Vermion Mts.), D. aethiops strumense from Kerkini Mts. (aeKe01 and aeKe02 collected at about 10 km from the type locality in S Bulgaria) and the nominotypical D. aethiops aethiops from Romania (Table 1). Our molecular data agree with the treatment of D. propinquum as a subspecies of D. aethiops and not as a distinct species. The identical DNA barcode in three subspecies distributed over a wide geographic area and the discordance between male genitalia and ITS2 data that places D. aethiops in one subgenus and COI molecular data that disagree are best explained by mitochondrial capture from a yet unknown species in the subgenus Cribridorcadion. The body morphology of D. aethiops is also reminiscent of some species of Dorcadion (Cribridorcadion) devoid of elytral pubescence.

By contrast, over the same geographic distance, D. fulvum is much more variable genetically. However, the three analysed subspecies cannot be distinguished based on the DNA barcode. Hence, we confirm the presence of the nominotypical subspecies in Romania, west of the Carpathians, based on morphology only. It was cited for the area by Zamoroka (2019) based solely on distribution data.

Dorcadion axillare, D. murrayi and D. pusillum

Dorcadion axillare, D. murrayi and D. pusillum represent an interesting case of hybridisation followed by mitochondrial capture. The mtDNA introgressed massively across species boundaries even if nuclear gene flow seems to be restricted. In the absence of extensive nuclear marker data, we base this conclusion on the morphological distinctiveness of the three species. This is also confirmed by the discrepancy between the combined 28S and ITS2 trees and the COI tree that show a clear case of cytonuclear discordance. Both the 28S and ITS2 sequences are distinct between D. murrayi and D. axillare axillare, D. axillare moldavicum having the nuclear sequences clustering with those of the nominotypical subspecies, unlike the COI sequences (Figs. 1 and S2). The 28S sequence is identical for nominotypical D. pusillum and D. murrayi likely reflecting their recent split, and thus uninformative for the affinity of D. pusillum ochrolineatum and D. pusillum vasiliscus (Fig. S2). The ITS2 sequences, however, are identical/very similar for all D. pusillum subspecies, and distinct from D. murrayi (Fig. S2). Hence, both the morphology and the analysis of the nuclear genes (Fig. 1) confirm that D. axillare moldavicum, D. pusillum ochrolineatum and D. pusillum vasiliscus are indeed subspecies of D. axillare and D. pusillum as originally described, and not of D. murrayi.

The range of D. axillare extends from Bulgaria to S Romania and continues to NE Romania where a separate subspecies, D. axillare moldavicum, was described (Dascălu & Fusu, 2012) (Fig. 5). In our COI trees, the populations of the two subspecies did not cluster together: D. axillare axillare forms a distinct barcode cluster while D. axillare moldavicum is nested within both D. murrayi and D. pusillum. Most populations have D. murrayi mitochondria except the population from Chirceşti which has D. pusillum pusillum mitochondria (Fig. 3). This last population was considered transitional by Dascălu and Fusu (2012). In all the localities of D. axillare moldavicum, including the type locality, D. murrayi is missing. The historically heavily forested Bârlad Plateau (Fig. 5) acts as a geographic barrier that impedes the spread of D. murrayi to the north (D. axillare crossed it somehow). In this context, the probability of D. axillare moldavicum being a first-generation hybrid is zero. Knowledge about the introgression also elucidates the origin of the characters used by Dascălu and Fusu (2012) to differentiate D. axillare moldavicum from the nominotypical subspecies. The bigger and more elongated body and the longer pronotal spines were most likely taken from D. murrayi. Because of its hybridogenic origin we prefer to consider this taxon as a subspecies rather than as a species (Dorcadion moldavicum) following Danilevsky (2020).

In most animal species, mitochondria are passed to the next generation by females. A possible explanation of the unidirectional mitochondrial transfer lies in the contrasting mobility of the two sexes in Dorcadionini. Males are more mobile than females, as shown for I. fuliginator by Baur et al. (2005), and hence, the likelihood for them to cross a geographic barrier such as the Danube River is higher. Our hypothesis is that stranded males of D. axillare axillare migrating from what is now Bulgaria encountered and mated with D. murrayi females, a species distributed north of the Danube. Even if some D. axillare axillare females also migrated to the north, the few mitochondria they passed to the next generations were likely lost by genetic drift or are rare and we could not detect them. Most documented hybridisation cases in animals are explained by the so-called Hubbs principle, stating that the acceptance rate of heterospecifics as mating partners is correlated to the rarity of conspecifics (Willis, 2013). Additionally, as shown by Richmond (2014) for Drosophila, it is possible that crosses between D. axillare axillare females and D. murrayi males are not possible due to differences in genitalia size, D. axillare axillare being a much smaller species.

In contrast to D. axillare, the two subspecies of D. pusillum endemic to Romania have their distribution area completely overlapping with D. murrayi (Fig. 5); however, they are frequently separated at the landscape level. Dorcadion pusillum prefers lowland habitats with slightly salty soils along small rivers, while D. murrayi prefers more steppe-like habitats on hill slopes (Dascălu, 2018 and author’s unpubl. data). That is why in the type locality of D. pusillum ochrolineatum, D. murrayi was not found, while in the type locality of D. pusillum vasiliscus, it is present in the area but not in the same place (Dascălu, 2018 and author’s unpubl. data). However, at Spătaru Forest (Fig. 5), both species share the same habitat and populations here could represent a hybrid swarm with introgression being an active phenomenon. The hybridisation was probably accelerated by the human-mediated encounter of otherwise ecologically distinct species, following regulation of water levels.

Similar to D. axillare moldavicum, in D. pusillum vasiliscus and D. pusillum ochrolineatum, the morphological characters used by Dascălu (2018) to distinguish them from the nominotypical subspecies are most likely derived from D. murrayi: reduction of dorsal and humeral elytral stripes and spots and a darker antenna.

Dorcadion murrayi and D. pusillum pusillum that generated the hybridogenic taxa D. pusillum vasiliscus and D. pusillum ochrolineatum are remarkably close genetically. The pairwise distance between D. murrayi and nominotypical D. pusillum of only 0.77% is the smallest of all species pairs that we examined. However, they are not particularly close based on morphology and we suspect that another hybridisation took place before the ongoing one, leading to the high similarity of their DNA sequences. Our best estimation in BEAST indicates that the COI sequences of the two species diverged only about 0.47 Myr ago. Within the current Quaternary glaciation, this would correspond to five interglacial periods ago, since the Quaternary is characterised by interglacials separated by periods of 0.1 Myr (Mudelsee & Stattegger, 1997; Jahn et al., 2003). Surprisingly, the end of the Marine Isotope Stage 13 (MIS 13, a warm interglacial period) is also dated at about 0.47 Myr ago (Railsback et al., 2015). The cooling of the climate after this stage and the onset of a glacial period are the likely driving forces behind the observed genetic divergence. Also, the MIS 14 glacial period that preceded MIS 13 was much warmer than other glacial epochs and this blended MIS 13 with the MIS 15 interglacial, resulting in a long and warm ‘super-interglacial style’ climate 0.621–0.478 Myr ago (Hao et al., 2015). This certainly led to considerable range expansion of the species followed by range contraction at its ending. One species captured the mitochondria of the other, and they diverged afterwards in allopatry until they met once again during the present interglacial and hybridised once more. Even if our molecular clock estimates are not accurate, it is still highly likely that the distribution areas of D. murrayi and D. pusillum overlapped at some another time in the past during their population retreat and expansion pursuing climatic oscillations.

Given the apparent ease with which these three species can hybridise, one can ask whether they are truly distinct species. A pro argument is the endophallus, characteristic and distinct for each of them (Dascălu, 2018; Dascălu & Fusu, 2012); combined with differences in habitus and colour, this clearly indicates that they are distinct species. Based on biogeographic data, D. axillare is a Balcanic species (Dascălu & Fusu, 2012), and most of the distribution range of D. pusillum coincides with the Pontic-Caspian steppe (Dascălu, 2018), while D. murrayi is known almost exclusively from Romania (Fig. 5). Therefore, it is likely that the three species evolved in allopatry and their range overlap is recent, caused by colonisation of new areas after the end of the last Ice Age (Taberlet et al., 1998). Since they were initially allopatric, no assortative mating mechanisms to prevent hybridisation evolved, but they may appear by reinforcement (Mallet, 2005) now that the species are partly sympatric. The continuous introgression between the three species is delayed by two geographic barriers: the Prut and Danube rivers. Even if they were crossed by D. pusillum and D. axillare, the third species, D. murrayi, mostly did not cross them and hence it did not colonise the areas south and east of Romania. Outside Romania, this species is known from one locality in Serbia, close to the Danube (Ilić & Ćurčić, 2015) (Fig. 5).

Dorcadion decipiens

The taxonomic status of D. decipiens based on molecular data is not clear. We included only one specimen in the analysis, and it is retrieved as sister to nominotypical D. pusillum only in the BI analyses (Fig. 2). In the ML analyses (Fig. 1), it is retrieved within nominotypical D. pusillum, but on a long branch. Probably because of this placement of D. decipiens, in the PTP analysis the nominotypical D. pusillum from Ukraine and D. decipiens are both distinguished as separate species; mPTP includes D. pusillum, D. decipiens and D. murrayi in the same species. In GMYC, D. decipiens is always included in the nominotypical D. pusillum, even if on the tree they are reciprocally monophyletic (Fig. 2) and are separated by 12 substitutions, three times more than D. murrayi and D. pusillum pusillum (Fig. 3). Morphologically, D. decipiens has a pubescence pattern similar to that of nominotypical D. pusillum but it is comparatively large and elongated, similar to D. murrayi. The mean interspecific distance between D. decipiens and these two species is 2 to 3% (Tables S2 and S4), at the threshold between intraspecific and interspecific variability.

Dorcadion equestre

Unfortunately, we could not include in the molecular analyses specimens from the type locality of D. equestre. The species was described from ‘South Russia’ which at the end of the eighteenth century also stretched over present-day Ukraine, where the species is quite widespread (Zagaykevich & Puchkov, 2009), but it is also distributed in present day south of European Russia (Plavilstshikov, 1958). There is no disruption in the species distribution in Ukraine, NE Romania and Republic of Moldova, but a distribution break exists between the above-mentioned area and the populations from the Romanian Plane (Walachia) and Transylvania, type locality for D. equestre transsilvanicum (Fig. 6). This break along the Iaşi escarpment was most likely generated by the historically heavily forested northern part of the Bârlad Plateau. Due to the large genetic divergence between specimens from these two disjunct areas, we include specimens from NE Romania and Republic of Moldova in the nominotypical subspecies rather than in D. equestre transsilvanicum as done by Danilevsky (2020) and Pesarini and Sabbadini (2010). In these specimens, the development of dorsal carinae and the presence of black mottling, characters that differentiate the two subspecies (Pesarini & Sabbadini, 2013), are more similar to specimens from Ukraine (based on 9 specimens in first author’s collection), than to those from southern Romania.

Fig. 6
figure 6

North-western distribution limit of D. equestre. Black circles: D. equestre equestre, red circles: D. equestre transsilvanicum. Yellow crosses: localities for the barcoded specimens. Distribution based on published localities, first author collection and other public collections

In our GMYC species delimitation scenario, D. equestre transsilvanicum is delineated as a distinct species from the specimens we identified as nominotypical. The two groups are separated by a minimum of seven substitutions (Fig. 3). The analysis also included one specimen from Dobrogea, an area lying south of the Danube. It is differentiated as a potentially distinct species only by the multiple-threshold GMYC, being included in the nominotypical subspecies by the single-threshold approach (Fig. 2, GMYCs and GMYCm). The PTP approach, on the other hand, is not delineating D. equestre transsilvanicum and D. equestre equestre as two species (Fig. 2, PTP and mPTP). A common result for all methods is the delimitation of D. equestre reclinatum as a distinct species because it is as divergent from the nominotypical subspecies as, for example, the divergence between D. aethiops and D. fulvum. Since for this subspecies our analysis included only one sequence derived from another study (Giannoulis et al., 2020), more sampling from a larger area might narrow this gap. A large intraspecific divergence was also found in other 16 species of European Cerambycidae (Hendrich et al., 2015; Rougerie et al., 2015).

Dorcadion lugubre and D. lineatocolle

The COI sequences of D. lugubre are divided between two clades, with one sequence grouped with D. lineatocolle. This misplaced specimen is small for its species, about the same size as D. lineatocolle (Fig. 7). It is heterozygous for the ITS2 sequence, having a haplotype that groups with D. lugubre and one that groups with D. lineatocolle (Figs. 1, S2 and 7). According to Danilevsky (2014), specimens of D. lugubre from Struma Valley in Bulgaria are to be attributed to D. lugubre minkovae. Our specimen of D. lugubre with D. lineatocolle mitochondrial haplotype and with an ITS2 sequence of both species comes precisely from this area (Promachonas Village in Greece, just before the river Struma enters Roupel Gorge). Hence, based on morphology and molecular data, the specimen linPro1001 is actually a hybrid between D. lugubre lugubre or D. lugubre minkovae and D. lineatocolle, a species present in the area (Minkova, 1961). A comprehensive sampling (like the one undertaken for D. pusillum, D. axillare and D. murrayi) is needed to confirm this hypothesis.

Fig. 7
figure 7

Habitus of voucher specimens (from left to right): Dorcadion lugubre lugubre (luKe1001 and luKe0801), D. lugubre × D. lineatocolle (linPro1001) and D. lineatocolle (linAng1001). All at the same scale. A portion of the ITS2 trace file for three specimens is included, showing double peaks in linPro1001 for the positions where the parental species differ. Scale bar represents 1 mm