Introduction

Sand flies (Diptera, Psychodidae, Phlebotominae) are small insects in which females participate in the transmission of several pathogens such as protozoan, bacteria, and virus (Maroli et al. 2013; Cecílio et al. 2022). This subfamily is highly diverse, especially in tropical regions, accounting for about 1060 species distributed worldwide (Galati and Rodrigues 2023). Regarding the Western Hemisphere (Americas), there are 23 described genera, comprising about 555 species (Galati 2018; Galati and Rodrigues 2023). The Amazon biome is one of the most important in the world due to its high biodiversity, which is reflected also in the American sand fly fauna and the highest number of species being found in Brazilian states that make up this region (Aguiar and Vieira 2018). Although the Amazonian sand fly fauna includes taxa from almost all Neotropical genera, some species of Bichromomyia, Lutzomyia, Nyssomyia, Psychodopygus, and Trichophoromyia deserve special attention because they are proven and putative vectors involved in the transmission of cutaneous leishmaniasis (CL) (Maroli et al. 2013; Brazil et al. 2015), a complex disease caused by several Leishmania species (Reithinger et al. 2007).

Trichophoromyia has been getting attention because some of its species have been considered as proven or suspect vectors of Leishmania spp. (Santos and Silveira 2020). Trichophoromyia auraensis has been found with Leishmania lainsoni and Leishmania braziliensis DNA in Madre de Dios, Peru (Valdivia et al. 2012). In the western region of the Amazon biome, in the Brazilian state of Acre, this species was found with L. braziliensis DNA (Araujo-Pereira et al. 2017; Ávila et al. 2018) and has been considered as a putative vector of Leishmania (Viannia) sp. (Teles et al. 2017), and in the state of Rondônia, it was found with Leishmania sp. DNA (Ogawa et al. 2016; Resadore et al. 2019).

Trichophoromyia, currently, has 45 nominal species described (Rodrigues et al. 2023), most of the females are isomorphic (Galati 2018), and many of the males present very similar morphology, which makes it difficult to identify and produce dichotomous keys at the species level. Therefore, the use of integrative taxonomy can help in the species’ delimitation of this genus, male-female association, and detection of possible cryptic species. However, few studies have sampled Trichophoromyia spp. to analyze molecular markers (Rodrigues and Galati 2023).

Recently, the sequencing and analyses of the DNA barcoding (Hebert et al. 2003) fragment of the cytochrome c oxidase subunit I (COI) gene revealed that T. auraensis can be molecularly identified among other sand flies from the western Amazon region (Pinto et al. 2023a). However, due to its wide distribution, other populations must be analyzed to corroborate these results and assess the presence of different genetic lineages. Moreover, different genomic regions such as the nuclear internal transcribed spacer 2 (ITS2) can be evaluated to support the findings of the COI barcodes (Yao et al. 2010).

The state of Maranhão, located in the northeast region of Brazil, has a territorial area of 329.651 km2 and seven million inhabitants (IBGE 2021), bordering the states of Piauí (east), Pará (west), and Tocantins (southwest). Different biomes are found in this state, including the Amazon rainforest, Cerrado, Caatinga, and Restinga, which present different climates and phytogeographic conformations (Rebêlo et al. 2010). So far, 92 sand fly species have been recorded for Maranhão (Rebêlo et al. 2010; Aguiar and Vieira 2018), making it the most diverse state in the northeast region. However, most of the sand fly surveys were carried out in fragments of secondary vegetation, modified primary forest, and, mainly, the peridomicile of human dwellings, where there are the highest leishmaniasis transmission foci and frequency of outbreaks (Rebêlo et al. 2000, 2001, 2010; Martins et al. 2004; Silva et al. 2010; Pereira Filho et al. 2015). Therefore, biological reserves and indigenous lands with large fragments of conserved forest have been poorly sampled, and their sand fly fauna may be more diverse than that reported so far.

Here, we report for the first time the presence of T. auraensis in the state of Maranhão, in an eastern Amazon area, updating the knowledge on the geographic distribution of this sand fly species. We also analyzed two molecular markers—the DNA barcoding fragment of the COI gene and the nuclear rRNA ITS2 region—in order to provide insights into the lineage diversification of this species in the Amazon biome, identifying the presence of cryptic diversity. Furthermore, the relationship of T. auraensis with different species of the same genus was analyzed.

Methods

Sample collection, processing, and geographic distribution

The entomological survey was performed in the municipality of Governador Newton Bello, state of Maranhão, Brazil, located in the west mesoregion, and Pindaré microregion of the state of Maranhão, which is characterized by being part of the extreme east of the Amazon biome, with ombrophilous forest and a humid climate (Lima Costa et al. 2016). This municipality has an area of 1144 km2 with an estimated population of 10,121 inhabitants (IBGE 2021). Collection sites were set on an adjacent forest fragment to the properties of the “Fazenda Planalto” (3°26′36.5″ S 46°15′08.6″ W), close to the border with the municipality of São João do Caru, state of Maranhão (Fig. 1).

Fig. 1
figure 1

A, B Current geographic distribution of Trichophoromyia auraensis in the Amazon biome, sample collection site of the new record reported in this study (pink star, 2), and previously processed populations from western Amazon (green star, 1)

In order to compare both regions of the Amazon biome, we also processed samples collected in the municipality of Xapuri, state of Acre, located in the western Amazon, in the locality of “Seringal Floresta” (10°26′37.3″ S 68°36′07.8″ W). In addition, a collection has been done in an Atlantic Forest fragment in the municipality of Murici, state of Alagoas, at “Fazenda Boa Sorte” (9°11′13.4″ S 35°55′27.6″ W) to access possible Trichophoromyia species in a different biome.

Collections were carried out using CDC-type light traps, which were installed at 1.5 m above the ground, operating overnight from 6 pm to 8 am of the next day. Collection dates were 16th May 2021, 10th October 2021, and 18th January 2023 for Fazenda Planalto, 29th April 2021 in the Seringal Floresta, and 27th December 2022 in the Fazenda Boa Sorte. Sand flies were stored in 70% ethanol at 20 °C. The legs were dissected and stored in a different microtube for posterior DNA extraction, while the remaining parts were clarified and slide-mounted using Enecê resin (Cerqueira 1943). Sand flies were morphologically identified using the dichotomous key of Galati (2018, 2023).

The geographical distribution of T. auraensis was mainly based on the catalog records by Martins et al. (1978), Young and Duncan (1994), and latter records of recent studies on Amazonian sand fly survey (Hudson and Young 1985; Castellón et al. 1994; González and Devera 1999; Bejarano 2006; Azevedo et al. 2008; Valdivia et al. 2012; Ballart et al. 2016; Teles et al. 2016; Ogawa et al. 2016; Brilhante et al. 2021; Araujo-Pereira et al. 2017; Pereira Júnior et al. 2022). The geographic coordinates were processed in the Open Source Geographic Information System QGIS v.3.4 (https://qgis.org/en/site/).

DNA extraction, amplification, and sequencing

The total DNA of the legs for each individual was extracted using Digsol buffer (EDTA 20 mM, Tris-HCl 50 mM, NaCl 117 mM, SDS 1%) and 2 μl of proteinase K at 20 mg/ml. For DNA amplification, the PCR Master Mix (Invitrogen, Thermo Scientific™) was used, following the manufacturer’s instructions, and the pair of primers LCO1490 (5′-GGTCAACAAATCATAAAGATATTGG-3′) and HCO2198 (5′-TAAACTTCAGGGTGACCAAAAAATCA-3′) which amplify the DNA barcoding fragment of the COI gene (~ 658 bp) (Folmer et al. 1994; Hebert et al. 2003). The PCR conditions of the COI fragment were as follows: initial denaturation at 95 °C for 2 min, followed by 35 cycles of 95 °C for 1 min, 54 °C for 1 min and 72 °C for 1 min 30 s, and a final extension of 72 °C for 10 min. We also amplified the internal transcribed spacer 2 (ITS2) region using the pair of primers C1A (5′-CCTGCTTAGTTTCTTTTCCTCCGCT-3′) and JTS3 (5′-CGCAGCTAACTGTGTGAAATC-3′) which amplify the rRNA ITS2 and short flanking regions of 5.8S and 28S (~ 500 bp) (Depaquit et al. 2000, 2002). The PCR conditions of the ITS2 region were as follows: initial denaturation at 95 °C for 2 min, followed by 35 cycles of 95 °C for 30 s, 62 °C for 1 min and 72 °C for 1 min; and a final extension of 72 °C for 10 min. Samples were checked by electrophoresis using 1% agarose gels stained with GelRed (Biotium Inc.), and all positive reactions with the expected molecular size were sent to ACTGene Análises Moleculares (Brazil) for purification and sequencing of the PCR products in both directions (forward and reverse).

Sequence analysis

The electropherograms were manually checked to remove primer sequences and assemble consensus sequences using SeqTrace v.0.9, which were submitted to BOLD Systems (Ratnasingham and Hebert 2007) and then to NCBI GenBank (Sayers et al. 2022) with the accession numbers for COI OQ945465-OQ945467 and OR260992, and ITS2 OR271579–OR271587.

To check the identity of the sequences and the usefulness of the COI barcodes for species delimitation, we merged into our data the publicly available sequences of Trichophoromyia howardi (Young, 1979); Trichophoromyia octavioi (Vargas, 1949); Trichophoromyia peixotoi Rodrigues, Pinto and Galati 2023; Trichophoromyia reburra (Fairchild & Hertig, 1961); Trichophoromyia velezbernali Posada-López, Galvis and Galati 2018; Trichophoromyia viannamartinsi (Sherlock & Guitton, 1970); and T. auraensis, which were previously processed in DNA barcoding studies of Neotropical sand flies (Contreras Gutierrez et al. 2014; Pinto et al. 2015, 2023a; Posada-López et al. 2023; Rodrigues et al. 2023) (Table 1). For rRNA ITS2, we merged our data with previously deposited sequences of T. reburra and Psathyromyia shannoni (Dyar, 1929) to deduce the boundaries of ITS2 and the other gene sequences of the amplicon (Kuwahara et al. 2009).

Table 1 Nominal species, collection sites, number of specimens for each molecular marker, and GenBank accessions for Trichophoromyia species analyzed in this study

To assess the clustering pattern of sequences, neighbor-joining (NJ) dendrograms were done for each molecular marker in the software MEGA 11 (Tamura et al. 2021) with 1000 bootstraps pseudoreplicates and the K2P model. Also, phylogenetic gene trees were reconstructed using both maximum likelihood (ML) and Bayesian inference (BI) criteria. We performed ML analysis using the IQ-TREE software in the web server http://iqtree.cibiv.univie.ac.at/ (Trifinopoulos et al. 2016), selecting automatic model selection and 10,000 ultrafast bootstrap pseudoreplicates. The BI tree was generated in BEAST v.2.6 (Bouckaert et al. 2019), using the nucleotide substitution model TPM3+G and F81+G, as indicated by the IQ-TREE analyses for COI and ITS2, respectively. The BI was performed using strict clock and coalescent constant population priors, but different prior settings were also tested (see Rodrigues et al. 2020). Two runs were performed with 10,000,000 generations (sampling every 1000), and the trace logs were visualized in Tracer v.1.7 to check the convergence and effective sample size (ESS) values, which were all above 200. Sequences of Lutzomyia longipalpis (Lutz & Neiva, 1912) (KP112583) and P. shannoni (U48382) were included as outgroups for COI and ITS2 analyses, respectively.

We identified the sequences at the molecular operational taxonomic unit (MOTU) level using the following algorithms: automatic barcode gap discovery (ABGD) (Puillandre et al. 2012), generalized mixed Yule coalescent (GMYC) (Fujisawa and Barraclough 2013), and Poisson tree processes (PTP) (Zhang et al. 2013). The first cluster sequences into MOTUs according to a pairwise distance matrix, while GMYC and PTP are coalescent-derived algorithms that seek to differentiate population stochastic processes from speciation events in phylogenetic trees. The ABGD analysis was performed in the web server https://bioinfo.mnhn.fr/abi/public/abgd/abgdweb.html, using the Kimura 2-parameters (K2P) model, Pmin = 0.005, Pmax = 0.1, and X = 1.0. We considered the recursive partitions with prior maximal distance P = 0.009729 and 0.006975 for COI and ITS2 datasets, respectively. For GMYC delimitation, we submitted the BI gene trees to the web server at https://species.hits.org/gmyc/ using both single and multiple thresholds. The species delimitation by PTP was performed by submitting the ML gene trees to the web server https://species.h-its.org/ptp/ using its default settings and 500,000 MCMC generations. The pairwise genetic distances were accessed for both intra- and interspecific comparisons using the K2P model in the software MEGA 11.

Results

In total, three male specimens morphologically identified as Trichophoromyia auraensis were collected in the last campaign at Fazenda Planalto (Fig. 2), representing a new sand fly record for the state of Maranhão, Northeast Brazil. These individuals were processed to obtain molecular markers COI and ITS2 and were compared with previously described populations of this same species, as well as other taxa of the Trichophoromyia genus.

Fig. 2
figure 2

Male specimens of Trichophoromyia auraensis PS1 and PS2 collected in the Brazilian states of Acre, western Amazon (A), and Maranhão, eastern Amazon (B), respectively. The red arrows indicate the main diagnostic characteristics for this species, being the gonocoxite bristles cluster and the paramere. Scale bar: 100 μm (top) and 50 μm (bottom)

COI

The complete COI DNA barcoding fragments (i.e., 658 bp) of these three specimens were obtained. Moreover, a specimen from the municipality of Xapuri, Acre was also COI-barcoded to confirm the identity of GenBank sequences of T. auraensis from Acre. No visual indication of pseudogenes and/or nuclear copies of mitochondrial origin (NUMT) were found.

The DNA barcoding analysis with the other Trichophoromyia species indicates the formation of well-supported clades in the BI tree (posterior probabilities > 0.95), but ML analysis had difficulty recovering T. peixotoi and some T. auraensis specimens in well-supported monophyletic clades. Also, the species pair T. velezbernali/T. auraensis merged into a single clade. Trichophoromyia auraensis was splitted into two distinct unrelated clades; the first comprising sequences of T. auraensis from the western region of the Amazon biome—of which were merged with Colombian T. velezbernali—and the second with specimens processed in this study from the state of Maranhão, eastern Amazon (Fig. 3). Thus, these two populations were analyzed as different putative species: T. auraensis PS1 and T. auraensis PS2, respectively.

Fig. 3
figure 3

Phylogenetic gene trees for both BI (A) and ML (B) methods based on COI DNA barcoding fragment of Trichophoromyia species. Numbers near nodes indicate posterior probabilities (> 0.95) of the BI tree and bootstrap support (> 70) of the ML tree. The different colors indicate the MOTU species delimitation partitions made by the algorithm ABGD (P = 0.009729). The red bars indicate the species delimitation by GMYC (A) and PTP (B), each bar being a delimited MOTU. Trichophoromyia auraensis PS1 (Acre State) and PS2 (Maranhão State) are highlighted in green and pink, respectively

The species delimitation algorithms split the COI dataset into seven MOTUs by ABGD, eight by GMYC, and two in the PTP analysis (Fig. 3). The ABGD partitioned the data in accordance with the nominal species and the main clades of the BI gene tree, also merging T. auraensis PS1 and T. velezbernali. The GMYC analysis agreed with ABGD delimitation but also split T. peixotoi into two MOTUs (Fig. 3A). In contrast, PTP merged nearly all the analyzed Trichophoromyia species, except for T. reburra (Fig. 3B).

The intraspecific pairwise distances ranged from 0.0 to 1.4, while interspecific distances were generally greater than 1.58. The species pair T. auraensis PS1/T. velezbernali reached minimum interspecific distances of 0.0 (mean 0.23) (Table 2). Regarding T. auraensis populations (PS1 and PS2), the mean genetic distances between them were 5.17, which is higher than other comparisons between morphologically distinct species (Table 2).

Table 2 Pairwise genetic distances (Kimura 2-parameters) between and within species of the genus Trichophoromyia based on COI DNA barcodes

ITS2

In total, nine Trichophoromyia rRNA ITS2 sequences were accessed for three species: T. auraensis (6), T. peixotoi (1), and T. viannamartinsi (2) which were analyzed with a previously processed T. reburra (AB479930). The sequence length varied between 277 and 284 bp after trimming flanking regions of rRNA genes.

Similar to the COI dataset, ITS2 alignment retrieved moderate (posterior probability > 0.7; bootstrap > 50) and well-supported clades (posterior probability > 0.95) for each nominal species. Despite the low genetic distances due to the small amount of diagnostic characters, T. auraensis PS1 and PS2 formed different clusters and MOTUs by ABGD analysis (Fig. 4). However, GMYC and PTP partitioned this dataset into two and one groups, merging all the analyzed species into the same MOTU, except for T. reburra in GMYC analysis (Fig. 4). Intraspecific K2P distances were 0.0 for all species, but interspecific K2P was greater than 1.1 while analyzing different nominal species and 0.7 comparing T. auraensis PS1 and PS2 (Table 3).

Fig. 4
figure 4

Phylogenetic gene trees of BI (left) and ML (right) methods based on rRNA ITS2 sequences of Trichophoromyia. Numbers near nodes indicate posterior probabilities and bootstrap values above 0.7 and 50 respectively. The different colors indicate the MOTU species delimitation partitions made by the algorithm ABGD (P = 0.006975). The red bars indicate the species delimitation by GMYC (left) and PTP (right), each bar being a delimited MOTU. Trichophoromyia auraensis PS1 (Acre State) and PS2 (Maranhão State) are highlighted in green and pink, respectively

Table 3 Pairwise genetic distances (Kimura 2-parameters) between and within species of the genus Trichophoromyia based on rRNA ITS2 DNA sequences

Due to the low number of specimens available, it was not possible to conduct a more accurate morphometric analysis, but we checked the measurements of some morphological characters of the three specimens collected in this study (eastern Amazon), in addition to five specimens from the state of Acre (western Amazon) obtained from the entomological collection of the “Laboratório de Entomologia em Saúde Pública—Phlebotominae of the Faculdade de Saúde Pública/USP.” The characters were as follows: head length and width, clypeus length, FI, wing length and width, R5, gonocoxite length and width, gonostyle, dorsal and ventral paramere lengths, epandrial lobe length and width, sperm pump, aedeagal duct, and the ratio of aedeagal duct/sperm pump. No notable difference was verified in these characteristics (data not shown). Also, images of the voucher specimens processed in the study of Pinto et al. (2023a) (PS1) were recorded by Fiocruz/COLFLEB, and the identity of T. auraensis was confirmed.

Discussion

The state of Maranhão, which borders the state of Pará and forms the limit of the eastern Amazon, has records of only two species of TrichophoromyiaT. viannamartinsi and Trichophoromyia ubiquitalis (Mangabeira 1942). This state has a great sand fly diversity due to its rich phytogeographic regions (Rebêlo et al. 2010), and our new T. auraensis report increases the current list of sand flies from Maranhão to 93 species. This species is a putative vector of causative agents of cutaneous leishmaniasis in the Brazilian Amazon (Teles et al. 2017), representing one of the major issues regarding the public health of Maranhão (Rebêlo et al. 2000, 2001; Martins et al. 2004; Gonçalves Neto et al. 2013). Other primary vectors have been found abundantly in transmission areas such as Nyssomyia whitmani (Antunes & Coutinho, 1939), and the participation of T. auraensis in this disease cycle remains unclear, although probably not involved because it has not been found until this study.

Trichophoromyia auraensis holotype was described from a male collected from an armadillo burrow in the municipality of Aurá, state of Pará, eastern Amazon. This description also includes several paratypes from the bank of the Mamoré River, border of Brazil and Bolivia, western Amazon (Mangabeira 1942). Despite this, T. auraensis has been rarely found in the eastern region of the Amazon biome and, so far, has not been described beyond eastern Pará. The most recent entomological surveys carried out in endemic regions of Pará for CL show the absence of this species, although other Trichophoromyia may occur (Silveira et al. 1991; Souza et al. 2010; Ferreira et al. 2014; Santos et al. 2019; Sánchez Uzcátegui et al. 2020; Teodoro et al. 2021; Pinto et al. 2022, 2023b). In contrast, for the western Amazon, T. auraensis has been found in great abundance and naturally infected with Leishmania parasites (Azevedo et al. 2008; Valdivia et al. 2012; Araujo-Pereira et al. 2014; Teles et al. 2016; Ogawa et al. 2016; Ávila et al. 2018; Pereira Júnior et al. 2019a, 2019; Silva et al. 2021). Thus, Sánchez Uzcátegui et al. (2020) argue that western populations of T. auraensis respond well under urbanization pressure, different from the eastern ones. Moreover, its distribution across the Amazon biome seems to be discontinuous (Fig. 1), indicating that they may be structured populations.

The molecular taxonomy analysis supports the hypothesis that the western and eastern populations of T. auraensis may be different species, identifying at least two genetic lineages in this nominal species. These two populations were split into different well-supported clades and MOTUs by ABGD/GMYC delimitation and present great genetic distances between them, which are sometimes higher than in relation to other nominal species. The interspecific distances were low—except for T. reburra—indicating that the diversification of this genus may have a recent history compared to other sand flies, which would agree with morphology-based phylogenies (Galati 2018). Because of this, distance-based species delimitation methods may have difficulty in the correct delimitation of Trichophoromyia spp. while analyzing them with other sand fly species with higher genetic distance values (e.g., Posada-López et al. 2023), which makes it necessary to adjust the analysis parameters. Generally, ABGD correctly sorted sand fly nominal species with the prior intraspecific divergence (P) between 1 and 2.5%, but for our dataset, the best partitions were obtained with P values of 0.97% and 0.69% for COI and ITS2, respectively. This may also be reflected in the lack of a structuring pattern in the ML tree and species delimitation by PTP. Despite the low nucleotide distances of both markers, but especially ITS2, a correct species delimitation could be made by ABGD, including the finding of cryptic diversity within T. auraensis. These results should be tested under validation approaches with multilocus datasets (Carstens et al. 2013).

The DNA barcoding is a useful tool to detect sand fly cryptic diversity (Rodrigues and Galati 2023), especially when geographically isolated populations of widely distributed species are analyzed (Pinto et al. 2015; Rodrigues et al. 2018; Scarpassa et al. 2021; Posada-López et al. 2023). This type of hidden diversity uncovered by DNA sequences can indicate genetically structured populations of the same species, incipient species, or fully diverged taxa which retain ancestral morphological features (Struck et al. 2018). If these populations really are different species, the absence of morphological disparity may be due to the recent divergence, morphological stasis, and convergence (Fišer et al. 2018; Struck et al. 2018). However, this preliminary hypothesis should be tested using a more comprehensive sampling effort in terms of different Amazonian populations and other nuclear molecular markers. Thus, our results should be interpreted with caution, and additional lines of evidence must be considered for the correct delimitation of this probable complex of cryptic species from an integrative taxonomy perspective.

The western population of T. auraensis (PS1) merged with the Colombian specimens of T. velezbernali into the same well-supported clade/cluster and MOTU. This complex association pattern of MOTU boundaries and nominal species, named as “mixture” by Ratnasingham and Hebert (2013), can occur with other sand fly species (Rodrigues and Galati 2023), including some of the genus Lutzomyia (Pinto et al. 2015). Trichophoromyia auraensis from the state of Acre and T. velezbernali from the Amazonas department of Colombia are from the western Amazon region, indicating that the COI barcodes may have more differences between geographic locations than between morphologically distinguishable species. Trichophoromyia velezbernali is a close-related species to T. auraensis, being delimited only by slight differences in the bristle clusters of the gonocoxite and paramere shape of males (Posada-López et al. 2018). This discordance may be a result of introgressive hybridization events between these two taxa or retention of ancestral polymorphism in the analyzed molecular markers or because these currently recognized species do not, in fact, correspond to different biological entities (Rodrigues et al. 2020). The real taxonomic status of these species and populations must be evaluated in the future using coalescent-based species delimitation methods and multilocus sequence data, which enable a more powerful and robust inference when species and gene trees differ (Fujita et al. 2012).

Here, we reported for the first time the occurrence of the putative sand fly vector T. auraensis in the state of Maranhão, Brazil. This record updates the species list of this state to 93 sand flies. The analysis of COI DNA barcoding and the nuclear rRNA ITS2 of Trichophoromyia species was able to detect a high nucleotide distance between the two populations of T. auraensis, which, in addition to the different epidemiological relevance of these populations, indicate that this taxon may represent a complex of cryptic species. This evidence should be evaluated with a more comprehensive sampling in terms of analyzed populations and molecular markers.