Introduction

The capybara, Hydrochoerus hydrochaeris (Linnaeus, 1766; type locality: Surinam, but restricted to San Francisco River, Sergipe state in Brazil by Feijó and Langguth 2013), is a semiaquatic herbivorous rodent widely distributed throughout most of the South American wetlands. Its geographical distribution encompasses from Panama to south of Buenos Aires Province in Argentina. The taxonomic history of capybaras is complex and some authors differentiated a second species from H. hydrochaeris. This second species, H. isthmius, occurring in eastern Panama, northwestern Venezuela, and northern and western Colombia, was first described by Goldman (1912), with a type locality in Marragantí, near the head of tide-water on the Tuira River at the Darien region, near the frontier between Panama and northern Colombia. Based on morphology (Mones 1984; Mones and Ojasti 1986) and genetics (Moreira et al. 2012), H. isthmius was recognized as a distinct species in Mammal Species of the World (Wilson and Reeder 2005). However, isozyme (Peceño 1983) and mitochondrial DNA analyses (Ruiz-García et al. 2016ab) revealed small genetic differences between H. isthmius and H. hydrochaeris, suggesting sub-specific differences and agreeing well with Heinemann’s (1975) subspecies proposal (H. h. isthmius). Also in agreement with this last author, Ruiz-García et al. (2016b) showed that putative populations of Colombian H. isthmius were genetically more related to some Amazonian H. hydrochaeris populations than these last to another Amazonian H. hydrochaeris populations. Thus, these molecular results agree better with H. h. isthmius as a subspecies than as a differentiated species from H. hydrochaeris.

The information about the geographical distribution of H. isthmius and H. hydrochaeris and current population status varies considerably between countries and regions. Thus, the Trans-Andean species H. isthmius is reported to be common in Panama but rare in Colombia and Venezuela (Moreira et al. 2012; Delgado and Emmons 2016) and is catalogued as Data Deficient by the IUCN (Delgado and Emmons 2016), whereas H. hydrochaeris is reported as common through most of its range and is listed as Least Concerned in Brazil (ICMBio/MMA 2018), Paraguay (Saldívar et al. 2017), and Argentina (Bolkovic et al. 2019). In Uruguay, capybaras are considered a priority species (González et al. 2013).

Despite being widely distributed, the species faces many different conservation problems. In Venezuela and Colombia, the species’ current distribution is highly fragmented and has experienced population declines in many regions, even going extinct in some (Ojasti 1973; Maldonado-Chaparro et al. 2011; Herrera and Barreto 2012). In Brazil, the species is considered a local pest because of the damage it causes to crops (Paschoaletto et al. 2003) and is almost extinct in the Caatinga biome, possibly due to high hunting pressure and habitat alteration (Moreira 2004). In addition, the scarcity and even disappearance of the species in northeastern Brazil have been noted in recent decades (Paiva 1973). In Argentina, capybaras are usually abundant in protected sites, but outside of them, densities drop dramatically. In places where wetlands were transformed into cattle pastures, populations decreased dramatically or directly disappeared (Bolkovic et al. 2019).

Sequences from the first hypervariable region (HVRI) of the rapidly evolving noncoding control region (CR) have been extensively used in the last decade to analyze the genetic diversity, population structure, and historical population dynamics of the species at different geographical scales. The first study was carried out by Campos-Krauer and Wisely (2010) who performed a phylogeographic analysis of the species using non-invasive samples collected in 13 populations located in the Gran Chaco ecosystem of Paraguay. They obtained relatively low values of haplotype diversity and genetic signals of recent population range expansions caused by deforestation and cattle ranching. Borges-Landáez et al. (2012) assessed the genetic variability and population structure of five managed capybara populations from the seasonally flooded savanna of Venezuelan Llanos and found moderate to high levels of genetic diversity, a significant genetic structuring and a correlation between genetic and geographic distance, suggesting restricted gene flow with isolation by distance. Byrne et al. (2015) analyzed CR sequences from Argentina together with those available from Paraguay and Venezuela in order to test the role of river drainage in shaping the genetic structure of the species at the river basin and regional scales. At the regional scale, their results showed that populations from Venezuela, located in the Orinoco Basin, were genetically different from those in Paraguay and Argentina, which belong to the Río de la Plata basin. At the basin scale, only populations from Paraguay showed significant differences between basins and a significant correlation between the genetic distance and the geographical distance between populations measured through the rivers, while for Argentina and Venezuela, results were not significant. These results would suggest that in Paraguay, the current genetic structure of capybaras is associated with the lack of dispersion corridors through permanent rivers whereas, limited structuring in Argentina and Venezuela is likely to be the result of periodic flooding facilitating dispersion. Ruíz-García et al. (2016b) sequenced the CR and Cytochrome b in 78 wild capybaras sampled in Colombia, Peru, Ecuador, and Brazil, as well as 25 wild capybaras at ten mitochondrial genes. They found 43 haplotypes, high levels of genetic diversity, and four well-defined populations in delimited geographical areas. However, their results do not support the view that two different species of capybara are present, as it was aforementioned unless chromosomal speciation can be demonstrated between these two groups. Finally, Byrne et al. (2019) analyzed the genetic diversity, population genetic structure, and historical population dynamics of the species in the Chaco-Pampean region. Their results showed the existence of four haplogroups in the study area, low genetic diversity in three of them, evidence that support the hypothesis that the Paraná and Paraguay rivers would be acting as a migration corridor for the species, and a historical population dynamics in which population expansions and secondary contact between genetic groups would be related to past climatic events.

Mitochondrial DNA analyses should be instrumental in helping to determine conservation strategies based on the systematic classification of differentiated populations below the species level. The uncertainty about these conservation units can lead to confusion in the establishment of management plans and errors in setting priorities (O’Brien 1994; DeSalle and Amato 2004). In this sense, the present study aims to combine the capybara mitochondrial DNA control region sequences obtained by the authors in previous studies at a regional scale (n = 124) with those available to date in the GenBank database (n = 264) to analyze the phylogeographic patterns of the species, the existence of geographical limits among the populations, and the taxonomic status of the different genetic units in a large portion of its distribution area in South America.

Materials and methods

Sampling

A total of 124 tissue, hair, or fecal samples were collected from different sampling sites in Colombia, Ecuador, Peru, Brazil, and Argentina (Table 1, Fig. 1). Tissue samples were obtained from animals found dead or hunted by indigene near their respective communities for food. Hair samples were obtained from live animals maintained in these communities as pets. Fecal samples were collected in the field from different mounds separated from more than 50 m, in order to decrease the probability of re-sampling the same individual. Samples were preserved in a 20% dimethyl sulfoxide solution supersaturated with NaCl or in ethanol 96% during fieldwork and stored at -20 °C until DNA extraction was performed.

Table 1 Origin of capybara samples used in this study. * Putative H. isthmius individuals
Fig. 1
figure 1

Location of capybara sampling sites in (A): Venezuela, Colombia, Ecuador, Peru, and Brazil and (B): Paraguay and Argentina. Sampling site numbers correspond to those described in Table 1. Genetic groups (see results): Red-Trans-Andean (includes sampling sites in the Napo River, cis-Andean area; see discussion section for details), Yellow-Llanos, Green-Western Amazon, and Blue-Río de la Plata

Mitochondrial DNA extraction, PCR amplification and sequencing

Different DNA extraction protocols were used depending on the sample source. Tissues samples were incubated overnight at 37 °C in an extraction buffer containing 10 μl of proteinase K, 10 mg/ml; 5 μl of RNase, 20 mg/ml, and 10% SDS. After incubation, DNA was isolated by phenol–chloroform extraction and alcohol precipitation. DNA from hair follicles was extracted with 10% Chelex resin (Walsh et al. 1991). For fecal samples, we followed the protocol described in Reed et al. (1997) with one modification: in the last step of the Wizard SV Gel and PCR Clean-Up System kit (Promega) protocol, centrifugation time was changed from 1 to 5 min in order to increase the amount of purified DNA. To avoid contamination among samples, DNA was extracted in a room where no PCR products were stored.

Amplification protocols used also depended on the sample source and the quality of DNA obtained, more degraded in fecal samples. Primers, PCR mix concentrations, and PCR profiles used are described in Byrne et al. (2019) and Ruiz-García et al. (2016ab). All amplifications including positive and negative controls were checked in 1–2% agarose gels. The amplified samples were purified using membrane-binding spin columns (Qiagen) or the EXO/SAP method (Werle et al. 1994). The double-stranded DNA was directly sequenced in a 377A (ABI) automated DNA sequencer using the same oligonucleotide primers used in PCR amplification. Some samples were repeated to ensure sequence accuracy. NUMTs were not expected because only a fragment of the mitochondrial control region was sequenced.

Control region sequences from capybaras inhabiting 13 populations located in the Gran Chaco ecosystem of Paraguay and five populations from the Venezuelan Llanos were obtained from GenBank (Fig. 1 and Table 1). With the inclusion of these sequences, the sampled area covers a large portion of the species' distribution, even though populations in most of Brazil and northeastern South America await study. Sequences from Paraguay had a length of 386 bp and were obtained from 110 individuals (Campos-Krauer and Wisely 2010; GenBank Access numbers GU456363-376), while those from Venezuela had a length of 545 bp and were obtained from 153 individuals (Borges-Landáez et al. 2012; GenBank Access Numbers EU149767- EU149776). All these sequences were fitted at a length of 292 bp due as this was the size of the HVRI fragment amplified in this study.

Data analyses

All sequences were aligned and analyzed for polymorphic sites (S) using the Clustal W system (Thompson et al. 1994) implemented in MEGA X software (Kumar et al. 2018). Parameters for the alignment were set in 15.00 for gap opening penalty and 6.66 for gap extension penalty both for pairwise and multiple alignments. The number of haplotypes (H), haplotype frequencies and haplotype (Hd), and nucleotide (π) diversities (Tajima 1983; Nei 1987) were computed using Arlequin 3.5.2 (Excoffier and Lischer 2010).

Two phylogenetic trees were obtained. The first one was a Maximum Likelihood (ML) tree constructed using RAxML v8.2.10 software (Stamatakis 2014). Best-fit models were selected using Bayesian information criteria under a ‘greedy’ search scheme using a subset of models specific to RAxML. The GTR + G model (General Time Reversible model + gamma-distributed rate variation among sites, Lanave et al. 1984) was used to search for the ML tree. We estimated support for nodes using the rapid-bootstrapping algorithm (_f a–x option) for 1,000 non-parametric bootstrap replicates (Stamatakis et al. 2008). In this first tree, no outgroup was used, in order to accommodate debated issues about molecular dating of recent phylogenetic splits (Ho et al. 2008). The second was a Bayesian tree completed with the BEAST v1.8.1 program (Drummond et al. 2012). Four independent iterations were run with six Markov Chain Monte Carlo (MCMC) chains sampled every 1,000 generations for 30 million generations after a burn-in period of three million generations. We checked for convergence using Tracer v1.6 (Rambaut et al. 2013). We plotted the likelihood versus generation and estimated the effective sample size (ESS > 200) of all parameters across the four independent analyses. The results from different runs were combined using LogCombiner v1.8.0 software (Rambaut and Drummond 2013a) and TreeAnnotator v1.8.0 software (Rambaut and Drummond 2013b). A Yule speciation model and a relaxed molecular clock with an uncorrelated log-normal rate of distribution (Drummond et al. 2006) were used. Posterior probability (PP) values provide an assessment of the degree of support of each node on the tree. The outgroup employed for rooting the Bayesian tree was the paca, Cunniculus paca.

The median-joining network method (Bandelt et al. 1999) implemented in PopART 1.7 (http://popart.otago.ac.nz) was applied to our dataset to estimate the phylogeographic structure of haplotypes. This method, using a parsimony criterion, combines the minimum-spanning trees (MSTs) with a single network, allowing more detailed population information than do strictly bifurcating trees (Posada and Crandall 1998).

After that, an Analysis of Molecular Variance (AMOVA; Excoffier et al. 1992) was performed on the four clusters identified (see Results) using Arlequin 3.5.2. The significance of the observed Ф-statistics was tested using the null distribution generated from 10,000 non-parametric random permutations of the data matrix variables. Differences in allele frequencies (FST-like) were used to compute the distance matrix for ФST. Population pairwise ФST values were also calculated using Arlequin 3.5.2 software. The Kimura 2P genetic distances (K2P) (Kimura 1980) between populations were computed in MEGA X software.

We used several methods to determine possible demographic changes across the natural history of the capybara. First, Tajima’s D (Tajima 1989) and Fu’s FS (Fu 1997) neutrality tests were carried out using Arlequin 3.5.2 software. Then, mismatch distribution analyses were performed using the same software. This method, based on an assumed stepwise growth model (Rogers and Harpending 1992), was used to evaluate: (1) whether there was a signature of population expansion, and (2) the timing of demographic expansion measured in units of mutational time. Approximate confidence intervals for growth parameters were obtained by a parametric bootstrap approach (1,000 replicates). The validity of the estimated stepwise expansion model was tested using the same parametric bootstrap approach. A goodness-of-fit test between the observed and simulated distributions of pairwise differences was performed. To quantify the smoothness of the observed haplotype frequency distribution, Harpending’s raggedness index (Harpending 1994) was applied. Statistical significance of the index was tested by a coalescent simulation (10,000 permutations) of the randomly expected value, as implemented in DnaSPv.5 (Librado and Rozas 2009). Moreover, the tau value (τ), obtained in the mismatch distribution analysis provides a rough estimate of the time when the sudden population expansion started (Rogers and Harpending 1992; Rogers 1995). The substitution rate suggested by Ruíz-García et al. (2016b) for the capybara mitochondrial control region (2.57 × 10 − 1substitutions/site/My) was used to convert mutational time (τ) into real-time according to the equation: τ = 2μt, where t is the time in years and μ is the substitution rate per generation. The generational time used was three years (Colin 1991; Nowak 1991). Finally, Bayesian skyline plots (BSP) were obtained through the BEAST 1.8.1 and Tracer 1.6 Software. The Coalescent-Bayesian skyline option in the tree priors was selected with five steps and a piecewise-constant skyline model with 40,000,000 generations (the first 4,000,000 discarded as burn-in). In the Tracer v1.6 software, the marginal densities of temporal splits were analyzed and the Bayesian Skyline reconstruction option was selected for the trees log file. A stepwise (constant) Bayesian skyline variant was selected with the maximum time as the upper 95% High Posterior Density (HPD) and the trace of the root height as the treeModel.rootHeight.

Results

The 387 individuals analyzed presented 39 different haplotypes determined by 30 polymorphic sites, of which three were singletons. Two of them were found on the sequences obtained from the 124 individuals sampled and the other in the GenBank dataset. Haplotypes 1, 19, 22, and 12 were the most frequent, being found in 74 (19.12%), 58 (14.99%), 47 (12.14%), and 41 (10.59%) individuals, respectively (Appendix I). None of the haplotypes were present in all the sampling sites. However, haplotype 1 was present in most of the sampling sites in Argentina and Paraguay; haplotypes 16, 18, and 19 were shared between Venezuela and Colombia; and haplotype 26 was shared between Ecuador and Peru. The best substitution evolutionary model was HKY + G, both for the Bayesian and Akaike models (BIC = 2038.29, AICc = 1466.15).

The ML tree showed four genetic clusters (Fig. 2). However, bootstrap values were medium or low for three of these capybara groups. The first cluster that diverged corresponds to the Western Amazon group. This group is the one with the greatest statistical support (84%). It includes eight haplotypes (n = 21 individuals); six from the Peruvian Amazon (H32-37, n = 15 individuals), and two from the Venezuelan Llanos (H38-39, n = 6 individuals) that might have an Amazonian origin. The cluster showed a high value of haplotype diversity (H = 0.876) and a relatively high value of nucleotide diversity (π = 0.011). The second cluster to diverge corresponds geographically to the Río de la Plata basin (Paraguay and Argentina) plus the central Brazilian Amazon (Negro River). This cluster included 15 haplotypes (n = 162 individuals): one shared between Paraguay and Argentina (H01, n = 74 individuals), five from Argentina (H02-06, n = 21 individuals), eight from Paraguay (H07-14, n = 66 individuals), and one from Brazil (H15, n = 1 individual). This cluster also showed a high value of haplotype diversity (H = 0.717), but lower than the previous clade, and a relatively high value of nucleotide diversity (π = 0.011). Some Paraguayan haplotypes (H09-14) appear to be the most ancestral of the cluster. Interestingly, the individual sampled in the Negro River, in the center of the Brazilian Amazonas, was associated with this cluster and not to the Western Amazon one. The last two clusters to diverge are the Trans-Andean and the Llanos. The Trans-Andean cluster included six haplotypes (n = 27 individuals): five from the Trans-Andean area of Colombia (H27-31, n = 23 individuals), and one shared between Ecuador and Peru (Cis-Andean area-Napo River) (H26, n = 4 individuals). The cluster showed a high value of haplotype diversity (H = 0.747), but a relatively low nucleotide diversity (π = 0.006). Finally, the Llanos cluster included ten haplotypes (n = 177 individuals): four from Venezuela (H21-24, n = 72 individuals), three from Colombia (H17, 20 and 25, n = 21 individuals), and the other three shared between Venezuela and Colombia (H16, 18–19, n = 84 individuals). Like the other three clusters, it presented a high value of haplotype diversity (H = 0.791) but a relatively low value of nucleotide diversity (π = 0.006) as the Trans-Andean cluster.

Fig. 2
figure 2

Maximum Likelihood (ML) tree for the 39 mitochondrial DNA control region (D-loop) haplotypes found. Nodes are labeled with bootstrap percentages. Haplotype numbers correspond to those described in Appendix I. * Calculation of the genetic diversity indexes of the Trans-Andean cluster does not include Haplotype 26. See the discussion section for details

In turn, the Bayesian tree showed two main clusters (Fig. 3), one composed of the Trans-Andean and Llanos haplotypes and the other by the Western Amazon and Río de la Plata haplotypes. Some haplotypes from Paraguay (H12-14) seem to diverge first, followed by all other haplotypes from the Río de la Plata basin and those from the Amazonian cluster. The cluster including the individuals from the Colombian and Venezuelan Llanos diverges later, followed by the Trans-Andean cluster.

Fig. 3
figure 3

A Bayesian tree for the 39 mitochondrial DNA control region (D-loop) haplotypes. Nodes are labeled with posterior probability values. Haplotype numbers correspond to those described in Appendix I. Genetic groups: Red-Trans-Andean, Yellow-Llanos, Green-Western Amazon, and Blue-Río de la Plata

The Median Joining Network (Fig. 4) showed the presence of four consecutively connected sub-networks separated by 1 to 5 mutational steps. Taking into account the position of the outgroup (d-loop sequence from the paca, Cunniculus paca) in the network, the first cluster to diverge correspond to individuals from the Colombian and Venezuelan Llanos, followed by the Western Amazon and Trans-Andean ones. The last cluster to diverge is composed of individuals from the Río de la Plata group. The star-like topology found in this cluster suggested that H01, the most frequent haplotype in the study area, represented an ancestral form from which the other haplotypes in the cluster derived. A similar pattern is observed in the other clusters, where H18, H31, and H32 seem to represent the ancestral variants.

Fig. 4
figure 4

Genealogical relationships of the 39 capybara haplotypes analyzed. Circle areas are proportional to haplotype frequencies. Each small bar on the branches represents one mutational step. Haplotype names correspond to those described in Appendix I. *The Trans-Andean cluster includes one haplotype (H26) from the Napo River of Ecuador and Peru (Cis-Andean cluster). See the discussion section for details

The results of AMOVA showed significant differences between clusters (ФST = 0.723; P = 0.0001; Table 2). The greatest source of variation (72.26%) was found between clusters. Pairwise comparisons of ФST values also showed significant differences between the four clusters (Table 2). Kimura 2P genetic distances between clusters were equal to or lower than 0.042 (Table 2).

Table 2 Analysis of Molecular Variance (AMOVA), and Kimura 2P (K2P) genetic distances between clusters. Ф-statistics and significance of variance components (P) were tested by 10,000 permutations according to Excoffier et al. (1992). Significant differences are indicated in bold

Results of the neutrality test showed negative and non-significant (P > 0.30) Fs values for all clusters, positive and non-significant (P > 0.58) D values for the Llanos, Trans-Andean, and Western Amazon clusters, and a negative and non-significant (P = 0.48) D value for Rio de la Plata cluster (Table 3). Consequently, there is no evidence to support a demographic expansion for any cluster. However, mismatch distribution analyses showed a unimodal curve that did not differ significantly from the simulated pattern based on a recent expansion model for Llanos, Trans-Andean, and Western Amazon clusters (P > 0.11, Fig. 5a-c). Raggedness values also provided statistical support for these results (P > 0.21). The time elapsed since expansion occurred was estimated at 14,800 Years Before Present (YBP) for Llanos, 14,200 YBP for Trans-Andean, and 27,800 YBP for the Western Amazon cluster. On the other hand, the Rio de la Plata cluster showed a multimodal curve (Fig. 5d). The Bayesian Skyline Plots (Fig. 5e-g) showed that Llanos, Western Amazon, and Río de la Plata suffered a decline in their female effective population sizes at about 60,000, 300,000, and 50,000 YBP, respectively, followed by an increase in the last 10,000 to 4,000 years. For the Trans-Andean cluster, results did not show a good resolution of possible demographic changes.

Table 3 Neutrality test estimates for each genetic cluster
Fig. 5
figure 5

Demographic history of the four capybara genetic clusters identified. Mismatch distributions (left panel): Bars are observed distributions; dotted lines are expected distribution under a sudden expansion model. Bayesian skyline plots (right panel): The black line represents the estimated median and the grey zone the 95% highest posterior density (HPD) intervals. References: Llanos (a, e), Western Amazon (b, f), Trans-Andean (c), Río de la Plata (d, g)

Discussion

Determining how species are divided into genetically distinguishable units is fundamental for conservation planning. As evolutionary processes act at the intraspecific level, genetic differences and locally adaptive characters will accumulate in these units over time. This reservoir of genetic and phenotypic diversity increases the species’ ability to persist through environmental changes (Wang 2002). Thus, one of the main goals in conservation is to preserve the evolutionary potential of species by maintaining the diversity found in genetic units. There are two levels of genetic differentiation within a species, called Evolutionarily Significant Units (ESUs) and Management Units (MUs) (Ryder 1986; Moritz 1994ab; Vogler and Desalle 1994; Funk et al. 2012). An ESU should be reciprocally monophyletic for mitochondrial haplotypes and show significant differences at nuclear loci, while a MU is defined as any population that exchanges so few migrants with other populations as to be genetically distinct from them (Avise 2004). In practice, MUs are identified by significant differences in allele frequencies at neutral marker loci. The identification of ESUs and MUs is primarily relevant to long-term management issues, that is, defining conservation priorities and setting strategies (Moritz 1994a). In this sense, the results obtained by the Median Joining Network and the AMOVA were coincident in the existence of four different genetic groups in our study area (i.e., Llanos, Trans-Andean, Western Amazon, and Río de la Plata basin; Fig. 1), although these groups are separated by a small number of substitutions in the network. The first group includes individuals from the Eastern Colombian Llanos, the northern Colombian Amazon (Guainia Department that borders with the Eastern Colombian Llanos), and Venezuela. The second is composed of individuals from northern and western Colombia (Trans-Andean area), but also by a small population in the Ecuadorian and Peruvian Amazon area of the Napo River (Cis-Andean area). That is to say, this group includes two populations geographically distant from each other and separated by the Andes Mountains. Taking into account that both populations are in very different geographical and ecological areas, they should be separated into two different ESUs as previously suggested by Ruiz-García et al. (2016b), or in two different MUs, or geographical populations, within the ESU. Carrying out an exhaustive sampling along the Napo River would make it possible to establish the contact zone and the existence of a hybrid zone between this alleged ESU and that of the Western Amazon, which would confirm the existence of a single capybara species, such as suggested by molecular data (see below). The next group is limited to the Peruvian Amazon, while the last one includes individuals from Paraguay and Argentina and one specimen from the central Brazilian Amazon (Negro River). According to the results of the median-joining network, these groups could be considered as different ESUs, as they are reciprocal monophyletic for the mitochondrial DNA control region haplotypes analyzed. In this analysis, the Trans-Andean haplogroup (“a priori” H. isthmus) was placed intermixed with the cis-Andean haplogroups (“a priori” H. hydrochaeris), which agrees quite well with a unique species of capybara. However, in the ML and Bayesian trees, the Rio de la Plata haplotypes were clearly not monophyletic because two clusters were detected (which were not resolved in the ML tree). Additionally, the node support values were medium or low for three of these capybara haplogroups. Therefore, in the strict sense of the word, only one ESU was found, corresponding to the Western Amazon haplogroup (node support values; ML = 84; Bayesian tree = 0.97). Nevertheless, although the Trans-Andean (including the specimens from the cis-Andean area of the Napo River in Ecuador and Peru), Llanos, and Rio de la Plata haplogroups did not have elevated bootstraps or posterior probabilities, they were located in well-defined geographical areas, which have a biological conservation meaning. Probably, with the analysis of more control region sequences or other mitochondrial/ nuclear genes, the statistical significance of these haplogroups will increase, and therefore they could be considered as “true” ESUs. Our discrete bootstrap or posterior probability values were the consequence of the analysis of only one gene. Moreover, seventeen of the 39 haplotypes found in our study area are shared between sampling sites, but always between those belonging to the same putative ESU (Appendix I). These results coincide with that proposed by Byrne et al. (2015), who suggested that the capybara population from Venezuela (Orinoco River basin) should be preliminarily considered a different ESU from the populations of the Río de la Plata basin, and are mostly coincident with the four ESUs proposed by Ruiz-García et al. (2016b). There are at least two other requisites for the definition of ESUs, i.e., to show significant differences of allele frequencies at nuclear loci, and phenotypic traits, mainly variation in skull metrics, but also other characteristics such as body weight, life history, and behavior (Moritz 1994a; Frankham et al. 2010). Until now, there are only two available studies in which nuclear markers were used in capybaras (Herrera et al. 2004; Maldonado-Chaparro et al. 2011), but both are very limited in the geographic range analyzed to define genetic units. Regarding phenotypic traits, Mones and Ojasti (1986) and Moreira et al. (2012) reported the existence of a latitudinal cline in the species, with an increase in body size and mass as the latitude increases. However, the first two authors warned that subspecies recognition could only be based on extreme populations and arbitrary limits.

Regarding the taxonomic status of the capybara genetic units, Mones and Ojasti (1986) recognized H. isthmius as a distinct species from H. hydrochaeris based on anatomical differences. However, Ruiz-García et al. (2016b) found genetic distance values lower than 3% at the mitochondrial control region between the Trans-Andean (representatives of the putative H. isthmus species) and the Llanos/Western Amazon populations. The authors concluded that their results agree better with a unique species of capybara because their genetic distance values are typical of the status of different populations or even subspecies but not of different species according to Avise (1994). Our results also agree with the idea of a unique capybara species, as we found genetic distance values from 2 to 4.2% between the Trans-Andean population (putative H. isthmus) and all other populations (H. hydrochaeris). Following Avise (1994), Bradley and Baker (2001), Kartavtsev (2011), and Ruiz-García et al. (2014), the average genetic distances among full species of the same genus is around 8–11%. Clearly, the genetic distances obtained were lower. On the other hand, Cabrera (1961) recognized three capybara subspecies: H. h. dabbenei from Paraguay and northeastern Argentina, H. h. uruguayensis from Uruguay and eastern Argentina, and the nominotypical form from the remainder of the range. Results obtained from the analysis of the mitochondrial DNA control region show no evidence to support the existence of two subspecies or ESUs within the Río de la Plata basin. Most of the sampling sites in Paraguay and Argentina, which cover the geographic areas mentioned by Cabrera (1961), share the most common capybara haplotype (H01, Appendix I), indicating the lack of reciprocal monophyly among populations. Moreover, Byrne et al. (2019), analyzing the population genetic structure of the species in the Chaco-Pampean region, found a widespread haplogroup that suggests the existence of historical gene flow, possibly mediated by the Paraguay and Paraná rivers, between sampling sites located at the Gran Chaco ecosystem of Paraguay and the Chaco-Pampean region in Argentina. In summary, our results support the existence of four or five groups (potential ESUs when more molecular markers are analyzed) throughout the study area, but do not agree with the existence of the two species recognized by Mones and Ojasti (1986) or all three subspecies proposed by Cabrera (1961). However, conclusions regarding the existence of one or two capybara species based on a fragment of a single mitochondrial marker should be taken with caution. More genetic studies, complemented with morphological and ecological data, and including samples from the Darien region of Panama, whole mitochondrial genomes, and nuclear genes, may contribute to resolving this question.

Until now, several studies have estimated the genetic diversity of capybara populations using the HVRI of the mitochondrial control region (Campos-Krauer and Wisely 2010; Borges-Landáez et al. 2012; Ruíz-García et al. 2016ab; Byrne et al. 2019). Haplotype diversity reported in these studies is highly variable (H = 0.00–0.98) as well as the sample number used to calculate it in each population. In the present study, where the 387 HVRI sequences available for the capybara were analyzed, we found high values of haplotype diversity (H = 0.717–0.876) and relatively low to high values of nucleotide diversity (π = 0.006–0.011) in the four clusters identified. Taking into account the long history of commercial exploitation and poaching of capybara populations in Argentina and Paraguay it could affect the future of these populations (Bolkovic et al. 2006; Schivo et al. 2015). The effect of poaching on genetic diversity has been reported by Maldonado-Chaparro et al. (2011) for a capybara population in the Colombian Eastern Llanos. Analyzing five microsatellite markers in 31 capybaras from Hato Corozal, Colombia, the authors found that the population crossed over a very narrow recent bottleneck, possibly caused by the illegal hunt. Besides, local extinctions mainly caused by high hunting pressure and habitat alteration have been reported in Venezuela, northeastern Brazil, and Argentina (Paiva 1973; Moreira et al. 2012; Bolkovic et al. 2019). Thus, even though the species as a whole presents high levels of genetic diversity, genetic monitoring of capybara populations and the identification of genetic units is important to guarantee adequate management plans for the species.

Regarding the demographic history of the four genetic groups identified, the results of BSP analyzes suggest that three of these groups have undergone processes of population decline in the last 300,000 to 50,000 years. Quaternary climatic fluctuations have been widely recognized as one of the main historical processes influencing the historical dynamics of natural populations both in the northern and southern hemispheres (Vuilleumier 1971; Hewitt 2011). Habitat contractions caused by Pleistocene glacial cycles have been proposed as a major process in reducing population numbers and genetic diversity in different species that inhabits temperate regions of South America (Ruzzante et al. 2006). In particular, the late Pleistocene period is characterized by arid to semiarid and cold weather conditions, with a mean decrease of temperature between 5 and 7 °C in high-altitude areas (Quattrocchio et al. 2008; Peçanha et al. 2017). These conditions are very adverse for the capybara, a species for which water is an essential resource, and its distribution is associated with temperate mean annual temperatures (Byrne 2017). In this context, it is possible that the capybara population declines would be associated with the climatic characteristics of the late Pleistocene.

Loss of genetic diversity due to genetic drift in declined populations can be followed by rapid range expansions as the weather becomes more favorable. The results obtained for the mismatch distribution and BSP analyzes suggest that this could have occurred in the four genetic groups of capybaras in a period between the end of the Pleistocene and the first half of the Holocene (ca. 28,000–4,000 YBP). For example, our data suggest that Río de la Plata population expansion would have occurred during the Holocene Climatic Optimum (7,000 to 3,000 YBP, Giorgis et al. 2015), a period characterized by heavy rainfall. This coincides with that observed by Byrne et al. (2019) who related the population expansion of Argentinean populations with the formation of the Iberá Wetland and the increase in predominance of herbaceous vegetation associated with this water body (Fernández Pacella et al. 2011) that generated a habitat with optimal conditions for capybaras (Schivo et al. 2015). Similar processes could have occurred throughout the distribution of the species, where the heavy rainfalls distinctive of the Holocene Climatic Optimum and the associated changes in vegetation have played a very important role in habitat modeling (de Vivo and Carmignotto 2004). Regional changes in climate and hydrology had similar effects on other semi-aquatic species near the end of the Pleistocene (Lopes et al. 2006, 2007; Marquez et al. 2006) (Tables 4 and 5).

The geographical origin of the species has been previously analyzed by Ruiz-Garcia et al. (2016b). Their results support the Western Amazon population as the original capybara population from which the Llanos and Trans-Andean populations derived. However, this previous study did not include the samples from Venezuela, Paraguay, and Argentina included here. When all these samples were included in the analysis, the ML tree showed that the first cluster that diverged corresponds to the Western Amazon group, agreeing with what was reported by Ruiz-Garcia et al. (2016b). Three other results of our study support the Western Amazon as the geographical origin of the species. This population shows the highest genetic diversity, the most ancient bottleneck according to the BSP results, and the oldest population expansion according to the mismatch distribution analyses. However, the Bayesian tree showed a deep divergence dividing the four or five genetic clusters into two main clades that appear genetically equidistant. Within the main clade where the Western Amazon group was placed, the first to diverge were ancestors of some individuals from Paraguay, and the Median Joining Network shows that Llanos is the genetic cluster most related to the outgroup, suggesting that this is the ancestral cluster. The different results obtained regarding which population is the ancestral one are probably due to the high genetic variance due to the use of only one mitochondrial marker. In conclusion, the results obtained when all the available samples were incorporated are not conclusive about the geographic origin of the species. New studies, including samples from sites not yet sampled and a larger number of mitochondrial and nuclear genes, could help elucidate the evolutionary history of the species.