Introduction

Paratuberculosis (PTB) or Johne’s disease is a widespread intestinal disorder that causes chronic enteritis. The disease is mostly associated to domestic ruminants such as cattle (Gopi et al. 2022; Kumar et al. 2020), sheep (Traveria et al. 2013), deer (Palmer et al. 2019; Paolicchi et al. 2001), goats (Fiorentino et al. 2012) and camelids such as alpaca (Salgado et al. 2016), but it is also associated with other wild and domestic species such as rabbits (Beard et al. 2001; Fox et al. 2018; Shaughnessy et al. 2013), foxes (Matos et al. 2014) and cats (Kukanich et al. 2013). Argentina has the fifth-largest cattle herd worldwide, with great production of meat and dairy products (Espeschit et al. 2017). PTB causes high economic losses for farmers and is thus an important issue for the country (Moreira and Tosi 1995).

The climate and soil conditions of Argentina determine that most of the cows are located in the Pampas region, in the central-east region of the country, where the seroprevalence of PTB ranges from 7.2% to 19.6% (Paolicchii et al. 2003). In contrast, the seroprevalence of PTB in Tierra del Fuego, the southernmost province of Argentina, is unknown and has been considered a bovine tuberculosis-free region since 2011 (SENASA 100/2011 resolution), an uncommon characteristic in the continent. The potential reason for this is that for the past 12 years, no cattle have been introduced and animals are born and raised inside this province.

The etiological agent of PTB is Mycobacterium avium subsp. paratuberculosis (MAP). This pathogen has a strong association to human Crohn’s disease (Singh et al. 2016; Timms et al. 2016). MAP is one of the most common non-tuberculous mycobacteria and a member of the Mycobacterium avium complex. The most frequent route of transmission is the fecal–oral (Gopi et al. 2020), but vertical transmission has also been reported (Vasini Rosell et al. 2020; Whittington and Windsor 2009). A molecular typing technique based on Mycobacterial Interspersed Repetitive Unit (MIRU) and Variable Number Tandem Repeat (VNTR) loci has been developed to analyze the genetic polymorphisms among MAP strains (Thibault et al. 2007). This technique, also called Multiple-Locus Variable number tandem repeat Analysis (MLVA), is a simple and rapid procedure consisting of eight amplifications of different loci and their respective runs in agarose gels to show variations in length (number of repeats) in each locus. This technique could also be applied directly to clinical samples, something that is particularly desirable for MAP because of the very slow-growing nature of this organism. Although more accurate methods to type microorganisms are available, this MIRU-VNTR protocol has been the most used worldwide to type MAP strains to date (Biet et al. 2012; Fernandez-Silva et al. 2011; Gioffre et al. 2015; Imperiale et al. 2017; Inagaki et al. 2009; Radomski et al. 2010; Stevenson et al. 2009; Thibault et al. 2007). On the other hand, the Whole Genome Sequencing (WGS) technology can generate whole genome sequences within a reasonable time frame and provide an extreme resolution of the diversity. However, although the costs associated with WGS have decreased over time, they are still unaffordable for large-scale studies in developing countries as Argentina.

Based on the above, the aim of this study was to analyze the genetic diversity of MAP isolates obtained from bovine and deer herds in Argentina by MLVA and to describe the phylogenetic relatedness between geographically distant isolates through WGS and core-genome analysis.

Materials and methods

MAP isolates

A total of 90 MAP isolates (Supplementary Table 1) obtained between 1990 and 2017 were selected from the collection of the Bacteriology Unit of the National Institute of Agricultural Technology (INTA)-Balcarce, Argentina. Archived samples corresponding to cattle (n = 85) and deer (n = 5) and isolated from different samples (milk, feces, organs, or tissues) were originally obtained by convenience sampling. Isolates were chosen to maximize geographical diversity within the dataset. The isolates chosen were both from the Pampas region (Buenos Aires, Córdoba, Santa Fe, and La Pampa provinces), considered one of the most productive areas of the country, and from the southernmost and northernmost provinces (Salta and Tierra del Fuego respectively), where according to the National Service of Animal Health (SENASA 2017), productivity is lower (Fig. 1).

Fig. 1
figure 1

Geographical origin of the isolates and number of herds sampled by province

DNA extraction and MAP confirmation

Samples were incubated in Herrold’s egg yolk medium supplemented with mycobactin and pyruvate at 37 °C for at least 2 months until growth. Once colonies were grown, a loop was taken and suspended in sterile distilled water. The cells were lysed by heat shock at 99 °C for 1 h (Mixing block, Bioer) and then centrifuged at 10,000 g for 5 min. Next, 2 µL of the supernatant was used as template for the PCR reactions. The identity of the isolates was confirmed by IS900-PCR in 1% agarose gel electrophoresis (Collins et al. 1993). High-quality genomic DNA was obtained using mini spin columns (Qiagen DNeasy® Blood & Tissue kit), following the kit instructions. DNA quality was tested using the Take3 plate in an Epoch Microplate Spectrophotometer (BioTek).

Genotyping by MLVA

MLVA genotyping was used to test eight different MIRU-VNTR loci, as previously described (Thibault et al. 2007). The loci investigated were VNTR292, MIRUX3, VNTR25, VNTR47, VNTR3, VNTR7, VNTR10, and VNTR32. The primers and PCR conditions were as previously suggested (Thibault et al. 2007), with minor modifications. The mixture consisted of 1X buffer (10 mM Tris–HCl pH 9, 50 mM KCl, 0.1% Triton X-100, 2.5 mM MgCl2), 1 µM of each primer, 0.2 mM of each dNTP and 1.25U GoTaq polymerase (Promega). The mixture for MIRUX3 was supplemented with 2 µL of MgCl2 per reaction, and mixtures for VNTR 47, 3, 7, 10 and 32 were supplemented with dimethyl sulfoxide and betaine (Sigma). The annealing temperatures were as previously described (Thibault et al. 2007), with the exception of VNTR 47, in which, according to a touch-down protocol, the annealing temperature was decreased by 1 °C during the first ten cycles from 69 °C to 59 °C and then set at 64 °C for 35 cycles, and VNTR 292, in which the annealing temperature was decreased by 2 °C and set at 56 °C (Gioffre et al. 2015). The PCR products were revealed with 3.5% agarose gel electrophoresis using a 100-bp DNA marker (INBIO Highway) and a 50-bp DNA marker (Promega) depending on the size of the expected product. A database from the National Institute of Agronomic Research in France (INRA) was consulted to search the INMV pattern/type derived from the numerical profile of each isolate (http://mac-inmv.tours.inra.fr/). DNA from MAP strain ATCC 19,698 (INMV 2) was included as a control.

Discriminatory power

The allelic diversity (D) of each locus and the global discriminatory power of the complete MLVA scheme were determined using the Hunter and Gaston discriminatory index (Hunter 1990; Hunter and Gaston 1988):

$$D=1-\frac{1}{N\left(N-1\right)}\sum_{j=1}^{s}xj\left(xj-1\right)$$

where

N:

is the number of unrelated strains tested,

S:

is the total number of different types, and

xj:

is the number of isolates belonging to the jth type. The index was calculated using the online software: http://insilico.ehu.es/mini_tools/discriminatory_power/, University of the Basque Country, Spain.

The relationship between the profiles was determined using the goeBURST algorithm (goeburst.phyloviz.net/) (Francisco et al. 2009). For this, clonal complexes were defined as MLVA linked through single-locus variants. The MLVA genotype associated with most single-locus variants is considered the founder pattern.

WGS and phylogenetic analysis

The following four of the 85 cattle isolates were selected: Map 907-k32 (INMV 1), B35-S34 (INMV 1) and I47-S28 (INMV 2), all three from Buenos Aires, which represents the more productive region of the country, and Map L80 (INMV 1) from Tierra del Fuego. High-quality DNA was obtained as described above. Paired-end Nextera XT libraries were constructed and sequenced in a MiSeq sequencer (2 × 250 bp, Illumina). A quality trimming step was applied to raw reads by using Trimmomatic (Bolger et al. 2014). De novo assembly was performed using SPAdes v3.11.1 (Bankevich et al. 2012). Contigs were oriented using Mauve (Darling et al. 2004; Rissman et al. 2009) and the genome of M. avium subsp. paratuberculosis-K10 (GenBank accession number: SAMN02604086) was used as reference.

Fifty-four MAP whole-genome sequences were downloaded from GenBank (Supplementary Table 2) to provide a global phylogenetic analysis. Roary (http://sanger-pathongens.github.io/Roary) (Page et al. 2015) was used to build a pangenome including the four Argentinian strains, with a threshold of sequence identity ≥ 90%. Core-genome multiple sequence alignment was performed using PRANK and a maximum likelihood phylogenetic tree was generated using RAxML v8.2.11 (Stamatakis 2014). Node support was evaluated with 1,000 bootstraps. The phylogenetic tree, geographical location source, and MLVA type were visualized using iTOL v5 (Letunic and Bork 2016).

MLVA of foreign isolates was performed in silico. MAP whole-genome sequences from the NCBI database were analyzed with the following three different tools to obtain a more accurate result: Unipro UGENE (Okonechnikov et al. 2012), Primer Map (http://www.bioinformatics.org/sms2/primer_map) and in silico PCR amplification tool (http://insilico.ehu.es/mini_tools/PCR/). The hybridization of the eight MIRU-VNTR primer pairs and the putative product size of each locus were evaluated.

Results

MLVA genotyping

IS900-PCR confirmed all the selected isolates as MAP. MLVA of our study sample yielded seven MAP genotypes. The dominant subtype, INMV 1 (n = 68), was present in all the herds sampled. Other genotypes present included INMV 2 (n = 6), INMV 33 (n = 5), INMV 3 (n = 4), INMV 16 (n = 2) and INMV 13 (n = 1) (Table 1). For three isolates from Pergamino and one from Chivilcoy (both localities from Buenos Aires province), the analysis of locus 292 showed two bands corresponding to alleles 3 and 4. The overall loci analysis suggested that these animals could be infected with two different strains, supported by the presence of both INMV 1 and INMV 2 genotypes in the herd, as observed in Table 2.

Table 1 Distribution of INMV genotypes classified by the number of herds and MAP isolates. Data are expressed as numbers and percentage over 90 isolates
Table 2 Allelic frequency of tandem repeats in each MIRU-VNTR over 68 isolates. Isolates with INMV 1/2 were excluded from this analysis

One out of four of the MAP isolates from Pehuajó (Buenos Aires province) showed the pattern INMV 13, which occurs exclusively in that locality. In the same way, MAP isolates from Las Colonias (Santa Fe province) represented only INMV 16 type (Table 1 and Fig. 2).

Fig. 2
figure 2

GoeBURST clustering of INMV patterns. Different colors represent different locations

Finally, seven of the twenty-five herds analyzed presented more than one pattern (28%). All these herds showed different patterns in the same year, indicating the coexistence of strains with different genotypes.

Allelic diversity and discriminatory power

The discriminatory power (D) was calculated with 37 non-epidemiologically related isolates and reached 0.536. With regards to the discriminatory power of each locus, loci X3, 3, and 32 showed no allelic diversity, whereas locus 292 showed the highest D value with 3 different alleles. These results, shown in Table 2, are in concordance with other studies (Gioffre et al. 2015; Imperiale et al. 2017).

A cluster analysis was performed to study the relationship among MLVA genotypes. The GoeBURST analysis determined that INVM 2 is the primary founder and all other five genotypes are derived from this genotype.

WGS and phylogenetic analysis

The core-genome phylogenetic tree clearly showed the presence of two lineages: one clustering the classical cattle type (C-type) strains and the other clustering the sheep type (S-type) strains (Fig. 3). All the strains sequenced in this study belong to the C-type lineage, where two branches could be differentiated. One of them grouped 8 out of 45 C-type strains from India, South Korea, Egypt, and the USA, whereas the other grouped most of the C-type strains (37/45). Strains of ovine origin (n = 6) were clustered under the same group (S-type) along with two strains from camelid hosts. This could be explained by horizontal transmission from sheep to other ruminants. The four Argentinian strains from our study were clustered into the broad branch of C-type strains. However, the INMV 1 strains were grouped together and separated from the other Argentinian INMV 2 strain. The INMV 2 strain studied was related to strains from the USA, Germany and Portugal. The in silico MLVA could be achieved for 19 out of 54 strains. Despite the incomplete data, overall results suggest that MLVA does not appear to be in accordance with the clustering obtained with the phylogenetic tree.

Fig. 3
figure 3

Core genome-phylogeny of MAP, Whole-genome sequences obtained in this study are highlighted in light blue. The source, origin, and in silico MLVA are included. Accession numbers of all genomes are listed in Supplementary tables 2 and 3

The statistics of the four Argentinian strains are shown in Supplementary Table 3.

Discussion

This study describes the analysis by MLVA and WGS of 90 MAP strains from Argentina, introducing MAP isolates from deer for the first time. The MLVA revealed the presence of nine different genotypes in Argentina, with a higher prevalence of INMV 1 over others (Gioffre et al. 2015; Imperiale et al. 2017 and this study). The prevalence of INMV 1 in the region was also reported on a systematic review from Latin America and the Caribbean (Correa-Valencia et al. 2021), together with INMV 2 and INMV 11. In this study, INMV 3 was described in the country for the first time, while INMV 5, INMV 8 and INMV 11, previously reported by Imperiale et al. (2017) and Gioffre et al. (2015), were not present in any of the 90 isolates. Differences could be due to the fact that these four genotypes were found in a low percentage and in few herds, making it more difficult to isolate and possibly not as widespread as others. Genotype INMV 1 seems to be distributed all over the country, and has even reached the southernmost province, Tierra del Fuego. This was unexpected and of particular interest because Tierra del Fuego has been an isolated region considered free of tuberculosis since 2011 (SENASA resolution 100/2011) and because this is the first case of PTB reported in that region. Core-genome phylogenetic analysis demonstrated a close phylogenetic relationship between this southern isolate and others. This strain from Tierra del Fuego was isolated from a dairy cow with clinical PTB, also confirmed by strong positive results by ELISA serology. After diagnosis, this animal was culled and samples were taken to the laboratory where MAP was isolated and genotyped. This is likely an instance of recent introduction of a carrier animal to the province. However, the management practices in the area do not support this hypothesis. Thus, an intermediate host such as wildlife is possible (Corti et al. 2021), although underreporting of the disease in this region cannot be ruled out.

The main criticism around MLVA typing is the limited resolution between isolates and that the polymorphisms detected do not necessarily reflect the phylogenetic relationships between strains (Ahlstrom et al. 2015; Bryant et al. 2016). Despite this, the MLVA approach allowed us to describe some features of the productive system of Argentina. Seven of the twenty-five herds analyzed presented more than one strain, which is evidence of the genetic diversity of strains within herds. These herds showed different patterns in the same year, confirming that the simultaneous presence of multiple MAP genotypes is frequent, as reported previously (Gioffre et al. 2015; Perets et al. 2022). Moreover, a co-infection with two strains within the same animal was also observed in four isolates from two different herds. The coexistence of different strains in the herds strongly suggests the absence of animal monitoring prior to the introduction (Ahlstrom et al. 2016). This represents a major risk factor for infection in herds and could be easily explained by the absence of a control program over the time in the country. The deer herds studied shared a common feeding area with beef cows, a farming practice that could have led to interspecies infection. Previous reports of similar MAP genotypes from deer and cattle in co-grazing conditions provides evidence for interspecies transmission (Fritsch et al. 2012). The results obtained in the present study support the idea that there is no relation between the host and genotype and that MAP can infect a wide variety of species, making its eradication from a herd even more difficult (Shaughnessy et al. 2013).

A frequent concern about MLVA genotyping is the stability of the markers and whether this technique can be trusted for epidemiological studies. A previous study showed different genotypes from the same vaccine strain coming from different laboratories or batches, not only with MLVA, but also with IS900-RFLP (Thibault et al. 2007). Further studies tested genetic stability under controlled conditions, both in vitro and in vivo, and proved that MIRU-VNTR alleles remain stable after several passages (Kasnitz et al. 2013), not only for MAP isolates, but also other mycobacteria like Mycobacterium tuberculosis (Savine et al. 2002). In this regard, the use of these MIRU-VNTR loci is plausible, at least for short-term analysis. A bibliographical search suggests different dynamics of the strains circulating between Argentina and Europe. In Argentina, the frequency of patterns is clearly biased to INMV 1 (Barandiaran et al. 2015; Imperiale et al. 2017), whereas in Europe the INMV 2 is prevalent (Biet et al. 2012; Stevenson et al. 2009; Thibault et al. 2007). This observation could be explained by the different genetic structures of the dominant MAP genotypes, the breed of the host, or the combination of both factors.

An alternative to prevent expansion of the disease is the adoption of test-and-cull-based control strategies. However, it must be considered that the presence of wild animals infected with MAP in the environment could hinder the success of control programs, since animals eliminate the bacteria in feces, representing a persistent and widespread source of infection (Fox et al. 2018). This emphasizes the importance of considering wild animals as reservoirs of the infection in the different environments of Argentina, as diverse as the Pampas and Patagonia regions. Molecular typing tools could help to support these programs and thus contribute to maintaining the herd health status and strengthening the regional economies of developing countries as Argentina. This study represents the first report of whole-genome sequences of MAP in Argentina.