Equid alphaherpesvirus (EHV) is a double-stranded DNA virus of the Herpesviridae family. This family can be divided into Alphaherpesvirus and Gammaherpesvirus subfamilies [1]. The equine Alphaherpesvirus subfamily comprises six Equid Alphaherpesvirus (EHV-1, EHV-3, EHV-4, EHV-6, EHV-8, and EHV-9) [1]. EHV-1 is a commonly occurring horse pathogen extensively studied due to its economic significance [2]. EHV-1, EHV-3, and EHV-4 are co-circulating in naturally infected horses from Argentina [3, 4].

EHV-1 can naturally recombine with EHV-4, EHV-8, and EHV-9 [5,6,7]. Most recombinant events identified are located at ORF64, a double-copy gene found in the genome repeat regions (Internal repetitive region IR and terminal repetitive region TR), which codifies for the infected cell protein 4 (ICP4). The breakpoints in ORF64 indicate that this gene is a hotspot for recombination [7]. Unlike other alphaherpesviruses, EHV-1 does not have consensus subtypes. Two methods have been described to classify EHV-1. Through phylogenetic analysis of the whole genome, EHV-1 is divided into 13 clades [5, 8]. Via fingerprinting analysis, 16 mobility profiles have been described, the most prevalent being EHV-1P and EHV-1B. The latter has been depicted as a natural recombinant between EHV-1P and EHV-4 in ORF64 [6].

This study analyzed the genetic variability of two EHV-1 isolates from Argentina. Strains E/745/99 and E/1297/07 came from the equine virus laboratory, Institute of Virology-INTA, and were isolated in Argentina from abortive equine fetuses collected in 1999 and 2007. The electropherotypes were obtained by restriction endonuclease analysis (REA) analysis with the BamHI and KpnI restriction enzymes (Promega, USA) [9]. Total DNA, digestion, and electrophoresis were performed as previously described [9]. Electropherotypes were compared to reference genomes deposited in GenBank using Nebcutter to simulate the restriction patterns of these genomes [10]. Strain E/745/99 showed a restriction pattern typical of 1P strains, while E/1297/07 had a pattern specific to EHV-1B strains (data not shown).

For sequencing, viral DNA was extracted from 200-µL suspensions of culture fluid (QIAamp viral DNA kit; Qiagen, Hilden, Germany) following the manufacturer's instructions. The Nextera DNA Flex Library Preparation kit (Illumina, USA) was used for library generation. The libraries were purified with AMPure XP (Benchman Coulter, USA) and quantified utilizing a Qubit dsDNAHS assay kit (Invitrogen, USA). The quality and length of the library were assessed on a Fragment Analyzer 5200 system (Agilent Technologies, USA) using the Standard Sensitivity NGS Analysis Kit (Agilent Technologies, USA). Whole genome sequencing was performed on an Illumina MiniSeq (Illumina, USA) platform (Facultad de Ciencias, Uruguay) using MiniSeqTM Mid Output Reagent Cartridge (300-cycles, paired-end reads). Adapter/quality trimming and filtering of raw data were performed with BBDuk, and clean reads (~200 nt) were mapped to reference genomes AY665713 (30.004 reads) and KF644569 (39.563 reads) to E/745/99 and E/1297/07, respectively, using Minimapp2 of Geneious Prime 2020.1.2 (https://www.geneious.com). The coverage mean was 271.1 for E745/99 and 38.2 for 1297/07. Consensus genomes were annotated using the complete reference genome of the equine alphaherpesvirus strain E/1297/07 and E/745/99. Sequences were submitted to GenBank (Accession number PP084325 (E/745/99), PP084326 (E/1297/07). The total length was 147,655 and 150,220 nt for E/1297/07 and E/745/99. Both sequences have 76 ORFs and a G-C% content of 56.4% and 57% for 1297/07 and E745/99, respectively. Alignments of the complete genomes were prepared using the Multiple Alignment with Fast Fourier Transformation (MAFFT) [11].

The sequence obtained was aligned with the available EHV-1 and EHV-4 genomes in GenBank to assess recombination events using RDP4 software [12]. The following methods were applied: RDP, BOOTSCAN, MAXCHI, CHIMAERA, 3SEQ, GENECONV, and SISCAN, the latter two only as secondary scans. Recombination events were considered when detected for five or more algorithms with a p-value < 0.05 after Bonferroni correction. Events were confirmed via SimPlot using the allegedly recombinant query with a window size of 200 bp and a step size of 20 bp [13].

BLAST of the complete genome revealed that E/745/99 is closely related to the Ab4 (98.2% nucleotide identity) (NC_001491.2), a neuropathogenic strain. Although isolated from an outbreak of paralysis in horses, experimental infections with Ab4 result in a higher frequency of abortions than paralysis [14]. E/1297/07 has the highest nucleotide identity (99.8%) with the NY03 strain (KF644569), isolated in the USA in 2003. The E/1297/07 genome is a recombinant virus between EHV-1 and EHV-4. The recombination seem to occur in both repeat regions of the genome, (Fig. 1), and the breakpoints were located at positions 112,276˗113,013 (IR) and 146,548˗147,331 (TR). The recombinant fragments can be subdivided into three parts: (i) a repeat sequence (5´-ACTAACCCGCCC-3´) in the intergenic region that differs in the number of copies between the isolates (Fig. 1), (ii) a 185-nt unique sequence, and (ii) a 466-nt region from ORF64. NY03 and the Japanese isolate 5586 (AP012321.1) were previously identified as recombinants. These strains have the same recombinant pattern as the Japanese strains 97c7, 97c5, 97c9, and 98c12 (GenBank Accession Number AB363623.1, AB183141.1, AB183142.1, and AB183143.1) [6]. Multiple alignments of these recombinant strains revealed that they share the same breakpoints but differ in the length of the recombinant region (Fig. 1). The recombinant region varies in size due to different numbers of copies of a 12pb repetitive sequence (5’-ACTAACCCGCCC-3’) in the intergenic region (Fig. 1). The breakpoint at the intergenic region is flanked by a 15pb repetitive region (5’-GGAAGGGGAGGAGCA-3’) (Fig. 1). Both EHV-1 and EHV-4 differ in the copy number of this repetitive sequence.

Fig. 1
figure 1

Graphical representation of the seven EHV-1 strains that recombine with EHV4. Recombination event involves the end of ORF64 (466 nt.) and part of the intergenic regions (orange rectangles). The boundaries of the recombinant region include a 4-nt unique sequence, followed by a 12pb repeat sequence (5’-ACTAACCCGCCC-3’) with a variable number of copies (red rectangle) all inside of the recombinant fragment. The initial breakpoint at the intergenic region is adjacent to a 15-nt repetitive sequence (5’GGAAGGGGAGGAGCA-3’) (rectangle in gray). Only IRs are represented for the 97c7, 97c9, 98c12, and 97c5 strains

An identity matrix of the recombinant region encompassing the 3´ end of ORF64 and the repeat region showed that the recombinants are similar to the EHV-4 genomes. The similarity decreased when the recombinant non-coding region was analyzed (Table 1). The 5586 and 97c7, 97c5, 97c9, and 98c12 strains were like the Irish NS80567 strain. The NY03 and 1297/07 strains are similar to the German DE17_1 strain (Accession number AF030027.1 and MW892435.1) (Table 1).

Table 1 Nucleotide identity matrix of the recombinant region

Multiple reasons might explain the differences found in the recombinant repeat regions. Firstly, sequencing and assembling errors could be generated by the repetitive nature of the sequence. These challenges could also lead to errors in determining which reads correspond to IR and which to TR, potentially meaning that recombination is not in both regions, but only in one of them [15]. Secondly, ancient recombination could have previously altered the region by deletions or insertions. Lastly, recombinant events with similar breakpoints may have occurred independently and involving different EHV-4 strains. Different EHV-4 strains with variation in the number of the repeat sequences reinforces last theory.

The movement of equids and trade in semen are essential factors in the spread of equine diseases, allowing the distribution of strains between countries. This indicates that cocirculation and, consequently, coinfection of both viruses favors recombination [16]. Sequence comparison suggests that the NY03 and E/1297/07 strains are similar and have the same recombination pattern, indicating a common origin. However, the Japanese strain is more divergent and likely emerged from another recombinant event. Recombination could be favored by a 15 pb repetitive region (5’-GGAAGGGGAGGAGCA-3’) flanking the intergenic breakpoint in EHV-1 and EHV-4. Although there are differences in the number of repetitions (Fig. 1), this sequence might confer the homology required for recombination and allow for multiple events in the same region.

These results increase the knowledge and nature of the homologous recombination mechanisms in equine alphaherpesviruses. We have expanded the known geographic distribution of recombinant viruses for this subfamily of viruses, reinforcing the evidence for recombination in the ICP4 gene between Equine Herpesvirus 1 and 4.