Introduction

Capripox infection is a transboundary and World Health Organization for Animal Health (OIE- notifiable disease of sheep, goats and cattle caused by sheeppox virus (SPPV), goatpox virus (GTPV) and lumpy skin disease virus (LSDV) virus, respectively, of the genus Capripox virus, family Poxviridae [1]. The geographical range of sheeppox and goatpox includes Africa (north of the equator), the Middle East, Iran, Iraq, Bangladesh, Afghanistan, Pakistan, China, India and European countries adjoining Middle-East regions, namely Turkey, Greece and Bulgaria [2]. Recent outbreaks in Vietnam, Mongolia, Morocco and Azerbaijan indicate the emergence of capripoxviruses (CaPVs) in previously unaffected regions [3]. These outbreaks pose a significant economic threat due to productivity losses, high mortality, hide damage, and restrictions on world trade [4, 5]. CAPVs cause a highly contagious disease characterized by fever, oculonasal discharge, and pock lesions on the skin and mucosae of the respiratory and gastrointestinal tract [6]. SPPV and GTPV have host preference for sheep and goats, respectively, and are considered different entities that provide only partial cross-protection [7] although they have a very close antigenic relationship [8]. Apart from conventional diagnostic approaches, PCR-based diagnostic techniques and genome sequencing have provided sensitive and powerful techniques for the identification and differentiation of SPPV and GTPV [3, 8]. It is possible to distinguish CaPVs at the genetic level by sequence comparison and phylogenetic analysis of individual genes or the whole genome sequence and also by polymerase chain reaction restriction fragment length polymorphism (PCR-RFLP) analysis [9, 10]. Multigene analysis is more reliable than analysis of single genes for determining genetic relatedness of because of their large genome size. SPPV and GTPV can cause cross-infection, and virus identification based on the host animal species from which the virus was obtained is not valid. Recent developments in molecular biology have provided deep insights into the cross-species transmission of SPPV and GTPV that makes it necessary to characterize the CaPV isolates from sheep and goats circulating at the field level [7].

The present study focused on genetic and structural analysis of the immunodominant virion core protein gene ORF095, a vaccinia virus A4L homolog [11, 12], using SPPV and GTPV isolates and clinical samples (n = 27). The vaccinia virus A4L protein encodes a highly conserved 39-kDa immunodominant virion core protein that plays a role in assembly and disassembly of the virion [13]. To date, genetic characterization of CaPV-A4L has been limited to some reports on expression in a prokaryotic host [14]. To our knowledge, this is the first study to analyze the full-length A4L gene of CaPVs at the sequence and structural levels using SPPV and GTPV isolates from India and elsewhere. A comparative analysis of A4L homologs of poxviruses was done to assess their evolutionary relationship.

Materials and methods

Virus isolates, clinical samples, and primers

A total of twenty cell culture virus isolates and seven field clinical samples of GTPV and SPPV maintained in the virus repository of the poxvirus laboratory of the Division of Virology, Indian Veterinary Research Institute, Mukteswar, were included in this study. All of the virus isolates and samples used in this study were originally collected from animals with natural sheep pox or goat pox infections at various geographical locations in India and at different times. Details of the virus isolates and samples used in this study and other isolates of CaPV, including LSDV isolates, used for sequence analysis are listed in Table 1. All SPPV and GTPV isolates were propagated in Vero cells obtained from ATCC (CCL81) using Eagle’s minimum essential medium (EMEM) along with 2% bovine calf serum (BCS) as maintenance serum. Infected cells were harvested when 80% of the cells showed a cytopathic effect (CPE) and used for extraction of total viral genomic DNA (gDNA) using a QIAamp DNA Mini Kit (QIAGEN, USA) as per the manufacturer’s protocol. The clinical samples were processed as 10% homogenates in PBS and then used for gDNA extraction. Primers for amplification of the full-length A4L genes of CaPVs were designed based on the complete genome sequences of GTPV, SPPV and LSDV strains available in the GenBank database and synthesized commercially.

Table 1 Details of GTPV, SPPV and LSDV isolates/vaccines used in phylogenetic and multiple sequence alignment studies of CaPV A4L

PCR amplification, cloning and sequencing of the CaPV-A4L gene

The extracted CaPV gDNA was used for PCR amplification of the A4L gene, using standard cycling conditions and an optimized annealing temperature using in-house-designed primers (ORF095 fwd, 5’-GGCCATGGCGATGGACTTCATGAAAAAATATAC-3’; ORF095 rev, 5’-GGAAGCTTTTTGCTGTTATTATCATCTAG-3’) covering the full length of the gene. In brief, PCR amplification was carried out using 1X GoTaq High Fidelity PCR Master Mix (Promega, USA), 10 pmol of each primer and 1 µL of template DNA (~ 50-100 ng) under optimized cycling parameters: 95 °C for 5 min, 35 cycles of denaturation (95 °C for 45 s), annealing (53 °C for 45 sec) and extension (72 °C for 1 min), followed by a final extension step at 72 °C for 10 min. The PCR product was then purified using a gel purification kit (QIAGEN, MD, USA) using the manufacturer’s standard protocol and ligated into pGEM®-T Easy Vector (Promega, Madison, WI, USA) using a T/A cloning strategy (data not shown). These ligated products then were then used to transform E. coli Top 10F’ cells by the heat shock method. Positive recombinant clones were selected using antibiotics and a blue-white colony system and sent for commercial sequencing (Delhi University South Campus, New Delhi, India). Clones were selected and sequenced twice before genetic analysis. The complete nucleotide sequences of the A4L genes of different SPPV and GTPV isolates used in this study were submitted at GenBank, and their accession numbers are listed in Table 1.

Multiple sequence alignment and phylogenetic analysis

A multiple sequence alignment of CaPV-A4L gene sequences of Indian CaPV isolates from this study and those of other CaPVs retrieved from the GenBank database was generated using the Clustal W program [15]. Percent identity values were determined using the MegAlign program of the DNASTAR package (Lasergene 6.0, DNASTAR Inc., Madison, USA). Phylogenetic trees based on the nt and the deduced aa sequences of the A4L genes of CaPVs (Table 1) and other animal poxviruses (Table 2) were constructed by the neighbor-joining method in MEGA 7.0 software, using 1,000 bootstrap replicates [16].

Table 2 List of representative poxviruses with A4L gene sequences used in comparative phylogenetic analysis

Protein modeling and structural features of CaPV-A4L

The consensus amino acid sequence of the CaPV-A4L protein was analyzed using various protein structure prediction programs, including Protean (DNASTAR), PSIPRED [17], and coils2 [18] for secondary structure prediction and identification of other protein characteristics. A structural model of CaPV-A4L was made using the I-TASSER online server, and the accuracy and quality of the predicted structure were estimated based on C-score, TM-score, and RMSD values [19, 20].

Results

Multiple sequence analysis of CaPV-A4L gene

PCR amplification of the full-length A4L gene of CaPV yielded an amplicon of ~ 500 bp containing an ORF of 486 bp in all of the isolates used in the study. The identity of all PCR products was confirmed by sequencing. A multiple sequence alignment revealed significant nt and aa sequence similarity among the Indian and foreign CaPV isolates under study. The nt and aa identity values are listed in Table 3. One synonymous mutation and eight non-synonymous mutations were found in the GTPV isolates, and five synonymous mutations and 13 non-synonymous mutations were found in the SSPV isolates. All of the nt and aa changes, including transverse/transition and synonymous/non-synonymous mutations are listed in Table  4. The presence of guanine (G) at positions 136 and 146, adenine (A) at positions 93 and 216 and cytosine (C) at position 389 may be considered a ‘GTPV signature’. Similarly, the presence of A at positions 63, 168, and 276 and G at position 47 can be considered an ‘SPPV signature’. The presence of G at position 48 and C at position 98 can be considered an ‘LSDV signature’. All SPPV and LSDV isolates possess the ‘A’ nucleotide substitution at positions 400 and 431, with the exception of SPPV/Ahemdabad/2009/P5. In contrast to the above, foreign SPPV isolates and LSDV isolates have a G nucleotide substitution at position 465 that is not found in Indian SPPV isolates. A multiple sequence alignment of CaPV ORF095 nucleotide sequences showing specific signature residues for each of the CaPV members is shown in Fig. 1.

Table 3 Percentage identity of CaPV A4L coding sequences of Indian GTPV and SPPV isolates to other CaPV isolates
Table 4 List of synonymous and non-synonymous mutations present in CaPV A4L of GTPV and SPPV isolates in this study
Fig. 1
figure 1

Multiple sequence alignment of the CaPV A4L gene showing species-specific signature residues for each member of the genus CaPV indicated by colored boxes

Phylogenetic analysis of the CaPV-A4L gene

A phylogenetic tree based on nucleotide sequences of the CaPV A4L gene is shown in Fig. 2. The consensus tree revealed the presence of three distinct genetic clusters separating the three members of the genus Capripoxvirus namely GTPV, SPPV and LSDV. This analysis demonstrated that of GTPV and SPPV are more closely related to each other than to LSDV. The significance of this grouping/cluster was supported by 1000 replicates of bootstrap testing. All of the Indian GTPV strains except GTPV KT/Mukteswar/2012 clustered together with the Chinese GTPV isolate FZ. GTPV KT/Mukteswar/2012 clustered separately with the foreign GTPV isolates GTPVG20-LKV and GTPV Pellor. The SPPV group showed more intra-cluster diversity than the GTPV and LSDV groups. Unexpectedly, SPPV/Ahmedabad/2009/P5 formed a different subcluster from the other SPPV isolates, although the SPPV isolates in general showed significant intra-species sequence similarity. A phylogenetic tree based on the sequences of CaPV-A4L homologs from various animal poxviruses (Table 2) retrieved from the GenBank database is shown in Fig. 3. This analysis showed genus-specific grouping of all chordopoxviruses, with fowlpox virus (FWPV) clustering separately from all other chordopoxviruses. The CaPVs were found to be more closely related to suipoxvirus (SWPV) than to any of the other chordopoxviruses.

Fig. 2
figure 2

Phylogenetic tree based on nucleotide sequences of CaPV-A4L genes of Indian GTPV and SPPV isolates and other CaPV sequences available in the GenBank database (Table 1). The tree was constructed by the neighbor-joining method in MEGA version 7.0 with 1000 bootstrap replicates. Symbols in the figure indicate vaccine strains of GTPV and SPPV used in India

Fig. 3
figure 3

Phylogenetic tree of poxviruses based on the deduced amino acid sequence of the A4L protein. Representative poxviruses and their A4L amino acid sequences (Table 2) were used to construct the tree by the neighbour-joining method in MEGA version 7.0 with 1000 bootstrap replicates

Predicted protein characteristics and structural features of CaPV-A4L

The full-length CaPV-A4L protein (161 aa, 18.68 kDa) was predicted to contain a total of 34 basic, 30 acidic, 47 polar, and 40 hydrophobic residues and no cysteine residues and to have an isoelectric point of 8.994. A schematic representation of the full length CaPV-A4L is shown in Fig. 4A. The predicted antigenic index, hydrophobicity, and surface exposure of amino acid residues obtained using Protean are shown in Fig. 4B. A secondary structure prediction using PSIRED revealed the presence of six α-helices (α1, 2-16; aa α2, 22-24; aa α3, 47-62; α4, aa 93-94; α5, aa 100-115; α6, aa 121-150) and only β sheet (aa 71-72) and many coils, as shown in Fig. 4C. The location of coiled-coil motifs (CCMs) in CaPV-A4L, predicted using COILS2, is shown in Fig. 5A. A coiled-coil probability plot using scanning windows of 14, 21, or 28 aa residues showed that the central portion of CaPV-A4L can potentially form coiled coils, with aa 54-89 (CCM-I) and aa 100-134 (CCM-II) predicted to have the highest probability using all three window sizes. A 3D structural model was made in I-TASSER after identification of templates, and threading alignments. The final model had a C-score of -2.14with an estimated TM score of 0.46 ± 0.15, and an RMSD score of 9.6 ± 4.6 Å (Fig. 5B).

Fig. 4
figure 4

Predicted characteristics of the virion core protein CaPV-A4L. a Schematic diagram of the aPV ORF095 protein sequence, which is 161 aa residues (18.6 kDa) in length; b Predicted protein characteristics. (a) Kyte-Doolittle hydrophilicity plot, (b) Jameson-Wolf antigenic index, (c) Emini surface probability plot. c GTPV-A4L sequence with predicted secondary structure. The secondary structure was predicted using PSIPRED. Pink tubes indicate α-helices. β-strands are represented by yellow arrows. Sec str, secondary structure; aa #, amino acid number; α, helix region; β, strand region. The helix and strand regions are numbered sequentially

Fig. 5
figure 5

Structural prediction of the CaPV A4L protein. a Probability plot for CCMs in GTPV-A4L predicted using COILS2 program. Peaks in the graph indicate regions of higher coiled-coil probability. The default output of probabilities with scanning windows of 14, 21 and 28 aa residues is shown in green, blue, and red, respectively. b Predicted GTPV-A4L protein model made using the SPICKER program of I-TASSER. The closest structural analogs were myosin/tropomyosin contractile proteins (pdb hits: 2tmaB and 2dfsA) with TM alignment scores of > 0.630 and 0.621, respectively

Discussion

Capripoxvirus infection is generally considered to be host-specific in nature [6], and some strains have been reported to infect sheep, goats and cattle, with more severe disease evident in the homologous host [7, 13, 21]. Genus level classification based on the host from which the virus was isolated is not reliable due to cross-species transmission, which has been reported frequently in the last decade [7, 22]. The antigenic relatedness of SPPV and GTPV makes them indistinguishable at the serological level, but they can be distinguished using molecular tools, which can help in understanding the molecular epidemiology of these viruses [9, 10, 23, 24]. A growing numbers of mixed-flock infections and CaPV-like disease [25] have worsened the situation, substantiating the need for identifying the causal agent. In this study, ORF095, which encodes a VACV A4L homolog [12] and is one of the major core immunogenic and conserved proteins of the CaPV genome, was targeted to genetically characterize the viral isolates at the nt and aa level.

Sequence analysis of CaPV-A4L protein revealed that all Indian GTPV isolates have high sequence similarity at both the nt and aa level to GTPV FZ, a Chinese isolate. The same pattern of clustering of Indian GTPVs with Chinese isolates was reported earlier [26]. However, GTPV KT-Mukteswar-2012 contained amino acid substitutions, namely N28S, Y83C and K96R, that were also present in the G20-LKV, Pellor and Gorgan strains. These isolates were identical to each other in the sequenced region at both the nt and aa level. All SPPV isolates from outside of India, namely, the SPPV 10700-99, SPPV A and SPPV Niskhi strains, had a common G substitution at nt position 405 that was not found in any Indian SPPV isolates. Although the AL gene sequence is generally highly conserved among CaPVs, species-specific signature residues were found that can be used for genotyping or differentiation of strains within the genus. Being located at the core of CaPV genome, A4L homolog is highly conserved and may not face any immune pressure from the host species that would force genetic changes.

The phylogenetic analysis showed that CaPVs formed three distinct clusters, namely GTPV, SPPV and LSDV, as observed for other genes used for the characterization of CaPVs [9, 27]. The topology of the consensus tree also revealed that GTPV and SPPV are more closely related to each other than to LSDV, supporting the earlier conclusions based on analysis of whole genome sequences of CaPVs [11]. It has been suggested recently that GTPV and LSDV are more closely related to each other than to SPPV and that they emerged from a common ancestor closely related to SPPV. This was based on phylogenetic studies on different genomic segments [9, 11, 28]. More studies are needed to clarify the evolutionary history of CaPV.

CaPV-A4L is a highly conserved immunodominant core protein. Therefore, it is not surprising that the nt sequence of its gene and its deduced aa sequence were found to be more than 96.3% and 92.0% identical, respectively, in all CaPVs. Multiple alignments of A4L sequences and phylogenetic analysis did not reveal specific differences among the vaccine and virulent strains of GTPV and SPPV, indicating that the A4L might not play a role in virulence or in the attenuation process. Subdivisions or subclusters were found in the GTPV, SPPV and LSDV groups with considerable bootstrap support, which is in agreement with previously reported studies examining the GPCR and RPO30 genes [23, 28]. One interesting feature of the A4L gene phylogeny is that the SPPV/Ahmedabad/2009/P5 isolate forms a separate subcluster between the GTPV and SPPV clusters. Multiple sequence alignment also revealed that this particular SPPV isolate had four nucleotide changes (T389C, A400G, A431G and A450T) in common with GTPV isolates. Therefore, sequencing of the entire genome of this isolate is warranted to investigate whether recombination might have occurred between SPPV and GTPV. Also, the C216T mutation was only observed in the LSDV Cro/2016, OBP, SIS Lumpyvax and NW-LW1959 isolates, which form a separate subcluster from other LSDV isolates in the phylogenetic tree (Fig. 1)

The ORF095 protein is a homolog of myxoma virus (MYXV) M093L [29] and vaccinia virus (VACV) A4L [13]. These viruses encode a 39-kDa acidic virion core protein that is synthesized at a late stage after infection. The late viral protein of VACV, designated as p39, corresponds to the product of A4L in VACV Copenhagen and A5L in VACV WR strains and was reported previously to induce a strong persistent immune response in humans [30, 31]. The co-localization of A5/A4 with replicated DNA in virosomes and its co-purification with nuclease activity indicate a possible role of this core protein in DNA processing or packaging [13]. Also, the A4L or p39 protein does not undergo post-translational modifications and acts as a matrix-like linker between the core and the innermost of the two membranes of the intracellular mature VACV virion [31] and makes a part of the spike-like protrusions present on the IMV core. Similarly, CaPV-A4L is located in the core but is partially exposed to the host immune system, making it potentially useful for differentiation of CaPV-vaccinated and infected hosts. This protein should be overexpressed in a suitable heterologous host and evaluated in an immunoassay such as ELISA as a coating antigen.

The CaPV-A4L protein was predicted to have a high antigenic index with high hydrophilicity and surface probability, suggesting its potential use for diagnostic or prophylactic purposes. The CaPV-A4L homolog protein of GTPV Uttartkashi (vaccine strain) has 93.79 to 100% sequence identity among members of the genus Capripoxvirus, with the highest identity observed with the GTPV FZ strain and the lowest with LSDV isolates. A4L homologs from other poxviruses had shown a low percentage of identity, viz. swinepox virus (50.31%), deerpox virus (44.87-45.51%), Yaba-like and tanapox viruses (37.65%), vaccinia-like viruses (37.11-37.18%), followed by myxoma and rabbit fibroma viruses (37.11%) with least identity to fowlpox virus (FWPV). FWPV strains carry a VACV A4L homolog, which is an immunodominant, 39K core protein possessing highly charged domains at each end of the protein and a 12-amino-acid serine-rich repeat sequence in the middle of the protein [32]. This repeat structure has also been reported to be present in a molluscum contagiosum virus homolog (MC107L) but absent from VACV-A4L. As expected, CaPV-A4L did not show any such repeat sequence in its primary structure. The complete A4L protein may be essential for the virus, as it plays a major role in virus assembly. This fact is corroborated by evidence that shRNA targeting CaPV A4L can effectively inhibit the replication of GTPV in Vero cells, resulting in a 463.5-fold reduction in vitro [12]. This presents a new anti-GTPV strategy for control of goatpox in the future. Strong coiled coil motifs (CCM-I and II) responsible for oligomerization under native conditions are present in the middle region of the protein. Oligomerization might stabilize the structure and function of the protein under common storage conditions [33]. This might be one of the potential attributes of a candidate diagnostic or prophylactic antigen [34, 35].

Conclusions

In this study, we found that the CaPV-A4L homolog is highly conserved among CaPVs but shows genetic variations that can be can be exploited for genotyping by phylogenetic analysis and testing for signature residues in the gene sequence. Comparative sequence analysis of A4L homologs of different pox viruses revealed a low percentage of overall identity but highly conservation at the genus level. The oligomerization ability of this protein predicted in this study needs to be tested experimentally. Potential signature residues specific for each CaPV member are indicative of an alternate genotyping strategy. Also, the ideal protein characteristics revealed by A4L structural analysis show that it can be a potential candidate antigen for heterologous expression and application for immunodiagnostic or prophylactic purposes.