Introduction

Peste des petits ruminants (PPR) is an acute and febrile viral disease of small ruminants, which is characterized by pyrexia, ocular and nasal discharges, erosive stomatitis and diarrhoea [1]. It was first reported in India in 1987 [2] and became endemic in the country [3]. India being a vast country with a population of more than 130 million goats and 58 million sheep; the PPR is considered as a main constraint in augmenting the productivity of small ruminants in this nation [3].

The causative agent, peste des petits ruminants virus (PPRV) is classified as a member of the genus Morbillivirus, Family Paramyxoviridae under the order Mononegavirales [4]. The other members of this genus include rinderpest virus (RPV), measles virus (MV), canine distemper virus (CDV), phocine distemper virus (PDV), dolphin morbillivirus (DMV) and porpoise morbillivirus (PMV) [5, 6]. The genome of morbilliviruses is a single stranded-RNA, ~16 kb long with negative polarity [7]. It is divided into six transcriptional units encoding two non-structural proteins (V, C) and six structural proteins: the surface glycoproteins (F and H), the matrix protein (M), the nucleoprotein (N), the phosphoprotein (P) which forms the polymerase complex in association with the large protein (L) [8, 9]. Presently, the genome sequences of most of the morbilliviruses are known; however, in the case of PPRV the nucleotide sequence of only the African vaccine virus Nigeria75/1 is nearly completed [1, 6, 8, 10, 11].

Based on the molecular epidemiological studies PPRV isolates have been classified in to four lineages (Lineage 1, 2, 3 & 4). Of which lineages 1, 2 & 3 is found in Africa and the lineage 4 is exclusively reported to occur in the Middle East, Arabia-and Indian sub-continent. The vaccine virus Nigeria75/1 (named here as PPRV-N) belongs to lineage 1 [12, 13]. The “PPRV Sungri/96” (named here as PPRV-S) virus used in the present study belongs to lineage 4 [12] and has been extensively characterized using a panel of monoclonal antibodies directed against different proteins [14] and thermostability [15]. Recently, a homologous live-attenuated vaccine has been developed at Indian Veterinary Research Institute (IVRI) using PPRV Sungri/96, which has been used in the field for vaccination of sheep and goats in India [I5].

Since not much work has been done to understand the genetic makeup of a lineage 4 virus, our lab is engaged in completing the genome sequencing of the vaccine virus PPRV-S. In this context here we report the fusion and haemagglutinin gene sequence of PPRV-S and its comparative analysis with other morbilliviruses including the lineage 1 PPR vaccine virus PPRV-N of African origin. The vaccine developed at IVRI is likely to be used extensively in field conditions in India and other Asian countries. Genetic characterization may help in better understanding of the vaccine virus and also to assess a vaccine-related outbreak.

Materials and methods

Cells and virus

Vero cells at passage level of 131 were used for propagation of the vaccine seed virus developed at Rinderpest Laboratory; IVRI-Mukteswar campus using an indigenous isolate of PPR virus (“PPR Sungri/96”). The vaccine virus was propagated as described previously [15] and aliquots of 250 μl were stored at −80°C until use.

RNA extraction, cDNA synthesis and PCR

Total RNA was extracted by the acid–guanidium-thiocyanate–phenol–chloroform method essentially as described earlier [12, 16]. The oligonucleotides used in RT-PCR to generate overlapping fragments covering the entire Fusion and Haemagglutinin protein genes are shown in Table 1 and were designed from the PPRV Nigeria75/1 sequences available in the GenBank database (Accession Numbers: Z47977, Z81358, Z37017) using the software DNASIS version 2.6 (Hitachi, Japan) and were obtained from M/S Metabion GmbH, Germany. Reverse transcription was performed on 1–5 μg of total RNA using MMLV reverse transcriptase and random hexamers (75 μg) at 37°C for 1 h and subsequent PCR amplification was carried out using 5 μl of the RT product. The PCR cycling conditions were as described previously [13, 16]. The cDNAs were subjected to a 30-cycle amplification (denaturation at 95°C, annealing for 1 min at 50°C and primer extension for 2 min at 72°C) using 10 pmol of each primer.

Table 1 Primers used for amplification of F and H genes of PPRV-S in the study

Cloning of PCR products and sequencing

The PCR amplicons were checked for its correct size in 1% agarose gel and purified using Wizard® PCR Purification system (Promega). The purified amplicons were cloned in to pGEM-T vector and the recombinant plasmid DNA was isolated from representative clones and checked for its correct size as described previously [12]. These cloned amplicons were sequenced on both the strands using fmol DNA cycle sequencing kit (Promega) and cy5-labeled M13 forward and reverse primers in an automated sequencer ALF express II, (Amersham Pharmacia Biotech, U.K.).

Sequence and phylogenetic analysis

The sequenced fragments of the gene were assembled using the Megalign software of the DNASTAR package. The portions overlapping and the primer sequences were eliminated appropriately. For comparison, the following sequences (GenBank Accession numbers in parenthesis) were obtained from NCBI sequence databases; vaccine virus PPRV Nigeria/75/1 (Z81358, Z37017), rinderpest virus (Z30697), Dolphin Morbillivirus (NC005283) phocine distemper virus strain PDV/DK88 (X75717), canine distemper virus (AF305419), and Measles virus (AF266289). The phylogenetic tree was constructed using the Neighbour Joining method with Kimura−2-parameter model available in the program MEGA version 2.1 [17]. The alignment gaps were excluded from pairwise distance estimations. The robustness of the predicted tree was statistically evaluated using the boot trap method [18, 19]. The bootstrap P-values are obtained after 10,000 replications. The sequence data reported in this paper have been submitted to the GenBank (Accession Number AY560591).

Results and discussion

Fusion protein gene

The entire F gene of PPRV-S composed of 2405 nucleotides including the poly-(A) tail. The gene starts with the semi-conserved gene start sequence AGGG and ends with AAAC, which is the same for F gene of PPRV-N. Compared to the published nucleotide sequences of the F-protein gene of other morbilliviruses, the PPRV-S F gene is found to be the longest gene reported so far. It is longer than PPRV-N, RPV, MV, CDV/PDV and DMV genes by 80, 52, 24, 200 and 189 nucleotides, respectively [11, 2025]. The additional 80 nucleotides are present at the 5′ untranslated region (UTR) at position 11–90 of PPRV-S. Similar sequence could be found in the PPRV turkey isolate whose sequence is available in the GenBank database (Accession number AJ849636)

The complete PPRV Sungri F gene alignment with other morbillivirus (Table 2) reveals a homology of 89% with PPRV-N and 48–51% with other morbilliviruses. As with other morbilliviruses, the PPRV-S F gene contained a long stretch of 628 nucleotides rich in G-C residues (68.6%) at the 5′ UTR. The ORF starts at position 629–631 and ends at 2267–2269. At the 5′ UTR, PPRV-S shares low to moderate range of nucleotide sequence homology with other morbilliviruses, and it was 25.6% for DMV and 80.3% for PPRV-N [11]. As the sequence divergence observed in this region is ~20% between PPRV-S and PPRV-N, the usefulness of this region for molecular epidemiology should be explored. The 3′ UTR is of 136 nucleotides long ending at AAACAAAA, which is followed by the intergenic trinucleotide CTT.

Table 2 Similarity between the F protein and.H protein of PPRV Sungri/96 and other morbilliviruses

The PPRV-S F-protein consists of 546 amino acids similar to that of PPRV-N and RPV F proteins, with a predicted molecular weight (MW) of 59.137 kDa. Comparison of the PPRV fusion protein amino acid sequences as a whole with those of other morbilliviruses (Table 2) showed that the lowest homology was with PDV at 65.6% and the highest homology with PPRV-N at 96.2%.

The alignment of the different F-protein sequences (Fig. 1) revealed a high degree of sequence conservation apart from two main domains and nearly identical hydropathic profiles as observed previously for morbilliviruses [11, 22]. The first domain, signal peptide in the N-terminus of the protein consists of 19 amino acids, out of which 5 residues are variable between PPRV isolates. In the second non-conserved long domain (aa 485–517) which includes the hydrophobic anchor membrane sequence (aa 485–502), only two amino acid variations could be observed between the Asian and African lineage of PPR viruses [11]. A total of 16 cysteine residues identified in this protein of PPRV-S were similar to PPRV-N of which 12 are conserved across morbilliviruses, indicating the conservatory nature of this amino acid in maintaining the tertiary structure. of this protein [5].

Fig. 1
figure 1

Comparison of the predicted amino acid (aa) sequence of F protein of PPRV-S with PPRV-N and other morbilliviruses. Dot (.) represent the identity with PPRV-N; dash (-) denotes gap generated during alignment and the difference in aa sequence is represented by a single letter aa code. The signal peptide, potential glycosylation sites (G1–G3), F0 cleavage site and Zinc finger domain are marked appropriately

The paramyxovirus fusion protein normally synthesized as precursor F0, which is cleaved in to two subunits, F1 and F2 linked by a disulfide bond [4]. This cleavage is required for the virus to become fusogenic and thus infective [24]. The cleavage site in PPRV-S is RRTRR at position 104 and 108 is same as that of PPRV-N [11]. The three-glycosylation sites identified in the F2 subunit of PPRV-N were also conserved [11]. The fusion peptide sequence from position 109 to 133 is identical with that of PPRV-N barring one amino acid variation at position 110 (A→V). In the leucine zipper structure detected previously in the paramyxoviruses at position 459–480 of PPRV-S is conserved. At the cytoplasmic tail, the last 22 amino acids (position 525–546) were similar to that of PPRV-N. Here the last 15 amino acids are thought to interact with the M-protein during the virus budding process since an alteration in that domain leads to the abolishment of virion production [11, 26].

Haemagglutinin protein gene

The haemagglutinin (H) protein of morbilliviruses is known to play an important role in virus attachment and induces strong neutralizing antibody response, which is highly protective [27, 28] and being the outermost protein, it is subjected to increased immunological pressure [29]. It is generally believed that the H protein of morbilliviruses lacks the neuraminidase activity, but recently it has been reported that the cytomegalovirus expressed H protein of PPRV and RPV possess neuraminidase, activity also [30].

The H gene of PPRV-S starts with usual semi conserved start signal of morbillivirus (AGGR). The complete gene along with the poly (A) is of 1954 nucleotides long, containing single ORF starting at position 18 and terminating at position 1847 with a TGA termination codon. The gene end sequence for PPRV-S is GTTAT, whereas for other morbilliviruses the sequence was ATTAT (RPV, CDV and PDV) and ATTAAG (MV). The intergenic triplet, CTT found between the H and L junction of PPRV-S is very similar to that of PPRV-N and DMV. However, MV and RPV have CGT and PDV and CDV have CTA at similar positions [31, 32].

The predicted molecular weight of H protein is ~68 kDa similar to that reported for RPV [33]. Percentage nucleotide similarity (Table 3) reveals that PPRV-S shares a homology of 90.6% with PPRV-N, but.with other morbilliviruses it ranged from 33% to 45%. At amino acids level similar homology could be observed where PPRV-S had percentage amino acid identity of 92.3% with PPRV-N and with other morbilliviruses, it ranged from 34% to 49%. Similar homology also could be observed at the ORFs. This indicates the high level of divergence of the H protein among all the morbilliviruses [24, 3335]. The H protein alignment reveals (Fig. 2) out of the 13 cysteine residues present in the protein, 12 are at identical position with all the morbilliviruses and one at position 583 is not present in the CDV and PDV. Similarly, 17 out of 37 proline residues were located at positions identical to those of other morbillivirus H proteins. The hydropathic profile (data not shown) of the predicted amino acid sequence of PPRV-S H protein was highly conserved, when compared with all other morbilliviruses [24, 3335].

Fig. 2
figure 2

Comparison of the predicted amino acid (aa) sequence of PPRV-S with PPRV-N and other morbilliviruses. Dot (.) represent the identity with PPRV-N; dash (-) denotes gap generated during alignment and the difference in aa sequence is represented by a single letter aa code. The membrane anchor domain is marked appropriately and the potential glycosylation sites are marked as bold and italics

The H protein of PPRV-S is type II glycoprotein with an N-terminal proximal anchor (residues 35–58) similar to other morbilliviruses. Potential sites for asparagine (N)-linked glycosylation were found at four positions (Fig. 2) N172KSK175, N215VSS218 N279MSD282 and N215VSS218 as predicted by ScanProsite programme [36]. PPRV-N shares all of these and contains one more potential site at N18KTH21.

Phylogenetic analysis of the F and H gene

The phylogenetic analysis based on F and H gene (Fig. 3a and b) revealed that PPRV-S and PPRV-N were cluster in one group, whereas RPV and MV were grouped in a separate cluster. CDV and PDV formed another cluster and DMV was separated in a different cluster with high bootstrap confidence. The tree also confirms similar grouping of morbilliviruses as observed earlier [5, 8, 11].

Fig. 3
figure 3

Bootstrapped phylogenetic tree based on genetic distances calculated using the sequences from F gene of PPRV and other morbilliviruses. The bootstrap confidence values of major clusters are indicated in the node. The bar represents the genetic distance