Introduction

Canine parvovirus (CPV), which was discovered in 1978, is the causative agent of severe and fatal gastroenteritis and myocarditis of domestic and wild canids (Kelly 1978; Lenghaus and Studdert 1984). CPV is a single-stranded negative-sense DNA virus in the Parvoviridae family. Genomic DNA contains two major open reading frames (ORFs), one encoding non-structural proteins NS1 and NS2 responsible for viral replication (Reed et al. 1988; Wang et al. 1998; Niskanen et al. 2010) and the other encoding capsid proteins VP1 and VP2 in mediating viral tropism and antigenicity (Reed et al. 1988; Parker and Parrish 1997; Palermo et al. 2006; Nelson et al. 2007). NS1/2 are encoded in overlapping reading frames, and NS2 contains 87 N terminal amino acid (aa) in common with the NS1 joined with 78 C-terminal aa from an alternative ORF (Reed et al. 1988; Wang et al. 1998). VP2 is a truncated version of the VP1 protein with an N terminal deletion of 143 aa, and VP3, which is derived from the cleavage of VP2 by host proteases (Reed et al. 1988). Inverted terminal repeats (ITRs) at both ends of the CPV genome form two hairpin structures, and the left hairpin functions as a primer for host enzymes to convert single-strand DNA to double-strands DNA, which contributes to the rolling hairpin replication of genome (Astell et al. 1985; Reed et al. 1988; Berns 1990). CPV is prone to continuous evolution. The antigenic variants (CPV-2a, CPV-2b, new CPV-2a, new CPV-2b, and CPV-2c) have emerged and replaced CPV-2 worldwide (Parrish et al. 1985, 1991; Buonavoglia et al. 2001; Ohshima et al. 2008; Mohan Raj et al. 2010). The evolutionary pressures should mainly be driven by immune evasion, transferrin receptor binding, and host adaptation (Palermo et al. 2006; Hoelzer et al. 2008).

Currently, few full-length genomes of CPVs are available in the GenBank database because obtaining the palindromic termini via PCR is difficult. In this study, the complete genome sequences of the two CPV isolates prevalent in Northwest China were determined and compared with other CPV isolates available worldwide.

Materials and methods

Clinical samples

Rectal swabs were collected from pet dogs suspected with CPV in the Animal Hospital of Lanzhou, Northwest China. The samples were emulsified in 0.1 M PBS (pH 7.2) and centrifuged at 10,000g for 20 min at 4–8 °C. The supernatant was collected and used for PCR amplification.

Cloning of the full-length genomic sequence of CPV

Viral DNA was prepared by boiling the supernatant for 10 min and chilling immediately on ice as previously described (Decaro et al. 2006). Based on the consensus sequence of previously published CPV genomes, seven primer pairs were designed to amplify the seven overlapping fragments covering the whole genome of CPV (Table 1). The coding regions of genome designated as EF, GH, and IJ (Fig. S1) were amplified using PrimeStar HS DNA polymerase (TaKaRa, Dalian, China) and the following primer pairs: E212f/F2316r, G2023f/H3750r, with the following cycling conditions: 94 °C for 9 min; 30 cycles of 98 °C for 15 s, 55 °C for 30 s, 72 °C for 130 s; and 72 °C for 10 min. For the palindromic terminus parts designated as AB, CD, KL, and MN (Fig. S1), the mixture of the extracted DNA, forward primer A9f (or C56f, M5023f, K-4395f), dNTP, and GC buffer was incubated at 98 °C for 5 min and chilled on ice for 5 min. The mixture was added with PrimeStar HS DNA polymerase and incubated at 72 °C for 3 min. After adding the reverse primer B74f (or D767r, L5029r, N5123r), PCR was performed (94 °C for 45 s, 30 cycles of 94 °C for 30 s, 60 °C for 30 s, 72 °C for 1 min).

Table 1 Primers used for the amplification of CPV genome

The PCR product was incubated at 72 °C for 20 min, added with dATP and Taq polymerase, and purified using a PCR purification kit (TaKaRa, Dalian, China). The purified PCR fragment was cloned into pMD20-T vector (TaKaRa, Dalian, China). The recombinant clones were screened via PCR and restriction endonuclease digestion. The positive clones were sequenced using ABI PRISM 3730 sequencer (Applied Biosystems, USA). The amplification, cloning, and sequencing of the full-length genomic sequence were repeated three times.

Sequence analysis

The sequences of all fragments were assembled using SeqMan (DNASTAR Inc., Madison, Wisconsin, USA) software with manual modifications. The sequences were then compared with those in the GenBank database and aligned with the MegAlign program of the DNASTAR multiple program packages (DNASTAR Inc., Madison, Wisconsin, USA) via the Clustal W method. The phylogenetic tree of the complete coding regions was generated via the neighbor-joining method by using MEGA 5.2.2 software (Kumar et al. 2008). The second structures of both termini of CPV were predicted and analyzed using DNAMAN (Lynon, Co., Quebec, Canada).

Nucleotide sequence accession number

The genome sequences of the two CPV-LZ strains were submitted to GenBank and were assigned with accession numbers JQ268283 and JQ268284.

Results and discussion

The complete genomes of CPV-LZ1 and CPV-LZ2 are 5053 nucleotides (nt) long with C+G content of 37.03 and 37.15 %, respectively. CPV-LZ1 and CPV-LZ2 share a similar genomic organization. The left UTR is 264 nt long, and the right UTR is 520 nt long. The left ORFs located between nt 265 and 2271 can encode the NS1 of 668 aa and the NS2 of 165 aa, and the right ORFs at positions 2278–4533 are capable of encoding 727-aa VP1 and 584-aa VP2. Whole-genome multiple alignment using the reference full genomic sequences in GenBank revealed that CPV-LZ1 and CPV-LZ2 had the lowest similarities (97.3 and 97.2 % of nt identity, respectively) with the CPV-N, whereas the strains had maximum similarities with the CPV-b (99.4 and 99.3 %, respectively). The sequence identity of these two CPV-LZs was 99.4 % at the nt level.

The ITRs located at each end of the CPV-LZ1 and CPV-LZ2 form hairpins. The left-end UTRs of CPV-LZ1 and CPV-LZ2 were similar with those of CPV-b and CPV-Y1 but different from CPV-N, CPV-2a (AJ564427), and Laika. Compared with CPV-N, CPV-LZ1 and CPV-LZ2 had two mutations at nt 52 (C52G) and nt 263 (C263T), a GC insertion at nt 55–56, and a copy of AACC deletion before the start code ATG Fig. 1a, which resulted in the difference at the “bubble” and “ears” of the Y-shaped secondary structure Fig. 1b, c. The second structures of the left-end termini of CPV were predicted based on the minimal free energy (△G). As shown in Fig. 1b, 110 of 264 nt left-end UTRs of both LZ strains folded into a Y-shaped hairpin containing small internal palindromes (26 nt) that formed the “ears” of the Y, and a duplex stem region interrupted by a mismatched “bubble” sequence. This predicted structure was similar with the features on the left-end hairpins of the MVC (Sun et al. 2009) and the minute virus of mice (MVM) (Astell et al. 1985; Burnett et al. 2006). The genomes in the family Parvoviridae are replicated through a rolling hairpin mechanism (Berns 1990). Therefore, the bubble and asymmetric Y residues are the critical structures in the replication of the left-end hairpin (Astell et al. 1985; Berns 1990; Burnett et al. 2006). When changes in nt such as mutations, deletions, and insertions appear, the overall free energy of the predicted secondary structure of the left UTR would change and result in a more or less stable structure that could affect viral DNA replication.

Fig. 1
figure 1

a Sequence alignment of partial left-end untranslated regions of CPVs, bd sequences and second structures of CPV palindromic repeats: left-end hairpin of CPV-N, b left-end hairpins of CPV-LZ1 and CPV-LZ2 c right-end hairpins of CPV-LZ1 and CPV-LZ2 d

The apparent parvovirus feature is the reiteration of DNA sequence within the right-end UTR of the genome (Reed et al. 1988). Two separate and unrelated 62-nt repeats were observed in the right UTR of CPV-N, and the right UTRs of CPV-Y1 had one repeat sequence as well. However, similar with CPV-b, B-2004, and cpv/nj01/06, no repeat sequence was found within the right-end UTRs of the LZ1 and LZ2. The right-end hairpins of CPV-LZs based on the minimal free energy (△G) of the structure displayed a simple duplex stem structure with three unpaired residues, AGA, that formed a small asymmetric bubble (Fig. 1d. This region of the stem harbors a small internal palindrome centered on the three-nucleotide mismatch. Thus, an alternative structure in which the duplex assumes an asymmetric cruciform configuration is possible (Sun et al. 2009).

NS1, a nuclear DNA-binding phosphoprotein, functions as an initiator and an ATP-powered helicase in viral DNA replication and as an activator of the viral promoters during diversion of the cellular machinery toward viral protein expression (Christensen et al. 1995; Lorson et al. 1996; Willwand et al. 1997; Niskanen et al. 2010). The alignment analysis of the full-length NS1 region revealed that CPV-LZ1 had a rare nt variation at position 1714 (G→A), resulting in aa substitution Glu572Ly, which was identical to B-2004, CPV-339, SCO2/2011, and CPV-s5; CPV-LZ2 had nt variations at positions 56 (A→G) and 1634 (A→T), resulting in aa substitutions Lys19Arg and Glu545Val, which appeared in CPV-JS2 and CPV-2a (JQ686671).

VP2 is the most abundant capsid protein, which includes the major antibody- and receptor-binding sites (Parker and Parrish 1997; Palermo et al. 2006; Nelson et al. 2007). In this study, the multiple alignments of the VP2 region revealed that CPV-LZ1 and CPV-LZ2 maintained the aa variations of the variants differed from the original CPV-2 (Met87Leu, Ile101Thr, Ala300Gly, and Asp305Tyr) and had the change at 297 (Ser→Ala) typical of the “new CPV-2a/2b” (Martella et al. 2005; Ohshima et al. 2008). CPV-LZ1 retained Asn at residue 426, whereas CPV-LZ2 had the mutation Asn426Asp. According to the typing systems based on key aa mutations, CPV-LZ1 was classified as new CPV-2a and CPV-LZ2 as new CPV-2b (Martella et al. 2005; Ohshima et al. 2008). In combination with our recent study (Xu et al. 2015), we concluded that the new CPV-2a and CPV-2b were co-circulating in Northwest China, and the former was the predominant antigenic type. All strains isolated from the Northwest China including CPV-LZ1 and CPV-LZ2 had the Y324I mutation, which was referred as a distinct characteristic of Chinese strains. CPV-LZ2 had two specific mutations Phe267Tyr and Thr440Ala, which were also observed in other Chinese strains CPV-JS2, SCO2/2011, and CPV-2a (JQ686671). Moreover, in a previous study (Xu et al. 2015), we reported that the ratio of mutations F267Y and T440A among 27 isolates collected from Gansu and Sichuan provinces of China accounted for 51.8 %. These results showed that the frequency of strains with these two mutations had reached a high prevalence among Northwest China CPV.

A phylogenetic tree was constructed based on the 4269 nt of the full-length coding regions via the neighbor-joining method. As shown in Fig. 2, a total of 33 CPVs retrieved from Genbank were to be spatially structured, with very mild gene flow between different clusters. Eight Chinese strains, including the two CPV-LZ isolates, formed a monophyletic cluster. In this cluster, CPV-LZ1 was closely related with B-2004 and CPV-s5 prevalent in Beijing and Southern China, respectively, whereas CPV-LZ2 had a close relationship with CPV-JS2 and CPV-2a (JQ686671) circulating in Nanjing and Guangxi, China, respectively. Unfortunately, the complete genomes of CPVs in the GenBank database are still insufficient. Thus, the analysis on molecular epidemiology and phylogeny of CPVs is restricted, which greatly inhibits our understanding of the transmission and evolution of CPVs.

Fig. 2
figure 2

Neighbor-joining tree based on the whole coding regions of the indicated CPVs using the MEGA 5.2.2 software (with 1000 bootstrap replicates). Bootstrap support (above 85 %) is shown

In conclusion, the study obtained the full-length genome sequences of CPV-LZ1 (new CPV-2a) and CPV-LZ2 (new CPV-2b) from two field strains circulating in Northwest China and revealed their specific unique variations during the process of local adaption. The result from this study not only adds incrementally to the knowledge of the full-length genome of CPV but also provides a better understanding of the molecular epidemiology and genetic diversity of CPV field isolates in Northwest China. The comprehensive epidemiology of new antigenic strains should be further studied for the prevention and control of CPV infection in this region.