The study

Gyroviruses are non-enveloped, icosahedral viruses with small circular single-stranded DNA genomes of ∼2-kb in length and comprise the Gyrovirus genus currently classified together with circoviruses within the Circoviridae family [1]. At least eight gyrovirus genomes have been described which all include three overlapping ORFs encoding structural protein VP1, non-structural protein VP2, and VP3/Apoptin protein. Gyroviruses can be divided phylogenetically into three clades A (CAV, HGyV/AGV2, GyV3, -6 and -7), B (GyV4, -5), and C (GyV8) [2, 3]. Chicken anemia virus (CAV), the prototype of the Gyrovirus genus, is responsible for severe anemia and immunosuppression in young chickens [4]. HGyV/AGV2 (human gyrovirus/avian gyrovirus 2) has been detected using PCR in tissues and sera of diseased chicken from Brazil [5], skins and plasma of healthy French adults [6], blood from solid organ transplant patients [7], and healthy blood donors [8]. The third gyrovirus species (GyV3) was found in feces of human diarrhea cases from Chile [9] and Hong Kong [10]. A second genotype of GyV3 was recently reported in feces of ferrets [11]. GyV4 was detected in the feces from unexplained cases of human diarrhea and in chicken meats and skin specimens in Hong Kong [10]. Two additional species, GyV5 and GyV6, were identified in Tunisian diarrheal feces [3]. Recently, GyV7 was reported to infect chicken [12] and GyV8 was found in tissues of a fulmar (sea bird) [2], respectively. Here we describe the genome of a proposed new gyrovirus species (GyV9) found in a human diarrhea sample.

Viral metagenomics was first used to analyze a fecal specimen collected from a French adult with unexplained diarrhea and fever. Because the sample was pre-existing and provided anonymized, it is considered non-human subject research under UCSF CHR guidelines (http://www.research.ucsf.edu/chr/Guide/chrExemptApp.asp). The sample had previously tested negative for adenovirus, astrovirus, group A rotavirus, norovirus, and sapovirus using the assays described in this Ref. [13]. The fecal suspension was first vortexed and clarified by 15,000×g centrifugation for 10 min. The supernatant was filtered through a 0.45-µm filter (Millipore) to remove bacterium-sized and larger particles. The filtrate was treated with a mixture of nuclease enzymes to digest unprotected nucleic acids, and then viral nucleic acids were extracted [14]. Random RT-PCR was used to amplify RNA and DNA, and a library was constructed for Illumina sequencing (MiSeq 2 × 250 bases) using Nextera™ XT Sample Preparation Kit. The average length in nt of the reads obtained was 228. The reads were de-novo assembled using EnsembleAssembler [15]. Translated sequence reads showing similarity to viral sequences with E score <10−5 were identified using BLASTx.

Out of 225,421 sequence reads, eight anellovirus reads, six circovirus-related reads, and six reads encoding gyrovirus-related proteins (BLASTx E score of 3 × 10−7 to 1 × 10−42). The nearly complete genome of GyV9 (GenBank KP742975) was then amplified using inverse PCR and the amplicon directly Sanger sequenced by primer walking. Two pairs of primers for inverse PCR were designed from the initial gyrovirus sequence reads. Primers GyV9-F1 (5′-ACA GAA ATG GAT GAC CCT AGA CCC T-3′) and GyV9-R1 (5′-CTG ATC CCT GTG CTC TTT GAG T-3′) were used for the first round of PCR. Primers GyV9-F2 (5′-TGA AAC ACT TAT TGG AGA ACT GTA CCT-3′) and GyV9-R2 (5′-TTG TCC TCT CTT TGA GCC TCT GTC-3′) were used for the second round of PCR, producing ∼1.9-kb product (1841 bases actually acquired by Sanger sequencing). Putative ORFs in the circular genomes were predicted by NCBI ORF finder. The 2242-base-long genome contained two major ORFs encoding a 455-aa structural protein (VP1) and a 236-aa nonstructural protein (VP2) (Fig. 1a). The non-translated region (NTR) could not be entirely sequenced due to a region of high GC content. The NTR was 353-bp in length and contained a polyadenylation signal (AATAAA). The alignment of the partial NTR sequences of GyV9 and the closest relatives revealed that 61 nucleotides were likely missing. The NTR possessed three tandem repeats of a CAV promoter (TGTACAGGGGGGGTACGTCA) containing a putative estrogen-response element (in boldface) that can up-regulate transcription [16]. Sequence identity was measured using BioEdit and SDT [17]. VP1 and VP2 shared the best aa identities of 41 and 42 %, respectively, with other members of the Gyrovirus genus (Fig. 1b). GyV9 showed conserved VP1 motif ELX2AQ for rolling-circle replication (RCR) function [18], and phosphatase motif CX5R and WX7HX3CXCX5H in VP2 [19, 20]. A smaller ORF (132-aa) was also identified in GyV9 that showed highest identity of 38 % to the similarly positioned VP3/Apoptin protein in GyV7. Phylogenetic analyses were based on translated amino acid sequences and were performed using CLUSTAL X with the default settings [21]. Neighbor-joining phylogenetic trees were generated using MEGA version 5 with the Poisson Indel Process. [22]. Bootstrap values for each node are shown if >70 %. Phylogenetic analyses of the VP1, VP2, and VP3 proteins confirmed that GyV9 was distinct from other known gyroviruses (Fig. 1c). Using the maximum likelihood [22] and Bayesian [23] phylogenetic methods produced trees with identical topologies (data not shown). SimPlot analysis was also used to test for recombination [24]. No recombination was detected between GyV9 and GyV1–GyV8.

Fig. 1
figure 1

New gyrovirus genome and phylogeny. a Organization of the circular DNA gyrovirus 9 genome shown here in a linear form. b Pairwise comparison of VP1, VP2, and VP3 proteins of the new gyrovirus and other species in the Gyrovirus genus. c Phylogenetic trees generated with VP1, VP2, and VP3 proteins of the new gyrovirus and other species in the Gyrovirus genus. The scale indicated amino acid substitutions per position. Bootstrap values (based on 100 replicates) for each node are given if >70. CLUSTAL X with the default settings included gap opening penalty (10), gap extension penalty (0.2), protein weight matrix (gonnet), residue-specific penalities (on), hydrophilic penalties (on), gap separation distance (4), and end gap separation (off)

A total of three eukaryotic viruses were detected in this fecal sample. Anelloviruses are nearly universal in human blood and acquired very early in infancy and have been shown to be shed over long period in feces of infants [25]. Anellovirus genome detection in feces might indicate the presence of blood in this diarrhea sample or replication in the digestive track. Sequences were also detected that were closely related to a group of small circular DNA viral genomes previously reported in the feces of wild chimpanzees and animals including pigs, cows, a turkey, and urban rats [2632]. The potential role of these three viruses in this patient’s diarrhea, if any, is not known. Because pathogenic bacteria and parasites were not tested for their presence and possible role in this patient’s diarrhea remains a possibility.

The numerous gyroviruses described to date have been detected in chickens and/or on human skin, blood, or fecal samples. Gyrovirus CAV is a very frequent chicken infection, and CAV DNA is commonly seen in human feces [9]. Gyrovirus DNA has also been reported in the feces of carnivore, possibly from the consumption of chicken or other birds [33, 34]. The detection of diverse gyrovirus DNA in human feces may reflect recent consumption of infected chicken, a frequent source of diarrhea-causing bacterial infections [35, 36], rather than active gyrovirus replication in human cells. A PCR study for HGyV DNA in human CD34 + hematopoietic cells and CD34-cells from ten hematological patients did not detect gyrovirus DNA [37]. Gyrovirus DNA may therefore survive transit through the human gut without playing any role in this patient’s diarrhea [38, 39]. Despite the detection of gyrovirus DNA detection in human blood, on human skin, and in feces of human and carnivorous mammals [68, 33, 34], gyrovirus replication in non-avian species such as mammals remains to be conclusively demonstrated.

Nucleotide sequence accession number

The nearly complete genome sequence of GyV9 is available in GenBank under accession number KP742975.