Papillomaviruses (PVs) are a family of viruses with a small, circular, double-stranded DNA genome of about 6-8 kb and a non-enveloped, icosahedral (T=7) capsid [1, 2]. The PV genome encodes at least six proteins: E1, E2, L1, and L2 [3]. L1 and L2 are structural proteins: the major capsid protein and the minor capsid protein [4,5,6]. E1 and E2 are regulatory proteins involved in replication and transcription [4, 7,8,9], whereas E6 and E7 specifically associate with cell cycle regulators [10, 11], serving to stimulate the progression of the cell cycle, with the potential to disrupt important regulatory pathways and be factors in the development of PV-associated cancers [12].

The family Papillomaviridae has two subfamilies: Firstpapillomavirinae and Secondpapillomavirinae [3]. Within these subfamilies, PVs are classified into 53 genera, 133 species and 343 types (see curation at PaVE [13]). PVs have been identified in samples from mucosal and epithelial cells, which are the primary sites of PV infection. They have also been identified in fecal samples from various animals as a result of epithelial cell shedding [12, 14]. PVs have co-evolved with their hosts, developing into a highly species- and tissue-specific viral family [2, 12]. Despite the diversity of PVs and their hosts, PVs show conservation in their genome organization and particularly high conservation in the L1 protein [14].

Non-human primate PVs have been identified in howler monkeys, squirrel monkeys, colobus monkeys, macaques, baboons, gorillas, bonobos, black-tufted marmosets, and chimpanzees [2, 14,15,16,17,18,19]. Clinical manifestations of non-human primate PVs include focal epithelial hyperplasia in bonobos, cervical dysplasia in baboons, and aggressive warts and cervical cancer in macaques [12, 15, 18]. Despite the characterization of PV genomes in the great apes and New and Old World monkeys, relatively little is known about those in lemuriform primates. Lemuriform primates are of particular interest because of their unique evolutionary history, phylogenetic relatedness to humans, and rapidly declining populations (~98% of lemurs are threatened with extinction) [20]. There are over 100 species of lemurs; thus, species-specific lemuriform PVs are likely to be extremely diverse. Antonsson and Hansson [17] detected a gammapapillomavirus-like PV in a black-and-white ruffed lemur (Varecia variegata); however, challenges encountered during sequencing efforts prevented characterization of this PV genome.

To address the significant gap in our knowledge of lemuriform PVs, we undertook a pilot project at the Duke Lemur Center in Durham, North Carolina, USA, to identify the oral PVs of black-and-white ruffed lemurs. We collected lemur saliva samples from three black-and-white ruffed lemurs (H1, H2, and H3) at the Duke Lemur Center under IACUC #A161-21-08 during July 2021 and March 2022 by allowing the lemurs to chew on a SalivaBio Children’s Swab (Salivametrics, USA). No oral abnormalities were noted. The saturated swabs were then placed into a SalivaBio Swab Storage Tube (Salivametrics, USA) and centrifuged to collect the saliva. Saliva samples were stored at -80°C until viral DNA extraction. We added SM buffer (0.1 M NaCl, 50 mM Tris-HCl [pH 7.4]) to each saliva sample up to a final volume of 400 µl, and 200 µl of the diluted sample was used to extract viral DNA using a High Pure Viral Nucleic Acid Kit (Roche Diagnostics, USA). The circular DNA in the viral DNA extract was amplified by rolling-circle amplification using an Illustra TempliPhi Kit (GE Healthcare, USA) and used to generate Illumina sequencing libraries using an Illumina DNA Prep Kit (with Tagmentation). Samples were sequenced on an Illumina NovaSeq 6000 system at the Duke Center for Genomic and Computational Biology facility. The paired-end reads (2 × 150 bp), with a Q30% of 90.58-92.58 and an average quality score of 35.33-35.64, were trimmed using Trimmomatic-0.39 [21] and assembled de novo using MEGAHITv.12.9 [22]. Circular contigs were identified based on terminal redundancy. Contigs >1000 nt in length were analyzed for viral-like sequences, using Diamond [23] to perform a BLASTx search of a local viral protein RefSeq database (release 210; downloaded March 2022). Contigs identified as potential PV-like sequences were confirmed using BLASTn [24]. We identified three contigs that represented complete genome sequences (based on terminal redundancy), ranging in size from 7953 to 7769 nt, to which ~0.0041-0.1475% of the total reads mapped. These three contigs had a read depth of 16.9×, 99.2×, and 103.2× and 881, 5191, and 5315 mapped reads, respectively, based on BBmap [25] read mapping analysis. The mapped reads have been deposited at SRA under BioProject: PRJNA874427; BioSample: SAMN30547858, SAMN30547860; SRA: SRR21284443, SRR21284445. The genomes were annotated (Fig. 1) using CenoteTaker2 [26] and refined using PaVE [13].

Fig. 1
figure 1

(A) Genome organization map of VavPV1 and VavPV2. (B) Identification of the conserved zinc-binding motifs in the E6 protein and the CR1, CR2, and zinc-binding motif in the E7 protein of the VavPVs. (C) Pairwise amino acid sequence identity values for the various proteins of VavPV1 and VavPV2. (D) Pairwise identity values for the proteins of VavPV1 and VavPV2 in comparison with their closest homologues from other PVs. The closest homologues were identified using BLASTp search of the complete PV protein sequence database, and pairwise identities were determined using SDT v1.2 [28]

Two of the PV genome sequences (accession nos. OP376965 and OP376966), obtained from black-and-white ruffed lemur individuals H1 and H2, share 99.4% identity. Thus, they are likely of the same PV type. We have named this virus "Varecia variegata papillomavirus 1" (VavPV1). The third PV genome sequence (accession no. OP376964), from black-and-white ruffed lemur H1, shares 63.0-63.2% pairwise identity with the VavPV1 isolates. Thus, it is clearly a different PV type, and we have named this virus "Varecia variegata papillomavirus 2" (VavPV2). Black-and-white ruffed lemur H1 appears to have been coinfected with VavPV1 and VavPV2. We did not detect any PV sequences in black-and-white ruffed lemur H3.

In VavPV1 and VavPV2, we identified all six core PV genes as well as putative E1^E4 and E8^E2 coding regions (Fig. 1). In the E6 and E7 proteins, we identified the conserved zinc-binding domains (CxxC), and in the E7 of VavPV2, we identified the pRB-binding motif (Lx[C/S]xE). We note that the lack of a pRB-binding motif in the E7 protein of VavPV1 is unusual. However, other previously described PVs also lack this motif (i.e., PVs in the genera Deltapapillomavirus, Dyochipapillomavirus, Dyoetapapillomavirus, Dyoiotapapillomavirus, Dyolambdapapillomavirus, Dyopsipapillomavirus, Dyorhopapillomavirus, Dyotaupapillomavirus, Dyoxipapillomavirus, Dyozetapapillomavirus, Epsilonpapillomavirus, Nupapillomavirus, Pipapillomavirus, Sigmapapillomavirus, and Zetapapillomavirus). In E7, two additional regions (CR1 and CR2) have been identified [10]. These regions share similarities to those in the E1A protein of human adenovirus 5 (family Adenoviridae) and the large tumor antigen of simian virus 40 (family Polyomaviridae). These two regions are also clearly recognizable in the E7 protein of the VavPV isolates (Fig. 1).

PV species classification is based on L1 nucleotide sequence similarity [27], with 70% identity being the species demarcation threshold. A query of the L1 genes of VavPV1 and VavPV2 revealed that they share <64% identity with each other and <66% identity with all other PV L1 sequences, as determined using SDT v1.2 [28]. Thus, VavPV1 and VavPV2 each represent a new PV species. To accurately determine the genus assignment for these two new PVs, we assembled a dataset of firstpapillomavirus E1, E2, and L1 protein sequences from PaVE [13] that included reference and non-reference sequences. The datasets were aligned with the sequences of the VavPVs using MAFFT v7.113 in AUTO mode [29]. The alignments were trimmed using TrimAL [30] with a gap threshold of 0.2. For each dataset, we determined the best-fit amino acid substitution model using ProtTest 3 [31]. The E1, E2, and L1 amino acid sequence alignments were concatenated, and the resulting alignment was used to construct a partitioned maximum-likelihood phylogenetic tree using IQtree 2 [32] with the model LG+I+G for E1, LG+I+G+F for E2, and LG+I+G+F for L1. The tree was rooted using avian and reptilian PV sequences and viewed and annotated in iTOL v6 [33]. The E1+E2+L1 sequences of the VavPVs clustered together and were part of a broader, well-supported clade containing sequences of viruses of the genera Dyoxipapillomavirus, Gammapapillomavirus, Pipapillomavirus, Taupapillomavirus, as well as Treisetapapillomavirus, and six unclassified viruses (Fig. 2). Although the results of L1 nucleotide sequence analysis using PaVE [13] would place the two VavPVs in the genus Gammapapillomavirus (60% pairwise identity threshold), the E1+E2+L1 protein phylogeny clearly shows that (1) the current gammapapillomaviruses, in particular gammapapillomavirus 6 and 7, are not monophyletic and that (2) phylogenetically, they are more closely related to taupapillomaviruses (Fig. 2). Therefore, VavPV1 and VavPV2 represent two new species in a putative new genus. Interestingly, this identifies another phylogenetic clade containing primate viruses, further demonstrating that the true diversity of the family Papillomaviridae is still not completely understood. The phylogenetic position of VavPV1 and VavPV2 further argues that other evolutionary mechanisms, in addition to cospeciation, must be in play [34].

Fig. 2
figure 2

Partitioned maximum-likelihood phylogenetic tree of concatenated amino acid sequences of E1, E2, and L1. Branches with >0.8 aLRT branch support are shown. PV genera are shown, and currently unclassified PVs that are unique at the species level were also included in the phylogeny

In this report, we describe the first complete genome sequences of PVs in lemuriform primates, and these novel viruses constitute a potential new genus. This contributes to our knowledge of non-human primate PVs, with particular significance for the virology, animal husbandry, and animal conservation communities. Continued research in this area will reveal the evolution and diversity of potential species-specific PVs present across the over 100 species of lemurs.