Avoid common mistakes on your manuscript.
Introduction
The term “pararetrovirus” was introduced by Temin [1] for animal (Hepadnaviridae) and plant viruses (Caulimoviridae) that, in contrast to retroviruses, have a DNA genome and do not integrate into the host genome for replication. Like retroviruses, pararetroviruses use a reverse transcriptase for their replication.
Endogenous pararetroviruses (EPRVs) in plants represent counterparts of members of the virus family Caulimoviridae integrated in their host’s genome. Despite the non-integrative replication cycle of members of the Caulimoviridae, a growing number of integrated viral sequences have been reported and are still being identified in various plant genomes [2–5].
Most of the integrants are silent, repetitive genome components. However, some of these sequences may still be able to replicate and initiate viral infection under certain conditions, according to their structural and sequence integrity and their genomic and/or epigenetic context.
Suggestions for a uniform nomenclature of endogenous virus sequences
Facing the rapidly growing diversity of EPRVs discovered in plant genomes, the need for a uniform nomenclature is obvious. According to the multi-copy nature of EPRVs, important differences in sequence composition and structure of integrants have been observed.
It would be highly desirable for a nomenclature system (1) to distinguish endogenous from episomal caulimovirid sequences, (2) to discriminate potentially functional integrants from passive and pseudogene host genome components and (3) to describe the element’s viral activity in a specific genomic context.
In some genomes, a wide variety of EPRVs has been identified, comprising viral sequences with or without exogenous virus counterparts [6, 7] as well as rearranged and functional forms of a specific virus genome [3].
Like their exogenous counterparts, EPRVs can be classified as petuvirus-like elements, badnavirus-like elements or as members of the genera Caulimovirus, Cavemovirus or Tungrovirus according to the number and arrangement of open reading frames (ORFs) and nucleotide sequence homologies with episomal viruses (see Table 1).
So far, authors use the prefix “E-”or “e-” for “endogenous” (e.g. ERTBVFootnote 1 [8], ePVCVFootnote 2 [9]) or the suffix “-EPRV” (BSGFVFootnote 3 EPRV; [3]) in connection with the virus name to distinguish integrated viral sequences from the homologous episomal virus. In other examples, they are named after the host plant from which they have been isolated, analogous to the nomenclature of transposons [e.g. Sotu (in Solanum tuberosum) or LycEPRV (in several Solanum subsection Lycopersicon species); [10–12]].
One major point for the nomenclature of plant endogenous virus sequences is to identify a significant relationship between viruses and integrated sequences. Usually, the highest matches of sequence identity are considered relevant. Based on existing sequence comparisons (Table 2), we suggest a threshold level of at least 80% nucleotide identity over 80% of the sequence within the polymerase (POL) reading frame (“ORF3” in Table 2) to confirm the affiliation of an endogenous sequence to a virus. This value is based on the suggestions of Wicker et al. [12] for the distinction of transposable elements and the ICTV rules for distinguishing species in the family Caulimoviridae [13]. It remains to be seen if this threshold value is appropriate when more EPRV sequences become available. Additionally, the comparison of integrated virus sequences can identify distinct EPRVs in the same host genome (e.g. NsEPRV and NtoEPRV in Nicotiana tabacum; [14]).
Another important feature for the classification of an integrant is whether it is functional and can trigger a virus infection. Thus, sequences in question have to be isolated and their infectivity has to be proven experimentally by infection of the respective host plants.
These considerations led us to the following suggestions for a uniform distinction between integrated pararetrovirus sequences (EPRVs) and their homologous exogenous viruses (see also Table 3):
-
1.
If the endogenous sequence can be affiliated to an exogenous virus, viral sequences integrated into the host genome should be marked by the prefix “e” (endogenous) in front of the virus abbreviation in cases where no information about the status of the integrant is yet available (eBSOLV, ePVCV). If further information is available, the prefix “ea” (endogenous and activatable) in connection with the virus abbreviation should be applied. Episomal viruses should be referred to following the ICTV nomenclature.
-
1.1
Functional endogenous copies are able to release a replication-competent viral genome with high similarity to an exogenous virus and should be marked by the prefix “ea” followed by the virus name (eaPVCV, eaBSGFV-7).
-
1.2
An integrant related to an exogenous counterpart, but not known to be functional as a virus per se should be named with the prefix “e” and the virus name (e.g. eTVCV, eBSVGFV-9). The integrant itself is incapable of making a functional virus: e.g. no transition from the endogenous to the episomal form is known, or the sequence lacks functional ORFs due to mutations. However, activation of “eEPRV” by recombination with exogenous counterparts cannot be ruled out. Moreover, “eEPRVs” may fulfill other purposes in the host, such as providing virus resistance [15].
-
1.3
In cases where it is necessary to distinguish different integrated copies from each other, a numerical or alphabetical index (such as eaBSGFV-7 and eBSGFV-9) should be introduced.
-
1.1
-
2.
When no exogenous virus is currently known or when only small fragments of a viral genome have been identified, the host plant initials plus the suffix “EPRS” for endogenous pararetroviral sequence” should be chosen (e.g. SotuEPRS for Solanum tuberosum endogenous pararetroviral sequence, Table 4).
For some integrated viral sequences, homologous viruses may be unknown, e.g. because they are extinct or have not been discovered yet. Therefore, we suggest using e- or ea-(virus name) only in cases where the exogenous form is known. If there is any doubt about the existence of a corresponding pararetrovirus, (plant initials)-EPRS should be chosen; e.g. integrants in the rice genome that reveal weak homology to RTBV ([8], Table 2) were suggested to be classified as OsEPRS (Oryza sativa endogenous pararetroviral sequence) according to recent sequence alignments (see Table 3, Geering pers. comm., Table 4). As soon as more sequence information becomes available, existing EPRV names have to be changed accordingly.
We hope that these suggestions will provide some guidance for developing a uniform scheme for the nomenclature of integrated viral sequences. We encourage a discussion leading to future improvements of the nomenclature. Anybody who wants to comment on this proposal is welcome to do so using the following website: http://talk.ictvonline.org/ Footnote 4.
Notes
RTBV: Rice tungro bacilliform virus.
PVCV: Petunia vein clearing virus.
BSGFV: Banana streak virus Goldfinger species.
To view and participate in the discussion, users will have to create an account by clicking on “Join” in the upper right-hand corner of the web page (http://talk.ictvonline.org/). This is open to anyone who chooses to participate. Once registered, anyone can create a new discussion thread under the “General ICTV Discussions” link that is found by clicking on the “Discussions” top menu line.
References
Temin HM (1985) Reverse transcription in the eukaryotic genome: retroviruses, pararetroviruses, retrotransposons and retrotranscripts. Mol Biol Evol 2:455–468
Staginnus C, Richert-Pöggeler KR (2006) Endogenous pararetroviruses: two-faced travelers in the plant genome. Trends Plant Sci 11(10):485–491
Gayral P et al (2008) A single banana streak virus integration event in the banana genome as the origin of infectious endogenous pararetrovirus. J Virol 82:6697–6710
Pahalawatta V et al (2008) A new and distinct species in the genus Caulimovirus exists as an endogenous plant pararetroviral sequence in its host, Dahlia variabilis. Virology 376:253–257. doi:10.1016/j.virol.2008.03.003
Hohn T et al (2008) Evolution of integrated plant viruses. In: Roossinck MJ (ed) Plant virus evolution. Springer, Berlin, pp 53–81. doi:10.1007/978-3-540-75763-4
Harper G et al (2005) The diversity of banana streak virus isolates in Uganda. Arch Virol 150:2407–2420
Geering ADW et al (2005) Banana contains a diverse array of endogenous badnaviruses. J Gen Virol 86:511–520
Kunii M et al (2004) Reconstruction of putative DNA virus from endogenous rice tungro bacilliform virus-like sequences in the rice genome: implications for integration and evolution. BMC Genomics 5:80 doi:10.1186/1471-2164-5-80 (http://www.biomedcentral.com/1471-2164/5/80)
Richert-Pöggeler KR et al (2003) Induction of infectious Petunia vein clearing (pararetro) virus from endogenous provirus in petunia. EMBO J 22:4836–4845
Hansen CN et al (2005) Characterization of pararetrovirus-like sequences in the genome of potato (Solanum tuberosum). Cytogenet Genome Res 110:559–565
Staginnus C et al (2007) Endogenous pararetroviral sequences in tomato (Solanum lycopersicum) and related species. BMC Plant Biol 7:24. doi:10.1186/1471-2229-7-24
Wicker T et al (2007) A unified classification system for eukaryotic transposable elements. Nat Rev Genet 8:973–982. doi:10.1038/nrg2165
Fauquet C et al (2005) Virus taxonomy: classification and nomenclature of viruses: eighth report of the International Committee on the Taxonomy of Viruses. Elsevier/Academic Press, San Diego
Gregor W et al (2004) A distinct endogenous pararetrovirus family in Nicotiana tomentosiformis, a diploid progenitor of polyploid tobacco. Plant Physiol 134:1191–1199
Mette MF et al (2002) Endogenous viral sequences and their potential contribution to heritable virus resistance in plants. EMBO J 21:461–469
Geering ADW et al (2001) Analysis of the distribution and structure of integrated banana streak virus DNA in a range of Musa cultivars. Mol Plant Pathol 2:207–213
Geering ADW et al (2005) Characterisation of banana streak Mysore virus and evidence that its DNA is integrated in the B genome of cultivated Musa. Arch Virol 150:787–796
Harper G et al (1999) Detection of episomal banana streak badnavirus by IC-PCR. J Virol Methods 79:1–8
Ndowora T et al (1999) Evidence that badnavirus infection in Musa can originate from integrated pararetroviral sequences. Virology 255:214–220
Harper G et al (2002) Viral sequences integrated into plant genomes. Annu Rev Phytopathol 40:119–136
Lockhart BE et al (2000) Characterization and genomic analysis of tobacco vein clearing virus, a plant pararetrovirus that is transmitted vertically and related to sequences integrated in the host genome. J Gen Virol 81:1579–1585
Jakowitsch J et al (1999) Integrated pararetroviral sequences define a unique class of dispersed repetitive DNA in plants. Proc Natl Acad Sci USA 96:13241–13264
Matzke M et al (2004) Endogenous pararetroviruses of allotetraploid Nicotiana tabacum and its diploid progenitors. N. sylvestris and N. tomentosiformis. Biol J Linn Soc 82:627–638
Acknowledgments
The authors thank M.A. Grandbastien for valuable comments on the manuscript and R. Hull for helpful advice in defining a threshold level for sequence comparisons.
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Staginnus, C., Iskra-Caruana, M.L., Lockhart, B. et al. Suggestions for a nomenclature of endogenous pararetroviral sequences in plants. Arch Virol 154, 1189–1193 (2009). https://doi.org/10.1007/s00705-009-0412-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00705-009-0412-y