Keywords

1 Introduction

Since the first discovery of microorganisms in the boiling hot springs of Yellowstone National Park by Thomas Brock in 1966, many thermophilic organisms including bacteria, Archaea and fungi have been isolated from different hot environments. Where there is life, there are viruses. Viruses infecting thermophilic microorganisms are abundant in terrestrial and marine hot environments and have been isolated from a range of hot environments all over the world, including hydrothermal vents, acidic and neutral hot springs and self-heating organic material such as compost.

Viruses constitute a major component of the biosphere and play a significant role in nutrient and energy cycling of carbon, nitrogen and phosphorus (Krupovic et al. 2011). Moreover, viruses have important impacts on the evolution of their hosts. For example, viruses can influence the genetic diversity of prokaryotes by horizontal gene transfer among different species. As killers of bacteria and Archaea, viruses play an essential role in regulating the structure and composition of microbial communities (Weinbauer and Rassoulzadegan 2004). Importantly, thermophilic viruses can serve as model systems for studying life at high temperatures, and they constitute a valuable source of novel enzymes of high biotechnological and industrial potential.

Viruses are genetic elements that require host resources for their replication. They have single- or double-stranded RNA or DNA genomes ranging in size from a few thousand to a million bases. In order to thrive and multiply, viruses have to transmit from one host to another. When they exit the cell, they have to survive in the natural environment of the host (Madigan and Brock 2009). Thus, viruses have an extracellular form that enables them to exist outside the host for extended periods of time. Thermophilic viruses have adapted to hot environments same as their hosts. Thermophiles refer to organisms that grow at temperatures above 45°C, and if the temperatures are above 80°C, they are called hyperthermophiles. Above 65°C only prokaryotic life forms are found, and Archaea are the more successful domain (Lengeler et al. 1999). Archaea are adapted to conditions of chronic energy stress and have adaptations in membrane composition and metabolic pathways that provide a competitive advantage in a range of extreme environments (Valentine 2007). Although both Archaea and bacteria exist in hot environments, there is a trend that more Archaea and fewer bacteria are found when the temperature increases. This is also reflected by the temperature range of host cells for thermophilic bacteriophages and archaeal viruses. As illustrated by Fig. 10.1, the optimal growth temperatures for hosts of bacteriophages range from 45 to 75°C, while the hosts of archaeal viruses have optimal growth temperatures from 65 to 100°C. This implies that thermophilic bacteriophages and especially thermophilic archaeal viruses must employ some special strategies to cope with the detrimental high temperatures when they are outside of their host. It is worth noting that the number of thermophilic viruses isolated so far is still limited, and the distribution illustrated by Fig. 10.1 may change when more viruses are obtained. Moreover, as seen in Fig. 10.2, the locations where thermophilic viruses have been sampled are few due to the natural obstacles of hot environments. Thus, it is only with the development of advanced equipment that isolation and ecological studies of thermophilic and especially hyperthermophilic viruses have been possible.

Fig. 10.2
figure 1

Distribution of sampling locations

Fig. 10.1
figure 2

The optimal growth temperature of thermophilic viral hosts. The graph shows an overview of the number of archaeal viruses and bacteriophages that infect hosts which have an optimal growth temperature exceeding 45°C

This chapter focuses on the application potential of thermophilic viruses. Central to this topic are the many novel genetic and morphological features discovered in thermophilic viruses. In order to fully appreciate the huge potential these viruses have in basic research and biotechnology, we will introduce recent results derived from studies of structural adaptations, functional genomics, metagenomics, virus–host interactions and virus life cycle, all areas with remarkable characteristics. At the end, a section will cover more specific application areas including thermophilic viruses as nanobuilding blocks.

2 Virion Morphotypes of Thermophilic Viruses

Thermophilic viruses are much less studied compared to the mesophilic viruses. In approximately 5,600 known viruses, only a few have been isolated from thermophiles (Lin et al. 2011). Tables 10.1 and 10.2 present 57 isolated thermophilic viruses, and the majority belong to archaeal genera. Although the number of known thermophilic viruses is limited, the morphological diversity is exceptional and includes unique morphotypes (Prangishvili and Garrett 2005), providing us with unique forms and complex features for possible future applications.

Table 10.1 Morphology, taxonomical classification and characteristics of thermophilic archaeal viruses
Table 10.2 Morphology, taxonomical classification and characteristics of thermophilic bacteriophages

The thermophilic archaeal viruses are extremely diverse and can be categorised into nine distinct families including Fuselloviridae, Bicaudaviridae, Ampullaviridae, Clavaviridae, Lipothrixviridae, Rudiviridae, Globulaviridae, Myoviridae and Siphoviridae and four taxa of uncertain affiliation, represented by five novel viruses: STIV1, STIV2, PAV1, STSV1 (Pina et al. 2011) and TPV1 (Gorlas et al. 2012). Enrichment cultures from hot environments exhibit a range of morphotypes including exceptional forms not previously observed for any dsDNA virus. The newly described morphotypes of archaeal viruses include spindle, droplet and bottle shapes (Arnold et al. 2000a; Mochizuki et al. 2010). These morphotypes are not rare in hot environments, and electron microscopy studies from terrestrial hot springs suggest that spindles, filaments, rods and spheres predominate (Garrett et al. 2010). The thermophilic archaeal viruses have been isolated from several archaeal genera including Sulfolobus, Acidianus, Thermoproteus, Aeropyrum, Stygiolobus, Pyrobaculum, Methanobacterium, Pyrococcus and Thermococcus (Pina et al. 2011). Pyrococcus and Pyrobaculum have optimal growth temperatures of 100°C demonstrating the extreme temperatures the thermophilic archaeal viruses have to adapt to.

Most of the isolated thermophilic bacteriophages exhibit a head–tail morphology similar to that of a T bacteriophage and belong to the families of Myoviridae and Siphoviridae. Interestingly, in an extensive survey done by the Promega Corporation in which 115 Thermus bacteriophages were isolated, only 45% of the bacteriophages were tailed. This was surprising because tailless bacteriophages of all types comprise only 4% of about 5,150 bacteriophages observed in the electron microscope. Of tailless thermophilic bacteriophages, three have been classified, two belong to the family of Tectiviridae (Yu et al. 2006) and one belongs to the family of Inoviridae. The thermophilic bacteriophages have been isolated from six bacterial genera including Bacillus, Geobacillus, Thermus, Meiothermus, Rhodothermus and Thermomonospora (Lin et al. 2011).

It is noteworthy that out of the seven described virion morphotypes, the bacterial viruses are represented by three, and the archaeal viruses are represented in all seven morphotype categories.

2.1 Spindle-Shaped Viruses

Viruses with a spindle- or lemon-shaped morphotype have been observed in both terrestrial and marine hot environments. Viruses with spindle-shape morpho­logy, single or two tailed, are common in and exclusive to the domain of Archaea (Garrett et al. 2010). Thermophilic archaeal viruses with spindle shape have been divided into two families Fuselloviridae and Bicaudaviridae, and three viruses await classification.

The Fuselloviridae Family. All the known fuselloviruses infect and propagate in the hyperthermophilic genera Sulfolobus and Acidianus including eight Sulfolobus spindle-shaped viruses (SSV1, 2, 4, 5, 6, 7, k1 and rh) and one Acidianus spindle-shaped virus (ASV1) (Table 10.1). The spindle-shaped virions are approximately 55–60 nm wide and 80–100 nm long (Fig. 10.3a). The exceptions are the viruses SSV6 and ASV1, whose virions tend to be pleomorphic (Redder et al. 2009). At one of the pointed ends of the spindle shape, a short tail of thin fibres is attached, which appear to be extremely sticky and readily attach to cellular fragments. The thin terminal fibres can also attach to the same type of fibres in other virions, which can produce rosette-like aggregates (Redder et al. 2009).

Fig. 10.3
figure 3

Electron micrographs of (a) SSV6 (Redder et al. 2009); (b) ATV (Prangishvili et al. 2006); (c) ABV (Haring et al. 2005); (d) SNDV (Arnold et al. 2000a); (e) APBV1 (Mochizuki et al. 2010); (f) PSV (Haring et al. 2004); (g) SIRV2 (Prangishivili et al. 1999); (h) AFV3 (Vestergaard et al. 2008a); (i) STIV2 (Happonen et al. 2010) and (j) P23-45 (Minakhin et al. 2008). The white arrow in (a) indicates the interaction between tail fibers of four virions. I shows the central section of a cryo-EM picture of STIV2. Scale bars, 50 nm in D; 100 nm in (a), (b), (c), (f), (i) and (j); 200 nm in (e), (g) and (h)

The Bicaudaviridae Family. The Bicaudaviridae family has only one member, the Acidianus two-tailed virus (ATV). The virions are exceptional in that they are extruded from host cells as tailless spindle-shaped particles, which then develop long tails at each pointed end. The tailless and the two-tailed virions have different virion dimensions. The tailless virions exhibit an average length of 243 nm and a maximum width of 119 nm, while the two-tailed virions show an average length of 744 nm and a maximum width of 85 nm (Fig. 10.3b) (Prangishvili et al. 2006). The tail development happens at temperatures above 75°C and independently of host cells or any energy sources (Haring et al. 2005). The only other known examples of extracellular viral morphogenesis comprise initial steps of infection or final steps in particle assembly and budding, and these are triggered on the cell surface of the host. The tails consist of tubes that terminate in an anchor-like structure. One function of the elongated, flexible tails might be to enhance the probability of virion adsorption to a new host cell (Prangishvili et al. 2006).

Three spindle-shaped viruses remain to be classified, Pyrococcus abyssi virus 1 (PAV1), Thermococcus prieurii virus 1 (TPV1) and Sulfolobus tengchongensis virus 1 (STSV1). PAV1 virions are approximately 120 nm long and 80 nm wide with a 15 nm long tail terminating in fibres. It contains a dsDNA of 18 kb (Geslin et al. 2003). STSV1 produces spindle-shaped virions (230 nm  ×  107 nm) with a single tail of variable length (Xiang et al. 2005). Similar to ATV, the spindle size of STSV1 seems inversely proportional to the length of the tail, that is, the longer the tail of a specific virion, the smaller the spindle is. This might reflect a reorganisation of structural components on the virions. The recently characterised TPV1 has a lemon-shaped virion approximately 140 nm long and 80 nm wide with a 15 nm long tail terminating in fibres, similar to the structure observed for PAV1. TPV1 contains a dsDNA of 21.5 kb (Gorlas et al. 2012).

2.2 Bottle-Shaped, Droplet-Shaped and Bacilliform Viruses

The virions of three archaeal viruses, ABV, APBV1 and SNDV, were all isolated from extremely hot terrestrial environments, and all of them display morphological features that are so unique that each of the viruses has been assigned to a new family.

The Ampullaviridae Family. The virion of Acidianus bottle-shaped virus (ABV) is structurally one of the most complex virions in the viral world and infects the hyperthermophilic Acidianus genus. The complex form resembles a bottle approximately 230 nm long and 4–75 nm wide (Fig. 10.3c). The lipid-containing envelope encases a cone-shaped core formed by a torroidally supercoiled nucleoprotein filament. At the broad end of the bottle shape, a disc is present to which 20 (±2) short, thick filaments are attached (Haring et al. 2005). Their function is still unknown. The virion seems to absorb to the host cell through its narrow end, suggesting that the tip of the virion could be involved in viral adsorption and channelling viral DNA into host cells.

The Guttaviridae Family. The Sulfolobus neozealandicus droplet-shaped virus (SNDV) exhibits a complex droplet morphology. The droplet is approximately 90 nm wide and 180 nm long and carries multiple long, thin fibres that are attached at its pointed end (Fig. 10.3d). The surface appears to be helically ribbed and has been compared to resemble a beehive (Arnold et al. 2000a). The circular dsDNA genome has not been sequenced, and the virus does not presently exist in culture collections.

The Clavaviridae Family. The Aeropyrum pernix bacilliform virus 1 (APBV1) has a short stiff bacillus form. The virions are 140 nm long and 20 nm wide (Fig. 10.3e). One end is pointed; the other is rounded. The circular dsDNA genome of 5.2 kb is the smallest genome of known prokaryotic dsDNA viruses (Mochizuki et al. 2010). The approval of the family Clavaviridae is pending at the ITV.

2.3 Linear Viruses

Linear viruses represent the main virion morphotype in extreme geothermal environments. Linear viruses of bacteria and Eukarya carry either ssDNA or ssRNA. Interestingly, linear archaeal viruses isolated from hot environments carry dsDNA. The linear archaeal viruses have been classified into two new families: the stiff, rodlike Rudiviridae and the flexible, filamentous Lipothrixviridae. One thermophilic filamentous bacteriophage is known, which belongs to the Inoviridae family.

The Lipothrixviridae Family. Eleven members belonging to this family have been isolated, including seven Acidianus filamentous viruses (AFV1, 2, 3, 6, 7, 8 and 9), one Sulfolobus islandicus filamentous virus (SIFV) and three Thermoproteus tenax viruses (TTV1-3) (Table 10.1). Lipothrixvirus filaments are surrounded by envelopes containing lipids obtained from the host. The filament is approximately 24 nm wide, ranging in length from 900 nm (AFV1) to 2,000 nm (AFV3 and SIFV) (Fig. 10.3h). Lipothrixviruses carry identical terminal structures at both ends, indicating both ends are able to bind to cell receptors. Members of this family show considerable diversity in their terminal structures. These structures can represent claws (AFV1), T-bars (AFV9), mop-like structures (SIFV), three (AFV1) or six (SIFV) short fibres or tips resembling bottle brushes (AFV2) (Pina et al. 2011). The lipothrixvirus SIFV has a termini structure that ends in a mop-like structure to which six tail fibres are attached. This structure unfolds like a spider’s legs, before attaching to receptors on the host cell membrane (Arnold et al. 2000 b). Apparently, members of the Lipothrixviridae family evolved different structures for cellular attachment, providing great flexibility in host recognition.

The Rudiviridae Family. This family contains two S. islandicus rudiviruses (SIRV1 and 2), one Acidianus rod-shaped virus (ARV1) and one Stygiolobus rod-shaped virus (SRV) (Table 10.1). The rod-shaped virions are nonenveloped, approximately 23 nm wide and 610–900 nm long (Fig. 10.3g). The length of the linear virion matches the length of the genomic dsDNA, a phenomenon also characteristic of bacterial inoviruses, which carry a ssDNA genome. The virion body is a tubelike superhelix formed by linear dsDNA and multiple copies of a highly glycosylated, DNA-binding capsid protein (Vestergaard et al. 2008 b). A plug is located at each end of the virion. It is approximately 50 nm wide and carries three short terminal fibres.

The Inoviridae Family. The only thermophilic member of this family, PH75, infects the bacterium Thermus thermophilus (Pederson et al. 2001). The filamentous virions are 6 nm wide and 1,000–2,000 nm long. The terminal structures of inoviruses are not identical. One end functions as attachment site to host cells. The virion attaches to the target cell via protein binding to a pilus. Pilus retraction then pulls the virion to the host’s internal membrane. The virions comprise a helical capsid surrounding a core of circular ssDNA. Unlike other bacterial viruses, PH75 does not lyse host cells but exit by extrusion.

2.4 Spherical Viruses

Two spherical virus families have been isolated from geothermal environments, Globulaviridae that infects Archaea and Tectiviridae that infects bacteria. Two spherical hyperthermophilic archaeal viruses, STIV1 and STIV2, still need to be assigned to a family.

The Globulaviridae Family. The family currently comprises two viral species, Pyrobaculum spherical virus (PSV) and T. tenax spherical virus 1 (TTSV1) (Fig. 10.3f; Table 10.1). PSV infects the anaerobic hyperthermophilic archaea, Pyrobaculum, which has a temperature optimum of a 100°C. The two viruses are highly similar in morphological and genomic properties. The spherical virions consist of a lipid envelope, which encases a helical nucleocapsid containing a linear dsDNA genome with inverted terminal repeats. The sphere of PSV has a diameter of approximately 100 nm (Pina et al. 2011).

STIV Viruses. The hyperthermophilic Sulfolobus turreted icosahedral viruses, STIV1 and STIV2 (Fig. 10.3i), remain taxonomically unclassified. The two viruses share a similar icosahedral structure and appear to be closely related. The virions have nontailed icosahedral structures with an internal lipid monolayer, which encases the circular dsDNA (Fulton et al. 2009). They share an architectural similarity with viruses of the bacteriophage family, Tectiviridae. Image reconstruction of the STIV virion revealed a unique virus architecture including complex, turret-like projections extending from each of the vertices, which may be involved in host recognition.

The Tectiviridae Family. Two thermophilic bacteriophages have been assigned to this family, P23-77 and ϕIN93 (Table 10.2). The virions of tectiviruses consist of an icosahedral protein capsid with an internal membrane vesicle that encloses the linear dsDNA genome. The internal membrane contains lipids organised as a bilayer underneath the protein capsid, which can give a layered shell appearance of the spherical virion (Jaatinen et al. 2008). There is no indication of any tail structure. Both thermophilic tectiviruses have small genomes of approximately 17–20 kb, possessing only the basic functions needed for survival in hot environments.

2.5 Head–Tail Viruses

Several thermophilic viruses with head–tail morphotypes have been isolated from hot environments, most of which infect bacteria. Only two thermophilic archaeal viruses with head–tail morphology have been isolated so far; they infect the thermophilic Methanobacterium. The isolated head–tail viruses fall into the two families Myoviridae and Siphoviridae, and five still remain to be classified. The head has icosahedral symmetry and the tail structure differs in length and construct.

The Myoviridae Family. Eight thermophilic viruses belong to this family, with three infecting Thermus, one infecting Rhodothermus and Meiothermus each, and three infecting Geobacillus (Table 10.2). The hexagonal heads are approximately 60 nm wide, and the contractile tails vary in length between 80 and 150 nm. In T4, a mesophilic myovirus, the contractile tail ends with a complex baseplate with six long fibres radiating from it. These fibres are used for cell attachment (Comeau et al. 2007). A similar structure has been observed in the thermophilic myovirus ϕTMA. Like a typical myovirus, such as T4, the tail tube of ϕTMA protrudes from the bottom of the baseplate when the tail sheath contracts. Moreover, electron microscopy study showed lipid vesicles that appeared to be bound to the bottom of the baseplate of the ϕTMA page particles (Tamakoshi et al. 2011). Bacteriophage T4 is a very complex virus, more than 40 different proteins form the mature virion and the dsDNA genome comprises approximately 172 kb. Among the isolated thermophilic myoviruses, ϕTMA, ϕYS40 and RM378 have genome sizes ranging from 130 to 152 kb; the rest have genomes smaller than 100 kb. It has been hypothesised that the large genomes of ϕTMA, ϕYS40 and RM378 may encode a complexity similar to that of T4.

The Siphoviridae Family. Four thermophilic bacteriophages and two thermophilic archaeal viruses belong to this family (Table 10.2). The hexagonal heads are approximately 60 nm wide, and the tails have a length of approximately 150 nm and a width of about 8 nm. The tails are noncontractile with a helical structure; at the end of the tail, short terminal fibres are attached. The virions have nonenveloped dsDNA genomes, which have undergone extensive genetic exchange (Hendrix et al. 1999). The thermophilic siphovirus, TSP4, infecting Thermus sp. isolated from Southwest of China, has an extremely long and flexible tail of 785 nm in length and 10 nm in width. It has been found that the end of the tail frequently absorbs to cell debris, indicating attachment structures. Despite their exceptional habitats being separated by thousands of kilometres, the morphological characteristics of TSP4 showed high similarity with the thermophilic siphoviruses P23-45 (Fig. 10.3j) and P47-26, which have been isolated from hot springs in the Far East of Russia (Lin et al. 2010).

Five viruses with head–tail morphology remain to be classified, BVW1, Tb1 and Tf2-4. Bacillus virus W1 (BVW1) has a long tail of 300 nm in length and 15 nm in width, and the hexagonal head has a diameter of 70 nm. It contains a dsDNA of 18 kb (Liu et al. 2006). Tb1 has been isolated from Thermonospora alba and Tf2-4 have been isolated from Thermonospora fusca. All four Thermonospora bacteriophages possess polyhedral heads and long tails. The tail lengths for Tb1 and Tf3 are approximately 260–280 nm and for Tf2 and Tf4 approximately 122–125 nm. Tail flexibility has been observed for the longer-tailed bacteriophages; however, flexibility is not apparent for the shorter-tailed types. The four bacteriophages contain dsDNA of 35–45 kb (Lawrence et al. 1986).

3 The -Omics Era and Gene Regulation of Thermophilic Viruses

Genomics and functional genomics have been instrumental in the past two decades in the field of biology. Although the number of thermophilic virus isolates is limited, the genomic and functional genomic data originated from them have provided insights into the diversity, ecology and gene regulation of these exceptional organisms.

3.1 Genomic Properties

All thermophilic viruses isolated so far package circular or linear dsDNA genomes, with the exception of bacteriophage PH75 which packages circular ssDNA. The genomes of thermophilic viruses vary considerably in size. The smallest genomes belong to the archaeal virus APBV1 (5.2 kb) and the bacteriophage PH75 (6.5 kb). The largest genomes are found among thermophilic bacteriophages ϕYS40 and ϕTMA, which range from 151 to 153 kb. The largest archaeal viral genome is found in STSV1 (75.2 kb). Genome size in prokaryotes is often a good predictor of metabolic complexity; it has been hypothesised that the large genomes of ϕTMA and ϕYS40 may encode a complexity similar to that of the bacteriophage T4.

The genomes of most isolated thermophilic viruses have been sequenced, providing a wealth of information about the genetic diversity of these viruses. However, the genes of thermophilic viruses generally yield few significant matches to sequences in public sequence databases, and most predicted gene products lack recognisable functions and homologs in extant databases. Identified functions in archaeal viruses are confined to a few proteins. For example, in STIV1 the structure of A197 reveals a GT-A fold that is common to many members of the glycosyltransferase superfamily, suggesting a glycosyltransferase activity for A197. Viruses commonly decorate their proteins with sugars as means of regulating interactions with their hosts. While viruses can utilise their host’s glycosylation machinery, it is clear that some also encode their own proteins for specialised glycosylation needs, for example, some lytic bacteriophages glycosylate their DNA to protect it from host restriction enzymes (Larson et al. 2006). In SIRV1 and SIRV2, Holliday junction cleaving enzymes have been characterised. Holliday junction resolving enzymes are ubiquitous and found in all living cells. They are essential for DNA recombination, recombination-related DNA repair and recombination-dependent DNA replication (Birkenbihl et al. 2001). Moreover, SIRV1 and SIRV2 both encode a dUTPase which is involved in nucleotide metabolism (Prangishvili et al. 1998). Similarly, another thermophilic rudivirus, ARV1, encodes a thymidylate synthase (Vestergaard et al. 2005). These enzymes function in adjacent steps of the de novo synthesis pathway of thymidine nucleotides, and either enzyme can help to maintain a low dUTP–dTTP ratio, thereby, minimising misincorporation of uracil into DNA (Chen et al. 2002). This is important at high temperatures when dCTP deaminates more rapidly to yield dUTP. Thus both enzymes are most likely important for efficient replication and for the stability of the hyperthermophilic rudiviruses. In SSV1 an integrase has been identified, encoding a site-specific recombination system, suitable for further investigations into the regulatory mechanisms in the SSV1 genome (Muskhelishvili et al. 1992). Other functional categories characterised in archaeal viruses including polymerases, ligases and nucleases will be discussed in the application section. Furthermore, structural genomics has produced insights into the functions of some previously unknown viral genes. This was reviewed recently (Krupovic et al. 2012) and will not be discussed in this chapter.

One surprise when studying thermophilic viral genomes has been the variance in their genomic GC content. The GC content of nucleic acids is known to be correlated with the stability of their double helix. GC pairs are more stable than AT pairs because they have an additional hydrogen bond. Thus higher GC base pairing leads to higher thermal stability of the DNA (Galtier and Lobry 1997). This can be exploited by organisms living in thermal environments. It has been suggested that high GC content may be a selective response to high temperature. However, a study comparing GC content and optimal growth temperature for numerous prokaryotes failed to demonstrate the predicted correlation (Hurst and Merchant 2001). Looking at GC content of thermophilic viruses emphasises this finding. Approximately 35% of the thermophilic archaeal viruses and 55% of the thermophilic bacteriophages have a GC content of 45% or higher (Fig. 10.4). This was surprising given the relatively higher optimal growth temperatures of archaeal hosts than bacterial hosts (Fig. 10.2). One extreme example is the archaeal rudiviruses SIRV1 and SIRV2 which have a GC content of 25% while the hyperthermophilic host, Sulfolobus, has an average GC content of about 37%. Apparently, factors other than GC base pairing, such as DNA-binding protein(s) in virions, play major roles in protecting viruses against heat.

Fig. 10.4
figure 4

Graph comparing the GC content of thermophilic archaeal viruses and bacteriophages with their respective hosts. More than half of the bacteriophages have a GC content of 45% or higher. Only one third of the archaeal viruses have a GC content of 45% or higher

3.2 Metagenomics

The classification of viruses has traditionally been based on morphological characteristics as demonstrated by the previous section. The system can only be used for viruses that are abundant enough for both microscopic study and genomic sequencing, which significantly biases our view of diversity towards the culturable fraction of the virus community (Pride and Schoenfeld 2008). One factor limiting the discovery of thermophilic viruses has been the reliance on culture-dependent methods for virus isolation. Hot environments are characterised by extreme conditions. For example, deep sea hydrothermal vents have strong physicochemical gradients such as temperatures from 2°C to more than 350°C and high hydrostatic pressure, lack of solar energy and prevalence of chemosynthesis (Gorlas et al. 2012). No viral-cultivation study can fully mimic the temperature and pressure extremes that characterise deep sea hydrothermal vents; this imposes a selective pressure preventing the propagation of viruses unable to adapt to laboratory conditions. Studies have shown that only 1–15% of microbial organisms are culturable under laboratory conditions, demonstrating the amount of genomes still to be discovered (Singh et al. 2009). The introduction of viral metagenomics has revolutionised the field of environmental virology by allowing the exploration of viral communities in a variety of environments, including extreme environments. Viral metagenomics involves viral particle purification followed by library construction and sequencing (Rosario and Breitbart 2011). Viral metagenomic studies of hot environments have provided information concerning viral biogeography, diversity and community structure and new viral types and contributed to the discovery of a potential archaeal RNA virus.

There are approximately 26 published metagenomic studies investigating viral communities, among which, five investigate hot environments. From these studies, it is clear that environmental viral communities are different from those observed in culture. Analysis of viral metagenomes has shown that about 30% of these sequences have detectable similarity to those in GenBank and about half of these are most similar to other known viruses (Rosario and Breitbart 2011). The high percentage of unknown sequences demonstrates the vast novelty of genetic information to be obtained from viruses. Despite the lack of sequence identification from viral metagenomes, studies have used this technique to catalogue viruses in environmental samples based on the identifiable sequences and to investigate community composition through statistical analyses.

A metagenomic analysis of samples from Yellowstone hot springs investigated viral biogeography at a local and global scale (Schoenfeld et al. 2008). Two pools were investigated, Bear Paw (74°C) and Octopus (93°C). The two metagenomic libraries, one from each pool, showed a relative close similarity (nearly 25%). This was surprising, given that microbial populations are temperature-dependent and the surface temperature of the two hot springs differ by 19°C. Alignment of the metagenomes to whole-genome sequences of six cultivated thermophilic viruses revealed striking conservation of certain sequences. Comparison with the genome of PSV showed median identities of 60 and 51% to Bear Paw and Octopus, respectively. PSV was isolated from a hot spring with notably different geochemistry and more than 30 km away from both hot springs. The high median identities between PSV and the two metagenome libraries illustrate the global conservation of sequences in hot springs with different physical and biogeochemical properties (Schoenfeld et al. 2008). This correlates with the findings that groups of highly similar Sulfolobus viruses and Thermus bacteriophages have been isolated from hot springs on different continents (Wiedenheft et al. 2004; Yu et al. 2006).

In general, viral metagenomics has revealed an enormous diversity and abundance of viruses in all environments. Estimates of viral diversity suggest the existence of several thousand unidentified virus types (Weinbauer and Rassoulzadegan 2004). With metagenomics, it is possible to assemble complete genomes of unidentified virus-like particles obtained from environmental viromes, thus identifying novel viruses. A metagenomic study investigating an enriched environmental sample from a hot spring in Yellowstone national park yielded two novel viral genomes, HAV1 and HAV2. Neither viral genome shows any clear similarity to other known archaeal viruses; only HAV2 shows morphological similarities with the two-tailed spindle-shaped ATV virus and limited sequence similarity between two genes. HAV1 has a linear 23 kb genome and HAV2 has a circular 18 kb genome; both have dsDNA. They yielded few significant matches in public sequence databases, reinforcing the notion of wide genetic diversity in archaeal viruses. The study attempted to identify the hosts of the two viruses by isolating archaeal strains from the environmental sample. However, none of the isolated strains were infected (Garrett et al. 2010). Based on the morphological and limited sequence, similarities to ATV, HAV1 and HAV2 were inferred as archaeal viruses. It can be difficult to identify potential archaeal hosts, especially because no reliable protocol procedures for transfection of viral DNA into potential hosts have been developed. Metagenomic analyses can be a helpful tool for identifying new viruses, although information concerning hosts and virus life cycle can be difficult to confirm.

Identification of viral hosts is crucial in understanding the role of viruses in an ecosystem. It would aid in determining how viruses impact the microbial diversity of a given ecosystem, as they are the only predators in hot environments above 60°C. A study tried to identify specific hosts for a viral assemblage obtained from a hydrothermal vent in the Northeast Pacific. The viral metagenome was compared with a comprehensive database of spacers derived from the clustered regularly interspaced short palindromic repeat (CRISPR) adaptive immune system (Anderson et al. 2011). The CRISPR system is an antiviral defence mechanism found in both Archaea and bacteria (Marraffini and Sontheimer 2010). Cells incorporate genomic sequences as CRISPR spacers from invading viruses and plasmids into their CRISPR loci. When invading viruses or plasmids have a match to a pre-existing CRISPR spacer sequence in the host genome, these elements are recognised as pathogenic invaders. Thus, CRISPR spacer sequences provide an adaptive, heritable record of past infections and express small CRISPR RNAs that guide targeting of invasive nucleic acids, rendering the cell immune to these viruses (Marraffini and Sontheimer 2010). The CRISPR loci act effectively as libraries of previous viral infections which can be compared with environmental viromes, thus potentially identifying virus–host relationships. A study by Anderson et al. (2011) showed that CRISPR spacers from vent isolates and from thermophiles in general have a higher percentage of matches to the vent virome than to other marine or terrestrial hot spring viromes. Spacers derived from strains belonging to 23 different taxonomic groups matched sequences from the hot vent virome; most notably, a wide range of both archaeal and bacterial taxonomic groups has CRISPR spacers matching the marine vent viromes. Interestingly, a high percentage of hits to spacers from mesophilic hosts suggested that the marine vent virome was comprised of viruses that have the potential to infect diverse taxonomic groups of multiple thermal regimes in both the bacterial and the archaeal domains (Anderson et al. 2011).

Metagenomics has also resulted in findings that suggest the existence of novel positive-strand RNA viruses that probably replicate in hyperthermophilic archaeal hosts and are highly divergent from RNA viruses that infect eukaryotes and bacteria. To date, all archaeal viruses possess dsDNA genomes, except one that possess ssDNA. Finding a RNA virus infecting an archaeal host could enhance our knowledge of RNA viruses and shed light on the origin of the enormous diversity of RNA viruses infecting eukaryotes (Bolduc et al. 2012). Further studies need to identify the potential hosts of the RNA viruses, possibly through sequence comparison between the RNA viral genomes and CRISPR spacers of cellular genomes from the same environment.

Advances in the field of metagenomics will provide more information concerning important aspects of viral ecology. Moreover, functional viral metagenomics are being developed to discover novel viral enzymes that can be used for diagnostic and biotechnological purposes (Schoenfeld et al. 2010).

3.3 Functional Genomics and Gene Regulation

A virus is totally dependent on its host cell for viral reproduction. In order to multiply, a virus must take control of its host cell’s molecular machineries. Perhaps the most important mechanism for achieving this control is a viral code for strong positive signals to promote viral gene expression and other signals to repress expression of some cellular genes. When a cell is infected, much is going on inside the cell at the molecular level, such as transcription of the incoming viral genes to form viral mRNAs, and their translation to produce early viral proteins, including the enzymes necessary to replicate viral DNA (Collier and Oxford 2006). Transcriptomics is a useful tool for investigating the subtle but definite changes in gene expression of both the host and the virus in the period of a virus infection. The transcriptome reflects the genes that are expressed at any given time. Therefore, it can provide insights into which genes are being up- or downregulated during an infection cycle. Moreover, viral transcriptomics can be used to compare transcription patterns of viruses under a wide range of conditions (Walther et al. 2011).

Microarray analysis of an infected culture of Sulfolobus solfataricus with STIV1 has revealed insights into the timing and extent of virus transcription, as well as differential regulation of host genes. The infection cycle of STIV1 is lytic with almost all cells in a culture being killed by the virus. Transcription of viral genes was first detected at 8 h post infection (hpi), and at 16 hpi, most viral genes were expressed. The majority of the viral genes are transcribed between 16 and 24 hpi. Around 24 hpi, a shift takes place from virus replication to preparation for lysis, with general cell lysis detected at 32 hpi. Although the expression starts at different timepoints for different genes, little temporal control was observed. During the infection, 177 host genes were determined to be differentially expressed, with 124 genes upregulated and 53 genes downregulated. The upregulated genes were dominated by genes associated with DNA replication and repair and those of unknown function, suggesting that STIV1 uses host proteins to aid the replication of its own DNA (Ortmann et al. 2008). An important upregulated gene concerns an ESCRT (endosomal-sorting complex required for transport) III homolog. The ESCRT III complex serves in eukaryotes as a protein-sorting machinery and functions as a pair of molecular scissors that cuts cell endosomale membranes (Wollert et al. 2009). The ESCRT III homolog has recently been reported essential for the cell division in Sulfolobales (Ettema and Bernander 2009); the upregulation may suggest its involvement in the release of STIV1 virions. The downregulated genes, mostly detected at 32 hpi, were associated with energy production and metabolism (Ortmann et al. 2008). Interestingly, the majority (94%) of the host genes showed no differential regulation between infected and uninfected cells, suggesting that the virus life cycle is tailored to avoid host stress response.

An infection study of SSV1 with S. solfataricus as a host investigated the transcriptome fluctuations of a virus with a lysogenic life cycle. Upon infection, the SSV1 genome is rapidly and site specifically integrated into a tRNA gene of the host paralleled by a short slowdown of growth. Even after infection with an excess of virus particles, the host recovers well and often grows even better than before, as if the presence of the viral genome gives some advantage (Schleper et al. 1992). Upon UV irradiation of the host cells, a strong replication of the viral DNA is induced and large amounts of particles (up to 100 per cell) are released into the culture medium. This occurs without apparent lysis of the host cells. However, at this stage, the cell growth is significantly retarded. The study found that the first viral transcripts already can be found after 1 hpi, while most viral genes are active at 8.5 hpi. The viral genes are clustered at 9 operons expressed into 10 transcripts, comprising both regulatory and structural genes. The regulatory genes are the first to be transcribed, and the genes coding for the coat protein of the virus are produced at a later stage. Similar to those of eukaryotic viruses and bacteriophages, the transcripts of SSV1 can be categorised according to their time of appearance and putative functional roles, showing that SSV1 exhibits a tight chronological transcriptional regulation (Frols et al. 2007). With respect to host gene regulation, transcriptomes were compared between UV irradiated and nonirradiated cells for both infected and uninfected cultures, and the response to UV irradiation was then compared between infected and uninfected cells. Only small differences in genome-wide analysis were detected between infected and uninfected cells, possibly due to the indirect comparison.

Very few studies have investigated the change in gene expression profiles when a bacteriophage infects a hyperthermophilic bacterium. A study analysing the interaction between GVE2, a deep sea thermophilic bacteriophage, and its host Geobacillus sp. E263 showed differential regulation of host genes. A comparison of protein/gene expression profiles of GVE2-infected and noninfected Geobacillus sp. E263 bacteria revealed that among the 20 differentially expressed host genes, 13 were upregulated and 7 were downregulated in response to GVE2 infection. Based on homology searches in GenBank, 19 of the 20 proteins involved in GVE2 infection shared homology with proteins of known function. The 19 proteins have diverse metabolic functions and can be grouped into three different categories based on cellular function, suggesting a coordinated response to virus infection (Wei and Zhang 2010).

One of the current limitations of transcriptomic methods is that they require millions of cells as a starting material. Therefore, a future challenge will be the determination of transcriptome maps for single cells, which will open an avenue for investigating the regulatory heterogeneity in a microbial population upon infection (Sorek and Cossart 2010). This may lead to the discovery of advantageous genes/defence mechanisms in some cells of a given population. Furthermore, single cell transcriptomics will also allow studies of the transcriptome of individual cells from unculturable species, a major advantage with the high amount of unculturable thermophilic viruses and their hosts. Another field that could advance our understanding of thermophilic viruses’ impact on their environments is metatranscriptomics. The analyses of the ecology in hyperthermophilic environments are generally less complex than that of aquatic or soil systems, making it easier to deal with big dataset covering many organisms (Walther et al. 2011). Comparing metatranscriptomes from different hot environments could elucidate the impact of viruses, helping us understand the roles viruses play in ecological and geochemical processes in hot environments.

4 Virus–Host Interactions

The number of studies characterising virus life cycles in hot environments is limited, and the interactions have rarely been characterised in depth, beyond the basic level. According to studies on isolated archaeal virus–host systems, the dominant type of the viral life cycle differs from that of bacterial systems (Abedon 2009). The majority of known thermophilic archaeal viruses establish a chronic infection in which virions are continuously produced, and the host cells remain alive although the growth rate is often retarded. Adaptation to energy stress conditions is hypo­thesised to be the crucial factor that distinguishes Archaea from Bacteria (Valentine 2007) and may be what has driven and even favoured such virus–host relationships. The only hyperthermophilic bacteriophage that exhibits a chronic mode of infection is the filamentous bacteriophage of the Inoviridae family, PH75, which does not lyse host cells even when virions are actively produced (Pederson et al. 2001). This infection mode is rare for bacteriophages but appears to be common among thermophilic archaeal viruses and is often referred to as the carrier state. However, the carrier state can be interrupted and transformed into a lytic state by stress factors such as UV irradiation. The hallmark of the lytic life cycle is the lysis of the host cells when virions are released, thus killing the host. The lytic cycle is the mode of reproduction for all hyperthermophilic bacteriophages with the exception of PH75. A few archaeal viruses are reported to be purely lytic (TTV1, STIV and SIRV). Determining the nature of the infection cycle is not always easy. For example, the rudivirus SIRV2 was thought to establish a carrier state infection, mainly based on the lack of a decrease in OD in infected cultures. However, the virus lyses the cells. The exceptional egress mechanism involved in the release of virions from the cell results in lysed cells in form of empty spheres. The cells appear intact and OD measurements lack the sophistication to identify that the measured cells are no longer ‘living’. Thus, other thermophilic archaeal viruses could turn out to be lytic upon further investigations.

4.1 Carrier State Infection

AFV1 is similar to most thermophilic archaeal viruses. Thus, it does not kill its host during reproductive infection. A study investigating the virus–host interactions between AFV1 and its host showed that the growth rate of host cells, Acidianus hospitalis, is nearly completely blocked in the initial stage of infection. The cell growth recovers slowly after initial infection, and from 2 hpi, a generation time of 20 h is observed, in contrast to a generation time of 11 h for uninfected cells. Mature virus particles start to be released from infected cells 4–5 hpi. After several successive dilutions of virus-carrying cultures and prolonged incubation, the virus was still present in host cells, indicating a stable carrier state of host–virus relationship. Under certain growth conditions, a balance between production of virions and multiplication of host cells can be observed. DNA bands originating from the viral AFV1 genome could not be distinguished in the restriction digestion pattern of total cellular DNA at any stage of infection. This indicates a low intracellular copy number of virus DNA, probably less than 10 copies per cell (Bettstetter et al. 2003). Another thermophilic virus, which reproduces by combining a lysogenic life cycle with various levels of viral particle production in a carrier state mode, is SSV2. A comparative study of SSV2 physiology in the natural host S. islandicus versus the foreign host S. solfataricus provided evidence of differently regulated SSV2 life cycles in the two hosts. An initial infection of the foreign host S. solfataricus retards cell growth. However, when SSV2 settles in the foreign host S. solfataricus, host growth is no longer inhibited. This is in strict contrast to what is observed in the initial infection stage. Neither does the SSV2 genome exceed over a few copies per cell throughout extended cultivation. This suggests that a series of interactions between SSV2 and its foreign host leads to a coexistence harmony between them. Further studies may reveal whether this conversion of host response is a general scheme for the host defence of S. solfataricus to a viral infection. In the natural host S. islandicus, SSV2 replication is characterised by a physiological induction. Viral genome copy number increases up to 50-fold within 4 hpi after the sudden halt of cell growth at OD600 of about 1.3 which corresponds to a late exponential growth phase. Growth inhibition of the host correlates with the virus replication induction. Interestingly, the inhibition effect is reversible. When released into a fresh medium, the SSV2-induced S. islandicus culture resumes their exponential growth and hosts regain control over SSV2 replication (Contursi et al. 2006).

4.2 Lytic Life Cycle

Virus–host interactions in hyperthermophilic bacteriophages are not well studied. Therefore, the best examples of lytic thermophilic virus–host interactions are from the domain of Archaea. As mentioned earlier, only few thermophilic archaeal viruses are lytic. However, the life cycles of the lytic viruses, STIV1 and SIRV2, have revealed a unique mechanism of virion release (Prangishvili and Quax 2011). The viruses differ significantly in their morphological and genomic properties but exploit the same mechanism for the release of mature virions from the host cell. A study done by Bize et al. (2009) investigated the life cycle of SIRV2. SIRV2 infects S. islandicus and virions are assembled in the cystoplasm of the host cell; approximately 8–10 hpi, they start to be released through well-defined apertures in the cell envelope. Formation of these openings is preceded and facilitated by the generation of virus-induced cellular structures of pyramidal shape, virus-associated pyramids (VAPs), located at the cell envelope and pointing outwards. The VAPs perforate the membrane and S-layer and opens up to release the preassembled virions in the cytoplasm to the surrounding environment. After virion release, the cell envelope remains as a stable empty shell. The timing of VAP disruption and virus release must be strictly controlled by virus-encoded functions, such that cell lysis does not occur until the virions have been assembled, as for any lytic virus (Bize et al. 2009). The extent of modifications caused by SIRV2 on the host cells results in a radically transformed cell, which functions as a complex viral factory. It has been suggested that the lysogenic life cycle provides an intracellular refuge for virus populations in hot environments. However, viruses with a purely lytic life cycle such as SIRV2 demonstrate that virus particles can survive in extreme ecosystems long enough to encounter new host cells. The SIRV2 virions are well adapted to hot environments, being almost as stable at 80°C as bacteriophages infecting mesophilic bacteria are at 37°C (De Paepe and Taddei 2006).

5 Applications

The previous sections have covered thermophilic viral characteristics, including novel features not previously observed. Many of these features may have application potential in basic research or biotechnology. However, application development in thermophilic viruses has so far concentrated on a few identified enzymes and an emerging interest to use them as nanobuilding blocks. Virus-derived vectors and enzymes have been essential research tools since the first days of molecular biology, and compared to mesophilic viruses, an obvious advantage of thermophilic viruses is their thermostability. The next section will summarise thermophilic viral enzymes which are already trademarked or have the potential for biotechnological/industrial exploitation. Moreover, the potential of thermophilic viruses as nanobuilding blocks will be addressed.

5.1 Application of Thermophilic Viral Enzymes in Biotechnology and Basic Research

Many research applications of enzymes are centred on nucleic acid metabolism. Especially DNA polymerases have been used in molecular biology techniques such as whole-genome amplification, PCR and DNA sequencing (Tabor and Richardson 1987; Saiki et al. 1988; Zhang et al. 1992). Viral polymerases are highly diverse in terms of primary amino acid sequence and biochemical activities (Schoenfeld et al. 2010) and are functionally distinct from their cellular counterparts. Many applications of DNA polymerases depend on thermostability up to 95°C, making hyperthermophilic viruses ideal subjects for discovering viral DNA polymerases with biotechnological potential. Recently, a thermostable DNA polymerase from a viral metagenome was found to be a potent RT-PCR enzyme (Moser et al. 2012). Out of 21,198 Sanger sequence reads derived from a viral metagenome library constructed from Octopus hot spring (93°C) in Yellowstone National Park, hundreds of potential pol genes were identified and 59 complete pol genes were tested for polymerase activity. Among these, 3173 Pol demonstrated both high thermostability and innate reverse trancriptase (RT) activity (Table 10.3). As the first reported virus-derived thermostable RT Pol, 3173 Pol also exhibited high sensitivity and high specificity comparable to two-enzyme RT-PCR systems. Obviously, 3173 Pol fulfils the requirements for a facile single-enzyme RT-PCR reagent (Moser et al. 2012). The 3173 Pol-based PyroScript RT-PCR master mix (Lucigen) provides demonstrated advantages over the commonly used two-enzymes RT-PCR systems. Among other documented performance problems, the two-enzyme RT-PCR systems require an initial low temperature reverse transcription that reduces specificity, increases reaction time and impairs synthesis through complex secondary structures. In contrast, the high thermostability of 3173 Pol should allow a ‘hot start’ (Moser et al. 2012).

Table 10.3 Current and future uses of thermophilic viruses

PCR has been used for about two decades as a powerful tool in molecular biology. However, the size of DNA product that can be amplified by PCR is still limited. With the best PCR enzyme to date, Phusion, a maximum of 20 kb can be reached. Although the bacteriophage phi29 DNA polymerase presents the highest processivity described so far for a DNA polymerase, and up to 70 kb fragments can be generated (Blanco et al. 1989; Rodriguez et al. 2005), the enzyme is not thermostable and therefore cannot be used in PCR. Biochemical and structural studies have shown that the high processivity of Phi29 DNA Pol is dependent on one of the two insertions in the protein sequence which are a characteristic of protein-primed DNA Pol families (Rodriguez et al. 2005). Recently, a putative protein-primed DNA polymerase gene was identified from the genome of the hyperthermophilic archaeal virus ABV which also contains the two insertions in the protein sequence (Table 10.3) (Peng et al. 2007). This feature may render the ABV Pol a high processivity, which, in combination with the high thermostability, makes the enzyme a promising candidate for PCR amplification of large DNA fragments.

DNA polymerases constitute just one group of useful enzymes; other viral enzymes with distinct and useful properties have recently been discovered in thermophilic viruses. Thermostable RNA ligases 1 from bacteriophages TS2126 and RM378 were recently isolated and characterised (Table 10.3) (Blondal et al. 2003, 2005a, b). RNA ligases 1 have the ability to ligate single-stranded nucleic acids by catalysing the ATP-dependent formation of phosphodiester bonds between 5′-phosphate and 3′-hydroxyl termini of single-stranded RNA or DNA. The biological role of bacteriophage RNA ligases has been primarily studied in T4, where the T4 RNA ligase 1, together with the bacteriophage encoded polynucleotide kinase, repairs cleaved tRNA molecules, thereby counteracting the defence mechanism of the bacterial host. The actual role of RNA ligases in TS2126 and RM378 bacteriophages has not been determined, although it is likely that these enzymes are a part of a repair machinery that responds to RNA degradation by the host. The T4 RNA ligase is a very important tool in molecular biology and is used in numerous protocols. Applications include RNA ligase‐mediated rapid amplification of cDNA ends (RLM‐RACE) which can be used in mapping the 5′ and 3′ends of RNA molecules (Liu and Gorovsky 1993), ligation of oligonucleotide adaptors to cDNA or single‐stranded primer extension products for PCR (Zhang et al. 1992) and various 5′ nucleotide modifications of nucleic acids. For example, a rapid 5′-labelling method of single-stranded DNA/RNA was developed based on the utilisation of an adenylated intermediate in the reaction of T4 RNA ligase. This method is useful for fluorescence-, isotope- or biotin-labelling of the 5′ ends of both oligo- and polynucleotides (Kinoshita et al. 1997).

Investigations into the overall identity of the RNA ligase 1 sequences from TS2126 and RM378 showed low similarity to each other. The amino acid sequence of RNA ligase 1 from the TS2126 bacteriophage showed more similarity to T4 (18%) than RM378 ligase 1 (15%), indicating that the two proteins of thermophilic origin have evolved and adapted independently to the elevated thermal conditions (Blondal et al. 2005a). The characterisation of TS2126 RNA ligase 1 revealed that it is stable at 60–65°C for an extended time period but loses activity at higher temperatures, similar to the thermostability of RM378 RNA ligase 1. When comparing the activity of TS2126 and RM378 and a commercial T4 RNA ligase 1, the TS2126 ligase showed ∼30 and 10 times higher specific activity, respectively. This striking difference in ligation efficiency was also observed in the ssDNA ligation experiments where TS2126 RNA ligase 1 was much more effective than RNA ligases from RM378 and T4. The TS2126 RNA ligase 1 exhibits extremely high activity, high ligation efficiency and moderate thermostability; thus, it can be used in the RLM-RACE assay at elevated temperatures (65°C). This may produce better results if the 5′ donor end of mRNA molecules has a secondary structure that inhibits efficient ligations with T4 RNA ligase at 37°C (Blondal et al. 2005 b). The ability to ligate ssDNA at elevated temperatures may also lead to a major advancement in specific applications in molecular biology. In fact, patents covering RNA ligases 1 from both TS2126 and RM378 have been sold to Epicentre® (an Illuminia® company). The RNA ligase 1 from TS2126 is already commercially available under the registered trademark CircLigaseTM. The thermostable ATP-dependent ligase that catalyses intramolecular ligation (i.e. circularisation) of ssDNA templates having a 5′-phosphate and a 3′-hydroxyl group can be used for very specific applications. In contrast to T4 DNA ligase and Ampligase® DNA ligase, which ligate dsDNA ends, CircLigaseTM ligates ends of ssDNA. The enzyme is therefore useful for making circular ssDNA molecules from linear ssDNA. Circular ssDNA molecules can be used as substrates for rolling-circle replication or rolling-circle transcription. Linear ssDNA of >30 bases is circularised by the CircLigaseTM enzyme. Under standard reaction conditions, virtually no linear concatamers or circular concatamers are produced. In addition to its activity on ssDNA, CircLigaseTM also has activity in ligating a single-stranded nucleic acid having a 3′-hydroxyl ribonucleotide and a 5′-phosphorylated ribonucleotide or deoxyribonucleotide (EpiCentre 2012). This can be used to map transcription termination sites of prokaryotic RNAs.

Following the characterisation of RNA ligase 1, a polynucleotide kinase (PNK) was identified and characterised in RM378 (Table 10.3), elucidating a defence mechanism against the host similar to that observed in T4. RNA ligase 1 functions together with PNK 1; together, they repair cleaved tRNA molecules. Whereas RNA ligase 1 is essential for ligation of the cleaved tRNA molecules, PNK 1 modifies the tRNA fragments to make appropriate substrates for the ligation step (Silber et al. 1972; Lillehaug and Kleppe 1977). PNK 1 is a bifunctional nucleic acid processing enzyme with 5′-kinase and 3′-phosphatase activities, catalysing the restoration of 5′-phosphate and 3′-hydroxyl termini in cleaved ssRNA and ssDNA. It functions in two major ways: (1) to remove the 2′:3′-cyclic phosphate from the 5′ tRNA fragment and (2) to add a phosphate group to the 5′-hydroxyl group of the 3′-tRNA fragment using ATP as the phosphate donor (Wang et al. 2002). The RNA ligase 1 and the PNK 1 are thus part of the same system, acting in concert to repair RNA and DNA. A study showed that RM378 PNK 1 carries out the same or very similar processes as those of the T4 PNK 1 (Blondal et al. 2005a). Characterisation of the RM378 PNK 1 protein showed thermostability up to 60–65°C; this is consistent with the optimum temperature of the natural environment of its host, Rhodothermus marinus. The RM378 PNK 1 shares only its 5′-kinase domain with the PNK family but no apparent homology to the 3′-phosphatase domain in that family. Its similarity with the T4 5′-kinase is low, but they share the P-loop motif, which is characteristic of many phosphotransferase families. Because of its unique phosphatase domain, RM378 PNK 1 resembles the mammalian PNKs rather than other phage PNKs in terms of domain arrangement (Zhu et al. 2004; Blondal et al. 2005a). This is a very interesting observation, because eukaryotic PNKs function as repair enzymes on double-stranded DNA. The T4 PNK 1 is used for labelling the 5′-termini of nucleic acids, and the labelled products can be used as markers for gel electrophoresis, primers for DNA sequencing, primers for PCR and probes for hybridisation (Stahl et al. 1991; Hilario 2004). The RM378 PNK 1 may have advantages in some protocols due to its stability at higher temperatures. The company Prokazyme is selling the RM378 PNK 1 under the registered trademark ThermoPhageTM poly­nucleotide kinase, and their protocol mentions several application areas including labelling of nucleic acids using 32P-γ-ATP for probes and DNA sequencing, phosphorylation of nucleic acids for subsequent ligation for cloning, phosphorylation of oligonucleotides for ligase reaction like the ligase chain reaction and similar procedures and phosphorylation of nucleic acids with modified phosphates (i.e. thiol-phosphates) for subsequent modifications and/or labelling. ThermoPhageTM polynucleotide kinase is the only available thermostable enzyme of its kind (Prokazyme 2012) opening up for exploration of high-temperature applications with a PNK.

Another study characterised a novel nonspecific nuclease from the thermophilic bacteriophage GBSV1 (Table 10.3) (Song and Zhang 2008). Nucleases are defined as a group of enzymes which are capable of hydrolyzing the phosphodiester linkages of nucleic acids. According to the substrates they hydrolyze, nucleases are divided into two groups: sugar specific nucleases (deoxyribonucleases and ribonucleases) and sugar nonspecific nucleases. Sugar nonspecific nucleases are characterised by their ability to hydrolyze both DNA and RNA without exhibiting pronounced base preferences. Sugar nonspecific nucleases play very important roles in different aspects of basic genetic mechanisms, including DNA salvage, repair, recombination and degradation. Moreover, they are involved in nutrition, scavenging for nucleotides and phosphates for growth and metabolism (Hsia et al. 2005). They have been isolated from a wide variety of sources including fungi, bacteria and viruses. The majority of these enzymes are intracellular, but some have been reported to be extracellular in nature (Legerski et al. 1978; Rangarajan and Shankar 1999). The ability of sugar nonspecific endonucleases to recognise a wide variety of nucleic acid structures has led to considerable efforts to evaluate their role in different cellular processes as well as application as analytical tools to study nucleic acid structure. Applications also include rapid sequencing of RNA, the removal of nucleic acids during protein purification and the use as antiviral agents (Rangarajan and Shankar 1999; Song and Zhang 2008). Compared with mesophilic enzymes, thermostable nucleases may possess novel properties in structures and biological functions. The novel GBSV1 nonspecific nuclease purified from GBSV1 is the first nuclease isolated from a thermophilic virus. GBSV1 nonspecific nuclease is able to degrade a variety of nucleic acids, including RNA, ssDNA and dsDNA that is either circular or linear. It is active at temperatures ranging from 20 to 80°C with an optimal temperature of 60°C, which is higher than those of most reported nonspecific nucleases. GBSV1 nonspecific nuclease could be obtained in large quantity by expression in E. coli. This would facilitate its biotechnological applications (Song and Zhang 2008).

5.2 Industrial Potential of Thermophilic Viruses

Thermophilic viruses have adapted to extreme hot environments. Besides high temperature, hot environments are often accompanied by a set of harsh conditions, consisting of physical extremes (e.g. pressure or radiation) and geochemical extremes such as salinity and pH. All biomolecules within the virus particles, including proteins, nucleic acids and possibly lipids, must be adapted to these conditions. Therefore, thermophilic viruses offer a source of thermostable enzymes that display outstanding stability against high temperatures and can often cope with a set of harsh conditions. Driven by increasing industrial demands for biocatalysts that can cope with industrial process conditions, considerable efforts have been devoted to the search for such enzymes. Biocatalysis uses natural catalysts, such as enzymes, to perform chemical transformations on organic compounds compared with organic synthesis; biocatalysts often have far better chemical precision, which can lead to more efficient production of single stereoisomers, fewer side reactions and a lower environmental burden. Approximately 3,000 different enzymes have been identified, and many of these have found their way into biotechnological and industrial applications (van den Burg 2003). However, there is a demand for enzymes which can withstand industrial reaction conditions. The reasons to exploit enzymes that are stable and active at elevated temperatures are obvious. At elevated temperatures, the solubility of many reaction components, in particular polymeric substrates, is signifi­cantly improved. Moreover, the risk of contamination, leading to undesired complications, is reduced at higher temperatures (van den Burg 2003). As a result, increasing attention is given to microorganisms that are able to thrive in hot environments. The exceptional diversity and novelty of thermophilic archaeal viral genomes indicates that novel enzymes may be encoded. One example is the putative provirus XQ2 entrapped in the genome of S. solfataricus P2 (She et al. 2001). The genome of the XQ2 is about 60 kb which encodes many hypothetical proteins as well as recognisable metabolic enzymes. The sizes of thermophilic viral genomes range from 5 to 153 kb, showing the coding capacity for metabolic genes. Recently, an EU funded collaborative project, HotZyme, was initiated aiming at a global investigation of biodiversity of the hot environments on Earth. Most importantly, novel hydrolases with improved performance in different industrial fields are expected to be discovered from thermophilic microorganisms and their viruses (HotZyme 2012).

5.3 Nanotechnological Potential of Thermophilic Viruses

A virus is a nanoscaled biomolecular unit composed of genes, protecting capsid proteins, and eventually envelopes. In recent years, naturally occurring bionanoparticles, including virus capsids and protein cages, have been studied and utilised as templates and building blocks for applications in biotechnology. Virus capsids and protein cages have received special attention because they are naturally self-assembled with atomic precision. Viral nanoparticles (VNPs) have several advantages such as their nanometre range size, their propensity to self-assemble into monodisperse nanoparticles of discrete size and shape, their high degree of symmetry and polyvalency, their relative ease of producing large quantities of material, and their exceptional stability, robustness and biocompatibility (Steinmetz et al. 2008 b). The particles are composed of programmable units, which can be modified by either genetic modification or chemical bioconjugation methods. Due to these properties, viruses hold promise for development as amenable platforms for diverse applications in biotechnology, electronics and medicine (Wiedenheft et al. 2007). Naturally occurring protein cage architectures, such as the spherical cagelike architectures of ferritins, can mediate the deposition of ‘hard’ inorganic materials within the spatial confines of a ‘soft’ protein container. For example, Cowpea chlorotic mottle virus (CCMV) is composed of 180 identical coat proteins that self-assemble to form a protein cage where RNA is encapsulated (Speir et al. 1995). CCMV undergoes reversible swelling when pH varies. Capsid swelling of CCMV at pH levels greater than 6.5 results in pore opening of the capsid protein assembly (Tama and Brooks 2002). Capsid protein swelling allows for the exchange of molecules, which in turn enables the release or entrapment of target molecules (Douglas and Young 1998). The development of VNPs for nanotechnological applications often requires that they withstand harsh conditions during either fabrication or use. Bioconjugation technique often requires the organic solvent dimethyl sulfoxide (DMSO) in concentrations of at least 20% by volume, and many applications rely on temperature stability. This has prompted efforts aimed at exploiting the thermostability and unique features found in hyperthermophiles. A recent study investigated S. islandicus rod-shaped virus 2 (SIRV2) as a candidate VNP (Table 10.3) (Steinmetz et al. 2008a). SIRV2 particles were found to be stable in two different solvent/water mixtures that are of relevance for bioconjugation and mineralisation reactions. Moreover, SIRV2 particles offer attachment sites allowing selective chemical modification. It was found that the major coat protein (CP) forms the virus body while the minor CP is located in the tail fibres at the end of the particles. Interestingly, amine reactivity showed that the minor CP could be selectively labelled, and this labelling reaction targeted the ends of the particles only. This suggests that various functional molecules can be installed at different positions in the VNP, offering multiple and various chemical attachment sites. SIRV2 remained intact and infectious in DMSO in concentrations up to 50% by volume for at least 6 days and is naturally stable at 80°C and pH 3. Overall, SIRV2 represents an extremely stable and structurally interesting VNP with the potential for novel biotechnological applications (Steinmetz et al. 2008a).

5.4 Perspectives

Enzymes from thermophilic viruses offer the opportunity to greatly expand the reaction conditions of biocatalysis. Furthermore, unique features in morphology and virus–host interactions may be used in tailored applications in biomedicine and nanotechnology. However, the current repertoire of viral enzymes only hints at their overall potential. The most commonly used enzymes are derived from a surprisingly small number of cultivated viruses, which is remarkable considering the enormous morphological and genomic diversity of viruses revealed over the past decade (Schoenfeld et al. 2010). Developments in the cultivation and production of thermophilic viruses, and developments related to the cloning and expression of their genes in heterologous hosts, will help in the search of novel enzymes with biotechnological/industrial potential. In addition, thermophilic viruses are likely to be useful tools for studying host evolution and host biochemical pathways, providing us with information concerning life in hot environments. Thermophilic viruses have many unique features not observed in other studied viruses; there may be a potential to exploit these features in biomedicine.

Viral genomes are relatively simple compared with those of their hosts and contain a comparatively high proportion of genes coding for structural proteins (e.g. coat and tail) together with proteins involved in nucleic acid metabolism and lysis (Schoenfeld et al. 2010). The density of certain genes in viral genomes offers a great advantage when looking for novel enzymes. For example, a typical bacterial genome of about 2 Mb contains only a single pol I gene (coding for DNA polymerase I). By contrast, between 20 and 40 pol genes per 2 Mb were found in viral metagenomic sequences. Metagenomic approaches have been used in ecological investigations of hot environments. However, viral metagenomics can also be used to uncover novel biological features of viruses, ultimately, producing useful enzymes. Many challenges are associated with metagenomic-based enzyme discovery. An inherent difficulty in viral metagenomics is isolating adequate amounts of genomic material for large-insert library construction (Edwards and Rohwer 2005). Another challenge arises during sequence assembly. Very large scale sequencing of viral metagenomics should permit the assembly of large contiguous stretches of DNA and potentially entire genomes. However, the high degree of sequence polymorphism within viral populations has largely confounded attempts to assemble large contigs with high confidence (Schoenfeld et al. 2008). Assembly at high stringency tends to prevent misassembling noncontiguous parts of the genome, but such stringent assembly can lead to overestimation of unique viral types and can prevent discovery of genes, enzymes and genomes. The identification of viral genes from sequence assemblies is difficult owing to the high diversity of viral genes and the relatively low numbers of viral genomes in public sequence databases. Moreover, most viral coding sequences (even for a well-studied bacteriophage such as T4) have no similarity to any known genes. This is especially distinct for thermophilic viruses of the Archaea (Prangishvili and Garrett 2005), lowering the likelihood of finding genes by similarity. These problems are being addressed, and viral metagenomics offers a mean of exploring genetic diversity within the vast uncultivated portion of thermophilic viruses.

6 Conclusions

Thermophilic viruses show a huge potential as useful tools in a broad field of biotechnology. Novel genetic and morphological features, in combination with a huge unexplored diversity, promise new discoveries which are essential to the development of new applications both in industry and basic research.