Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 DNA Deaminase Evolution

Central to nucleic acid metabolism is the near-ubiquitous process of enzymatic deamination of adenine and cytosine bases, individually or in the context of larger nucleic acid constituents (Conticello et al. 2005; Grosjean 2009). For instance, in most species, the wobble base in several of the tRNA anti-codons is frequently changed by deamination of adenosine to inosine (A-to-I), which can then base pair with cytosine, and thereby increase the flexibility and decoding capacity of the tRNA anti-codon (Gerber and Keller 1999). These enzymes belong to the adenosine deaminase acting on tRNA (ADAT) family. Related proteins in most metazoans from nematodes and flies to humans catalyze A-to-I editing of a variety of RNA targets (Kim et al. 1994; Nishikura 2010). These enzymes are called, appropriately, adenosine deaminases acting on RNA (ADAR). Editing events that occur in the coding region of an mRNA can result in amino acid substitutions in the resulting protein. However, the majority of editing events occur in non-coding regions of mRNA or in non-coding RNAs, and these A-to-I editing events can alter RNA secondary structure, stability, function, and/or capacity to be bound by regulatory RNAs such as siRNAs (Morse et al. 2002; Levanon et al. 2004; Agranat et al. 2008; Li et al. 2009).

Cytosine to uracil (C-to-U) deamination is almost as ancient as A-to-I editing (Conticello et al. 2007b; Grosjean 2009). The pyrimidine salvage pathways of most organisms use cytidine deaminase (CDA) to produce the essential RNA and DNA building blocks of uridine directly and thymidine after additional enzymatic steps (Zrenner et al. 2006). However, at some point near the root of the vertebrate tree, polynucleotide cytosine deaminases emerged, with the lamprey CDA being a present-day example (Rogozin et al. 2007). This enzyme is thought to underpin a unique form of adaptive immunity in which the DNA segments that encode arrays of highly diverse leucine-rich repeats are assembled into mature variable lymphocyte receptor genes by a recombination-mediated process. It is thought that an ancestor of the present-day lamprey enzyme served as the original substrate for expansion of the polynucleotide cytosine deaminase gene family during vertebrate evolution (Fig. 1a) (Rogozin et al. 2007; Conticello 2008). The result in most vertebrates alive today is a much larger repertoire of polynucleotide C-to-U editing enzymes that execute diverse biological functions from lipid metabolism to adaptive and innate immunity (Figs. 1 and 2).

Fig. 1
figure 1

Evolution and structure–function of APOBEC cytosine deaminases. a Expansion of the modern primate APOBEC3 locus encoding seven APOBEC3 genes with eleven zinc-coordinating (Z) domains (reprinted with permission from Lackey et al. 2012). b Deamination of C-to-U plays a central role in innate immunity. c Three-dimensional structures of the APOBEC3G C-terminal domain and APOBEC3C. A zinc ion (purple) is shown coordinated in both proteins by one histidine and two cysteine residues in the α2-β3-α3 core

Fig. 2
figure 2

The physiological functions of the APOBEC family. a RNA editing by APOBEC1 generates a truncated APOB protein. b AID activity underpins the processes of somatic hypermutation and class switch recombination in B cell germinal centers. c Four APOBEC3 proteins restrict HIV through cytosine deamination in the absence of Vif (reprinted with permission from Hultquist et al. 2011)

All vertebrate polynucleotide cytosine deaminases belong to the so-called ‘APOBEC’ family. The defining feature of this family is a conserved His-X-Glu-X25–31-Pro-Cys-X2–4-Cys zinc (Z)-coordinating motif, which is strictly required for deaminase activity (where X can be a variety of amino acids) (Wedekind et al. 2003; Harris and Liddament 2004; Conticello et al. 2005; LaRue et al. 2009). As described in more detail below, key residues within this motif position zinc at the active site of the enzyme (Fig. 1c). The protein sequences within these motifs enable phylogenetic groupings into three subfamilies: APOBEC1, AID, and the APOBEC3s. The APOBEC3s can be further subdivided into three subgroups: Z1, Z2, and Z3 (Fig. 1a) (Conticello 2008; LaRue et al. 2009). Importantly, the number and organization of the A3 Z-domains can vary dramatically from branch to branch throughout the mammalian portion of the vertebrate phylogenetic tree (e.g., human versus mouse loci depicted in Fig. 1a).

Apolipoprotein B mRNA-editing catalytic subunit 1 (APOBEC1) has provided the namesake to the larger family. It was discovered as an enzyme that catalyzes the deamination of a specific cytosine within the APOB mRNA (Figs. 1b and 2a) (Teng et al. 1993). This produces a premature translation stop codon and a smaller secondary gene product. These two APOB proteins (APOB100 and APOB48) differentially regulate the secretion of lipoproteins from the liver (Chan 1992). Many mammalian APOBEC1 enzymes also possess DNA C-to-U deaminase activity (Harris et al. 2002; Ikeda et al. 2008; Petit et al. 2009; Ikeda et al. 2011). Taken together with the fact that earlier vertebrate lineages, such as the one represented by birds and lizards, lack an APOB-like gene, it is probable that the DNA-editing function preceded involvement in RNA editing (Severi et al. 2011).

The most conserved DNA cytosine deaminase in vertebrates is activation-induced deaminase (AID; gene name AICDA) (Fig. 1a). AID has a central role in adaptive immunity by seeding somatic hypermutation, gene conversion, and class switch recombination with its DNA deaminase activity (Figs. 1b and 2b) [(Muramatsu et al. 1999, 2000; Di Noia and Neuberger 2002; Petersen-Mahrt et al. 2002); reviewed by (Longerich et al. 2006; Di Noia and Neuberger 2007; Conticello 2008)]. Interestingly, the genes that encode APOBEC1 and AID are positioned adjacent to each other in the genomes of most vertebrates (an inversion has placed the human gene farther away on the same chromosome). This suggests that an ancestral AID gene duplicated and diverged to produce APOBEC1 (Fig. 1a). It is likely that duplication of an ancestral AID/APOBEC1 locus produced the genetic seeds for the mammal-exclusive APOBEC3 subfamily (Fig. 1a) (Jarmuz et al. 2002; Harris and Liddament 2004).

In humans, the seven APOBEC3 proteins are encoded by a tandemly arranged gene cluster (Fig. 1a) (Jarmuz et al. 2002). These present-day genes are the products of continual evolution, in which an ancestral cluster of three Z-domains is predicted to have undergone a minimum of eight duplication events over the past 100 million years to produce the locus found in most primates (LaRue et al. 2008; Münk et al. 2012). These domains are either expressed singly or one enzyme may consist of two Z-domains (LaRue et al. 2009). In contrast, the ancestral APOBEC3 locus experienced a deletion in the rodent lineage of one of the ancestral Z-domains, leading to the present-day two domain loci, which encodes a single protein quite distinct from any of the primate enzymes (Fig. 1a).

One possible explanation for why some mammalian lineages, like primates, have many APOBEC3s, while other lineages, such as rodents, have few is that these enzymes have overlapping innate immune functions to protect the host from a variety of parasitic elements (e.g., in HIV-1 restriction, Fig. 2c; mechanism elaborated in Sect. 3, below). Because multiple distinct innate immune mechanisms serve to suppress the spread of such parasitic elements, it is reasonable to postulate that some mammals will be fortified at the APOBEC3 locus and weaker at other loci, with each mammalian lineage being distinct. For instance, primates encode a single TRIM5α protein, whereas mice have the capacity to encode a total of eight TRIM5α-like proteins (Sawyer et al. 2007; Tareen et al. 2009; Chap. 13 in Lever et al. 2010). It is likely that each species’ present-day innate immune fortifications were independently shaped by past pathogenic pressures, which one can only speculate may have been the ancestors of present-day viruses and transposable elements.

2 Biochemical and Structural Insights

Zinc-dependent deaminases, such as the APOBECs, catalyze the conversion of C-to-U in polynucleotide substrates (Fig. 1b). This reaction requires the activation of water by a zinc ion coordinated by the enzyme (Fig. 1c). A glutamic acid in the active site of the enzyme protonates N3, priming the nucleophilic attack on the C4 position of the pyrimidine ring, followed by the removal and subsequent protonation of an amino group (NH2) that results in the release of ammonia (NH3) and uracil as products. This conversion can theoretically occur within both RNA and single-stranded DNA substrates. However, apart from APOBEC1, which has both RNA and DNA-editing activities, AID and the APOBEC3s have proven specific to DNA substrates in vitro and in vivo.

The extent of amino acid homology to APOBEC1 originally suggested that the APOBEC3 enzymes might be a family of RNA-editing proteins (Jarmuz et al. 2002). Three lines of evidence, however, demonstrated that this view was incorrect and established the APOBEC3 enzymes as single-stranded DNA cytosine deaminases. First, APOBEC3 has ËÌa high degree of homology to AID, and experiments in E. coli demonstrated AID, APOBEC3C, and APOBEC3G are capable of inducing high levels of mutation in an antibiotic resistance gene (Harris et al. 2002; Petersen-Mahrt et al. 2002). This was clearly due to DNA editing because mutation levels rose synergistically in a bacterial strain deficient for uracil DNA glycosylase (UDG), an enzyme that initiates base excision repair by recognizing and removing uracil exclusively from DNA (Lindahl 2000; Di Noia and Neuberger 2002; Harris et al. 2002). Second, unambiguous evidence for DNA versus RNA editing comes from head-to-head biochemical studies using recombinant enzymes. AID and APOBEC3G have a strong preference for single-stranded DNA substrates, with no detectable RNA-editing activity (Bransteitter et al. 2003; Iwatani et al. 2006). Third, a strong preference for single-stranded DNA substrates is also evident in sequencing studies of retroviruses produced in the presence of a given APOBEC3 protein, such as APOBEC3G (Fig. 2c) (Harris et al. 2003; Lecossier et al. 2003; Mangeat et al. 2003; Zhang et al. 2003). In this experimental system, each APOBEC3 protein presumably has a chance to deaminate viral genomic RNA cytosines before the reverse transcription process converts it to a single-stranded cDNA intermediate and then to the double-stranded DNA required for integration. However, the most common APOBEC3-dependent mutations detected in integrated viral DNA that has survived this process are genomic strand G-to-A mutations, entirely attributable to cDNA minus strand C-to-U deamination events. Genomic strand C-to-T editing events possibly due to RNA editing are rarely detected. Importantly, APOBEC3 DNA-editing activity is required to explain previously reported G-to-A mutation biases in HIV-1 substrates in vivo (Vartanian et al. 1994; Janini et al. 2001).

The solved structures of bacterial and yeast cytidine and cytosine deaminases were used to inform early functional and structural studies of various APOBEC3 family members (Betts et al. 1994; Ireton et al. 2003; Ko et al. 2003; Johansson et al. 2004; Xie et al. 2004). Each of these bacterial and yeast proteins, in monomeric form, is globular with a hydrophobic β-stranded core and several surrounding α-helices. The most conserved structural feature is the active site, which is defined by a histidine and two cysteines in the yeast enzyme and three cysteines in the bacterial enzymes (Xiang et al. 1997; Ireton et al. 2003; Ko et al. 2003). In both instances, these residues are positioned similarly by alpha helices and they serve to coordinate a zinc ion in the active site, which, as described above, is essential for the deamination reaction (Fig. 1b). Although these conserved features have been useful for generating models of APOBEC3 structures, they have also been misleading because the oligomeric state of each enzyme is variable. For instance, the E. coli CDA is homodimeric and the yeast enzyme is homotetrameric (Betts et al. 1994; Johansson et al. 2002). This has fuelled (likely incorrect) speculation that APOBEC3 family members must also function as oligomers.

Generating high-resolution structures of APOBEC3 family members has proved challenging in large part due to insolubility at higher protein concentrations [e.g., (Iwatani et al. 2006)]. However, several NMR and crystal structures have been achieved for the APOBEC3G catalytic domain (representing Z1-type deaminases), and crystal structures were obtained recently for APOBEC3C and the APOBEC3F catalytic domains (representing Z2-type deaminases) (e.g., Fig. 1c) (Chen et al. 2008; Holden et al. 2008; Furukawa et al. 2009; Shandilya et al. 2010; Kitamura et al. 2012b; Li et al. 2012; Bohn et al. 2013 ).

These structures have several conserved features that provide insight into how these enzymes may function. First, these proteins are all globular with a hydrophobic core consisting of five beta strands surrounded by six alpha helices and the hallmark α2-β3-α3 zinc-coordinating motif that defines the larger cytosine deaminase superfamily. Second, β-strands 3, 4, and 5 are arranged in parallel, similar to the RNA-editing enzyme TadA (an ADAT) but different from the antiparallel arrangement found in bacterial and yeast CDAs. This parallel β3-β4-β5 organization may be a key feature that distinguishes polynucleotide from non-polynucleotide deaminases. Third, although many potential oligomeric interfaces have been captured in the crystal lattices, none have proven critical for enzymatic activity and no common themes have emerged (Furukawa et al. 2009; Shandilya et al. 2010; Kitamura et al. 2012b; Bohn et al. 2013). This is consistent with a number of other studies, indicating that oligomerization may not be essential for binding and deaminating single-stranded DNA substrates (Opi et al. 2006; Nowarski et al. 2008; Shlyakhtenko et al. 2011, 2012). However, more work on this topic is clearly needed to define the role of oligomerization in vivo, because several of the family members, including APOBEC3G, elicit such a property in living cells (Bransteitter et al. 2003; Chiu et al. 2006; Soros et al. 2007; Chen et al. 2013 in preparation). Finally, it is notable that the majority of structural and amino acid differences between APOBEC3 structures are confined to non-catalytic loop regions. Such differences likely relate to substrate targeting and possible cofactor binding, ultimately reflecting physiological function.

A significant remaining question in our understanding of APOBEC3 function is how these enzymes bind single-stranded DNA substrates. A current working model proposes a positively charged brim in the region surrounding the active site consisting of R213, R215, R313, and R320 in APOBEC3G (Chen et al. 2008; Shindo et al. 2012). These residues are predicted to position single-stranded DNA substrates in a manner that allows the target cytosine to enter the active site (Chen et al. 2008). This model also predicts that in order to access to the catalytic glutamic acid, the target C will be flipped out with respect to the phosphodiester backbone. A base-flipping mechanism is in good agreement with the structure of the adenosine deaminase TadA complexed with its RNA substrate (Losey et al. 2006).

Finally, the brim-domain model and TadA structures suggest an explanation for the different local single-stranded DNA deamination preferences among APOBEC family members (Conticello et al. 2007a; Chen et al. 2008). Unlike bacterial restriction enzymes with 4, 6, or 8 base palindromic recognition sequences, APOBEC3 family members have a notable preference for the base immediately 5′ of the target C (Harris et al. 2002, 2003; Mangeat et al. 2003; Zhang et al. 2003; Bishop et al. 2004; Liddament et al. 2004; Wiegand et al. 2004; Yu et al. 2004a, b; Zheng et al. 2004; Doehle et al. 2005a; Langlois et al. 2005; Dang et al. 2006; Aguiar et al. 2008; Harari et al. 2009; Stenglein et al. 2010). Specifically, AID prefers a 5′ purine base (5′-AC or GC), APOBEC3G a 5′ cytosine (5′-CC), and all other family members a 5′ thymine (5′-TC). Several studies have recently mapped this activity to a loop adjacent to the active site, positioned between β4 and α4 secondary structural elements (Conticello et al. 2007b; Chen et al. 2008; Holden et al. 2008; Kohli et al. 2009, 2010; Rathore et al. 2013 in preparation). This is most dramatically evidenced by loop grafting experiments, in which this loop in AID can be replaced by the homologous loop from APOBEC3G or APOBEC3F resulting in a complete switch of the preferred base immediately 5′ of the target cytosine (Kohli et al. 2009, 2010; Carpenter et al. 2010; Wang et al. 2010). Moreover, exchanging the same loop (or even a single amino acid) between APOBEC3A and APOBEC3G completely swaps the dinucleotide preference of these enzymes (Rathore et al. 2013 in preparation). Despite this progress, the field still anxiously awaits high-resolution structures of enzyme–substrate complexes that will more precisely define the substrate binding mechanism and advance our understanding of how these enzymes function in vivo.

3 Biological Functions

3.1 APOBEC3 Proteins in HIV Restriction

Pathogens including the retrovirus HIV-1 (hereafter HIV) must both engage and avoid numerous host factors to replicate and cause disease. Genome-wide knockdown and proteomic studies suggest that up to 10 % of human proteins either directly or indirectly impact HIV replication (Brass et al. 2008; Konig et al. 2008; Zhou et al. 2008; Yeung et al. 2009; Jäger et al. 2012a, b). The majority of these proteins are required in some capacity for virus replication (i.e., dependency factors). In contrast, a small number of these cellular proteins are dominant proteins that directly suppress virus replication (i.e., restriction factors). Restriction factor hallmarks include the capacity to potently inhibit virus replication, signatures of rapid evolution (positive selection), responsiveness to interferon, and neutralization by at least one viral counter-restriction strategy (Malim and Emerman 2008; Harris et al. 2012; Malim and Bieniasz 2012). Here, we focus on the mechanism of HIV restriction by APOBEC3 DNA cytosine deaminases, and we encourage readers to see chapters on equally interesting restriction and counter-restriction mechanisms (TRIM5α in the Chap. 2 by A.J. Fletcher and G.J. Towers, Tetherin/BST-2 in the Chap. 3 by S.J.D. Neil, and SAMHD1 in the Chap. 4 by M. Sharkey, this volume).

Original studies showed that the viral infectivity factor (Vif) protein of HIV is required for virus replication in primary CD4+ lymphocytes and in several common laboratory T cell lines (e.g., CEM, H9, so-called non-permissive), but it was dispensable in several others (e.g., CEM-SS, SupT1, or permissive lines) (Fisher et al. 1987; Strebel et al. 1987; Gabuzda et al. 1992). This phenotypic difference led to the cloning of APOBEC3G as one cDNA sequence expressed differentially (of many) between CEM and CEM-SS (Sheehy et al. 2002). However, APOBEC3G proved remarkable as it could convert a permissive cell line to a non-permissive phenotype (Sheehy et al. 2002). Taken together with the independent and near-simultaneous discoveries of APOBEC3G as a putative RNA-editing factor and as a DNA-editing enzyme, an editing mechanism of restriction was predicted and shortly after demonstrated (Harris et al. 2002, 2003; Lecossier et al. 2003; Mangeat et al. 2003; Zhang et al. 2003).

Over five hundred papers have now been published on APOBEC3G and the related APOBEC3 proteins that have culminated in a current Trojan horse-like model for HIV restriction as shown in Fig. 2c. To be effective as an HIV restriction factor, APOBEC3G must first be expressed in the cytoplasmic compartment of an infected virus-producing cell (Mangeat et al. 2003). Second, cytoplasmic APOBEC3G is thought to interact with a Gag ribonucleoprotein complex, and this interaction is required for APOBEC3G packaging (Alce and Popik 2004; Luo et al. 2004; Schafer et al. 2004; Svarovskaia et al. 2004; Khan et al. 2005; Burnett and Spearman 2007; Bogerd and Cullen 2008). This interaction can be disrupted by RNase treatment and is therefore thought to involve a bridging RNA between APOBEC3G and the nucleocapsid region of Gag (Cen et al. 2004; Svarovskaia et al. 2004; Iwatani et al. 2007; Bogerd and Cullen 2008). The identity of the bridging RNA is still an active area of investigation; while several reports indicate a role for the Alu-like 7SL RNA, a role for viral genomic RNA has not been excluded (Khan et al. 2005, 2007; Tian et al. 2007; Wang et al. 2007; Bogerd and Cullen 2008). Third, by an ill-defined mechanism, packaged APOBEC3G must breach the nucleocapsid core of the viral particle. This is a genetically (but not mechanistically) defined step, as various chimeric constructs have the capacity to be packaged into viral particles, yet do not enter the core or restrict (Haché et al. 2005; Martin et al. 2011; Song et al. 2012).

Once an APOBEC3G-loaded viral core is deposited into a target cell, reverse transcription proceeds and a susceptible single-stranded cDNA intermediate is generated. Here, APOBEC3G deaminates C-to-U (Harris et al. 2003; Mangeat et al. 2003; Zhang et al. 2003; Yu et al. 2004b). The uracilated cDNA is either subjected to degradation or templates the second-strand synthesis, preceding integration into the target cell’s genome (Kaiser and Emerman 2006; Mbisa et al. 2007; Yang et al. 2007). The uracils in the cDNA strand template the insertion of adenines in the nascent plus strand and thereby immortalize G-to-A mutations that limit subsequent rounds of viral replication (Harris et al. 2003; Lecossier et al. 2003; Mangeat et al. 2003; Zhang et al. 2003).

Human T cell lines that express near-physiological levels of an APOBEC3G catalytic mutant do not suppress the replication of a Vif-deficient virus, indicating that the predominant mechanism of HIV restriction depends upon deaminase activity (Bishop et al. 2004; Miyagi et al. 2007; Schumacher et al. 2008; Browne et al. 2009). However, a component of APOBEC3G’s capacity to restrict HIV may be deaminase independent (Newman et al. 2005; Bishop et al. 2006, 2008; Iwatani et al. 2006; Opi et al. 2006; Iwatani et al. 2007; Mbisa et al. 2007). The most convincing of these studies has shown that APOBEC3G is capable of binding viral genomic RNA and sterically hindering the processivity of reverse transcriptase (Bishop et al. 2006, 2008). Nevertheless, the importance of deaminase-independent mechanisms, at least for APOBEC3G and HIV, is questionable given the aforementioned results at physiological expression levels.

Soon after the initial discovery and functional characterization of APOBEC3G, attention turned toward its six, most closely related family members. Proviral sequences isolated from HIV-infected individuals exhibit two distinct patterns of G-to-A mutation consistent with APOBEC3-mediated deamination, both 5′-GG-to-AG and 5′-GA-to-AA (Fitzgibbon et al. 1993; Janini et al. 2001; Caride et al. 2002; Kieffer et al. 2005; Pace et al. 2006; Gandhi et al. 2008; Land et al. 2008). As mentioned previously, APOBEC3G has a demonstrated preference for deaminating 5′-CC dinucleotides on the viral minus strand resulting in 5′-GG-to-AG mutations and is therefore the most likely source of this mutational pattern (Harris et al. 2003; Mangeat et al. 2003; Zhang et al. 2003; Liddament et al. 2004; Yu et al. 2004b). However, high levels of proviral 5′-GA-to-AA mutation in patients’ proviral sequences suggested the involvement of at least one additional family member. Unfortunately, all six other APOBEC3 family members prefer to deaminate a 5′-TC, therefore excluding the possibility of identifying an additional APOBEC3 source based solely on mutational preference [e.g., (Bishop et al. 2004); reviewed in (Albin and Harris 2010)].

In cell-based model systems with forced cDNA expression, all seven APOBEC3s have reported activity [e.g., (Bishop et al. 2004; Liddament et al. 2004; Wiegand et al. 2004; Yu et al. 2004a; Zheng et al. 2004; Doehle et al. 2005a; Rose et al. 2005; Dang et al. 2006, 2008; Goila-Gaur et al. 2007; OhAinle et al. 2008); reviewed in (Albin and Harris 2010)]. However, cellular expression patterns, ability to encapsidate into viral particles, and restriction in relevant T cell models indicate not all the APOBEC3s function in vivo as HIV restriction factors. For example, in primary CD4+ T lymphocytes, a major target of HIV in vivo, six APOBEC3s are expressed at appreciable levels, but APOBEC3A mRNA is undetectable (Koning et al. 2009; Refsland et al. 2010). In addition, only four: APOBEC3D, APOBEC3F, APOBEC3G, and APOBEC3H (but not APOBEC3A, APOBEC3B, and APOBEC3C) package into viral-like particles and inhibit viral replication when stably expressed in human T cell lines (Hultquist et al. 2011). Finally, endogenous APOBEC3D and APOBEC3F combine to explain the 5′-GA-to-AA mutation pattern observed in the non-permissive T cell line CEM2n (Refsland et al. 2012). Of note, CEM2n does not express appreciable levels of APOBEC3H mRNA; therefore, the endogenous contribution of this protein could not be evaluated. Seven variants of APOBEC3H have been reported, and cell-based studies indicate only a subset of these haplotypes are stable at the protein level and capable of HIV restriction (Dang et al. 2008; OhAinle et al. 2008; Harari et al. 2009; Wang et al. 2011). The significance of this natural variation in human populations and the relevance of it to HIV restriction require further investigation. Taken together, while all the APOBEC3s can be compelled to deaminate a single-stranded viral DNA in vitro or in heterologous cell lines by gross overexpression, only four APOBEC3s—APOBEC3D, APOBEC3F, APOBEC3G, and APOBEC3H—appear to have the capacity to function as HIV restriction factors in T lymphocytes.

All successful viruses have evolved sophisticated immune suppression and/or evasion mechanisms to neutralize the cellular defense systems they must face. One of the best-studied APOBEC3 counter-restriction mechanisms is orchestrated by the HIV Vif. This protein is small (23 kDa), highly basic, and strictly required for pathogenesis in vivo as well as for virus replication ex vivo in monocytes, macrophages, primary CD4+ T lymphocytes, and non-permissive T cell lines (Kan et al. 1986; Lee et al. 1986; Sodroski et al. 1986; Fisher et al. 1987; Strebel et al. 1987; Gabuzda et al. 1992; von Schwedler et al. 1993). Vif recruits a multiprotein E3 ubiquitin ligase complex consisting of CUL5/NEDD8, ELOB, ELOC, RBX2, and CBFβ to mediate proteasomal degradation of the cellular APOBEC3 restriction factors (Fig. 2c) (Conticello et al. 2003; Marin et al. 2003; Sheehy et al. 2003; Yu et al. 2003; Mehle et al. 2004; Jäger et al. 2012b). In addition to this primary role, it has been suggested that Vif can relieve APOBEC3G-mediated restriction by alternative mechanisms such as directly blocking deaminase activity, preventing encapsidation, sequestering the protein in catalytically inactive conformations, or impeding APOBEC3G translation (Mariani et al. 2003; Stopak et al. 2003; Kao et al. 2004; Santa-Marta et al. 2005; Opi et al. 2007; Goila-Gaur et al. 2008; Britan-Rosich et al. 2011). The extent to which these mechanisms function to counteract APOBEC3 during a productive infection warrants further investigation but is likely to be secondary to the proteasomal degradation mechanism.

3.2 APOBEC3 Proteins in General Innate Immune Defense

Exogenous retroviruses are rare in humans, consistent with the idea that the multifaceted APOBEC3 defense provides robust protection against this type of pathogen. HIV is one exception to this rule. It is successful, at least in part, due to Vif and degradation of the cellular APOBEC3s. However, evidence for a long history of positive selection acting on the human APOBEC3 locus suggests this family has been defending the genomic integrity of its host’s cells long before HIV was transmitted into the human population (Sawyer et al. 2004; Zhang and Webb 2004; Sanville et al. 2010). Given the significant presence of endogenous retroelements in the genomes of mammals (nearly 50 %), suppression of these elements may represent the subfamily’s true raison dêtre and the primary source of the positive selection that has shaped the complex present-day locus and the diverse functions of these innate immunity factors.

Endogenous retroelements, including those containing long terminal repeats (LTR) like endogenous retroviruses, as well as non-LTR elements like long interspersed nuclear elements (LINEs) and short interspersed nuclear elements (SINEs), may have provided the evolutionary pressure necessary for the maintained expansion of the APOBEC3 locus in primates. Additionally, the differences in retrotransposition frequency between rodents and primates could be attributable to possessing an arsenal of seven APOBEC3 proteins versus only a single one (Maksakova et al. 2006; Stenglein and Harris 2006). In support of this hypothesis, human APOBEC3s have demonstrated activity on LTR retrotransposons from mice and yeast (Dutko et al. 2005; Esnault et al. 2005, 2006; Schumacher et al. 2005; Bogerd et al. 2006a; Chen et al. 2006; Lee and Bieniasz 2007; Jern and Coffin 2008; Lee et al. 2008). Additionally, some endogenous retroviruses exhibit the characteristic scars from APOBEC3F and APOBEC3G activity in their genomes (Anwar et al. 2013 in press). Non-LTR elements, including LINE1 and Alu, are also restricted by human APOBEC3s but, in contrast to the HIV restriction mechanism described above, this mechanism appears entirely deamination independent (Bogerd et al. 2006b; Chiu et al. 2006; Muckenfuss et al. 2006; Stenglein and Harris 2006; Carmi et al. 2011).

The generation of a single-stranded DNA intermediate in the life cycle of a parasitic element may render it susceptible to APOBEC3 restriction. In addition to HIV, other viruses including simian immunodeficiency virus, murine leukemia virus, foamy virus, porcine endogenous retrovirus, human T cell leukemia virus, and hepatitis B virus have all been reported to be susceptible to APOBEC3-mediated editing of their genomes (Harris et al. 2003; Mangeat et al. 2003; Mariani et al. 2003; Kobayashi et al. 2004; Turelli et al. 2004; Yu et al. 2004a; Doehle et al. 2005b; Lochelt et al. 2005; Russell et al. 2005; Suspène et al. 2005; Abudu et al. 2006; Delebecque et al. 2006; Jonsson et al. 2007). Whether or not the APOBEC3-mediated restriction of all of these viral pathogens is part of a natural innate immune response or a by-product of the particular model systems used awaits further investigation.

Endogenous retroelements and their mobility are believed to have played a central role early in shaping the human genome during speciation (Kazazian 2004; Carmi et al. 2011). However, this process has its associated costs. Ultimately, cells have devised strategies to defend and preserve genomic integrity by curbing the movement of these genetic elements. The evolution of the APOBEC3 family has likely played a prominent role in this defense and in diversifying the retroelements to make them more useful for the host species [e.g., (Carmi et al. 2011)].

4 Pathological Consequences of DNA Deamination

Although DNA cytosine deamination has obvious benefits for individual cells and the organism as a whole, this process may have considerable pathological consequences, most notably cancer. For instance, overexpression of APOBEC1 in the liver of mice has been shown to cause hepatocellular carcinoma and liver dysplasia (Yamanaka et al. 1995). Likewise, AID transgenesis also causes cancer, and it has been implicated in initiating the chromosomal translocations responsible for some lymphocyte neoplasias including the hallmark c-myc/IgH rearrangement in Burkitt’s lymphoma [(Okazaki et al. 2003; Ramiro et al. 2004; Unniraman et al. 2004); reviewed by (Perez-Duran et al. 2007)]. AID may also promote resistance to the chemotherapeutic drug imatinib (Klemm et al. 2009). However, the overall impact of APOBEC1 and AID on human cancer is questionable because their expression is largely limited to tissues associated with their biological functions, APOBEC1 in the enterocytes of the small intestine and AID in B cells (Fig. 3a) (Powell et al. 1987; Teng et al. 1993; Muramatsu et al. 1999, 2000).

Fig. 3
figure 3

APOBEC expression and model of carcinogenesis. a APOBEC family mRNA expression in the indicated cell lines and normal human tissues. Data are relative to levels of AID in the spleen (reprinted with permission from Burns et al. 2013). b Model depicting the effect of APOBEC3B overexpression over time

In contrast, most APOBEC3s have much broader expression ranges that span most tissues in the human body (Fig. 3a) (Koning et al. 2009; Refsland et al. 2010; Burns et al. 2013). Broad expression profiles, potent DNA deaminase activity, and C-to-T transition biases in tumor genome sequences strongly suggested that one or more of the APOBEC3 proteins may be a source of mutation in different cancers (Harris et al. 2002; Nik-Zainal et al. 2012; Roberts et al. 2012; Burns et al. 2013). APOBEC3B became a leading candidate as it uniquely and constitutively localizes to the nucleus by inheriting a nuclear import mechanism from AID (Bogerd et al. 2006b; Bonvin et al. 2006; Muckenfuss et al. 2006; Stenglein and Harris 2006; Kinomoto et al. 2007; Stenglein et al. 2008; Hultquist et al. 2011; Pak et al. 2011; Lackey et al. 2012, 2013). Recently, APOBEC3B was found overexpressed in some laboratory breast cancer cell lines, but not in available control cell lines (Fig. 3a; for example, compare HCC1569, MCF-MB-453, and MCF-MB-463 to telomerase immortalized human mammary epithelial cells, hTERT HMECS) (Burns et al. 2013). APOBEC3B up-regulation was shown to be responsible for elevated genomic uracil levels and increased mutation rates in breast cancer cell lines (Burns et al. 2013). APOBEC3B up-regulation was similarly robust in the majority of human breast tumors, in contrast to barely detectable levels in normal breast tissue (Burns et al. 2013). Remarkably, APOBEC3B overexpression correlated with a doubling in the tumor genomic mutation loads, and the majority of C-to-T transition mutations occurred within the preferred motif of recombinant APOBEC3B (Burns et al. 2013).

Overall, a model is emerging for how APOBEC3B provides genetic fuel for tumorigenesis, which, coupled with selection, may help explain many hallmarks of cancer such as increased DNA damage, elevated proliferation, decreased apoptosis, and massive heterogeneity (Fig. 3b). In particular, APOBEC3B up-regulation correlates with inactivation of the tumor suppressor gene TP53, which strongly suggests that it may be an early tumor-initiating event (Burns et al. 2013). Obviously, the potential benefits to encoding APOBEC3B must outweigh potential costs of carcinogenesis. An attractive explanation for this apparent conundrum may be that its innate immune function is important early in life and for the health of the species, for instance, in germ cells or early development (Bogerd et al. 2006b; Wissing et al. 2011), whereas the toll of cancer is not imposed in most instances until after the reproductive years. In any event, much more work is now justified on APOBEC3B and its role in breast and, potentially, other human cancers.

5 Possible Avenues to APOBEC3-Based Therapeutics

5.1 Therapy by Hypermutation

If left unimpeded by Vif, APOBEC3 proteins such as APOBEC3G can convert up to 10 % of viral cDNA cytosines into uracils in a single round of virus replication [e.g., (Harris et al. 2003; Yu et al. 2004b)]. The resulting massive levels of G-to-A mutations effectively ruin the genetic potential of the retrovirus in a process called lethal mutagenesis (Loeb et al. 1999; Haché et al. 2006). Moreover, the preferred context of APOBEC3G deamination events often results in the conversion of the tryptophan codon TGG into a premature stop codon TAG, which is more detrimental to the virus than a simple amino acid change. Overall, physiological levels of APOBEC3 proteins largely, if not fully, suppress the replication of Vif-deficient HIV.

This remarkable potency therefore raises the prospect of developing drugs to leverage the APOBEC3 restriction mechanism against HIV (Fig. 4a). Direct inhibition of Vif is certainly one strategy, but such an approach is destined to be susceptible to problems imposed by extensive natural HIV variation and the rapid evolution of drug resistance. A more appealing alternative may be to develop a drug toward one of the more genetically stable cellular proteins recruited to the APOBEC3 degradation complex. In particular, Vif requires at least four heterologous protein–protein interactions to successfully counteract the APOBEC3 proteins. The first strategy, of course, is targeting the direct interaction with the APOBEC3 proteins themselves, which may occur through conserved structural motifs (Albin et al. 2010; Kitamura et al. 2012a). Second is targeting the recently discovered interaction with CBFβ, which is essential for Vif stability and function (Jäger et al. 2012b; Zhang et al. 2012). Third and fourth are targeting the distinct Vif interaction motifs in ELOC (the SLQ motif) or CUL5, which are also essential for activity of the APOBEC3 Ub ligase complex (Marin et al. 2003; Yu et al. 2003). Finally, it may be possible to target an upstream component of the proteasomal pathway involved in APOBEC3 degradation [e.g., (Kim et al. 2013)].

Fig. 4
figure 4

Potential therapies harnessing APOBEC3 innate immunity. a Inhibition of the Vif–E3 ligase complex could result in virus restriction. b Inhibiting the deaminase activity of APOBEC3 proteins could deprive HIV of a source of variation and allow immune clearance

However, despite these clear opportunities for drug development, progress has been relatively underwhelming. Proof-of-concept experiments have been achieved through cell-based screens for preservation of APOBEC3G-GFP fluorescence in the presence of HIV Vif (Nathans et al. 2008). The lead molecule from these studies, RN18, however, has only been subject to modest additional development (Ali et al. 2012). An additional concern holding back development of RN18 is the major challenge of identifying the molecular target, which is essential for structural studies and rational improvements through medicinal chemistry.

Independent lead compounds have also been identified based on predicted fit into the Vif-binding pocket in ELOC, which would effectively outcompete the SLQ motif of Vif (Huang et al. 2013a; b). However, the best of these indolizine-type compounds has a modest IC50 value of 11 μM. This will likely need improvements in potency and solubility before efficacy can be achieved in live cell studies. In any event, much more work in this area is needed, including larger-scale cellular screens, biochemical screens, and computational screens, followed by a comprehensive set of secondary and tertiary screens to narrow in on the most effective drug candidates.

5.2 Therapy by Hypomutation

Vif efficiently counteracts the HIV-relevant APOBEC3 repertoire (Conticello et al. 2003; Kao et al. 2003; Marin et al. 2003; Sheehy et al. 2003; Stopak et al. 2003; Hultquist et al. 2011). Nonetheless, evidence of APOBEC3 activity is found in the proviral sequences derived from infected patients [e.g., (Janini et al. 2001)]. This raises the possibilities that either Vif cannot completely neutralize all of the cytoplasmic APOBEC3 before encapsidation or HIV regulates its high mutation rate, in part, through controlled degradation of the APOBEC3 proteins with Vif acting as a molecular rheostat optimizing the levels of cytosine deamination necessary for immune evasion and potentially drug resistance (Haché et al. 2006; Harris 2008).

A counterintuitive but potentially more effective strategy to decrease the pathogenesis of HIV may be to inhibit the enzymatic activity of the APOBEC3s, thereby eliminating a potential source of variation for the virus (Fig. 4b) (Harris 2008; Li et al. 2012; Olson et al. 2013). Toward this end, high-throughput screens have resulted in proof-of-concept experiments that APOBEC3G activity can be blocked with small molecules (Li et al. 2012; Olson et al. 2013). However, much additional work is still necessary to improve the solubility, potency, and bioavailability of APOBEC3 inhibitors, and efficacy will need to be demonstrated in cell-based studies before critical animal experiments can be done. Nevertheless, a hypomutation strategy is attractive because decreasing the HIV mutation rate has the potential to limit the diversity of the viral population and render it susceptible to normal immune clearance mechanisms (Fig. 4b). Indeed, the adaptive T cell- and B cell-mediated immune responses manage to keep the virus in check initially in most patients, and it is tempting to speculate that even a slight tip in favor of host immunity may enable complete virus clearance (i.e., analogous to most other viral infections). Given the fact that existing anti-retroviral drugs are numerous and largely effective, a big challenge facing the field is the development of a curative therapy. It remains possible that promoting either virus hypermutation or hypomutation could be part of such a cure.