Introduction

Secondary antibody diversification mechanisms such as somatic hypermutation (SHM) and class switch recombination (CSR) are essential to neutralize and clear pathogens and toxins from the circulation. SHM is characterized by very high mutation rates that occur at the immunoglobulin V region (Papavasiliou and Schatz 2002), estimated to be 10−3 mutations per basepair per generation (McKean et al. 1984), while CSR is a DNA recombination event that occurs between two immunoglobulin switch regions that are located upstream of each constant region (Honjo et al. 2002; Kenter 2003; Manis et al. 2002). Both SHM and CSR require the enzyme activation-induced cytidine deaminase (AID) (Diaz and Storb 2003; Martin et al. 2002; Muramatsu et al. 2000; Okazaki et al. 2002; Revy et al. 2000; Yoshikawa et al. 2002). Since its initial discovery, two models have emerged that describe how AID initiates SHM and CSR. One model suggests that AID edits an mRNA, much like its homologue Apobec1, to create a novel transcript that now encodes a mutator or endonuclease (Muramatsu et al. 2000). The second model suggests that AID is a DNA-specific cytidine deaminase that initiates the SHM and CSR processes by deaminating cytidine (to uridine) within actively-transcribed V regions and switch regions of antibody genes, respectively (Martin et al. 2002; 2002). The uridines created by AID are then acted on by uracil DNA glycosylase (UNG) (Di Noia and Neuberger 2002; Rada et al. 2002) and AP endonuclease, resulting in single-stranded DNA breaks that are subsequently repaired by other DNA repair factors. A recent report, however, shows that the glycosylase activity of UNG is dispensable for CSR, casting doubt on the DNA-deamination model (Begum et al. 2004; Unniraman et al. 2004).

The mismatch repair (MMR) pathway, a DNA repair pathway responsible for recognizing and initiating repair of mispaired DNA, is also known to be involved in the processes of SHM and CSR (Alabyev and Manser 2002; Ehrenstein and Neuberger 1999; Martin et al. 2003; Phung et al. 1998; Rada et al. 1998; Reynaud et al. 1999; Schrader et al. 1999; Vora et al. 1999; Wiesendanger et al. 2000). Mice deficient in the MMR proteins Msh2, Msh6, or Exo1 have fewer A–T mutations than controls (Bardwell et al. 2004; Martin et al. 2003; Phung et al. 1998; Rada et al. 1998; Wiesendanger et al. 2000). These findings led to the hypothesis that SHM proceeds in two phases: the first phase mutates G–C basepairs, while the second phase, which is dependent on MMR, preferentially mutates A–T basepairs (Rada et al. 1998). Thus, the remaining G/C mutations in MMR-deficient mice would mostly be due to AID, if the DNA deamination model were to be correct.

In support of the DNA deamination model, purified or partially purified AID deaminates single-stranded DNA (Bransteitter et al. 2003; Chaudhuri et al. 2003, 2004; Dickerson et al. 2003; Pham et al. 2003; Ramiro et al. 2003; Shen and Storb 2004; Sohail et al. 2003; Yu et al. 2004a). Recent evidence suggests that AID requires replication protein A (RPA) to mediate these deamination events in vivo, and that this interaction may influence target specificity (Chaudhuri et al. 2004). Two reports using a glutathione S-transferase (GST)-hAID fusion protein showed that AID prefers to mutate cytidines within a WRC (A/T, A/G, C) sequence context (Pham et al. 2003; Yu et al. 2004a). This sequence motif is also observed to mutate frequently in mice and humans (Rogozin and Diaz 2004; Shapiro et al. 2002), in cell lines (Martin et al. 2002; Zhang et al. 2001), and in bacteria (Beale et al. 2004), which lends strong support to the DNA-deamination model for AID function and suggests that RPA is not required for this specificity. However, a second report using purified AID with a small peptide tag did not find a sequence preference for WRC motifs (Dickerson et al. 2003). The finding that AID prefers to mutate WRC motifs, a highly mutated sequence in vivo, might be the most convincing argument to support the DNA deamination model. Because this WRC specificity was only observed with the GST moiety attached to AID, we wanted to first confirm that this finding was not imparted by the GST protein to AID. Thus, we tested whether purified AID without the large GST moiety retained the WRC specificity. We also compared the mutation spectrum of purified hAID at all trinucleotide motifs to that observed in hypermutating Ramos cells and ung−/−msh2−/− mice, which occurs mostly or exclusively at G/C basepairs, respectively (Martin et al. 2002; Rada et al. 2004; Sale and Neuberger 1998; Zhang et al. 2001), and are thus likely to be deficient for non-AID induced mutations.

Materials and methods

Generation and purification of His-hAID

The His-hAID expression vector was constructed by cloning hAID cDNA into the pRSETc vector. A 4-l culture of DE3 BL21 with the pRSET-hAID vector was grown at 37°C in the presence of 100 μg/ml ampicillin. At an OD of 0.6, 1 mM IPTG (Fermentas) and 200 μg/ml ampicillin was added to cultures and grown for 16 h at 16°C. A 500-ml pellet was resuspended in 20 ml of lysis buffer (50 mM phosphate buffer pH 8.0, 150 mM NaCl, 0.2% Trition X-100, 100 μg/ml PMSF, 1 mM imidazole) and lysed by French-press and centrifuged. Supernatants were applied to a His-select affinity cobalt resin (Sigma), and washed with 100 ml of lysis buffer containing 8 mM imidazole. His-hAID was eluted with lysis buffer containing 100 mM imidazole. Fractions were analyzed by SDS-PAGE and stained with coomassie blue. Identity of the His-hAID band was verified by western blotting using an anti-His-tag antibody (data not shown). BSA (1 mg/ml) was added to the fractions containing pure His-hAID, and then dialyzed overnight at 4°C into the final storage buffer (20 mM Tris, pH 7.4, 100 mM NaCl, 1 mM DTT, 1 mg/ml BSA) and stored at −80°C.

His-hAID deamination assay

HindIII-digested p219 plasmid (1 nM; Larijani et al. 2005) was diluted in 20 mM Tris HCl pH 8.0, denatured at 95°C for 10 min and snap-cooled, followed by the immediate addition of 2× reaction buffer (20 mM Tris HCl, pH 8.0, 20 mM MgCl2, 2 mM DTT) and 20 ng of His-hAID. Tubes were incubated at 37°C for the indicated time-points. Samples of 1 μl were then used for PCR reactions, using Taq and previously described primers and conditions (Larijani et al. 2005). Although these primers have a Tm for the normal and deaminated product of ∼48 and ∼68°C, respectively, the annealing temperature during the PCR reaction was kept at 50°C to avoid amplifying heavily deaminated products (Larijani et al. 2005). PCR products were electrophoresed on a 1.5% agarose gel and photographed. PCR products from the 30-min time-point were cloned into the pCR4-Topo vector (Invitrogen) and sequenced using T7 primers. Statistics in Table 1 was performed for the WRC mutations, in which observed mutation frequencies were compared with expected mutation frequency for a mutation mechanism with no sequence bias. Statistics was calculated by the independent-samples t-test (two-tailed) with equal variances assumed (Excel 2002).

Table 1 Trinucleotide mutability indexesa in vivo and of purified hAID

Results and discussion

Trinucleotide mutability index of His-hAID

To characterize the mutation spectra of purified AID protein without the GST fusion protein, we generated a bacterially expressed human AID protein with a poly-histidine tag (referred to as His-hAID). This protein was expressed in Escherichia coli and purified to near homogeneity (Fig. 1). Bovine serum albumin (BSA) was added to the His-hAID shortly after purification to stabilize the enzyme (see Materials and methods). To test the activity of His-hAID, a linearized plasmid (i.e., p219) was boiled, snap-cooled, and 20 ng His-AID was added to the reaction and incubated for 5, 10, 30, 60, and 120 min. Deaminated products were amplified by PCR using primers that preferentially bind to deaminated plasmid DNA. As shown in Fig. 2a, PCR products were observed in the AID treated samples, but not in the untreated controls. As a control, His-hAPOBEC3G, which was purified from bacteria using the same protocol that was used for His-hAID, did not deaminate DNA sufficiently to produce a PCR product (Fig. 2a, lane 8). This was expected since hAPOBEC3G has a much more restricted sequence preference (i.e., CCCA/G) (Yu et al. 2004b) than hAID and is not expected to deaminate the target sufficiently to allow for primer binding. This control also indicates that the His-hAID and His-hApobec3G preparations were free of contaminating E. coli cytidine deaminases.

Fig. 1
figure 1

Coomassie-stained SDS-PAGE of purified His-hAID. Lanes 1 and 2 show total protein before and after application to the cobalt column, respectively. Lane 3 shows the purified His-hAID which was eluted from the cobalt column. Immediately after elution, 1 mg/ml BSA was added to stabilize the purified His-hAID

Fig. 2a,b
figure 2

His-hAID is active in deamination assay. a Schematic (left panel) illustrating the in vitro deamination assay and the PCR reaction used to selectively amplify deaminated product. Linearized p219 plasmid was boiled, snap-cooled, and incubated with purified His-hAID or His-hAPOBEC3G for the indicated times. Primers specific to the p219 β-promoter region with either T or A substituted for C or G, respectively, preferentially amplify deaminated products in a nested PCR assay. The inner primers used in the nested PCR reaction generate a 370-bp product. Agarose gels were stained with ethidium bromide and photographed with colors inverted (right panel). b Mutations (asterisks) located within the 314-bp fragment of the p219 β-promoter region. Regions proximal to the primer binding site were not analyzed since these sites were selectively deaminated due to the PCR reaction and would thus have skewed the results shown in Table 1 and Fig. 3

To determine the preferred sequence context mutated by His-hAID, we cloned PCR products from the 30 min His-hAID treated products, and sequenced them (Fig. 2b). These results were compared with those from baculovirus-expressed GST-hAID that was previously analyzed using the same conditions and plasmid target (Table 1, Fig. 3) (Larijani et al. 2005). Our analysis specifically focused on the mutability index of trinucleotide motifs (Shapiro et al. 2002). In general, His-hAID preparations preferentially deaminated WRC (i.e. A/T, A/G, C) motifs, and showed a reduced deamination efficiency of GYC (G, C/T, C) motifs (Table 1, Fig. 3). These results are similar to what was previously reported for GST-hAID (Table 1, Fig. 3) (Larijani et al. 2005; Pham et al. 2003).

Fig. 3
figure 3

Trinucleotide mutability indexes in vivo and of purified AID. The trinucleotide mutability indexes (data obtained from Table 1) are plotted for His-hAID, GST-hAID, Ramos cell VH region, and the region flanking the 3′ side of VHJ558DJH4 in ung−/−msh2−/− mice

Trinucleotide mutability indexes in Ramos cells and in ung−/−msh2−/− mice

To compare the mutation spectrum produced by His-hAID to mutations found in vivo and in cell lines, we reanalyzed the mutation spectra from hypermutating Ramos cells (Martin et al. 2002; Zhang et al. 2001) and from ung−/−msh2−/− mice (Rada et al. 2004). According to the model proposed by Cristina Rada and Michael Neuberger (Rada et al. 1998), SHM is divided into two phases: the first phase is G/C biased, while the second phase is A/T biased but is also expected to occur at G/C basepairs at a reduced frequency. Ramos cells are likely to be deficient in phase two mutations because ∼80–90% of mutations occur at G/C basepairs (Martin et al. 2002; Sale and Neuberger 1998; Zhang et al. 2001), which contrasts to the ∼40–50% G/C mutations found in humans and mice (Martin and Scharff 2002). Because ung−/−msh2−/− mice only contain transition mutations at G/C basepairs (Rada et al. 2004), phase two mutations are absent. According to the DNA deamination model, phase one mutations are due to AID. Thus, the mutation spectrum in Ramos cells and in ung−/−msh2−/− mice is likely to reflect the sequence preference for AID in vivo.

Only unique mutations were tabulated for Ramos cells and ung−/−msh2−/− mice. That is, if a specific base change was observed more than once in a sequence in the same Ramos clone or in the same mouse, that mutation was counted only once. The data are presented in Table 1 and Fig. 3, together with human and mouse trinucleotide mutability indexes that were previously reported (Shapiro et al. 2002). It is important to note that all mutations reported in Table 1 and Fig. 3 are unselected since they either occur within introns (i.e., for mice) or non-productively rearranged V-regions (i.e., for humans) or are obtained in vitro (i.e., for Ramos cells, GST-hAID, and His-hAID). As shown in Table 1 and Fig. 3, the trinucleotide mutability indexes in human VH, mice intronic JHJκ, Ramos cell V-region, and the region flanking the 3′ side of VHJ558DJH4 in ung−/−msh2−/− mice showed a similar pattern to that of purified GST-hAID and His-hAID. Overall, WRC motifs were preferentially mutated, while SSC motifs were primarily under-mutated. Interestingly, Ramos cells showed a strikingly high trinucleotide mutability index for AGC, TAC, and AAC, but not for TGC, all of which comprise the WRC motif. The lack of a preference for TGC motifs was also observed in human VH genes (Table 1), but not in mice JH, ung−/−msh2−/− mice, or purified human AID, which showed high mutability indexes for all four WRC motifs (Table 1, Fig. 3). The significance of this finding is unclear. These data show that the trinucleotide mutability indexes of purified hAID correspond to the trinucleotide mutability indexes that are observed in vivo.

The data presented in this report show that hAID has a natural specificity for WRC motifs, which is likely not to be dependent on other factors such as the recently identified RPA co-factor for AID (Chaudhuri et al. 2004), since this sequence specificity was achieved with purified protein. Importantly, the mutability index at all trinucleotide combinations for purified hAID from bacterially expressed His-hAID and baculovirus-expressed GST-hAID matches that observed in vivo, providing strong support for the DNA deamination model, which states that AID initiates SHM and CSR by cytidine deamination of DNA within V regions or switch regions, respectively (Di Noia and Neuberger 2002; Martin et al. 2002; Petersen-Mahrt et al. 2002; Rada et al. 2002). Because the base excision repair pathways initiated by UNG and the mismatch repair pathway initiated by Msh2/6 heterodimers are mutagenic during SHM (Bardwell et al. 2004; Martin et al. 2003; Phung et al. 1998; Rada et al. 1998, 2002; Wiesendanger et al. 2000), removal of these mutagenic pathways has revealed a cytidine deamination-like mutagenic process. This is because SHM in ung−/−msh2−/− mice proceed solely through transition mutations at G/C basepairs (Rada et al. 2004), implying that the dU:dG mismatch produced by a cytidine deamination is being replicated and is not being recognized by DNA repair pathways in B cells. Thus, the remaining mutations in ung−/−msh2−/− mice are likely to reflect the sequence specificity of the mutator in vivo. This is also likely to be the case in Ramos cells, since most mutations are transition mutations at G/C basepairs (Martin et al. 2002; Sale and Neuberger 1998; Zhang et al. 2001). Indeed, the trinucleotide mutability index in Ramos cells and in ung−/−msh2−/− mice corresponds to those of purified His-hAID and GST-hAID in vitro, supporting the notion that AID is the mutator that initiates SHM and CSR by directly deaminating DNA sequences within immunoglobulin genes.