Introduction

Supramolecular protein assemblies are involved in many of the key processes that occur in living organisms. Characterizing the structure, dynamics and interactions of these machineries is therefore a critical step in our attempts to understand the mechanisms of life. Recently, a number of groundbreaking studies have demonstrated the feasibility of using solution nuclear magnetic resonance (NMR) spectroscopy to analyze the function, local structure, intermolecular interactions and dynamics of large protein assemblies of up to 1 MDa (Fiaux et al. 2002; Sprangers et al. 2007; Sprangers and Kay 2007; Gelis et al. 2007; Turano et al. 2010; Religa et al. 2010; Ruschak et al. 2010b). In addition to ever-improving spectrometer performance, progress has been driven by the development of protocols for the selective protonation of methyl groups in perdeuterated proteins (Gardner and Kay 1997; Tugarinov and Kay 2004; Lichtenecker et al. 2004; Fischer et al. 2007; Isaacson et al. 2007; Ayala et al. 2009; Gans et al. 2010; Ruschak et al. 2010a) and transverse relaxation-optimized methyl spectroscopy: methyl-TROSY (Tugarinov et al. 2003; Amero et al. 2009). However, despite numerous technical and methodological advances, the process of obtaining sequence-specific assignments—a pre-requisite for analyzing a variety of NMR-derived data—of large proteins remains a considerable challenge. NMR experiments designed for resonance assignment generally utilize through-bond correlations (Tugarinov et al. 2002; Tugarinov and Kay 2003) and cease to be applicable to soluble proteins over 100 kDa. As a result of this size limitation, a protein engineering strategy is often applied to large multimeric proteins in which resonance assignments obtained from smaller fragments of the target protein are transferred to the full-size complex (Gelis et al. 2007; Velyvis et al. 2009). While this “divide-and-conquer” approach is attractive under certain circumstances, in practice it can be time-consuming and laborious and may not even be applicable to the system of interest. More recently, state-of-the-art solution and solid-state 13C-NMR spectroscopy have been combined to report atomic resolution information of large multimeric protein assemblies (Turano et al. 2010).

Here we report a fast, efficient and wide-ranging method for assigning the resonance frequencies of each methyl probe within the context of a biologically-relevant full-size protein assembly. The method combines automated molecular biology techniques, small-scale parallel-preparation of residue-type-specifically isotope-labeled samples and sensitivity-optimized NMR experiments. Site-directed mutagenesis is used to individually “turn off” the NMR signal of each methyl-containing amino acid in the target protein and thereby provide a sequence-specific resonance assignment. As this approach does not rely on conventional through-bond triple-resonance NMR experiments, it is considerably less sensitive to the size of the system. The protocol we have outlined is systematic, fast, cost-effective and user-friendly and as such promises to be widely applicable. We demonstrate the utility of our strategy by characterizing inhibitor binding sites in the 468-kDa PhTET2 aminopeptidase complex from Pyrococcus horikoshii (Dura et al. 2005).

Materials and methods

Automatic cloning protocol

Constructs carrying single point mutations were generated using the RoBioMol platform at the Institut de Biologie Structurale, Jean-Pierre Ebel. Thirty-four isoleucine to leucine and thirty alanine to valine single point mutations were produced using an automated PCR-based protocol adapted from the QuikChange site-directed mutagenesis method (Strategene). PCR amplification was performed with Phusion Hot Start enzyme (Finnzymes) using the expression plasmid pET41c-PhTET2 as template and the specific mutagenic primers containing the sequence for the InFusion cloning. Products were purified using a Nucleofast plate (Macherey–Nagel) and digested by Dpn I. Final mutations were selected by transformation and verified by sequencing.

Parallel Small scale expression of U-[2H, 15N, 12C], Ile-δ1-[13CH3] and U-[2H, 15N, 12C], Ala-β-[13CH3] proteins

pET41c-Ile/Leu-mut-PhTET2 plasmids were transformed into Escherichia coli BL21-CodonPlus(DE3)-RIL (Stratagen). The cells were grown in shaker incubators in twenty 500 mL baffled flasks at 37°C. The cells were adapted to growth in D2O-based media in two stages. The final 50 mL M9 culture was prepared with 99.85% D2O (Eurisotop) containing 1 g/L of 15ND4Cl, and 3 g/L of d-glucose-d7 (Isotec) as the sole nitrogen and carbon sources, supplemented with 30 μg/L of kanamycin and 34 μg/L of chloramphenicol. When the culture reached an OD600 of approximately 0.8, either (A) 80 mg/L of 2-keto-4-13C-3,3-d2-butyrate (Ile-δ1 labeling; Gardner and Kay 1997); or (B) 800 mg/L of 2-(S)-2-(2H)-3-(13C)-Alanine (CortecNet) with 60 mg/L for isoleucine-d10 (Cambridge Isotope Laboratories, Inc.) and 200 mg/L for α-ketoisovalerate-d7 (CDN Isotopes Inc) (Ala-β labelling; Ayala et al. 2009). Protein expression was induced 1 h later using isopropyl β-D-thiogalactopyranoside to a final concentration of 0.1 mM. After 4 h, the cells were harvested by centrifugation.

Protein purification

Cell pellets were suspended in 5 mL of lysis buffer [50 mM Tris–HCl, 150 mM NaCl, 0.1% Triton X-100, pH 8.0], with 1.25 mg of lysozyme (Euromedex), 0.25 mg of DNase I grade II (Roche), 1 mg of RNase (Roche), 5 mg of Pefabloc SC (Roche) and 50 μL of 2 M MgSO4. Cell disruption was achieved by sonication with a Branson sonifier 150 at 4°C. Two 30-secs bursts at intensity 10 with an intermediary pause of 30 s were employed. The crude extract was heated at 85°C for 15 min before clarification by centrifugation at 17,000 g for 1 h at 4°C. The supernatant was dialyzed overnight against 20 mM Tris–HCl, 150 mM NaCl, pH 7.5, and the resulting extract was injected in a Resource Q 1 mL column (GE Healthcare), previously equilibrated with dialysis buffer. The protein was eluted at 4 mL/min with a linear NaCl gradient from 0.15 to 0.4 M in 20 mL. The chromatographic separation was performed using an ÄKTApurifier system (Amersham Biosciences) and the absorbance was monitored at 280–254 nm (A280 and A254 respectively). 0.667 mL fractions were collected and the four fractions with the highest A280/A254 ratio were selected and pooled.

Mutant purity and oligomerization state were determined by loading 40 μg of purified protein onto a Superose 6 10/300 GL column (GE Healthcare) equilibrated and run with 20 mM Tris–HCl, 150 mM NaCl, pH 7.5. All samples were more than 95% pure according to the A280. Aminopeptidase activity was measured following the method described by Durá et al. (2005) using leucyl-4-nitroanilide as a substrate. Specific activity of the purified mutants ranged from 41 to 95% of that of wild-type PhTET2.

For NMR analysis, the buffer of the protein samples was first exchanged to 50 mM NaCl, 20 mM Tris-DCl, pH 7.4, in 100% D2O by three concentrations/dilutions in a 30-KDa membrane Amicon Ultra-15 Centrifugal Filter (Millipore). The samples were concentrated to 250 μL in 5 mm Shigemi microtubes. Few resonances of the native PhTET2 particle overexpressed in M9/D2O medium display line broadening, which could be due to heterogeneity of the occupancies of divalent cation sites. After assignment of the native particle, in conditions described above, EDTA was added to the protein samples to reach a final concentration of 5 mM. The addition of EDTA substantially improves the quality of NMR spectra (Fig. 1 b, c) with no or minor effect on the position of individual methyl resonances, confirming that the line broadening observed in the particle purified from E. Coli culture was due most likely to an heterogeneity in the occupancies of divalent cations sites present in the purified protein. At the high concentrations used for NMR spectroscopy (2–6 mg/mL), the apo-PhTET2 remains in a dodecameric state. Assignment for both the native and apo forms of PhTET2 particle are reported in Tables S1 and S2.

Fig. 1
figure 1

The principle of SeSAM. a Schematic illustration of the parallel, mutation-based NMR assignment strategy. b–g Examples of spectra of mutants used to assign individual isoleucine correlations (residues 79, 162, 226, 260, 286, 348) of the 468 kDa PhTET2. SOFAST-methyl-TROSY spectra displayed in this figure were recorded using samples of U-[15N,12C,2H], [13CH3]Ile-δ1 labeled mutant PhTET2 protein (red). Each spectrum extract was overlaid with the reference spectra of the native particle (black). The assignment inferred for the missing resonance in the mutant spectrum is indicated. SOFAST-methyl-TROSY spectra of the PhTET2 mutants were recorded in less than 1 h per mutant. h–j Assignment of the δ1 resonance of isoleucine 185. As the I185L construct did not express, secondary chemical shift perturbations were exploited to unambiguously assign this resonance. h Expansion of the 3D structure of PhTET2 displaying the I185 side chain and the neighboring alanine residues, A180–A186. i and j Secondary chemical shift perturbations of resonance of I185 observed in spectra of PhTET2 mutants, A180–A186 V respectively

NMR spectroscopy

All NMR spectra were recorded on a Varian Inova spectrometer operating at a proton frequency of 800 MHz equipped with a cryogenically-cooled triple resonance pulsed field gradient probe head. Taking advantage of the thermostability of PhTET2, NMR data were recorded at 50°C. 1H-13C SOFAST-methyl-TROSY (Amero et al. 2009) spectra were recorded with 64 complex data points in the indirect dimension. The angle of proton excitation pulse was set to 30° and the recycling delay was optimized to 0.4 s. The length of each NMR experiment was adjusted depending on the final concentration of purified protein. All data were processed and analyzed with NMRPipe (Delaglio et al. 1995) and CARA (Keller 2004).

Results

SeSAM strategy

The general protocol for Sequence-Specific Assignment of methyl groups by Mutagenesis (or SeSAM) is shown in Fig. 1a. The objective is to assign each signal observed in a two-dimensional (2D) 1H,13C-NMR spectrum of a specifically methyl-protonated perdeuterated protein to the correct residue in the sequence. In the first step each methyl-containing residue in the target sequence is mutated, on a site-by-site basis, to a similar methyl containing amino acid, e.g. I→L, A→V, etc. The necessary library of conservative single-site mutants can be efficiently prepared using an automated molecular biology platform or, alternatively, can be purchased commercially. Each mutant construct is expressed on a small-scale using fully perdeuderated expression media supplemented with isotope-labeled metabolic precursors designed for the specific protonation of a single class of methyl group (Tugarinov et al. 2006; Ruschak and Kay 2010). Based on the overexpression yield of the wild-type protein, the scale of the culture is adjusted to ensure an average yield of 25 nmol of monomer for each purified protein of the library.

A conservative mutation of one methyl-containing residue to another non-labeled one (e.g. I→L) causes the NMR correlation of that methyl group to disappear from NMR spectra recorded of a specifically methyl-labeled sample. Sequence-specific assignments of each NMR signal can be inferred by comparing the 2D 1H-13C correlation spectrum recorded of each member of the library of mutants with a spectrum recorded of the wild-type particle (Fig. 1b–g). NMR spectra can be acquired using the SOFAST-methyl TROSY pulse sequence (Amero et al. 2009). This experiment decreases inter-scan delay times with a concomitant increase of sensitivity per unit of time of ca. 40%. Using this pulse sequence in combination with isotope-labeling schemes optimized for large protein assemblies allows assignment-quality spectra of supramolecular protein assemblies to be recorded in only a few minutes with ca. 1 mg of sample (ca. 25 nmol of monomer). (Fig. 1b–j). Alternatively, if the amount of protein is limiting factor, quality NMR spectra can still be recorded overnight with as little as 125 μg of protein (i.e. 3 nmol of monomer—Supporting Figure S1).

Assignment of Ala-β and Ile-δ1 methyl of a 468-kDa protein assembly

The SeSAM methodology was used for the assignment of Ile-δ1 and Ala-β methyl groups of the PhTET2 protein, a multimeric aminopeptidase involved in polypeptide degradation in the hyperthermophilic Archaea, Pyrococcus horikoshii. PhTET2 assembles to form a 468-kDa homododecameric particle with tetrahedral shape (Dura et al. 2005; Borissenko and Groll 2005). Each of the twelve 39-kDa subunits contains 34 isoleucines and 30 alanines. High quality 2D methyl-TROSY spectra recorded of the full-size, wild type, [1H,13C]Ile-δ1- or [1H,13C]Ala-β-labeled PhTET2 assembly contain all of the expected methyl resonances (Fig. 2). In this implementation of SeSAM, all isoleucine and alanine residues were individually mutated to either leucine or valine, respectively (Fig. 1), which were not specifically-labeled and therefore gave no signals in 1H,13C-NMR spectra. The mutant library was prepared in parallel using small-scale deuterated M9 expression cultures supplemented with the necessary precursors for [1H,13C]Ile-δ1- or [1H,13C]Ala-β-labeling (Gardner and Kay 1997; Ayala et al. 2009). Experimental acquisitions times, for each SOFAST-methyl-TROSY, ranged from a few minutes to 1 h per mutant, depending on the final concentration of purified protein. With the exception of a single construct (I185L), which did not express, it was possible to produce between 0.5 and 1.5 mg of each mutant in 50 mL of perdeuterated M9 expression medium (i.e. total amount of monomer: 13–38 nmol). Across the library of mutants little variation between wild-type and mutant spectra was observed indicating that the single residue changes did not affect the overall structure or oligomerization of PhTET2. This conclusion was further supported by size exclusion chromatography and enzymatic tests that demonstrated that all mutants formed dodecameric active particles (data not shown).

Fig. 2
figure 2

a Quaternary structure of the 468 kDa dodecameric PhTET2 particle. On the right side is presented an expansion of the tertiary structure of the monomeric subunit with the position of the methyl probes depicted by blue sphere for isoleucine-δ1 and red for alanine methyl groups. b and c SOFAST Methyl-TROSY spectra annotated with SeSAM inferred residue-specific assignments of isoleucine-δ1 (b) and alanine-β (c) methyl groups in wild-type PhTET2 aminopeptidase

The sequence-specific identity of a mutant residue was ascertained by superposing the spectrum of the native PhTET2 protein with the spectrum of a mutant. In approximately 50% of cases the sole difference between mutant and reference spectra was a single missing cross-peak in the spectrum of the mutant (Fig. 1b–g). In such instances, the missing cross-peak could be unambiguously assigned to the methyl group of the mutated residue.

In the remaining spectra, the disappearance of the signal of interest was accompanied by small changes in the chemical shift of a few additional correlations (Supporting Figure S3). This phenomenon is to be expected and has been previously observed by Kay and co-workers (Sprangers et al. 2005). Peak movements that do not directly concern the mutated resonance can complicate the process of obtaining a sequence-specific assignment from a single experiment, particularly when the target correlation resonates in an overcrowded region of the spectrum. Nonetheless, in the case of PhTET2 it was relatively easy to resolve such ambiguities by considering all the NMR spectra recorded of the full mutant library as well as the unambiguous assignments already determined. Secondary chemical shift perturbations reflect modifications in the local electronic environment that result from the mutation and therefore these changes provide complementary information that can be used to cross-validate potentially ambiguous assignments. This principle enabled the assignment of the δ1 methyl resonance of I185 which, after the assignment of all other resonances, was assigned to a weak correlation in the wild-type spectrum that overlapped with I286. This weak resonance was clearly perturbed by the mutation of neighboring residues, such as A180–A186 (Fig. 1h–j). From the 64 methyl groups labeled, only the assignment of the δ1-methyl resonance of I72 remains ambiguous. Nevertheless, quantitative analysis of the relative peaks intensities indicates that this resonance is located in the centre of the spectra overlapping with 6 other intense peaks (Supporting Table S2).

Application to the characterization of an inhibitor binding site

NMR spectroscopy is a powerful technique for analyzing intermolecular binding surfaces, particularly in cases where sequence-specific resonance assignments are known. Chemical shift changes in SOFAST-methyl-TROSY spectra of [1H,13C]Ile-δ1- or [1H,13C]Ala-β-labeled PhTET2 were observed in spectra of the enzyme on the addition of an inhibitor, amastatin (Fig. 3). Using the SeSAM-derived assignments of isoleucine δ1- and alanine β-methyl groups of free PhTET2, it was possible to identify which residues are perturbed by the interaction with amastatin and to map these sites on to the 3D structure (Fig. 3). The residues identified by NMR chemical shift mapping are localized to the inner channels of the particle and are in excellent agreement with the surfaces determined in the X-ray structure (Borissenko and Groll 2005).

Fig. 3
figure 3

Characterization of the amastatin binding surface of PhTET2 by NMR spectroscopy. a and b Overlays of regions of 2D 1H-13C SOFAST-Methyl-TROSY spectra (Amero et al. 2009) of free (black) and amastatin-bound (red) PhTET2 particle (Borissenko and Groll 2005). Analysis of the site-specific chemical shift perturbations reveals the inhibitor binding sites. Spectra corresponding to Ala-b and Ile-δ1 methyl groups are respectively shown in a and b. c Binding-induced chemical shift perturbations mapped onto the outer and inner surface of the PhTET2 dodecamer in complex with amastatin, which is shown in yellow. (PDB code: 1Y0Y). No chemical shift perturbations are detected on the outer surface of the particle. The molecular surface is color coded according to chemical shift perturbation via a linear gradient from blue (no change detected) to red (maximal perturbation). d Expansion of the region of PhTET2 involved in the interaction with amastatin. Arrows indicate the locations of a set of residues perturbed on the binding of the inhibitor binding sites. Side chains of residues 99, 100, 180 and 186 are buried and are not colored. Residues corresponding to the neighboring subunits are indicated with an asterisk

Discussion

The use of site-directed mutagenesis in NMR signal assignment dates back to the early days of biomolecular NMR spectroscopy (Jarema et al. 1981; Gronenborn et al. 1986; Bycroft and Fersht 1988), before modern multidimensional through-bond correlation experiments were available. Mutagenesis-based assignment approaches have more recently been applied to very high molecular weight proteins. However, unambiguous assignment of the individually mutated methyl groups proved difficult due to spectral complications caused by secondary chemical shift changes (Sprangers et al. 2005). We have demonstrated here that secondary chemical shift perturbations can be an important source of information that can be used to confirm the proposed assignment. The key point is that the information provided by secondary chemical shifts only becomes interpretable when data from a full library of methyl group mutants is considered. Any ambiguous assignments can therefore be readily cross-validated using structurally-close, unambiguously-assigned resonances. Using an incomplete library of mutants would not permit the same level of confidence in the final assignments.

Through the use of a simple and systematic approach it was possible to obtain complete resonance assignments of all the visible resonances in 2D methyl NMR spectra of a selectively-labeled 468-kDa protein multimer without need to disrupt the overall structure of the native complex. This is the first time such a feat has been achieved for a protein of this size. The total time taken to implement the SeSAM procedure on PhTET2, including the parallel small-scale expression and purification of a library of 64 isotope-labeled mutants was approximately one month. In addition, the total experimental NMR time required was 3 days. For the complete assignment of the Ile-δ1 and Ala-β resonances in the 468 kDa PhTET2 particle, a total of 3.2 L of perdeuterated culture media were used (corresponding to global cost in isotopically material of ca. 2000 €). These values represent a considerable reduction of the cost, man-power and spectrometer time required compared to alternative approaches (Gelis et al. 2007; Velyvis et al. 2009; Turano et al. 2010). The rate-limiting step in the example described here was protein purification, as PhTET2 was expressed without any affinity purification tag. The use of a purification tag designed for a single-step purification would serve to substantially accelerate sample production and open the possibility of fully automating the SeSAM procedure (Rasia et al. 2009).

Conclusion

In summary, we have devised and implemented SeSAM, a powerful and general strategy for sequence-specific NMR assignment of methyl resonances in large protein assemblies. The protocol presented is simple and relies on relatively straightforward, accessible and inexpensive laboratory and NMR techniques. A key attraction of this approach is that resonance assignments of large proteins and biomolecular complexes can be obtained directly from the full size protein or complex. Using SeSAM it is not necessary to dissect the complex into smaller, more NMR-friendly components or to obtain microcrystals of labeled proteins for combined solution- and solid-state analysis. SeSAM utilizes small volume cultures which considerably lowers sample preparation costs due to the reduced requirement for expensive isotope-labeled materials. Furthermore, the use of a small-scale protocol allows multiple targets to be efficiently and conveniently isotope-labeled, purified and analyzed by 2D NMR spectroscopy in parallel. The procedure outlined above represents a simple, fast and cost-effective way to obtain residue-specific assignment of NMR signals of high molecular weight proteins. As demonstrated here, the SeSAM strategy can alleviate the assignment bottleneck and open the possibility of performing atomic resolution analyses of structural changes, dynamics and interactions of supramolecular protein complexes using simple and accessible solution NMR spectroscopy techniques.