Biological context

The green fluorescent protein (GFP) from the jelly fish Aequorea Victoria is one of the most commonly used biomarkers used in biological research leading to the Nobel Prize in Chemistry 2008 (Shaner et al. 2007; Tsien 1998). It features the spontaneous formation of a chromophore that involves an auto-catalytic cyclisation and subsequent oxidation of the backbone of residues 65–67 without the need for enzymes or added cofactors. This makes GFP an ideal fusion tag for a myriad of applications, such as monitoring protein expression, localisation, and mapping interactions in vivo and in vitro by using the fluorescence of GFP as a reporter. GFP consists of 238 residues that fold into an 11-stranded β-barrel which wraps around the central α-helix—where the chromophore is located and buried in the hydrophobic core—to form a β-can structure. Over the last decade, an array of GFP variants has been identified from natural sources (primarily from marine animals) or engineered additionally by mutagenesis. By altering the side-chain composition of the chromophore itself, i.e., residues 65–67, as well as that of the surrounding residues, the reported GFP variants cover a broad spectrum of fluorescence excitation and emission wavelengths ranging from blue to red, which makes it most versatile for various applications. While the wild type GFP requires a few hours for the chromophore to mature and its spectral properties are highly sensitive to experimental conditions, e.g., pH and salt contents, a yellow fluorescent protein (YFP), Venus, has been developed to yield rapid chromophore maturation, higher quantum yield, and insensitivity to environmental changes (Nagai et al. 2002). These desirable properties of Venus have made it an ideal fluorescence probe in for FRET (Shimozono and Miyawaki 2008) and single molecule studies (Yu et al. 2006).

The usefulness of GFP variants as biomarkers relies on a robust folding capacity of these proteins. A structure-based rational design is a key factor in the further improvement of the folding properties as well as the photochemistry and photophysics of GFP variants (Jackson et al. 2006). While more than 100 crystal structures of GFP variants have been deposited in the Protein Data Bank to date, including that of Venus (Rekas et al. 2002), NMR assignment data are, at present, only available at the backbone level for a single GFP variant, GFPuv (Georgescu et al. 2003; Khan et al. 2003). The availability of backbone assignments renders the identification of highly stable structural elements in GFPuv at a residue-specific level under native and denaturing conditions (Huang et al. 2007). Dynamics studies of the structural perturbations in GFPuv resulting from point mutations have also been reported (Seifert et al. 2003). Moreover, a very recent report that employed 1D 1H NMR has identified intriguing indications of folding intermediate of a GFP variant by following the upfield shifted methyl proton resonances, although no side-chain assignments of GFP are currently available (Andrews et al. 2007).

We report here the backbone and side-chain 1H, 15N and 13C assignments of YFP Venus with an emphasis on the I/L/V side-chain methyl groups. These assignments, in particular of the side-chain reporters, should enable us to obtain structural insights at an atomic resolution to delineate the complex folding processes and the common features within the GFP family.

Methods and experiments

The gene encoding YFP Venus was a kind gift from Prof. Atsushi Miyawaki at the Brain Science Institute, RIKEN, Japan, and was sub-cloned into a pET21 vector with a hexahistidine tag at the N-terminus. 1H, 15N (>95%), 13C (>95%) and uniformly 2H (~70%), 15N (>95%), 13C (>95%) labelled protein was expressed in the E. coli strain BL21 (DE3) and purified using a Ni-column (Ni-NTA Superflow; Qiagen) followed by gel filtration chromatography (Superdex 75; GE). Selective 13C labelling, [U-15N, 13C-aa], was achieved by addition of 13C/15N (>98%; Spectral Isotope) labelled amino acids to 0.5 l of 15N-labelled minimal media (1 mM final concentration for each amino acid type) 20 min prior to IPTG induction. The cell cultures were harvested after 1.5 and 3 h of growth at 37°C; prolonged cell growth did not show significant scrambling of the supplemented amino acid types. Selective suppression of 15N labelling, [U-15N, \aa], was achieved by addition of unlabelled amino acids to 15N-labelled minimal media following the same protocol as that for selective 13C labelling. Six amino acid type-specific (un)labellings were made—[U-15N, 13C-I/F/V/L], [U-15N, 13C-V/L], [U-15N, 13C-F, \K], [U-15N, \M\K\I\L], [U-15N, \M\I] and [U-15N, \K\L]—to facilitate residue-specific backbone assignments by comparing the 15N–1H projections of corresponding HNCO and 13C coupled {15N–1H} TROSY spectra to identify the sequential (i − 1) and intraresidue correlations of the selectively (un)labelled amino acid types.

Purified protein was then concentrated to ca. 400 μM and buffer exchanged into Tris-HCl, HEPES, MES or phosphate buffers to yield sample pH pf 6.0, 6.6, 7.6, 8.0, 9.6, in the presence of 0.002% NaN3. For backbone assignments, TROSY versions of the triple resonance experiments (HNCA, HNCACB, intra-HNCACB, HNCO, HN(CA)CO) were recorded using a 2H(70%)/13C/15N labelled sample. Experiments were performed at 37°C on a Bruker Avance 700 MHz spectrometer equipped with a cryogenic triple resonance probe (Bruker BioSpin). The chemical shifts of individual spin systems (HN, N, Cα, Cβ and C′) collected manually and the backbone resonance assignments were achieved iteratively through a combination of computer-aided automated assignment using the programme MARS (Jung and Zweckstetter 2004) and visual inspections. For side-chain resonance assignments, 3D 1H–15N NOESY-HSQC, 3D 1H–13C NOESY-HSQC, 3D HcCH-COSY, 3D HcCH-TOCY and 3D hCCH-COSY spectra were recorded using the [U-15N, 13C-I/V/L] labelled sample, and HBHA(CO)NH were also recorded using the 2H(70%)/13C/15N labelled sample, and the resonances were assigned manually. All NMR data were processed and analysed by TopSpin (Bruker BioSpin), NMRPipe (Delaglio et al. 1995) and Sparky (Goddard and Kneller) software packages.

Extent of assignment and data deposition

An array of experimental conditions including pH values (ranging from 6.0 to 9.6; Fig. 1) and temperatures (ranging from 20 to 47°C; data not shown) have been tested to obtain an optimal condition for the assignment purpose based on the spectral quality and the number of observed crosspeaks in the {15N–1H} TROSY spectra. While some crosspeaks exhibit marked chemical shift changes and intensity changes under different conditions, the overall number of crosspeaks, however, is essentially the same throughout the wide range of tested conditions (Fig. 1). Overall, broader linewidths are observed at lower pH values and low temperatures (data not shown). In particular, the NMR sample of Venus begins to show visible signs of aggregation at pH 6.0, consistent with expectation of the loss of charges when close to the theoretical isoelectric point of 5.85, close to the pKa of the yellow fluorescence of Venus, 5.8, below which point the fluorescence intensity is lost rapidly (Rekas et al. 2002). The backbone assignment of Venus was therefore carried out using primarily data recorded at pH 7.6 and at 37°C; 3D HNCA, 3D HNCO and 3D 1H–15N NOESY-HSQC, recorded at pH 6.6 and at 37°C, were used to identify or confirm additional assignments. The condition used here is the closest to those under which Venus is used as a spectroscopic probe in vivo and therefore most useful for future structural studies associated with the in vivo data.

Fig. 1
figure 1

Overlaid 2D {15N–1H} HSQC spectrum of Venus recorded at different pH values for optimisation of spectral quality. The spectra were recorded at 37°C using a 700 MHz spectrometer equipped with a cryogenic probe. Weak crosspeaks that are attributed to minor conformations of Gly residues (all 24 Gly have been unambiguously assigned) are indicated by red crosses. Aliased side-chain resonances which were only observed at lower pH values as a result of protonation are indicated by asterisks (δ(1H) = 7–8 ppm, δ(15N) ~ 124 ppm)

Following a standard sequential assignment procedure, 90% of all assignable 1HN15N pairs (201 out of 224) and 86% of all 13C′, 13Cα and 13Cβ (592 out of 690) resonances of Venus have been assigned (Fig. 1). The relatively low level of assignment is in part due to the limited number of backbone amide crosspeaks available for the assignment; essentially all the resolved crosspeaks in the {15N–1H} TROSY spectrum observed across a wide range of pH and temperature values have been assigned. Structural mapping of the unassigned residues revealed a cluster encompassing parts of strands 3, 7, 8, 10 and 11 (Fig. 2a), forming a surface that largely overlaps with the dimer interface with which Venus self-associates in the crystalline state (Rekas et al. 2002) (Fig. 2b). Under NMR conditions, the hydrodynamic radius (Rh) derived through pulse field gradient diffusion measurements is ca. 6-fold larger than that of an immunoglobulin domain (14 kDa), while a factor of two is expected for a monomeric Venus in solution. The unexpectedly large Rh is in fact concentration-dependent, suggesting that Venus undergoes transient interactions in solution to form not only dimers but also higher order oligomers which contribute to the loss of resonances corresponding to the unassigned residues.

Fig. 2
figure 2

Structural mapping of the extent of backbone assignment. a The crystal structure of Venus (PDB entry 1MYW) is shown in a ribbon representation with unassigned residues coloured in red and labelled with corresponding identities. The chromophore is shown as yellow spheres. b Definition of the dimer interface of Venus in the crystalline state. Residues that are involved in intermolecular contact between the monomers are shown in blue with mesh representing the solvent accessible surface. c The extent of backbone amide assignments. The primary sequence of Venus is shown with the same colouring scheme as in (a). In addition, proline residues are shaded to indicate the locations where sequential connectivities are interrupted

Note that triply labelled NMR samples (2H/13C/15N) with partial deuteration (~70%) were prepared to alleviate the relaxation loss due to the moderate size of Venus (27 kDa), leading to a prolonged 15N transverse relation time of 42.3 ± 5.9 ms, which is sufficiently long for most triple resonance experiments (unpublished data). The 70% partial deuteration was necessary in order to retain part of the highly protected amide groups whose hydrogen exchange rates are exceeding low even in the presence of chemical denaturants or strong acids, and the refolding of Venus is not efficient (some have half lives of several months; unpublished data). Perdeuteration that requires back-exchange for the amide protons through cycles of unfolding and refolding procedures is therefore not desirable.

Although similar in sequence compositions (differs in 10 out of 238 residues), significant differences can be found in the 2D {15N–1H} HSQC spectra of Venus and GFPuv (not shown). Comparison of the Cα chemical shifts of Venus with respect to the two sets of previously reported values of GFPuv, which report secondary structure contents, revealed systematic offsets. We therefore re-referenced the deposited assignments, BMRB entries 5144 (Georgescu et al. 2003) and 5666 (Khan et al. 2003), using the online server SHIFTCOR (http://redpoll.pharmacy.ualberta.ca/shiftcor/ (Zhang et al. 2003)), and corrections of 3.08 and −0.30 ppm, respectively, were carried out to yield two similar overall profiles of the secondary Cα chemical shifts of GFPuv that are in general agreement with that of Venus. The differences of secondary Cα chemical shifts between YFP and GFPuv, ΔΔδ = Δδ YFP  − Δδ GFP , are −0.26 ± 1.55 and 0.13 ± 1.17 ppm for BMRB entries 5144 and 5666, respectively (Fig. 3). The two assignments were obtained under very similar conditions, both in phosphate buffered saline (pH 7.0 for BMRB 5144 and pH 7.2 for BMRB 5666) and at 310 K. Close inspection into the differences between the two, however, revealed marked deviations for BMRB entry 5144 from those of Venus (determined at pH 7.4), particularly in loop regions connecting individual β-strands, suggesting some degrees of pH-dependent conformational rearrangements in GFP. Common in these two GFPuv assignments, in relation with the observed deviations from those of Venus, are the Cα shifts of T38, H169, E172, P192 and L194, all of which are conserved and are located in the loop regions. Overall, the deviations of Cα chemical shifts between GFPuv and Venus suggest that the solution structures of the two GFP variants are highly similar and that marginal conformational differences occur only in the loop regions.

Fig. 3
figure 3

Comparison of Cα chemical shifts of Venus and GFPuv. Top panel Secondary Cα chemical shifts of Venus (black bar) and those of GFPuv derived from BMRB entries 5666 (Khan et al. 2003) (filled red circles) and 5144 (Georgescu et al. 2003) (open green squares), Δδ = δ YFP/GFP  − δ random coil , after re-referencing using SHIFTCOR (Zhang et al. 2003). Sequence-dependent random coil chemical shifts were calculated as described previously (Schwarzinger et al. 2001). Lower panel Differences of secondary Cα chemical shifts between YFP and different GFPuv assignments, ΔΔδ = Δδ YFP  − Δδ GFP . The sets of residues that form α-helices and β-strands in the crystal structure (PDB entry 1MYW) are indicated by red wiggles and blue arrows at the top of the diagram with labels that are consistent with the nomenclature used in the crystallographic study, and are shaded in dark and light grey boxes, respectively. Residues 65–67, which form the cyclised chromophore, are highlighted in black (Rekas et al. 2002)

In addition to the backbone assignments, we assigned nine out of 10 alanine side-chain methyl groups through a combination of HBHA(CO)NH and 3D 1H–15N NOESY-HSQC. Additionally, 82% (84 out of 102) of all side-chain methyl groups in Ile, Val and Leu (most of the unassigned resonances are located in the loop regions which exhibit severe spectral overlaps) were assigned using a [U-15N, 13C-I/F/V/L] labelled sample (Fig. 4). Future studies are underway to investigate in detail the structural and dynamical characteristics of these side-chain methyl groups. The assigned chemical shifts have been deposited in the BMRB under accession number 15826.

Fig. 4
figure 4

Methyl region of the 2D {13C–1H} constant-time HSQC spectrum of Venus recorded at 37°C and at 700 MHz. The assignments for Ile, Leu and Val are coloured in green, red and black, respectively