Introduction

Multi-spanning helical membrane proteins of eukaryotes, especially seven-transmembrane (7TM) helical G-protein coupled receptors (GPCRs), have become very attractive targets for NMR studies (Kim et al. 2009; Tikhonova and Costanzi 2009; Goncalves et al. 2010; Tapaneeyakorn et al. 2010). While GPCRs are extremely important medically, the specific interest of the NMR community is fueled by several additional factors. First, recent progress of X-ray structure determination of GPCRs, even though striking, is limited, especially when it comes to studies of the activated states and dynamics (Kobilka and Schertler 2008; Mustafi and Palczewski 2009; Hanson and Stevens 2009), making the use of complementary techniques such as NMR a must. Next, the ability of both solution and solid-state NMR (SSNMR) to get structural insights (or even full structures) for membrane proteins of this size and architecture has increased dramatically (Kim et al. 2009; Gautier et al. 2010; Renault et al. 2010). Finally, continuous development of new expression systems often allows production of isotope-labeled eukaryotic proteins in milligram quantities sufficient for NMR studies, even though the successful functional overexpression of GPCRs is sporadic (Lundstrom et al. 2006; Sarramegna et al. 2006; McCusker et al. 2007; Takahashi and Shimada 2010).

While a large number of GPCRs could be obtained in the inclusion bodies of E. coli with reasonable yields and at a low cost (Lundstrom et al. 2006), it is often difficult to achieve their proper refolding, membrane insertion, and native-like post-translational modifications (reviewed in (Sarramegna et al. 2006; McCusker et al. 2007; Kim et al. 2009; Tapaneeyakorn et al. 2010)). In addition to expression in the inclusion bodies, several GPCRs could be inserted in the inner membrane of E. coli, either as fusion proteins or at low temperature (Grisshammer 2009; Tian et al. 2005; Berger et al. 2010). Even though some NMR studies were conducted on isotope-labeled GPCR ligands bound to E. coli-expressed natural abundance proteins (e.g., (Luca et al. 2003)), there are very few cases where isotope-labeling of a whole GPCR in E. coli resulted in good quality spectra suitable for structural studies. For example, solution spectra of vasopressin V2 receptor (Tian et al. 2005), kappa opioid receptor (Kim et al. 2009), and Y2 receptor (Schmidt et al. 2010), missed a large part of the expected resonances, indicating possible problems with native folding, complicated heterogeneous dynamics, and proton back-exchange. At the same time, SSNMR spectra of uniformly labeled lipid-reconstituted Y2 and cannabinoid receptors (Schmidt et al. 2010; Berger et al. 2010) gave good dispersion but poor spectral resolution, even though selective 15N labeling in cannabinoid and chemokine CXCR1 receptors gave promising results (Park et al. 2006; Berger et al. 2010).

An alternative method of expression of isotope-labeled GPCRs is in mammalian or insect cell cultures. While this method does not have problems associated with improper folding and post-translational modifications, it suffers from high costs and difficulties with uniform labeling and deuteration, even though there is a constant improvement in these techniques (Lundstrom et al. 2006; Werner et al. 2008; Takahashi and Shimada 2010; Egorova-Zachernyuk et al. 2011). As with E.coli-expressed GPCRs, there is a number of isotope-labeled ligand NMR studies conducted with natural abundance receptors (Ratnala et al. 2007; Lopez et al. 2008; Kofuku et al. 2009), as well as with chemically isotope-labeled receptors (Bokoch et al. 2010). While interesting structural insights could be obtained from selectively isotope-labeled GPCRs (Klein-Seetharaman et al. 2004; Werner et al. 2007; Ahuja et al. 2009), no high-resolution SSNMR or complete solution NMR spectra of uniformly labeled GPCRs amenable for structural studies have been reported so far.

Expression of GPCRs and other eukaryotic membrane proteins for structural studies in methylotrophic yeast Pichia pastoris has been considered as a promising and cost-effective alternative for some time (Massou et al. 1999; Lundstrom et al. 2006; McCusker et al. 2007; Oberg et al. 2009; Takahashi and Shimada 2010). Many GPCRs could be functionally expressed in P. pastoris, and some of them with high yields (Abdulaev et al. 1997; Sarramegna et al. 2002; de Jong et al. 2004; Kim et al. 2005; Andre et al. 2006; Shukla et al. 2007; Talmont 2009; Singh et al. 2010). Several other polytopic helical mammalian membrane proteins were overexpressed in Pichia and crystallized, including potassium channels, aquaporins, leukotriene synthase, and P-glycoprotein (Long et al. 2005; Tao et al. 2009; Ho et al. 2009; Horsefield et al. 2008; Molina et al. 2007; Aller et al. 2009). Protocols for efficient and economical uniform isotope-labeling (both 13C and 15N) and deuteration are well established for soluble proteins expressed in Pichia (Laroche et al. 1994; Wood and Komives 1999; Rodriguez and Krishna 2001; Morgan et al. 2000; Pickford and O’Leary 2004). Nevertheless, no detailed structural NMR studies of isotopically labeled GPCRs or other polytopic eukaryotic helical membrane proteins produced in Pichia have been reported to date (with the exception of a few studies of isolated extracellular domains).

The main goal of this work is to demonstrate that expression of eukaryotic membrane proteins in yeast can result in functional, structurally homogeneous, isotopically labeled samples yielding high-resolution SSNMR spectra after reconstitution in lipids. We report on a successful adoption and optimization of uniform isotope labeling protocols, previously used for production of soluble secreted proteins in Pichia pastoris (Pickford and O’Leary 2004), for expression of a 7TM helical eukaryotic membrane protein for high-resolution SSNMR study. As a model system, we chose a fungal microbial-type rhodopsin from Leptosphaeria maculans (LR) (Idnurm and Howlett 2001), the first proven case of a bacteriorhodopsin (BR)-like eukaryotic light-driven proton pump (Waschuk et al. 2005). Recently, LR was functionally expressed in neurons (Chow et al. 2010), showing its promise in optogenetics. Being architecturally similar to GPCRs, LR is a very convenient protein for the purposes of this SSNMR study, as it is known to have high expression level in Pichia pastoris, remains stable and functional upon reconstitution into synthetic lipid membranes, is colored, and can be tested functionally by observing its photochemistry (Waschuk et al. 2005). We achieved high yield of expression (more than 5 mg of purified protein per liter of culture in shake-flasks) of uniformly doubly labeled LR, which gives stable (at least several weeks at 5°C), homogeneous, and functionally active lipid-reconstituted samples of high protein-to-lipid ratio. The samples produce high-resolution 13C and 15N magic angle spinning (MAS) SSNMR spectra, which are amenable to a detailed structural analysis via multi-dimensional spectroscopy. Taking into the account recent successes in the expression of natural abundance eukaryotic membrane proteins in Pichia, we believe that similar isotope labeling and reconstitution protocols can be adopted for these challenging targets as well.

Materials and methods

Protein expression

Previously, we successfully expressed N-terminally truncated (48 residues removed) LR with C-terminal 6-His-tag, using pHIL-S1 vector transformed into GS115 strain of P. pastoris (Waschuk et al. 2005). After the cleavage of the PHO1 secretion signal sequence, the mature LR had three extra residues (REF) on its N-terminus. To maximize the yield of stable expression for isotope-labeling, we use pPICZαA vector (Invitrogen) with different secretion signal (α-factor) in the protease-deficient strain SMD1168H, which allows selection for multiple integration events on zeocin (Cedarlane) plates. Using the pHIL-S1-LR construct as a starting material, we inserted the LR encoding sequence (with two extra EF residues at the truncated N-terminus, C-terminal 6-His-tag and a stop codon) into the multiple cloning site of the pPICZαA vector, using EcoRI and XbaI restriction sites. Thus, we expected that the mature protein would have, after the post-translational cleavage, either two (EF), or six (EAEAEF) extra residues at its N-terminus, depending on the efficiency of the STE13 processing (Invitrogen manual).

The pPICZαA-LR vector was propagated in DH5α strain of E. coli in low salt LB medium with zeocin, isolated using Qiagen kit (QIAprep Spin Miniprep), and transformed into P. pastoris SMD1168H cells by electroporation according to the manual of the Pichia expression kit (Invitrogen) with small modifications. Briefly, 10 μl of stock of P. pastoris SMD1168H cells was inoculated into 25 ml of yeast peptone dextrose (YPD) medium in a 250 ml baffled flask, grown at 30°C, 300 rpm to achieve OD600 ~10 (normally, 24 h). Approximately 2–10 ml of the overnight culture was spun down at 700×g for 2 min and resuspended with 400 ml of fresh YPD medium in a 2.8 L Fernbach flask. The culture was grown at 30°C, 300 rpm to OD600 ~1.4–2.0 (the growth time was adjusted according to the starting value of OD600; the doubling time of log phase Pichia in YPD was ~2.5 h). The cells were collected by centrifugation at 1,500×g for 4 min at 4°C. The cell pellet was washed twice with 300 ml of sterile ice-cold water and centrifuged again. The same procedure was repeated using 40 ml of sterile ice-cold 1 M sorbitol and the final pellet was resuspended in 0.5 ml of sterile ice-cold 1 M sorbitol. Before transformation, the plasmid was linearized with BstXI, and then purified by the QIAquick nucleotide extraction kit (Qiagen), desalted by the QIAquick nucleotide removal kit (Qiagen), concentrated by ethanol precipitation and resuspended in a desired volume of water. 40 μl of yeast cell suspension in sorbitol was mixed with 5 μg of desalted DNA in an electroporation 0.2 cm cuvette (Bio-Rad). After incubating on ice for 5 min, the cuvette was subjected to a “Pic” pulse (MicroPulser, Bio-Rad). Normally, the applied voltage was ~2 kV and the pulse duration was 5.6–5.7 ms. 1 ml of sterile ice-cold 1 M sorbitol was added immediately after the pulse and mixed thoroughly. 100–200 μl aliquots of the transformation mixture were spread on YPD sorbitol plates containing zeocin (at two different concentrations: 100 and 500 μg/ml). Following 3–10 days of incubation at 30°C the transformant colonies were isolated from the plates and screened for high expression level of LR, by inoculating into 5 ml of BMD medium in a 250 ml baffled flask and shaking at 30°C, 300 rpm overnight. After the value of OD600 reached ~2, another 20 ml of BMD was added to the culture, which was shaken at 300 rpm, 30°C for 18–24 h until the OD600 was 10. This culture was centrifuged at 1,500×g for 5 min at 4°C, resuspended in 25 ml of BMM, and grown by shaking at 240 rpm, 30°C. After 24 h, additional 175 μl of 100% methanol (final concentration 0.7%) and 6.25 μl of 10 mM all-trans-retinal (isopropanol stock, final concentration 2.5 μM) were added into the culture. At different time points (24h, 40h, 48 and 52 h), 1 ml of the expression culture was taken and centrifuged at 1,500×g for 5 min at 4°C in a 1.5 ml microcentrifuge tube. The expression level of the protein was evaluated by the intensity of the color of the yeast pellet, and the colonies showing the most intense red color were selected for a large scale expression.

Large-scale protein expression followed the established shake-flask protocol for secreted soluble proteins (Pickford and O’Leary 2004) with small modifications. Unlabeled media components were purchased from Fisher and Sigma, while isotope-labeled components were from Cambridge Isotope Laboratories (Andover, MA). Briefly, a small piece from a cell colony with the highest expression level of LR in small-scale culture was inoculated into 50 ml of BMD (or 13C,15N-BMD for isotope-labeled LR, with 0.5% 13C-glucose and 0.8% 15NH4Cl) in a sterile 250 ml baffled flask. This seed culture was grown, shaking at 30°C (300 rpm) for 18–24 h, until the OD600 exceeded 2, and inoculated into a sterile 2 l baffled flask containing 200 ml of BMD (or 13C,15N-BMD). This culture was shaken at 29–30°C (270 rpm) for 18–24 h, until the OD600 reached 10. To induce LR expression, the cells were pelleted in sterile containers at 1,500×g for 5 min at 4°C, and gently resuspended in 0.8 L of BMM (or 13C,15N-BMM, with 0.5% 13C-methanol and 0.8% 15NH4Cl), which was placed into 2.8 L Fernbach flask and shaken at 29–30°C (240 rpm). The temperature was kept at the standard value, as there was no evidence of any significant protein misfolding or heterogeneity warranting expression at lower temperatures (see results). 10 mM isopropanol stock of all-trans-retinal (Sigma, needed for rhodopsin regeneration, final concentration 5 μM) and 100% filtered methanol (or 13C-methanol, final concentration 0.5%) were added to the growth medium after 24 h of induction. The 0.5% concentration of labeled methanol (and glucose), lower than that in several labeling protocols, was used following the recommendation of the protocol for economical labeling (Rodriguez and Krishna 2001), where it showed additional benefits of more complete incorporation of isotopes and lack of cell lysis by high concentration of methanol. We evaluated growth at 1% 13C-methanol, but did not find any improvement of the protein yield. The red-colored cells were collected by centrifugation at 1,500×g for 5 min at 4°C after 40 h of induction, as the protein yield was found to be lower upon longer (48–52 h) and shorter (24 h) incubation times. The protein yields at 24 h were less than 20% of those at 40 h, showing much lower yield per unit of 13C-methanol, making the collection at 40 h the most economical. The cell pellet was washed with MilliQ water twice and stored frozen at −20°C for later use.

Protein purification and lipid reconstitution

The cell breakage and protein purification protocols were based on those used for Neurospora rhodopsin (NR) and LR (Furutani et al. 2004; Waschuk et al. 2005) with some modifications. The cell pellet was re-suspended in one pellet volume of buffer A (7 mM NaH2PO4 at pH 6.5, 7 mM EDTA, 7 mM DTT, and 1 mM PMSF) and incubated with 5 mg of lyticase (from Arthrobacter luteus, Sigma) per 0.8 L of cell culture for digestion of the cell walls, and additional 25 μM of all-trans-retinal to ensure complete rhodopsin regeneration, by slowly shaking in the dark at room temperature for 3–4 h. The cells were then centrifuged at 1,500×g for 5 min at 4°C and immediately resuspended in one pellet volume of buffer A. Half of the pellet volume of ice-cold acid-washed glass beads (Fisher) (420–600 μm diameter) was added, and the cells were disrupted with four 1 min pulses using vigorous mixing with a vortex mixer. The cell debris were removed by centrifugation at 700×g for 5 min at 4°C and the cell lysate was collected. An additional half pellet volume of buffer A was added to resuspend the cell debris, and vortexing and centrifugation steps were repeated several times (usually 8 times in total) to achieve complete breakage of the cells. All cell supernatants containing the membrane fraction were combined and centrifuged at 40,000×g for 30 min at 4°C, and the membrane pellets were stored at −20°C for later use.

To purify LR, the pellets of frozen membranes were thawed and resuspended with solubilization buffer (62.5 ml per l of culture, 1% Triton X-100, 20 mM KH2PO4, pH 7.5, 1 mM PMSF), stirred overnight in the dark at 4°C, and centrifuged at 40,000×g for 30 min at 4°C. Solubilized LR was purified from the supernatant using 6-His tag affinity resin (Ni2+-NTA agarose, Qiagen) by the batch method. We estimated the quantity of solubilized protein spectroscopically (Cary 50, Varian), assuming the molar extinction similar to that of BR, to calculate the amount of resin needed. The mixture was incubated in the dark at room temperature with gentle agitation to allow complete binding (usually 3 h). The clear supernatant containing other solubilized proteins was flown through empty PD-10 column (GE Healthcare), while the resin was retained at the bottom. The washing buffer (about two times of the volume of the resin, 0.25% Triton X-100, 50 mM KH2PO4, pH 7.5, 400 mM NaCl, 0–35 mM of imidazole, concentration increasing in 5 mM steps between consecutive washes) was added into column and mixed for 2–3 min, then allowed to flow through. The spectrum of the eluate was monitored to detect the loss of non-specifically bound cytochromes (at 410 nm) and specifically bound LR (at 540 nm). The resin was washed 12–13 times with increasing concentrations of imidazole, until the cytochrome band disappeared from the eluate spectrum. Tightly bound LR was eluted from the resin with the elution buffer (0.25% Triton X-100, 50 mM KH2PO4, pH 7.5, 400 mM NaCl, 250 mM imidazole), after 2 min of mixing. The resin was eluted several times until the amount of bound LR was negligible. All eluates were combined and concentrated to 1–2 ml volume by centrifugation at 4,000×g for 15 min at 4°C in a concentrator (Amicon Ultra 15 ml, 10 kDa cutoff). The purity of this preparation was assessed by gel electrophoresis (SDS–PAGE) and MALDI TOF mass spectrometry (University of Guelph Advanced Analysis Center).

The lipid reconstitution protocol followed that used for proteorhodopsin (PR) (Shi et al. 2009a). The dry powder lipids (DMPC: DMPA = 9:1 w/w, Avanti lipids) were first dissolved and mixed in warm chloroform, which was thoroughly removed by evaporation under vacuum to yield a thin lipid film. The dry lipids were rehydrated by adding a rehydration buffer (50 mM KH2PO4, 100 mM NaCl, pH 7.5) and agitated with vigorous shaking, mixing, and sonication to obtain lipid suspension at high concentration (usually, 10 mg/ml). Purified solubilized LR was added to the preformed liposomes, which were semi-solubilized (as judged by the drop in turbidity) with Triton X-100 at protein/lipids/detergent (w/w/w) ratio of 1:1:0.7, and stirred for 15 min at room temperature. The resultant semi-transparent mixture became turbid after removal of detergent by adding 600 mg of Bio-beads SM-2 (Biorad) per 1 ml of the mixture and incubation with stirring for 24 h at 4°C in the dark. Next, the proteoliposome suspension was withdrawn by a syringe, after which the colored beads (with remaining bound LR) were washed by 0.1 M NaCl several times. The proteoliposomes were collected by centrifugation at 150,000×g for 50 min at 4°C. The pellet was washed several times by centrifugation in 10 mM NaCl, 25 mM TrisCl, pH 7, and the proteoliposomes were finally concentrated by ultracentrifugation at 900,000×g for 3 h. The pellet obtained this way was ready for SSNMR rotor packing and kept at −20°C until further use. For the NMR measurements, the proteoliposomes were hydrated with 10 mM NaCl, 25 mM Tris-Cl, pH 7.

FTIR measurements

Absolute static and time-resolved rapid-scan difference FTIR spectra were collected as described previously (Waschuk et al. 2005), using Bruker IFS66vs machine with a temperature-controlled sample holder (Harrick) connected to a circulating water bath (Fisher). Photochemical cycle was activated by light provided by the second harmonic of a Nd-YAG laser (Continuum Minilite II), using 7 ns pulses at 532 nm. Dry or hydrated (0.05 M CHES, 0.05 M KH2PO4 and 0.1 M NaCl, pH 9) films of the DMPC:DMPA LR liposomes were compressed between two CaF2 windows with a 6 μm Teflon spacer, and data acquisition was controlled by OPUS software (Bruker). The higher pH value (9) was used to accumulate the characteristic photointermediates of the photocycle of LR and to compare with the published data (Waschuk et al. 2005).

NMR experiments

All SSNMR experiments were performed as described previously for PR (Shi et al. 2009a). Additional experimental details are given in the respective figure legends. The spectra were recorded on Bruker Avance III spectrometers operating either at 800.230 MHz or 600.13 MHz, both equipped with 3.2-mm E-free 1H–13C–15N probes (Bruker). The MAS frequency was 14.3 kHz for experiments carried on the 800 MHz spectrometer, and 12 kHz for experiments carried on the 600 MHz spectrometer. Hydrated proteoliposomes containing approximately 7 mg of LR were center-packed in a 3.2 mm rotor. The sample temperature was maintained at 5°C in all experiments.

Results and discussion

Solid-state NMR spectroscopy

After optimization of the induction length (optimal time 40 h, similar to that found for NR (Furutani et al. 2004)), the yield of the purified protein in shake-flasks exceeded 5 mg per liter of culture. Since only 0.5% concentration of 13C methanol in BMM was used, and it had to be replenished only once (after 24 h of induction), the cost of this sample is close to that for similar bacterial proteins produced in E. coli, such as PR and ASR (Gourdon et al. 2008; Shi et al. 2009a, 2010). The lipid-reconstituted LR gave SSNMR spectra of high resolution allowing identification of individual chemical sites as obvious both from the 1D and 2D spectra (Figs. 1, 2, 3, and 4). The estimated linewidth (~0.5 ppm for 13C, 0.7 ppm for 15N) is similar to that observed for E. coli-expressed bacterial homologs of LR (Etzkorn et al. 2007; Shi et al. 2009a, 2010), as well as for the native 2D crystals of BR (Varga et al. 2008), suggesting structural homogeneity of the sample. In the recent past, similar spectral resolution allowed us to perform 3D and even 4D SSNMR experiments leading to the assignment of the majority of resonances for PR and ASR (Shi et al. 2009b, 2010).

Fig. 1
figure 1

One-dimensional 15N MAS NMR spectrum of 13C,15N-labeled LR proteoliposomes at 800 MHz. The inset shows the expansion of a spectral region where side-chains of protonated histidines and the retinal Schiff base resonate

Fig. 2
figure 2

One-dimensional 13C MAS NMR spectrum of 13C,15N-labeled LR proteoliposomes measured at 800 MHz

Fig. 3
figure 3

Two-dimensional NCA MAS NMR spectrum of 13C,15N-labeled LR proteoliposomes at 800 MHz. Glycine resonances are shown in the box, the Schiff base peak (folded from ~173.2 ppm) is marked. The time domain data matrix was 160 (t1) × 1,024 (t2), with t1, t2 increments of 74, and 20 μs, respectively. The carrier frequency was placed at 118 ppm and 60 ppm for nitrogen and carbon chemical shift evolution, respectively. 80 scans per point were recorded, with a recycle delay of 1.8 s. Total experiment time was 6.4 h. Data were processed with Lorentzian-to-Gaussian apodization functions and zero filled to 4,096 (t1) × 4,096 (t2) prior to Fourier Transform. 24 Hz of Lorentzian line narrowing and 40 Hz of Gaussian line broadening were applied in t1 15N indirect dimension, 40 Hz of Lorentzian line narrowing and 80 Hz of Gaussian line broadening were applied in t2 13CA direct dimension. The first contour is cut at 5 × σ, with each additional level multiplied by 1.2

Fig. 4
figure 4

Two-dimensional DARR (20 ms mixing) MAS NMR spectrum of 13C,15N-labeled LR proteoliposomes at 600 MHz. Alanine resonances are shown in the box. The time domain data matrix was 1,436 (t1) × 997 (t2), with t1, t2 increments of 7.9, and 24 μs, respectively. The carrier frequency was placed at 90 ppm for carbon chemical shift evolution. 32 scans per point were recorded, with a recycle delay of 1.8 s. Total experiment time was 23 h. Data were processed with Lorentzian-to-Gaussian apodization functions and zero filled to 16,384 (t1) × 4,096 (t2) prior to Fourier Transform. 30 Hz of Lorentzian line narrowing and 60 Hz of Gaussian line broadening were applied on both t1 and t2 13C dimensions. The first contour is cut at 5 × σ, with each additional level multiplied by 1.2

One-dimensional 15N spectrum shows fine structure both in the backbone amide region (Fig. 1) and for the sidechains, such as His, Arg, and Lys. LR has several His residues in the transmembrane core, and one can distinguish at least four individual resonances of protonated histidines at 160–170 ppm (Fig. 1, inset), with additional low-intensity lines at 250–260 ppm, corresponding to the deprotonated species (not shown). The protonated retinal Schiff base is the active functional center of rhodopsins, and its resonance is readily observed at 173.2 ppm. This position is consistent with the maximum of the visible spectrum of LR being at 542 nm (Waschuk et al. 2005), following the well-known relationship between these two parameters (Hu et al. 1997; Pfleger et al. 2008). A minor shoulder of this band may reflect the presence of a small fraction of 13-cis-retinal (<10%) detected earlier by Raman spectroscopy (Waschuk et al. 2005) and HPLC of the retinal extracts (Sumii et al. 2005), but only a single correlation to the Cε atom of Lys, corresponding to the all-trans-configuration, was observed in the 2D NCA spectrum (Fig. 3). One-dimensional 13C spectrum (Fig. 2) shows similarly high resolution, comparable to that observed for PR (Shi et al. 2009a). The absence of strong resonances at 70–90 ppm implies lack of glycosylation (O’Leary et al. 2004), consistent with the results of mass spectrometry (see below), as well as efficient removal of the majority of glycolipids during the purification.

2D NCA and CC (DARR) correlation spectra (Figs. 3, 4) show many isolated narrow peaks, which allow identification of amino acid types and indicate that the sample is suitable for spectral assignments. For, example, 2D NCA spectrum (Fig. 3) shows well-resolved proline correlations (130–145 ppm 15N shifts). Only three out of five Pro resonances are clearly visible, probably due to the dynamics in the loops (one more resonance is visible, but its intensity is low). Based on the comparison with BR and ASR (Lansing et al. 2003; Shi et al. 2010), two prolines with nitrogen shifts of 136.4 and 131.5 ppm likely correspond to the residues located in the TM helices (helices C and F in BR and ASR). The third proline peak with unusually high 15N shift of 144.2 ppm is similar to that observed in BR (Lansing et al. 2003) and probably corresponds to a residue in a beta-structured loop (possibly, B–C).

Some tentative functionally important information can be derived from SSNMR spectra obtained at this early stage. A pair of CG-CB correlations around 172–173/37–39 ppm (Fig. 4, lower panel) are very typical for protonated buried Asp sidechains of BR, Asp96 and Asp115 (Metz et al. 1992; Jaroniec et al. 2001), and may belong to their homologs in LR, which are protonated, as obvious from the difference FTIR spectra (Fig. 5b) (Waschuk et al. 2005; Furutani et al. 2006).

Fig. 5
figure 5

FTIR analysis of the extent of labeling and functionality of LR reconstituted into DMPC:DMPA liposomes. a Static mid-infrared absorption spectra of dry films of DMPC/DMPA liposomes of natural abundance (blue) and 13C,15N-labeled (red) LR showing clear isotopic shifts of amide I and II bands, confirming the high labeling extent. b Light-induced difference FTIR spectra (time delay 2 ms after the laser flash) of the same samples hydrated with 0.05 M CHES, 0.05 M KH2PO4 and 0.1 M NaCl, pH 9, at 2°C. The characteristic bands are labeled, clear isotopic shifts of protein bands are observed (see text for details)

The analysis of the intensities of well-resolved Gly and Ala regions of the 2D 13C-13C and NCA correlation spectra (Figs. 3, 4) allows estimating the completeness of spectral coverage. Compared to TM regions, solvent-exposed loops, turns and tails are typically characterized by increased mobility, and result in lower signal intensities in dipolar-based correlation spectra. These observations have been previously made in a number of membrane-embedded systems, such as SR-II, DsbB, PR, and ASR (Etzkorn et al. 2007; Li et al. 2008; Shi et al. 2009b, 2010). It should be noted that LR has longer loops than its bacterial homologs, so one can expect lower degree of the spectral coverage if these loops are mobile. In our spectra, we observe the majority of these residues. For example, glycine resonances are well-resolved in the NCA spectrum (shown in box in Fig. 3), and their integration accounts for 20 out of 25 glycines (with only 10 glycines expected to be in the helical regions according to the homology modeling on the BR template). Likewise, alanine resonances can be integrated in the 13C-13C DARR spectrum (Fig. 4, boxed), and account for about 23 out of 36 alanines, 25 of which are expected to be in the helical regions (the lower fraction of alanines compared to glycines may be due to stronger influence of the local dynamics on sidechains as opposed to the backbone). High dispersion of Ala, Gly, and Thr peaks, along with the high variability in the peak intensity, shows the presence of α-helical and β-strand elements, along with random coil stretches (Wang and Jardetzky 2002) in the structure of LR.

Post-translational modifications, Extent of the Isotope-Labeling, and Functionality of the Protein

The purity and the presence of post-translational modifications of the sample were assessed by SDS–PAGE and MALDI TOF mass spectrometry, which showed the expected product at ~30.8 kDa (for the natural abundance protein) and a minor contaminant at 21.6 kDa. The observed molecular weight (30,836 ± 10 Da for the natural abundance protein) corresponds to the expected mass (30,832 Da) of the Na+ adduct of non-glycosylated LR. This molecular weight indicates that only a small four-residue part of the leader sequence (EAEA) is left on the N terminus, as a result of the incomplete STE13 post-translational processing (Buensanteai et al. 2010), possibly due to the proximity of the cleavage sites to the membrane surface. The lack of glycosylation is consistent with the absence of strong signals from carbohydrates in the 13C SSNMR spectra (Fig. 4) at 70–90 ppm (O’Leary et al. 2004), where only minor signals were detected, possibly from non-covalently bound glycans. Although the absence of glycosylation is rather unusual, as both N-linked and O-linked glycosylation is common for P. pastoris (O’Leary et al. 2004; Choi et al. 2003), glycosylation was also not observed for NR, another fungal rhodopsin expressed in Pichia (Bieszke et al. 1999).

The extent of isotope labeling in Pichia was reported to vary between 68 and 99%, depending on the exact growth protocol (Rodriguez and Krishna 2001). The high extent of isotope labeling of the expressed LR was confirmed by static and time-resolved difference spectroscopy (Fig. 5). The absolute spectra of dry films of liposomes containing natural abundance and doubly-labeled LR (Fig. 5a) allow comparison of the positions of Amide I (mostly backbone C = O vibrations, can report on the extent of 13C labeling) and Amide II (mostly backbone C–N vibrations, can report on the extent of both 13C and 15N labeling). The large shifts (43 and 27 cm−1, respectively) observed in the positions of both peaks are in a good agreement with those reported earlier (Vogel et al. 2007; Egorova-Zachernyuk et al. 2009). A number of prominent infrared bands not showing an isotopic shift can be readily assigned to various vibrations of synthetic lipids.

To verify the extent of 15N labeling, which could be masked by carbon isotopic shifts, we measured analogous FTIR spectra of 15N labeled LR (not shown) and found 16 cm−1 downshift of the Amide II band consistent with nearly complete nitrogen labeling (Egorova-Zachernyuk et al. 2011). Additionally, absolute FTIR spectra can report on the secondary structure of the expressed protein, verifying its native fold. The position of the Amide I peak maximum at 1,657 cm−1 (Fig. 5a) indicates predominantly α-helical structure of LR, while the shoulder at 1,628 cm−1 suggests the presence of some β-strands, consistent with the results of NMR (see above).

While absolute FTIR spectra can give an overall estimate of the labeling extent, they can not detect the presence of a small number of non-labeled groups as well as non-random absence of labeling in selected amino acids. Such lack of sensitivity is explained by the amide peaks broadness, baseline distortions, and variations in the lipid contribution to the FTIR signal. Additional investigation of the extent of isotopic labeling of specific groups in LR combined with testing of the sample functionality is possible through the analysis of time-resolved difference FTIR spectra of the hydrated LR samples. Such light-minus-dark difference spectra can report on the changes in chemical groups involved in the LR’s proton-pumping function (Waschuk et al. 2005; Sumii et al. 2005; Furutani et al. 2006; Fan et al. 2007). Overall character of the difference spectra (Fig. 5b) agrees well with those observed earlier for the proteoliposomes with lower protein:lipid ratios (Waschuk et al. 2005), confirming the functionality of LR at high protein:lipid ratio of the SSNMR samples. For example, the key proton transfer step could be clearly observed following the protonation signal of the primary proton acceptor Asp139 (at 1,759 cm−1), along with the retinal photoisomerization (bands at 1,201 and 1,188 cm−1). Comparison between the FTIR difference spectra of the natural abundance and doubly-labeled LR (Fig. 5b) reveals a number of bands, which do not show any isotopic shifts. Those prominent bands belong to the retinal chromophore, which was added externally, and as such was not 13C-labeled. On the other hand, many protein bands display large and complete isotopic shifts, such as those of the protonated carboxyl stretching vibrations of the three previously assigned Asp sidechains (at 1,759, 1,747, and 1,736 cm−1, all downshifted by 43 cm−1 upon labeling) (Waschuk et al. 2005; Furutani et al. 2006). One can also observe downshifts of the retinal Schiff base C=N vibrations (e.g., at 1,620 cm−1, reflecting 15N labeling of Lys sidechains), as well as prominent shifts of several symmetric COO stretches, presumably of Asp (Ikeda et al. 2007), such as that at 1,392 cm−1. These prominent isotopic shifts of vibrational bands along with that for the Amide II band (at 1,564 cm−1, overlapping asymmetric COO stretches) confirm the high extent of labeling of specific sidechains along with the backbone of LR. Some of the tentative assignments of the vibrational bands mentioned above were verified by measuring analogous difference spectra of 15N labeled LR (not shown), for example, to discriminate between C–N stretches of prolines and symmetric COO stretches of carboxylic acids.

Conclusions

We have demonstrated that expression of a eukaryotic 7TM helical protein in P. pastoris can produce samples suitable for structural studies by SSNMR in a cost-effective way. The samples are stable (at least several weeks at 5°C) and functional, have high extent of the uniform 13C,15N labeling, and give good spectral resolution comparable to that obtained for bacterial proteins of similar fold expressed in E. coli. Such spectral resolution allowed observation of resonances of nuclei of individual chemical groups and, in the case of bacterial proteins, lead to the assignment of majority of backbone and sidechain resonances, especially in the functionally important transmembrane regions (Shi et al. 2009b, 2010). New developments in SSNMR of polytopic helical membrane proteins combined with the possibility of their inexpensive uniform isotope labeling will eventually result in structural breakthroughs in the field of GPCRs and other eukaryotic membrane proteins.