10.1 Introduction

G protein-coupled receptors (GPCRs) are molecular gateways to the cell, allowing it to sense its environment and to communicate with other cells by decoding messages conveyed via diffusible signaling molecules. GPCRs are located on the cell surface and recognize a plethora of extracellular molecules, such as hormones, odorants, neurotransmitters, proteins, lipids, photons, and ions [1, 2], transmitting signals inside of the cell, primarily through coupling to heterotrimeric G proteins, but also through arrestins and other G protein-independent pathways [3,4,5]. This exposed cell surface location makes GPCRs ideal targets for therapeutic intervention. Indeed, about a third of current drugs approved by the FDA directly target GPCRs [6, 7]. However, out of over 800 human GPCRs, fewer than 110 currently have drugs designed for them [7, 8], while over 100 receptors remain orphans [9, 10], for which even endogenous ligands are unknown.

GPCRs undergo large-scale conformational changes between active and inactive conformations (Fig. 10.1a), with the active conformation making them amenable to interaction with G proteins and arrestins [3], initiating canonical signaling pathways and leading to intracellular responses. GPCR ligands can be classified based on their pharmacological efficacy [4] into agonists, (neutral) antagonists, and inverse agonists (Fig. 10.1a). While receptors are usually activated by agonist molecules, they can also display basal activity in the absence of any ligand. Molecules reducing receptor activity below the basal level are called inverse agonists, and molecules that occupy receptor binding sites without changing its activity level are called (neutral) antagonists. The inherent conformational flexibility of the receptor is an important reason for the difficulty to crystallize them.

Fig. 10.1
figure 1

GPCR activation, engineering, and stabilization. (a) Conformational plasticity of GPCRs. An unliganded apo receptor can sample multiple conformational states. Inverse agonists stabilize the inactive receptor conformation. Agonist binding triggers series of conformational rearrangements in the 7TM bundle leading to large-scale conformational changes on the intracellular side, most notably an outward movement of helix VI, priming the receptor for G protein binding. (b) GPCRs can be conformationally stabilized, for example by site-specific mutations (red circles) and by insertion of fusion partners before the helical bundle, or into ICL3 or ICL2, increasing their yield and stability and making them amenable to crystallization

High-resolution structure determination is a prerequisite for rational drug design and can provide a structural basis for understanding the molecular determinants of signaling. In recent years, three key developments have facilitated high-resolution structure determination of GPCRs by crystallography: (1) stabilization of the receptor by protein engineering, that is, receptor truncations, point mutations [11], and fusions with soluble protein domains [12] (Fig. 10.1b); (2) crystallization in lipidic cubic phase (LCP), a membrane mimetic, native-like crystallization environment [13]; and (3) advances in micro-crystallography [14]. Furthermore, specific conformational states of the receptors can be stabilized by ligands, allosteric modulators, peptides, antibodies, nanobodies [15, 16], and signaling partners, among others, further increasing their propensity to crystallize [17]. Since the first high-resolution structure of the human β2-adrenergic receptor bound to a diffusible ligand was published in 2007 [18], overall 45 structures of unique GPCRs have been determined to date (November 2017; Fig. 10.2). Most of these receptors were crystallized in an inactive conformation that is stabilized by an antagonist, or an inverse agonist. Several of the available agonist-bound structures display conformational signatures of an active state; however, only a few of them are captured in a fully active state in complex with a G protein, arrestin, or their mimetics [20,21,22].

Fig. 10.2
figure 2

Timeline of GPCR structure determination. A human GPCR sequence homology tree is shown on the left. X-ray crystal structures (highlighted in blue) have been obtained for representatives of all non-olfactory GPCR classes except for Adhesion. There has been a steady increase in the number of GPCR structures over the years (right). GPCR structures obtained by LCP-SFX are highlighted in red and encircled on the tree (left), and their names are shown on the timeline (right). The figure is modified from Ref. [19]

GPCRs share a general topology of a seven transmembrane helical bundle (7TM) with an extracellular N-terminus and an intracellular C-terminus, which often includes a short intracellular amphipathic helix VIII. The transmembrane helices are connected by three intracellular (ICL) and three extracellular loops (ECL), where ECL2 often plays a critical role in ligand binding, while ICL2 and ICL3 engage in G protein and arrestin coupling. Fusion partners are typically inserted in ICL3 or ICL2, or placed at the N-terminus before the 7TM bundle (Fig. 10.1b). Posttranslational modifications of termini and loops are common; for example, the N-terminus and ECLs can be glycosylated, ICLs and C-terminus phosphorylated, and the C-terminus palmitoylated.

While overall sequence identity between different receptors is low, based on their common topology, conserved sequence features, and motifs, as well as presence of extracellular domains, GPCRs are typically grouped into five classes, that is, class A (rhodopsin-like), which constitutes the largest class, class B (secretin-like), class C (glutamate-like), class Frizzled/Taste2, and class Adhesion (GRAFS classification system [23], Fig. 10.2). Typically, class A receptors consist mostly of a 7TM, while class B receptors also contain an extracellular domain that binds peptide ligands [24]. Class C receptors also contain a large extracellular domain and form functional homodimers or heterodimers [25]. Initially, GPCR structure determination was focused on class A receptors and on the 7TM domains of class B, C, and Frizzled/Taste2 GPCRs; recently, structures of full-length non-class A receptors such as the glucagon receptor GCGR [26] and the smoothened receptor [27] have become available. Comparison between class A GPCRs has been facilitated by generic residue numbering schemes, for example the Ballesteros-Weinstein (BW) numbering [28], which predates experimental GPCR structure determination, and for every transmembrane residue denotes helix number X and residue position relative to the most conserved residue for that helix, which is assigned number X.50. More recently, the BW numbering has been updated based on more abundantly available structural information and extended to other GPCR classes [29]. With increasing coverage of structural space, as well as the availability of receptor complexes and full-length structures, rational structure-based drug design will become a routine application over the next few years, and a solid structural foundation will aid the understanding of endogenous signaling processes.

Despite recent technical advances and the impressive rate at which novel GPCR structures have become available, receptor crystallization remains extremely challenging and tedious. Initial GPCR crystal hits obtained via LCP crystallization are typically too small (<10 μm) for data collection at microfocus synchrotron beamlines due to a fast onset of radiation damage. Crystal optimization often takes months to years, and in some cases may fail to improve crystal size and quality [21, 30]. The recently emerged hard X-ray free electron lasers (FELs) have revolutionized structural biology by enabling high-resolution structure determination from micrometer-sized crystals at room temperature with minimal radiation damage and by providing access to ultrafast time-resolved conformational transitions in biological macromolecules [31]. Extremely short duration (femtoseconds) X-ray FEL pulses outrun radiation damage, and, since each extremely bright pulse destroys the sample, the data are typically collected using a serial femtosecond crystallography (SFX) approach, in which crystals are constantly replenished either by streaming them across the beam or by using a fast scanning of crystals deposited on a solid support, also known as a fixed target approach, as discussed in Chap. 5.

GPCR microcrystals grown in LCP are particularly suitable for SFX due to their typically high diffraction quality and small size. The first successful applications of LCP-SFX to GPCR microcrystals were enabled by the development of a special injector capable of streaming LCP [30] and by the introduction of concurrent sample preparation approaches [32, 33]. With structures of ten different receptors determined by LCP-SFX during the last 4 years (Fig. 10.2), the method has proven its strength in acquiring high-resolution structural information from microcrystals of challenging membrane protein targets. In this chapter, we will outline the major steps of GPCR sample preparation for LCP-SFX, summarize successful GPCR structure determination studies including de novo phasing, and conclude with future perspectives of applying X-ray FEL radiation to study structure and dynamics of GPCRs as well as other challenging proteins and their complexes.

10.2 GPCR Sample Preparation and Delivery for LCP-SFX

While in the case of traditional crystallography, a single large, well-ordered crystal is desired for data collection and structure determination, the LCP-SFX sample preparation and optimization are instead focused on obtaining sufficient quantities of micrometer-sized crystals uniformly dispersed in LCP at a high density. Therefore, the LCP-SFX sample preparation is generally conducted in two separate steps (Fig. 10.3). The first step consists of a construct optimization and high-throughput crystal screening and optimization in 96-well plates, while the second step involves scaling up crystallization volume approximately 1000 times by setting up crystallization in gas-tight syringes.

Fig. 10.3
figure 3

A flow chart for GPCR sample preparation for LCP-SFX data collection. The GPCR structure determination process consists of two general steps (black and red outlines) and involves feedback loops with several stages of evaluation and quality assessment (yellow)

10.2.1 GPCR Construct Optimization and Screening

A major bottleneck in crystallization of GPCRs is related to their low expression level and highly dynamic nature, and therefore, a range of modifications of the wild-type receptors is typically necessary to increase their yield and stability. These modifications include N- and C-terminal truncations, point mutations [11], and fusion partner insertions [12] (Fig. 10.1b and Table 10.1). The resulting modified constructs are expressed in insect or mammalian cells, solubilized in detergent micelles, purified by immobilized metal affinity chromatography (IMAC), and characterized by a number of assays. Several iterations of construct engineering are typically required to obtain a highly stable and monodisperse receptor suitable for crystallization (Fig. 10.3). Another prerequisite for a stable receptor sample is the identification of a high-affinity ligand that keeps the receptor in either an inactive (antagonist or inverse agonist bound) or an active (active-like) conformation (agonist bound). The influence of ligands on the receptor construct can be evaluated using a fluorescence-based thermal shift assay which utilizes native buried cysteine residues of the receptor for covalent binding of the thiol-specific fluorophore N-[4-(7-diethylamino-4-methyl-3-coumarinyl)phenyl]maleimide (CPM) [39]. As the sample temperature is increased, the receptor unfolds and exposes its previously buried cysteines to binding by the dye. The subsequent increase in fluorescence can be measured as a readout for the overall unfolding of the protein. Typically, ligands that yield higher melting temperatures and therefore more stable receptor-ligand complexes are prioritized for subsequent trials.

Table 10.1 Construct design and crystallization conditions for GPCR structures determined by LCP-SFX

Once a stable and monodisperse construct has been identified, the potential for crystallization is examined using the fluorescence recovery after photobleaching (or LCP-FRAP) assay [40]. In this assay, purified protein is labeled with the fluorescent dye 5,5′-disulfato-1′-ethyl-3,3,3′,3′-tetramethylindocarbocyanine (Cy3), reconstituted in LCP, and set up in 96-well glass sandwich plates against a set of crystallization screening solutions. A small, few micrometer-sized spot in the LCP drop is then photobleached with a laser, after which the recovery of fluorescence in this spot is monitored over time, providing information about protein diffusion. If the receptor-ligand complex is found to diffuse well in LCP, it is promoted to crystallization trials, while if the outcome is negative (no apparent protein diffusion), further optimization of the receptor-ligand combination is required. Additionally, LCP-FRAP experiments have shown to be extremely useful for selecting the most promising precipitant conditions for subsequent crystallization trials and in fact, many initial crystal leads have been found in such setups with high sensitivity and specificity due to the fluorescent labeling.

10.2.2 High-Throughput Crystallization Screening and Optimization

High-throughput crystallization trials are performed in 96-well glass sandwich crystallization plates, where 40 nL LCP boli containing purified protein are overlaid with 800 nL of precipitant solutions (Fig. 10.4a). Typically, a purified receptor at a concentration of 20–50 mg/mL is reconstituted in LCP by mixing with molten host lipid monoolein containing 10% w/w cholesterol at a 2:3 v/v ratio and then dispensed onto glass sandwich plates using either manual setups or an LCP crystallization robot [13]. The resulting drops are incubated at 20 °C and imaged at regular intervals. Conditions exhibiting showers of small crystals (typically in the low micrometer range) are chosen for the subsequent scale-up in syringes (Fig. 10.4b). Since LCP-SFX data collection requires a high density of uniformly sized microcrystals, samples can be further characterized using bright-field and cross-polarized light microscopy, Second-Order Nonlinear Imaging of Chiral Crystals (SONICC) [41], and Transmission Electron Microscopy (TEM) [42] (Fig. 10.4c) (for further reading about detection and characterization of microcrystals, see Chap. 3).

Fig. 10.4
figure 4

Sample preparation and LCP-SFX data collection. (a) High-throughput nanovolume crystallization screening in 96-well glass sandwich plates. (b) Scale-up of crystallization in syringes. (c) Microcrystal visualization and characterization using different imaging modes (bright-field, cross-polarizers, two-photon UV fluorescence, SONICC). (d) A layout for an LCP-SFX experiment. These figures are reproduced from Ref. [33]

10.2.3 Crystallization Scale-Up in Syringes

Structure determination by LCP-SFX requires a dataset typically containing a few tens of thousands indexed diffraction patterns from microcrystals intersecting the X-ray beam in random orientations (Table 10.2). In practice, this translates into a sample volume of 50–100 μL with a crystal density of ∼105 μL−1 [33], meaning that conditions optimized in 96-well sandwich plates should be scaled up at least 1000 times by volume. Scaling up LCP crystallization setups, however, is not always straightforward because of slow rates of precipitant diffusion through the LCP volume. In the sandwich plates, the LCP bolus is squeezed between two glass slides forming a disk-like shape, ∼500 μm in diameter, with precipitant diffusing into LCP around the perimeter of the disk. Therefore, a proper method for scaling up is by mimicking the same geometry as much as possible, which could be accomplished by introducing LCP as an extended filament of the diameter ∼500 μm inside a reservoir filled with the precipitant solution. In practice, this is achieved by injecting ∼5–8 μL aliquots of protein-laded LCP as a continuous string into 100 μL Hamilton gas-tight syringes prefilled with a 10–15-fold excess of the precipitant solution and incubating them at 20 °C until crystals form (Fig. 10.4b) [33]. However, since the overall geometry of the syringe setup is not completely identical to that of the sandwich plates, small optimizations of the crystallization conditions are often necessary to ensure that a sample with a sufficiently high density of micrometer-sized crystals is produced [33].

Table 10.2 SFX data collection and processing statistics

Just before starting LCP-SFX data collection, samples from several syringes are combined together by expelling the excess of precipitant solutions from each syringe and consolidating their content into a single syringe. After combining samples, the remaining precipitant solution has to be absorbed to yield a homogeneous LCP sample capable of being run in the injector. This is achieved by adding approximately 10% v/v of 7.9 MAG lipid [30] (a cis-monounsaturated 1-monoacylglycerol, referred here by the N. T MAG notation, where N represents the number of carbons between the ester bond and the double bond and T corresponds to the number of carbons between the double bond and the terminal methyl group). The addition of this lipid helps to prevent the formation of a lamellar crystalline phase, which may occur due to evaporative cooling and dehydration upon injection of the sample in a vacuum environment [44]. If the sample is intended to be injected in the chamber at ambient pressure, the same lipid as the one used for crystallization (e.g., monoolein) can be added in this step. At the same time, it is also possible to dilute the sample with freshly prepared LCP, should the crystal density be deemed too high, which should help to avoid recording multiple crystal diffraction patterns in individual detector frames during data collection.

Due to the viscous nature of LCP, a special LCP injector (also known as a High Viscosity Extrusion injector as described in Chap. 5) was designed that is capable of streaming LCP at a wide range of flow rates (0.001–3 μL/min) [30]. LCP sample is typically loaded into a 25 or 40 μL reservoir and extruded through a 15–70 μm diameter capillary using a hydraulic plunger driven by an HPLC (High-Performance Liquid Chromatography) pump connected through a tube filled with water. The extruded LCP stream is supported by a sheath of gas, typically helium or nitrogen, to keep it flowing straight. The LCP flow rate can be matched to the X-ray FEL pulse rate to supply fresh crystals for every shot, while simultaneously minimizing the sample waste between pulses. Unlike in the gas dynamic virtual nozzle (GDVN) injector [45], the sheath gas does not focus the LCP stream below the diameter of the capillary nozzle. Therefore, while smaller diameter nozzles can decrease scattering background, they are prone to clogging and require much higher pressures (up to 10,000 psi for a 15 μm nozzle) for successful LCP extrusion. The most optimal nozzles for sub-10 μm crystals have been empirically found to be 30–50 μm in diameter. Larger diameter nozzles can be used to increase hit rates for samples with lower crystal densities, while smaller diameter nozzles can help with reducing sample consumption and background scattering. By keeping the LCP-crystal stream diameter to 30–50 μm, an entire dataset can be collected using <0.3 mg of purified protein [30]. For comparison, in case of a GPCR expression in insect cells, a typical yield is ∼1 mg of purified receptor per 1 L of expression media. Additional details on the setup for sample delivery of crystals in LCP are further discussed in Chap. 5.

10.2.4 Sample Selection for LCP-SFX Data Collection

Since beam time at X-ray FEL sources is extremely limited, it is important to pre-screen samples and select those with the best chances for a successful outcome. It is reasonable to expect that conditions yielding the best diffracting crystals at synchrotron sources would also perform comparatively well at an X-ray FEL. Therefore, pre-screening samples at a synchrotron, if feasible, can help with the selection of the most promising conditions for precipitants, salts, pH, and additives. Once such conditions have been identified, it is often sufficient to slightly adjust the concentrations of the main components to produce suitable samples for an X-ray FEL source.

Two of the most important parameters for samples prepared for LCP-SFX data collection are a high crystal density and an optimal crystal size, which are inversely related to each other. In general, GPCR crystals of 5–10 μm size are optimal for LCP-SFX since they produce sufficiently strong diffraction and compatible with relatively high crystal densities, which are required to achieve reasonable crystal hit rates (>5%). Unfortunately, no reliable procedure to concentrate crystals in LCP has been established yet, and, therefore, the crystal density cannot be increased once the crystals have already grown. Another important consideration, especially when collecting data from crystals injected into a vacuum environment, is to avoid precipitant components with low solubility, if possible. Such compounds can readily crystallize upon sample extrusion in vacuum and produce a strong powder diffraction, which could be potentially damaging to sensitive detectors.

In contrast to traditional goniometer-based crystallography, where a crystal is rotated during data collection, SFX data are collected from a large number of randomly oriented still (on the femtosecond time scale) crystals, one shot per crystal, and therefore all recorded reflection intensities are partial. The data are merged using Monte-Carlo approaches, meaning that the accuracy of derived structure factor amplitudes has a strong dependency on the multiplicity of the data, whereas the completeness very quickly reaches 100%. Thus, an SFX dataset is considered “complete” when a sufficient accuracy of the data is achieved, which typically requires an average multiplicity of a few hundred. With rapid advancements in the SFX data processing software [46,47,48,49,50,51,52], the required number of diffraction images to reach the desired data quality is constantly decreasing and currently constitutes of 10,000–30,000 indexed patterns. With a 5–10% hit rate, it takes approximately 30–50 μL of crystal-laden LCP to collect enough data at the X-ray FEL pulse repetition rate of 120 Hz, which, at a flow rate of 0.2 μL/min, translates into 2.5–4 h of continuous beamtime (Table 10.2).

10.3 Review of Published Structures

10.3.1 From Validation to Novel GPCR Structure Determination

The LCP-SFX method was first introduced and validated in 2013 through the high-resolution structure determination of the human serotonin 5-HT2B receptor in complex with an agonist ergotamine [53], which was shown to be of comparable quality to the structure previously obtained using synchrotron data [54]. While the synchrotron structure was obtained using relatively large crystals (80 × 20 × 10 μm3) at cryogenic temperatures, the data collection with an X-ray FEL was performed using much smaller 5 × 5 × 5 μm3 crystals and at room temperature. The results confirmed that the LCP-SFX method enables structure determination from sub-10-μm-sized GPCR crystals at room temperature and without apparent radiation damage effects, while providing more accurate insights into receptor structure and dynamics at close to native conditions. The discrepancies between the synchrotron and X-ray FEL data were found predominantly in several side-chain conformations of solvent-exposed amino acids, which supports the view that cryo-cooling of crystals used in synchrotron data collection can trap some side-chains in artificial conformations [53, 55].

After successful validation of LCP-SFX with 5-HT2B/ergotamine, the method was further applied to solve the structure of the human smoothened receptor with the truncated cysteine-rich domain (ΔCRD-SMO) in complex with the teratogen cyclopamine [30]. SMO mediates signal transduction in the hedgehog pathway, which is implicated in normal embryonic cell development and in carcinogenesis. SMO antagonists can suppress the growth of some tumors; however, mutations in SMO have been found to abolish their antitumor effects, a phenomenon known as chemoresistance. Due to poor diffraction with high mosaicity and anisotropy of relatively large cryo-cooled crystals (∼120 × 10 × 5 μm3), the ΔCRD-SMO/cyclopamine structure could not be obtained at synchrotron sources. In contrast to the synchrotron data, the LCP-SFX data collected from sub-10-μm-sized crystals at room temperature were of reasonable quality, allowing for the structure to be solved by molecular replacement after application of an ellipsoidal data truncation at 3.4, 3.2, and 4.0 Å along the three principal crystal axes. The binding pose of the ligand cyclopamine within a narrow elongated binding cavity inside the 7TM domain of SMO was well resolved and provided the structural basis for understanding SMO receptor modulation and chemoresistance [56].

Alkaloid opiates, such as morphine, are effective and widely prescribed against moderate to severe pain. These drugs target μ-opioid receptor (μ-OR), which together with δ-OR and κ-OR play a crucial role in pain management, mood states, consciousness, and other neurophysiological phenomena. However, prolonged administration of opioids often leads to increased tolerance, dependence, and addiction. It was shown that co-administration of morphine with δ-OR antagonists helps to reduce morphine tolerance effects in rodents [57]. The H-Tyr-Tic-Phe-Phe-OH (TIPP) class of endomorphin-derived peptide analogs offers remarkable variety in efficacies with mixed δ-OR and μ-OR profiles. The LCP-SFX method was used to determine the structure of the human δ-OR bound to a bifunctional δ-OR antagonist and μ-OR agonist tetrapeptide H-Dmt-Tic-Phe-Phe-NH2 (DIPP-NH2) belonging to the TIPP class [34]. Initially, the X-ray crystal structure of the δ-OR–DIPP-NH2 complex was determined at 3.3 Å resolution using the synchrotron X-ray diffraction from cryo-cooled crystals, however, the electron density for the DIPP-NH2 peptide was not of sufficient quality to unambiguously trace it. By using LCP-SFX the resolution was dramatically improved to 2.7 Å, showing excellent density for the peptide ligand (Fig. 10.5a). The structure revealed crucial atomic details of the bifunctional pharmacological profile of DIPP-NH2. Using a superposition with the previously solved structure of μ-OR, it was observed that DIPP-NH2 clashes with the side chains of non-conserved residues Trp3187.35 and Lys3036.58, highlighting the importance of these residues for the bifunctional properties of the peptide. This structure also revealed important details of the peptide recognition by GPCRs, given that the structural information about peptide GPCR complexes is limited, making it valuable for structure-based drug design efforts.

Fig. 10.5
figure 5

Examples of GPCR structures solved by LCP-SFX. (a) The ligand binding pocket of the δ-opioid receptor in complex with DIPP-NH2 (PDB ID 4RWD). The figure is reused from Ref. [34]. (b) A comparison of the ligand binding pose between two angiotensin receptors, AT1R (orange ligand, PDB ID 4YAY) and AT2R (purple ligand, PDB ID 5UNG). The figure is reused from Ref. [36]. (c) The structure of a complex between rhodopsin (blue) and arrestin (purple) (PDB ID 5W0P). (d) The full-length GCGR receptor in complex with a small molecule allosteric modulator (purple) and a monoclonal mAb1 antibody fragment (teal) (PDB ID 5XEZ). The figure is reused from Ref. [26]

10.3.2 First Novel GPCR Structures Solved Using LCP-SFX

The next important milestone of LCP-SFX was achieved in 2015 with the first structure determination of a novel GPCR, the human angiotensin II receptor type 1 (AT1R) [35, 36]. AT1R serves as a primary blood pressure regulator in the cardiovascular system. Although several AT1R blockers (ARBs) have been developed and approved as antihypertensive drugs, the structural basis for AT1R ligand-binding and regulation has remained elusive, mostly due to the difficulties of growing high-quality crystals for structure determination using synchrotron radiation. By applying the LCP-SFX method, the first room-temperature crystal structure of the human AT1R in complex with its selective antagonist ZD7155 was solved at 2.9 Å resolution [35] using crystals with an average size of 10 × 2 × 2 μm3. This structure revealed the critical interactions between ZD7155 and the receptor and served as a basis for the binding mode determination of other ARBs by means of molecular docking [58].

Successful structure determination of AT1R was followed by the work on the second angiotensin II receptor type 2 (AT2R), which is another key component of the renin–angiotensin–aldosterone system. In contrast to the well-defined function of AT1R, the function of AT2R is unclear, with a variety of reported effects [59, 60]. The initial crystal hits were optimized to produce a high density of small crystals. Interestingly, analysis of the collected LCP-SFX data revealed that the crystal suspension contained two distinct crystal forms in the same crystallization setup. Consequently, two structures of the human AT2R bound to an AT2R-selective ligand (cpd-1) were determined at 2.8 Å resolution in two different space groups (monoclinic P21 and orthorhombic P21221) [36]. Both structures captured the receptors in an active-like conformation. Unexpectedly, helix VIII was found in a noncanonical position. In most previously reported GPCR structures helix VIII runs parallel to the membrane on the intracellular side, whereas in the AT2R structure helix VIII flips over to interact with the intracellular ends of helices III, V, and VI, thus stabilizing the active-like state, but at the same time sterically preventing the recruitment of G proteins or β-arrestins. This finding is in agreement with the absence of signaling responses of AT2R in standard cellular assays [61, 62]. The AT2R structure provides insights into the structural basis of the distinct functions of the angiotensin receptors and may guide the design of new selective ligands (Fig. 10.5b).

10.3.3 Rhodopsin–Arrestin Complex Structure

After success with structure determination of novel GPCRs, a phenomenal advantage of LCP-SFX over traditional crystallography was demonstrated with the determination of the first high-resolution structure of a major signaling complex between a GPCR and an arrestin [21]. Most GPCRs upon activation primarily signal via interactions with G proteins, followed by phosphorylation of their C-terminus and ICLs by G protein-coupled receptor kinases (GRKs). Activated and phosphorylated receptors are recognized by arrestins, which, in turn, block interaction with G proteins and induce internalization. Arrestins are also responsible for triggering a variety of G protein-independent signaling cascades, and hence they constitute an essential component of GPCR signaling pathways. Biased GPCR ligands that selectively direct signaling towards specific pathways bear significant therapeutic benefits with fewer side effects compared to unbiased ligands. The first GPCR signaling complex of β2-adrenergic receptor bound to its cognate Gs protein was solved in 2011 [22]. Structural details of arrestin binding to GPCRs, however, remained undiscovered until 2015, when finally the first structure of the human visual rhodopsin in complex with arrestin was obtained by LCP-SFX [21].

Rhodopsin is a prototypical GPCR that functions as a photon receptor in the visual systems. Crystal structures of rhodopsin have been previously determined in various states: a ground inactive form [63], a partially active form (opsin) [64], and a fully active form in complex with a C-terminal peptide of Gα [65]. The structures of an inactive and preactivated arrestin are also available [66, 67]. Obtaining the structure of the rhodopsin–arrestin complex, however, required overcoming additional challenges. The wild-type complex is characterized by heterogeneous interaction modes between the rhodopsin and arrestin. Therefore, the proteins had to be engineered to stabilize a single state for crystallization [21]. A constitutively active mutant of the human visual rhodopsin (E113Q and M257Y) along with a pre-activated form of the mouse arrestin (3A arrestin with the mutations L374A, V375A, and F376A) were used to increase their mutual affinity. To further improve the stability and shift the equilibrium towards the complex formation, a polypeptide linker was designed to connect the C-terminus of rhodopsin with the N-terminus of arrestin. Initial crystals of the complex ranging in size between 5 and 20 μm were obtained using the LCP crystallization technique. Despite extensive optimization efforts, the diffraction quality could not be improved beyond 6–8 Å at synchrotron microfocus beamlines. Therefore, crystallization conditions were optimized to yield showers of 5–10-μm-sized crystals in syringe crystallization setups. Due to a relatively low hit rate, about 10 h of LCP-SFX data collection were required to solve the structure [21, 68]. Diffraction patterns from 18,874 crystals were indexed and integrated. The data were initially processed in the apparent Laue class 4/mmm with a large unit cell (a = b = 109.2 Å, c = 452.6 Å). Structure determination was, however, complicated by a pseudo-merohedral twinning, caused by the identical a and b axes, and a pseudotranslation along the a and b axes. Finally, the structure was successfully determined by lowering the lattice symmetry in the P212121 spacegroup with four molecules in the asymmetric unit with perfect twinning using a twin law of k, h, −l. The diffraction was anisotropic with resolution limits of 3.8 Å and 3.3 Å along the a/b and c axes, respectively. The obtained structure represented the first crystal structure of a GPCR bound to arrestin and, together with additional biophysical and biochemical data, provided a molecular basis for understanding the mechanism of arrestin-mediated signaling [21].

The key findings of this study include the following observations: (1) rhodopsin, within the complex, adopts a canonical active state conformation, overall highly similar to the conformation of β2AR in complex with Gs protein [22], except for small deviations of helices I, IV, V, and VII, some of which may be related to the mechanism of arrestin-biased signaling; (2) the phosphorylated C-terminal tail of rhodopsin is paired to the highly cationic N-terminal domain of arrestin, displacing its C-terminus and triggering arrestin activation; (3) activated arrestin undergoes a 20° rotation between its N- and C-domains, consequently opening a cleft between the middle loop and the C-loop to accommodate the ICL2 helix of rhodopsin; (4) additionally, the finger loop of arrestin adopts a short alpha-helical conformation, which fits in the opening created by the outward displacement of helix VI on the intracellular side of rhodopsin and interacts with helices VII and VIII; and (5) finally, a conserved hydrophobic patch at the C-tip of arrestin anchors it in the lipid bilayer helping to stabilize the arrestin-rhodopsin interactions.

More recently, improvements in data processing algorithms made it possible to re-process the LCP-SFX data for rhodopsin–arrestin complex leading to an increased resolution (3.6 Å and 3.0 Å along a/b and c axes respectively) and a better quality electron density maps [43]. The improved maps allowed the C-terminus of the receptor to be traced (Fig. 10.5d), helping to identify a set of phosphorylation codes for arrestin recruitment by GPCRs.

10.3.4 Full-Length Smoothened Receptor Structure

Due to challenges with crystallization of multidomain non-class A receptors, the first GPCR structural studies were focused on their 7TM domains [69, 70]. The progress in receptor stabilization techniques and the development of LCP-SFX facilitated the structure determination of the full-length receptors. While the initial structures of ΔCRD-SMO in complex with several 7TM antagonists and agonists shed light on the ligand binding poses and interactions with the 7TM domain [56, 70], the mutual arrangement of the extracellular domains (ECD), which include a hinge domain (HD) and an extracellular cysteine-rich domain (CRD), remained unknown. Since ECD plays an important role in ligand recognition and receptor activation through allosteric effects, this information was an important missing piece for a full mechanistic understanding of SMO function. Previous biochemical studies have revealed a ligand-binding site that is situated on the surface of the extracellular cysteine-rich domain, targeted by cholesterol-like molecules [71]. It was shown that CRD has an auto-inhibitory effect on SMO [72], whereas oxysterols release CRD suppression and activate the hedgehog pathway. In order to suggest a model of the SMO activation mechanism a structure of the multidomain human SMO in complex with a specially designed super-stabilizing ligand was solved using LCP-SFX and synchrotron data [27]. The structure revealed a hydrophobic pocket that is formed by CRD (residues V107, L108, L112), HD (residue V210), and ECL3 (residues V494, I496) and constitutes an oxysterol binding site. Comparison of these structures with the concomitantly published multidomain SMO structures in complex with vismodegib and cholesterol [73] revealed important structural features, namely, the CRD tilting angle was found to be different in all structures, along with rearrangements of ECL3 supporting this conformation. The structural data combined with hydrogen-deuterium exchange analysis and molecular dynamics simulations suggested a unique mechanism, in which helix VI, ECL3, and HD play a central role in the signal transmission across the receptor.

10.3.5 Full-Length Class B Glucagon Receptor Structure

Class B GPCRs (secretin family) are mostly peptide hormone receptors that are indispensable drug targets for metabolic diseases, like diabetes, cardiovascular disease, neurodegeneration, and some psychiatric disorders [74]. These receptors consist of an N-terminal extracellular domain (ECD) and a 7TM domain, both of which are required for binding to their endogenous peptide ligands and regulation of cell signal transduction. The glucagon receptor GCGR belongs to class B and plays a key role in glucose homeostasis and the pathophysiology of type 2 diabetes. It has long been targeted by structural studies, and the structure of the 7TM domain bound to a small molecule drug was solved by conventional synchrotron crystallography using crystals grown in LCP [69, 75]. Although this structure provided important information about the receptor, the lack of ECD limited our understanding of the receptor function. Crystallization of the full-length receptor required further efforts of construct optimization and, in particular, utilization of a stabilizing Fab (fragment antigen binding) antibody fragment. Eventually, the structure of the full-length human GCGR containing both ECD and 7TM domains in complex with a Fab fragment of an inhibitory antibody mAb1, and a negative allosteric modulator NNC0640 was determined at 3.0 Å resolution using the LCP-SFX method (Fig. 10.5d) [26]. As in most of the previous examples, the data collected from small crystals with an X-ray FEL showed superior quality compared to the synchrotron data, thereby improving the resolution of the dataset from 3.2 Å to 3.0 Å. Despite the challenge of the twinned data, the GCGR/NNC0640–mAb1 complex structure was solved by molecular replacement (MR). No substantial differences were observed between the X-ray FEL and the synchrotron structures with a backbone r.m.s.d. (root-mean-squared deviation) of 0.6 Å. The GCGR/NNC0640–mAb1 structure revealed an unexpected conformation of the stalk region, which links 7TM with ECD. Whereas in the initial structure of the 7TM domain it formed a three-turn α-helical extension of helix I, in the full-length structure the stalk adopts a β-strand conformation that runs across the helical bundle flanked by ECL1 on one side and ECL2 and ECL3 on the other. Given such a dramatic difference in the conformation of the stalk, the relative orientation between ECD and TMD revealed by the full-length structure was observed to be drastically different compared to a predicted model based on the 7TM structure alone. In addition, ECL1 was found to exhibit a β-hairpin conformation interacting with the stalk to form a compact β-sheet structure, potentially playing a critical role in modulating peptide ligand binding and receptor activation.

10.3.6 Structural Basis for GPCR Extracellular Recognition by Antibodies

With their growing success in clinical studies, monoclonal antibodies (mAbs) have become a critically important modality and a powerful alternative to small molecule therapies [76]. Recently developed mAbs have demonstrated a twice higher chance of approval by FDA than conventional small molecule compounds [77]. Despite their significant success compared to other approaches, there is still a considerable rate of failure with 85% of leads falling through the clinical trials, which emphasizes the need of deeper understanding of the underlying biology and interactions with antigens in particular. Due to their potentially high affinity, selectivity, long duration of action and engineered ability to penetrate the blood-brain barrier, mAbs are very suitable for targeting a large variety of GPCRs. Unfortunately, the most abundant class A GPCRs is characterized by relatively small extracellular solvent-exposed surface making the production of high affinity, selective mAbs very challenging. In order to gain insights into the molecular basis of extracellular recognition of GPCRs by mAbs, a complex between the human 5-hydroxytryptamine 2B (5-HT2B) receptor bound to the agonist ergotamine (ERG) and a selective antibody Fab fragment was crystallized and solved by means of LCP-SFX [38]. While previous structures of 5-HT2B/ERG were captured in a partially activated state with only some of the activation features observed [78], the 5-HT2B/ERG-Fab structure reveals the receptor in a distinct active-like state, with extensive activation-related changes displayed throughout the receptor including conserved activation “microswitches” and large-scale intracellular displacements of helices VI and VII [38]. This work also provided the first insight into structural determinants for extracellular GPCR recognition by mAbs, as all the previous structures of GPCR-antibody complexes contained Fabs/nanobodies bound to the intracellular side of the receptors. The 5-HT2B/ERG-Fab structure, therefore, can be considered an important first step towards a rational development of therapeutic mAbs.

10.4 Experimental Phasing of XFEL Data for GPCRs

All novel GPCR structures obtained by LCP-SFX and discussed in this chapter were solved using the molecular replacement (MR) method that is based on previous knowledge of related structures. The overall conservation of the 7TM fold and the presence of the fusion domains of known structure make it relatively straightforward to generate search models for MR. However, a good search model for MR may not be available for all targets. For instance, in the case of GPCRs, the majority of structural information comprises class A receptors, whereas class B, C, and Frizzled are represented by a few structures only. If it is impossible to create an adequate model for an MR search, experimental phasing methods become necessary. These are typically based on the introduction of anomalous scatterers into crystals that do not change the target structure (i.e., are isomorphous). The first successful experimental phasing of SFX data was demonstrated with lysozyme crystals using single-wavelength anomalous dispersion (SAD) of gadolinium, which exhibits a very strong anomalous signal [79]. Attempts to use experimental phasing of SFX data for another test soluble protein, the luciferin-regenerating enzyme, with a more conventional mercury compound, succeeded by the use of the SIRAS (single isomorphous replacement with anomalous signal) method [80]. A very recent work of Colletier et al., [81] showed that experimental phasing using X-ray FELs could be achieved for crystals with an average size of 500 nm, which corresponds to approximately 50 unit cells per crystal edge. In this work, the experimental phases were derived using multiple isomorphous replacement with anomalous scattering (MIRAS) from combining three heavy-atom derivatives and a native dataset.

These methods rely on the incorporation of heavy atoms into protein crystals, which requires extensive screening of various compounds at different concentrations, while many of them suffer from poor solubility. For example, while derivatization with Ta6Br12 clusters had previously been successful in the case of SMO and mGluR1 at synchrotron sources [70, 82], our attempts of using the same approach for X-ray FEL data collection with crystals of 5-HT2B receptor had limited success. Ta6Br12 precipitated and crystallized immediately upon delivery into the vacuum environment during the LCP-SFX data collection, which resulted in very bright Bragg reflections on the detector. To prevent detector damage the X-ray FEL beam had to be attenuated significantly, resulting in the anomalous signal extending to ∼8 Å only. And although the Ta6Br12 cluster was incorporated into crystals and could be located in the anomalous difference electron density map, the phasing attempts were unsuccessful. Moreover, quite often efficient isomorphous incorporation of heavy atom compounds in the crystal lattice is unattainable.

On the other hand, the sulfur SAD (S-SAD) phasing method allows for the determination of protein structures without additional modification of crystals such as heavy-atom derivatization or incorporation of selenomethionine. This method has been proposed by Hendrickson and Teeter in 1981 [83] and is becoming a more and more popular method for de novo experimental phasing due to advances in data collection techniques and data processing algorithms. Challenges, however, remain due to a very small anomalous signal from sulfur atoms requiring a very accurate measurement of the anomalous differences.

The first successful S-SAD phasing of SFX data has been demonstrated for lysozyme crystals [79] followed by another test protein, thaumatin [84]. These proteins are widely used as crystallization standards and in the development of new crystallographic methods, because they are commercially available and inexpensive, their crystal suspensions are stable for years, and their crystals diffract to high resolution. The highest reported resolution for lysozyme crystals is 0.65 Å (PDB ID: 2VB1) and for thaumatin crystals is 0.94 Å (PDB ID: 2VHK). Most GPCR crystals, however, diffract in the range of 2.4–3.4 Å with only several receptors diffracting better than 2.4 Å, for example, the human δ-opioid receptor (PDB ID: 4N6H, 1.8 Å) [85] and the human adenosine A2A receptor (A2AAR) (PDB ID: 4EIY, 1.8 Å) [86], as well as a thermostabilized adenosine A2A receptor (PDB ID: 5NM4, 1.7 Å) [87].

To demonstrate the possibility of S-SAD phasing for GPCRs at X-ray FEL sources, anomalous LCP-SFX data from ∼5 × 2 × 2 μm3 crystals of A2AAR were collected using X-ray energy of 6 keV as a compromise between the strength of anomalous scattering from sulfur atoms (K-edge 2.472 keV) and the detector-size and energy limits on resolution [37]. At this energy the anomalous difference in structure factors is expected to be <1.5%, requiring a very high multiplicity of collected data. To minimize background X-ray scattering the sample was injected into a vacuum chamber, and the X-ray beam was attenuated to 14% to prevent the disruption of the LCP jet by the shockwaves from micro-explosions and to avoid the oversaturation of the CSPAD detector [88].

Within ∼17 h of data collection a total of 7,324,430 images were collected, of which 1,797,503 were identified as crystal diffraction patterns using the Cheetah hit finding software [89]. A total of 593,996 of these hits were successfully indexed using the CrystFEL software [90]. The final reflection list, created using Monte Carlo integration and iterative scaling resulted in a dataset at a resolution of 2.5 Å. This resolution was limited by the X-ray energy, detector size, and minimal achievable sample-to-detector distance. To further extend resolution, additional data at an X-ray energy of 9.8 keV (wavelength, 1.27 Å) were collected. This high-resolution data set was assembled from 72,735 indexed patterns and was truncated at 1.9 Å resolution.

Compared to the previously reported S-SAD phasing of SFX data for lysozyme [79] and thaumatin [84], phasing of A2AAR data required approximately four times more indexed patterns. In addition to lower crystal symmetry and lower sulfur content, the diffraction power of A2AAR microcrystals is substantially lower compared to lysozyme crystals of similar size. At the same time, the background scattering from an LCP stream 50 μm in diameter, in which A2AAR microcrystals were delivered, is much greater [91] than the background from a liquid stream 5 μm in diameter used to deliver lysozyme crystals.

These factors, together with a potentially lower isomorphism of A2AAR microcrystals as compared to crystals of soluble test proteins, contribute to the challenge of native sulfur phasing of SFX data for membrane proteins. In this experiment, protein consumption for de novo phasing was very reasonable (∼2.7 mg) by merit of the efficient operation of the LCP injector [30, 92]. These results, demonstrating that ∼600,000 indexed patterns are sufficient to phase GPCR data starting with 12 ordered sulfur atoms per 447 residues (2.7%) can be placed in perspective with the fact that over 88% of all human proteins have indeed higher than 2.7% of Cys and Met residue content. Thus, this result provides an important reference point reassuring that most human proteins could be phased by S-SAD for de novo structure determination with X-ray FELs, provided sufficient sample quantities are available.

10.5 Conclusions

The nascent technique of LCP-SFX, despite its short history, has already stimulated great progress in the field of GPCR structural biology. With the majority of human GPCR structures still unsolved, it promises to alleviate roadblocks that have traditionally hampered GPCR crystallography, such as the ability to obtain sufficiently large crystals devoid of growth defects. This method has already resulted in a number of important breakthroughs including the structure of rhodopsin bound to arrestin [21], as well as two angiotensin receptors [35, 36], and several full-length non-class A receptors [26, 27, 73]. In the near future, X-ray FELs will likely allow for the study of the highly flexible GPCRs in a dynamic fashion, shedding light on the structural foundation of their activation process. Indeed, X-ray FELs already facilitate crystallography at room temperature, which is closer to physiological conditions compared to cryo-cooled crystals at synchrotrons. Side chain and loop movements, as well as differential water occupancy at room temperature, can be conclusive as to how these proteins function in the native cells [55].

The use of ultrashort X-ray FEL pulses also opened up the field of time-resolved crystallography with very fine temporal resolution and the ability to study irreversible processes. The inherent dynamic nature of GPCRs and the high biological relevance of their conformational response to ligand binding make them very exciting targets for time-resolved studies. For GPCRs, the challenge arises of how to trigger receptor conformational changes precisely, as, except for the case of visual rhodopsins, they are not inherently light sensitive. Current efforts focus on the design of suitable, covalently attached or diffusible photosensitive ligands that can elicit the receptor conformational change upon a flash of laser light, as well as on adapting rapid mixing injectors to microcrystals delivered in LCP, where the high speed and efficiency of diffusion in microcrystals alleviate in part the lower temporal resolution of mixing experiments as compared to light-driven reactions. GPCR activation takes place over several time scales, from fast internal motions and plasticity of the ligand-binding pocket (∼nanoseconds), to rearrangements of receptor microswitches (∼microseconds), to large-scale helical motions (∼milliseconds) [93]. What part of the activation process will be amenable to studies by time-resolved crystallography using X-ray FELs will depend on the efficiency of the mechanism triggering the receptor conformational change and on the ability of crystals to accommodate those conformational changes, but not on the characteristics of the X-ray FEL beam itself, which offers femtosecond temporal resolving power.

Lastly, the LCP-SFX method has potential to facilitate structure-based drug design (SBDD) for GPCRs and other important membrane protein drug targets. SBDD studies rely heavily on the availability of X-ray structures to characterize the ligand binding site and use this information to guide ligand optimization design. Tens to hundreds of ligand-bound structures are normally determined in the course of a drug design program prior to clinical trials. To date, the application of this approach to membrane proteins based on experimentally determined structures, particularly of GPCRs, has been very limited or nonexistent [94]. By enabling high-resolution data collection from micrometer-sized GPCR crystals LCP-SFX can overcome the main bottleneck behind SBDD, namely the need for an extensive crystal optimization for each selected ligand–receptor complex, which often can take months to years. Additionally, micrometer-sized crystals facilitate common procedures of co-crystal generation, such as ligand soaking and exchange, while the overall LCP-SFX protocol further simplifies data collection on a large number of co-crystals by eliminating the need for crystal harvesting and cryo-cooling.

Taking into account all the impressive results reviewed in this chapter we certainly expect that with further progress in X-ray FEL instrumentation, development of novel fast detectors with high dynamic range, designing more efficient sample delivery approaches, and commissioning of new X-ray FEL sources, the impact of LCP-SFX on structural biology of GPCRs will continue to grow.