Main

The extracellular region of Smoothened (SMO) is composed of an N-terminal CRD followed by a small linker domain, which then connects to the TMD and a C-terminal intracellular domain (ICD; Fig. 1a). Studies using small-molecule agonists and antagonists of SMO have defined two separable ligand-binding sites, one in the TMD and one in the CRD1. The TMD binding site binds the plant-derived inhibitor cyclopamine2,3, the synthetic agonist SAG4,5, and the anti-cancer drug vismodegib6, which is used clinically to treat advanced basal cell cancer. Side-chain oxysterols such as 20(S)-hydroxycholesterol (20(S)-OHC) represent a distinct class of SMO ligands7,8,9 that activate signalling by engaging a hydrophobic groove on the surface of the SMO CRD10,11,12. The native morphogen Sonic Hedgehog (SHH) functions by binding and inactivating Patched 1 (PTCH1), the major receptor for Hh ligands, which suppresses SMO activity13. Despite the discovery of numerous exogenous SMO ligands, no bona fide endogenous SMO ligand that regulates Hh signalling has been identified. Structure-guided mutations that disrupt 20(S)-OHC binding to the CRD groove or sterol-based inhibitors that occlude this groove impair signalling by SHH10,11. By contrast, several mutations in the TMD site that blocked the binding and activity of synthetic ligands failed to have any effect on the basal or SHH-stimulated activity of SMO12,14. These data suggest that an endogenous SMO ligand that can regulate Hh signalling engages the CRD groove on SMO.

Figure 1: Structure of human SMO.
figure 1

a, Two views of the overall structure showing the extracellular and transmembrane domains of human SMO in cartoon representation. Orange, CRD; pink, linker domain (LD); blue, TMD; red, inactivating point mutation Val329Phe; cyan, cholesterol; black, nine numbered disulfide bridges; yellow sticks, two N-linked glycans (NAG). A schematic of SMO is shown above (SP, signal peptide; BRIL, position of the BRIL fusion protein inserted between TMD helices 5 and 6). b, The ‘connector’ region between the CRD and linker domain highlighted as sticks in atomic colouring, with the CRD shown as a solvent-accessible surface and the linker domain and part of the TMD ECL3 loop as cartoons. c, Interface between the CRD, linker domain and TMD shown in cartoon representation. Yellow sticks, ECL3-NAG; cyan sticks, cholesterol.

PowerPoint slide

Crystal structures of the isolated SMO linker domain–TMD in complex with both agonist and antagonist ligands15,16,17 have shown that the GPCR heptahelical scaffold is conserved and provided a detailed view of a small-molecule binding pocket, but did not show the conformational changes typically associated with GPCR signalling18,19. In addition, two unliganded structures of the isolated SMO CRD have been solved10,20. However, structural insights into how the extracellular domains and TMD interact to regulate signalling in SMO (or in any other GPCR) are lacking.

Overall structure of SMO

We determined the crystal structure of human SMO containing both the CRD and the TMD, connected by the juxta-membrane linker domain (SMOΔC, Fig. 1a and Extended Data Fig. 1). To study the SMO TMD in a defined functional state and to reduce conformational flexibility, we included a single amino acid mutation, Val329Phe16, in TMD helix 3 that locked SMO in an inactive state and substantially improved expression levels (Extended Data Fig. 2 and Supplementary Discussion). Using an established strategy in GPCR crystallography, the third intracellular loop (ICL3) between transmembrane helices 5 and 6 was replaced by thermostabilized apocytochrome b562RIL (BRIL)21. The SMOΔC structure was determined to 3.2 Å resolution (Extended Data Table 1 and Extended Data Fig. 3). The asymmetric unit, comprising two molecules arranged ‘head-to-tail’, stacks into alternating hydrophobic and hydrophilic layers along one axis, as is typical for lipidic cubic phase (LCP)-derived crystals (Extended Data Fig. 3a). This SMO arrangement within the crystal suggests that SMOΔC is monomeric, in agreement with size-exclusion chromatography (SEC) coupled to multi-angle light-scattering analysis (MALS) (Extended Data Fig. 3f).

SMO adopts an extended conformation in the structure. The extracellular CRD is perched on top of the linker domain, which forms a wedge between the TMD and CRD. At the apex of this wedge, the CRD contacts the TMD through the elongated TMD extracellular loop 3 (ECL3; Fig. 1a). The overall architecture is stabilized by nine disulfide bridges, four of which (numbered 2–5 in Fig. 1a) reveal the canonical disulfide pattern of the CRD fold22 and one that is specific for SMO10 (marked as 1′ in Fig. 1a). This disulfide bridge positions the start of a ‘connector’ segment that links the CRD and linker domain (Fig. 1b). The connector is tucked along a hydrophobic groove that runs the length of the CRD and shields this groove with three inwardly turned hydrophobic residues (Val182, Ile185 and Phe187), multiple hydrophilic residues and an N-linked glycan facing the solvent (Fig. 1b and Extended Data Fig. 4a). The connector region, combined with mainly hydrophobic interactions between the CRD and both the linker domain and ECL3 (total buried surface area of 745 Å2), orients the CRD in an upright conformation with its N terminus pointing away from the plasma membrane (Fig. 1a, b and Extended Data Fig. 4b). However, we did not observe major structural changes in the heptahelical TMD bundle when we compared it to previously solved structures of either antagonist- or agonist-bound complexes lacking the CRD15,16,17. The only exception was a rearrangement of the linker domain, which in our structure is pushed down towards the TMD, perhaps by CRD binding (Extended Data Fig. 5). A structure of the activated state of the SMO TMD will probably require co-crystallization with its (still unknown) downstream effector or the use of an active-state stabilizing antibody.

Cholesterol is a ligand for SMO

Unexpectedly, we discovered a cholesterol molecule in our SMO structure (Fig. 1 and Extended Data Methods). Cholesterol occupies a central position, interacting with all three SMO domains (the CRD, linker domain and TMD; Figs 1c and 2a, b), and adopts an extended conformation with its tetracyclic sterol ring bound in a shallow groove in the CRD, a site previously shown10,11,12 to bind 20(S)-OHC in SMO and the palmitoleyl group of Wnt ligands in Frizzled receptors23. The cholesterol iso-octyl tail, located at the interface between the CRD, linker domain and TMD ECL3, is buried in the SMO protein core. This arrangement positions cholesterol some 12 Å away from the lipid bilayer of the plasma membrane (Fig. 2c) and indicates that a cholesterol molecule would have to completely desorb from the plasma membrane surface to access its binding site in SMO. Cellular cholesterol levels are permissive for Hh signalling24,25; however, this requirement is likely to be unrelated to the cholesterol binding site seen in our structure because SMO mutants lacking the CRD12 or carrying mutations in the cholesterol-binding groove25 remain sensitive to cholesterol depletion (Supplementary Discussion).

Figure 2: The cholesterol binding site.
figure 2

a, b, Close-up of cholesterol with interacting residues as sticks. Initial 2FoFc map at 1.0σ before inclusion of cholesterol shown as chicken-wire. Colour-coding follows Fig. 1. Inset shows the potential hydrogen-bonding network coordinating the cholesterol 3β-hydroxyl group. Interatomic distances (Å) are shown in black. c, Solvent-accessible surface colour-coded by hydrophobicity (red, hydrophobic; white, hydrophilic). d, Sequence conservation (based on 55 vertebrate SMO sequences) mapped onto SMOΔC (black, conserved; white, not conserved). e, Superposition of human (orange, this study) and fly (purple, PDB 2MAH20) SMO CRD structures. The fly CRD region occupying the cholesterol-binding site is highlighted by the dashed line.

PowerPoint slide

The sterol-binding site of SMO is predominantly lined with hydrophobic residues from the CRD, which stabilize the flat α-face of cholesterol (Fig. 2a). Mutations in several of these residues (Leu108, Trp109, Pro164 and Phe166) have been noted to prevent SMO binding to 20(S)-OHC and to impair signalling driven by either 20(S)-OHC or SHH10,11,12. The β-face is shielded by the side-chain of Arg161 and an N-linked glycan (Fig. 2b). The cholesterol 3β-hydroxyl group is incorporated in a hydrogen-bonding network with the side chains of Asp95, Trp109 and Tyr130; this optimally positions the α-face of the sterol ring system to make a stacking interaction with the indole ring of Trp109 (Fig. 2a, b). The Leu491, Ala492 and Ile496 residues of TMD ECL3 and the Val210 residue of the linker domain orient the iso-octyl tail of cholesterol in an elongated conformation. Residues lining the cholesterol-binding site are highly conserved in vertebrates but are less conserved in Drosophila (Fig. 2d and Extended Data Fig. 1). The superposition of our human CRD structure with a previously solved solution structure of the Drosophila CRD20 (r.m.s.d. 1.71 Å for 86 equivalent Cα positions) revealed a major rearrangement of one edge of the cholesterol-binding site comprising fly SMO residues 183–190 (Fig. 2e and Extended Data Fig. 1). This segment of fly SMO forms a short helix that protrudes into the CRD groove and consequently may preclude cholesterol binding. In fact, fly SMO does not bind oxysterols10,11.

Consistent with the structure, purified SMOΔC bound to cholesterol in a ligand-affinity assay (Fig. 3a–c), analogous to the assay we previously developed to measure the binding of SMO to 20(S)-OHC7. Beads covalently coupled to cholesterol captured purified SMOΔC (Fig. 3a). Binding could be blocked by free 20(S)-OHC added as a competitor at concentrations that activate Hh signalling in cells7 (Fig. 3b), confirming that cholesterol and 20(S)-OHC engaged the same binding groove on the CRD surface. In a stringent specificity control, 20(R)-OHC, an epimer with inverted stereochemistry at a single position that cannot bind to the CRD7 and cannot activate Hh signalling10, failed to block this interaction (Fig. 3c).

Figure 3: The cholesterol-binding site regulates SMO signalling activity.
figure 3

a, Purified SMOΔC captured on beads coupled to increasing concentrations of cholesterol in the presence or absence of free 20(S)-OHC. b, c, SMOΔC captured on cholesterol beads in the presence of increasing concentrations of free 20(S)-OHC (b) or in the presence of a synthetic epimer 20(R)-OHC (c). d, Protein levels of SMO variants stably expressed in Smo−/− mouse fibroblasts. WT, wild-type. The protein Suppressor of Fused (SUFU) served as a loading control. e, Binding of SMO variants to 20(S)-OHC-coupled beads. Asp99Ala and Tyr134Phe are predicted to disrupt hydrogen bonding with cholesterol (Fig. 2a). f, Levels of Gli1 mRNA (mean arbitrary units ± s.d., n = 4) were used as a metric for Hh signalling activity in cell lines shown in d after stimulation with SHH, SAG or 20(S)-OHC. Statistical significance based on one-way ANOVA is denoted for the difference in Gli1 mRNA levels between cells expressing wild-type SMO and cells expressing each mutant SMO protein. n.s., P > 0.05; ****P ≤ 0.0001. Each experiment was repeated 3 or more times.

PowerPoint slide

Cholesterol promotes SMO signalling

To assess the functional relevance of the cholesterol-binding site identified in our structure, we focused our mutagenesis efforts on the hydrogen-bonding network between the cholesterol 3β-hydroxyl group and Asp95 and Tyr130 (Fig. 2a). The corresponding residues in mouse SMO, Asp99 and Tyr134, were mutated either individually or in combination to residues (Asp99Ala and Tyr134Phe) that lack hydrogen bond acceptor or donor groups. As cholesterol and oxysterols occupy the same groove in the CRD (Fig. 3b), they probably adopt a similar conformation and participate in at least some of the same interactions. Therefore, we used 20(S)-OHC binding and signalling to evaluate the effect of these mutations. First, we tested the ability of 20(S)-OHC beads to capture SMO from detergent extracts (Fig. 3d, e). As previously demonstrated7, wild-type SMO can be captured on 20(S)-OHC beads. However, SMO variants carrying the Asp99Ala and Tyr134Phe point mutations failed to bind to 20(S)-OHC beads, highlighting the importance of this hydrogen-bonding network for sterol binding (Fig. 3e). Next, we stably expressed untagged versions of these SMO mutants in Smo−/− mouse fibroblasts (Fig. 3d) and assessed their abilities to restore signalling initiated by SHH, the TMD agonist SAG or the CRD agonist 20(S)-OHC (Fig. 3f). SMO-Asp99Ala and SMO-Tyr134Phe did not increase the basal activity of SMO in the absence of Hh agonists and did not significantly change the signalling response to SAG (Fig. 3f). This demonstrates that these mutants retained an intact TMD ligand-binding site and remained competent to transmit signals to cytoplasmic components. However, Asp99Ala and Tyr134Phe significantly impaired the ability of SMO to respond to both 20(S)-OHC and SHH (Fig. 3f). We conclude that these residues, and by implication the cholesterol seen in our structure, are important for signal-induced SMO activation.

Inactive-state stabilization by the SMO CRD

To assess the structural influence of the CRD in a membrane environment, we carried out molecular dynamics simulations of SMO in the presence and absence of cholesterol. Ten independent 100-ns atomistic simulations were performed with SMO embedded in a phosphatidylcholine bilayer (Fig. 4a, b and Extended Data Fig. 6). These simulations revealed that the SMO CRD has substantial conformational flexibility when not bound to a ligand. In the presence of cholesterol, however, there was a pronounced decrease in this flexibility (Fig. 4a), consistent with the idea that cholesterol stabilizes the CRD structure. By contrast, cholesterol did not substantially change the conformational stability of the SMO TMD (Fig. 4b). The predominant cholesterol-induced stabilization was seen in the vicinity of the sterol-binding pocket, with some propagation to more distal CRD regions (Extended Data Fig. 6). Consistent with these simulations, the thermostability of purified SMOΔC was reduced when cholesterol was depleted with methyl-β-cyclodextrin (Extended Data Fig. 6f, g and Supplementary Discussion).

Figure 4: SMO activity is regulated by the stability of its extracellular domain.
figure 4

a, b, Molecular dynamics (MD) simulations of SMO. Cα r.m.s.d. versus time for CRD (a) and 7TM region (b) with (blues) and without (reds) cholesterol. c, d, Mutations in the extracellular region increase constitutive signalling activity of SMO. c, CRD–linker domain interface with mutated residues highlighted. Corresponding mouse residues are in brackets. d, Gli1 mRNA levels (mean arbitrary units ± s.d., n = 4, ≥3 independent repeats) after treatment with agonists (SHH, SAG, 20(S)-yne) or antagonists (cyclopamine, SANT-1). Gli1 mRNA levels in untreated cells reflect constitutive activity of SMO. Numbers above bars indicate Gli1 fold-increase compared to untreated cells. Asterisks denote statistical significance based on one-way ANOVA for comparison with wild-type SMO. **P ≤ 0.01, ****P ≤ 0.0001. e, Solution-state SAXS profiles of untreated (grey) and 20(S)-OHC-treated (red) SMOΔC. Lines are fits derived from the indirect Fourier transform of the shown data points. f, Paired-distance (P(r)) distribution functions, normalized to their respective I(0) values, with maximum particle sizes (dmax) of 120 Å for untreated (grey) and 129 Å for 20(S)-OHC-treated (red) SMOΔC.

PowerPoint slide

The molecular dynamics results prompted us to test the signalling consequences of destabilizing interactions within the extracellular region of SMO. To probe the effect of interactions between the CRD and the linker domain, we generated two mutants (Pro120Ser and Ile160Asn/Glu162Thr) in the context of mouse SMO. These introduce bulky, N-linked glycosylation sites at positions 114 and 156, respectively, of the human CRD that contact the linker domain (Fig. 4c). To disrupt the conformation of the linker domain itself, we mutated two cysteines (Cys197 and Cys217 in mouse SMO) that form a disulfide bond within the linker domain (numbered 6 in Figs 1a and 4c) to serines. We compared these mutants to SMO lacking the entire CRD (SMO-ΔCRD) (Fig. 4d and Extended Data Fig. 7a). Untagged SMO-ΔCRD stably expressed in Smo−/− fibroblasts demonstrated a similar level of constitutive activity11,12 to that of wild-type SMO stimulated with saturating SHH (Fig. 4d and Extended Data Fig. 7b).

The Pro120Ser, Ile160Asn/Glu162Thr, and Cys197Ser/Cys217Ser mutations mimicked the effect of complete CRD deletion—constitutive signalling activity was increased (Fig. 4d) despite the presence of high levels of PTCH1 (Extended Data Fig. 7a) and there was complete loss of responsiveness to 20(S)-yne (Fig. 4d). All three mutants remained sensitive to inhibition by SANT-1, which binds deep in the TMD and does not contact the linker domain16. Cyclopamine and SAG, both of which make contacts with the linker domain16,17, could regulate SMO-Pro120Ser and SMO-Ile160Asn/Glu162Thr, but not SMO-Cys197Ser/Cys217Ser. Thus, destabilization of either CRD–linker domain contacts or the linker domain itself increases the constitutive activity of SMO, implicating these regions in stabilizing an inactive state, analogous to the D(E)R3.50Y motif in helix III of some Class A GPCRs26.

To understand the basis for the loss of oxysterol responsiveness, we measured binding of each of the mutants to 20(S)-OHC beads (Extended Data Fig. 7c). Although the CRD mutations Pro120Ser and Ile160Asn/Glu162Thr impaired binding, the Cys197Ser/Cys217Ser mutation in the linker domain had no effect on binding to 20(S)-OHC. The observation that 20(S)-OHC cannot activate SMO-Cys197Ser/Cys217Ser (Fig. 4d) even though it can bind normally (Extended Data Fig. 7c) suggests that the linker domain may transmit the conformational changes that lead to SMO TMD activation in response to CRD ligands. To investigate conformational changes in SMO induced by CRD ligands in the solution state, we performed small angle X-ray scattering (SAXS) experiments on SMOΔC in the absence of any exogenous ligands (apo-SMOΔC) or SMOΔC loaded with the agonist 20(S)-OHC (Extended Data Fig. 8a). The initial SAXS curves (Fig. 4e) and further analyses27,28 (Fig. 4f and Extended Data Fig. 8b) show that 20(S)-OHC binding induces a conformational change consistent with elongation and reduced globularity of SMOΔC. As 20(S)-OHC and cholesterol bind to the same groove on the CRD (Fig. 3b), we conclude that replacement of cholesterol by 20(S)-OHC (a molecule that carries only a single additional hydroxyl) produces a conformational change that leads to SMO activation.

The structure of SMO bound to vismodegib

Our structural and functional studies highlighted the critical regulatory role played by CRD–linker domain–TMD contacts in controlling the conformation and activity of SMO. It is unclear whether these interactions are altered by TMD-targeted small molecules used to treat patients with Hh-driven cancers, because prior structures of SMO bound to small molecules have not included the CRD, precluding an assessment of how the extracellular and transmembrane domains of SMO communicate. We determined the crystal structure of SMO in complex with vismodegib, a potent TMD antagonist, to 3.3 Å resolution (vismo–SMOΔC; Fig. 5a and Extended Data Fig. 9a, b). Despite the fact that vismodegib is the most commonly used Hh pathway inhibitor in patients and resistance has already become a clinically relevant problem14,29,30, there are no structures available for vismodegib in complex with SMO.

Figure 5: Structure of SMO in complex with the antagonist vismodegib.
figure 5

a, Overall structure showing full extracellular and transmembrane domains of human SMO (cartoon) in complex with vismodegib (green). Colour-coding follows Fig. 1. b, Close-up of the vismodegib-binding site. The asterisk denotes a residue mutated in vismodegib-resistant tumours. ce, Comparison of apo-SMO (red) and vismo–SMO (blue). c, Superposition of the two SMO structures based on the TMD. Arrows indicate domain rotations. Cholesterol (red) and vismodegib (blue) shown as spheres. The dashed oval highlights the conformational change of the TM6 and ECL3. d, Close-up of the TMD–linker domain–CRD interface. Phe484 contacts the vismodegib methylsulfone group. e, Close-up of the sterol-binding site. Cholesterol, Arg161 and NAG shown as sticks. The partly disordered ECL3 loop of vismo–SMO is depicted as dotted line.

PowerPoint slide

Vismodegib is stabilized by a network of hydrogen bonds and hydrophobic interactions formed by the SMO TMD core (Fig. 5b). As expected, vismodegib occupies the TMD binding site previously identified in SMO complexes with other antagonists and agonists15,16,17 (Extended Data Fig. 5). The vismodegib pyrimidine and chloro-benzyl rings are deeply buried, forming mainly hydrophobic interactions with the TMD core. The amide linker, which is stabilized by a potential hydrogen bond to the side chain of Asp384, occupies a central position within the seven transmembrane domain (7TM) bundle and connects to the chlorophenyl–methylsulfone moiety, which is oriented towards the extracellular domains and the entrance of the TMD pocket16. This arrangement is stabilized by potential hydrogen bonds to the side chains of Gln477 and Arg400 and a hydrophobic stacking interaction of the vismodegib methylsulfone moiety and the aromatic ring of Phe484. Asp473, a residue that is mutated in vismodegib-resistant cancers14,29,30, stabilizes the potential hydrogen-bonding network around Arg400 (asterisk in Fig. 5b). The SMO-Asp473His mutation seen in drug-resistant cancers is ideally positioned to disrupt the vismodegib-binding site, in agreement with functional experiments showing that SMO-Asp473His cannot bind this drug14. Our structure also explains the vismodegib resistance mechanism for many other mutations in the TMD binding site observed in patients with advanced basal cell cancer29,30 (Extended Data Fig. 9c–f).

The vismodegib-bound structure of SMO in its inactive state shows a marked conformational change in the extracellular domains and the ECL3 loop when compared to the apo-SMO structure (Fig. 5c). The upper part of transmembrane helix 6, which contacts the CRD and linker domain, is moved ~15° towards the linker domain and CRD, probably owing to an interaction between Phe484 and the methylsulfone moiety of vismodegib (Fig. 5b, d). This movement of helix 6 results in a rotational movement of the CRD (and to a lesser extent the linker domain), allowing ECL3 to intrude into the sterol-binding groove of the CRD. As a consequence, the side chain of CRD residue Arg161 forms a stacking interaction with Trp109 and occupies the space where cholesterol was located in the apo-SMO structure (Fig. 5e). This reorganization is predicted to occlude the cholesterol-binding site. Indeed, cholesterol was absent from the vismo–SMO structure (Fig. 5a), and the addition of vismodegib reduced the interaction of purified SMOΔC with cholesterol beads (Extended Data Fig. 9g, h). In summary, binding of vismodegib to the TMD site induces a conformational change that ultimately results in rearrangement of the sterol-binding site, providing a structural communication mechanism that explains the previously observed allosteric interaction7 between CRD and TMD ligands. More generally, this finding suggests that a similar conformational change, involving a shift of the ECL3 and a pivoting of the CRD on the extracellular end of the TMD bundle, may allow an extracellular signal to be transduced from the CRD to the TMD and ultimately across the membrane. From a therapeutic perspective, our results highlight an unanticipated role for the CRD and ECL3, including displacement of an extracellular cholesterol ligand, in stabilizing a drug-bound inactive state of SMO, providing a structural template for the development of the next generation of SMO inhibitors against Hh-driven cancers.

Conclusion

The structure of SMO provides insights into the mechanism by which a large extracellular region and two allosterically linked ligand-binding sites may regulate the activity of a GPCR. We propose that cholesterol functions as an endogenous SMO ligand that occupies the CRD groove and stabilizes a resting or apo conformation poised to respond to Hh signals. SMO signalling activity is compromised by mutations that prevent cholesterol binding or by antagonists such as vismodegib that act allosterically to occlude the cholesterol-binding site. SAXS data suggest that CRD ligands such as 20(S)-OHC, which displace cholesterol, produce an additional conformational change that leads to SMO activation. Identification of the mechanism by which PTCH1 inhibits SMO will be necessary to understand how cholesterol-bound SMO is activated in response to endogenous Hh morphogens. We predict that SMO activation will involve alterations in the interactions between the CRD, ECL3, linker domain, and TMD that allow the TMD to adopt an active signaling state.

Methods

No statistical methods were used to predetermine sample size. The experiments were not randomized and the investigators were not blinded to allocation during experiments and outcome assessment.

Reagents

NIH 3T3 and 293T cells were certified stocks obtained directly from ATCC. Smo−/− fibroblasts (which were used to express all the SMO mutants) have been described previously31 and were originally obtained from J. Chen and P. Beachy. The authenticity of Smo−/− cells was established by immunoblotting to ensure lack of endogenous SMO protein expression. Incoming cell lines were confirmed to be negative for mycoplasma contamination. SAG was obtained from Enzo Life Sciences; cyclopamine was obtained from Toronto Research Chemicals; SANT-1 was obtained from EMD Millipore; 20(S)-OHC was obtained from Steraloids. The synthesis of 20(S)-yne, 20(R)-OHC7 and 20(S)-amine-coupled beads for SMO binding assays has been described in detail previously10. Rabbit polyclonal antibodies against SMO, PTCH1 and SUFU have been described previously32,33, and the mouse monoclonal antibody against GLI1 was obtained from Cell Signaling Technologies (L42B10). SHH-containing conditioned medium was made from 293T cells transfected with the N-terminal signalling domain of SHH and used at saturating concentrations (dilution of 1:4)2.

Constructs

For large-scale expression and crystallization, a SMO construct (SMOΔC) was designed by truncating the N and C termini of human SMO to leave the extracellular and transmembrane domains (UniProt Id. Q99835; residues 32–555), and replacing intracellular loop 3 (Q99835; residues 429–440) with BRIL (UniProt Id. P0ABE7; residues 23–128). The synthetic gene encoding SMOΔC was obtained from Geneart (Grand Island, NY) and cloned into the pHLSec vector34 in frame to either a C-terminal Rho1D4 antibody epitope tag35,36 or a monoVenus tag37,38 followed by a Rho1D4 antibody epitope tag.

All mouse SMO mutants were made using Quikchange methods using a previously described construct39 encoding mouse SMO (pCS2+:YFP-mSmo), after removal of the N-terminal YFP tag by XhoI digestion. The mouse SMO-ΔCRD construct lacks residues 68–184. For stable-line construction, the mouse SMO coding sequence was transferred from pCS2+ to pMSCVpuro using Gibson cloning.

Expression and purification of SMO

SMOΔC was expressed by transient transfection in HEK-293S-GnTI (ATCC CRL-3022) cells in a typical batch volume of 9.6 l. Cells were grown in suspension at 37 °C, 8.0% CO2, 130 r.p.m. to densities of 2–3 × 106 cells ml–1 in protein expression medium (PEM, Invitrogen) supplemented with l-glutamine, non-essential amino-acids (NEAA, Gibco) and 1% fetal calf serum (FCS, Sigma-Aldrich). Cells from 0.8 l cultures were collected by centrifugation (1,100 r.p.m., 7 min) and re-suspended in 120 ml Freestyle293 medium (Invitrogen) containing 1.2 mg PEI Max (Polysciences), 0.4 mg plasmid DNA and 5 mM valproic acid (Sigma-Aldrich), followed by a 3–6-h incubation at 160 r.p.m. Culture media were subsequently topped up to 0.8 l with PEM and returned to 130 r.p.m. 48–72 h after transfection, cell pellets were collected by centrifugation, snap-frozen in liquid N2 and stored at –80 °C, resulting in a total of ~150 g of cell mass per 9.6 l suspension medium40.

Frozen cell pellets were thawed, re-suspended in 10 mM HEPES pH 7.5, 300 mM NaCl buffer supplemented with a 1:100 (v:v) dilution of mammalian protease inhibitor cocktail (P8340, Sigma-Aldrich) and solubilised with 1.3% (w/v) n-dodecyl-β-d-maltopyranoside (DDM, Anatrace) and 0.26% (w/v) cholesteryl hemisuccinate (CHS, Anatrace), then rotated for 1.5 h at 4 °C. Insoluble material was removed by centrifugation (10,000 r.p.m., 12 min, 4 °C) and the supernatant incubated for 2 h at 4 °C with purified Rho-1D4 antibody (University of British Columbia) coupled to CNBr-activated sepharose beads (GE Healthcare). Protein-bound beads were washed extensively with 50 mM HEPES pH 7.5, 300 mM NaCl, 10% (v/v) glycerol, 0.1% DDM, 0.02% CHS buffer and then with 50 mM HEPES pH 7.5, 300 mM NaCl, 10% glycerol, 0.05% DDM, 0.01% CHS buffer and eluted overnight in 50 mM HEPES pH 7.5, 300 mM NaCl, 10% glycerol, 0.03% DDM, 0.006% CHS, 500 μM TETSQVAPA peptide (Genscript). Eluate was concentrated to ~500 μl using a Vivaspin Turbo 4 PES 100 kDa MWCO concentrator and loaded onto a Superose 6 10/300 column (GE Healthcare) equilibrated with 10 mM HEPES pH 7.5, 150 mM NaCl, 10% glycerol, 0.03% DDM, 0.006% CHS. Peak fractions were pooled and concentrated to ~30 mg ml–1 using a Vivaspin 500 PES 100 kDa MWCO concentrator. Samples were deglycosylated with Endoglycosidase F1 and incubated at room temperature for 1 h. For the vismodegib complex, vismodegib (GDC-0449, Selleck Chem) dissolved at high concentration in DMSO was added to the protein sample to a final concentration of 10 mM.

For small-scale screening, cells in adherent format were transiently transfected using Lipofectamine2000 (Invitrogen) and the expressed protein, tagged with YFP37,38, was prepared in the same manner as above with quantities adjusted appropriately. For analysis, samples were loaded onto a Superose 6 3.2/300 column (GE Healthcare) attached to a high-performance liquid chromatography system with automated micro-volume loader and in-line fluorescence detection (Shimadzu)41.

For thermostability experiments, SMOΔC was expressed and purified as described above. After SEC purification, the pooled peak fractions were re-applied to purified Rho-1D4 antibody coupled to CNBr-activated sepharose beads. Equal amounts of beads were treated with different quantities of methyl-β-cyclodextrin (MBCD) and incubated with gentle rocking for 1 h at 15 °C before extensive washing with 300 mM NaCl, 50 mM Hepes pH 7.5, 0.03% DDM and elution with 300 mM NaCl, 50 mM Hepes, pH 7.5, 0.03% DDM, 500 μM TETSQVAPA peptide. The thermostability of the different samples was assessed by heating aliquots of their eluates to the indicated temperatures for 1 h before loading them onto a Superose 6 3.2/300 column on a Shimadzu system and following using absorbance at 280 nm. Samples were kept at the baseline temperature of 20 °C when not heated. The construct used in this assay was not fluorescently tagged in order to avoid the potentially confounding effects of the fluorescent tag on overall stability.

Stable cell lines

Stable cell lines expressing untagged SMO mutants were made by infecting Smo−/− fibroblasts with a retrovirus carrying these constructs cloned into pMSCVpuro39. Retroviral supernatants were produced after transient transfection of Bosc23 helper cells with the pMSCV constructs42,43. Virus-containing media from these transfections was directly used to infect Smo−/− fibroblasts, and stable integrants were selected with puromycin (2 μg ml–1).

We had previously10 constructed stable lines using an identical strategy with SMO constructs carrying an N-terminal fluorescent protein (FP) tag; however, we found that epitope tagging of SMO with a fluorescent protein, or transient overexpression of tagged or untagged SMO, could impact the assessment of its signalling activity. For example, we previously measured a lower level of constitutive signalling activity for SMO-ΔCRD compared to the present study, probably due to the presence of an N-terminal YFP tag10. Moreover, previous reports from three groups (including our own) reached somewhat divergent conclusions regarding the role of the CRD in basal and ligand-stimulated SMO activity10,11,12. These differences may have been related to the use of different epitope tags and expression systems.

Hence, all Hh signalling assays used in this study were performed with untagged SMO and SMO mutants stably expressed in Smo−/− fibroblasts, with assessment of SMO protein levels in stable cell lines by immunoblotting.

Hedgehog signalling assays

Stable cell lines expressing either wild-type SMO or SMO mutants were grown to confluence in Dulbecco’s Modified Eagle’s Medium (DMEM) containing 10% Fetal Bovine Serum (FBS, Optima Grade, Atlanta Biologicals) and then exposed to DMEM containing 0.5% FBS for 24 h to induce primary cilia assembly. These ciliated cells were then treated with saturating concentrations of various Hh agonists and antagonists in DMEM containing 0.5% FBS for 12 h.

For detection of proteins by immunoblotting, cells were washed in ice-cold PBS and lysed (30 min, 4 °C) by agitation in modified RIPA buffer (50 mM Tris-HCl pH 8.0, 150 mM sodium chloride, 2% NP-40, 0.25% deoxycholate, 0.1% sodium-dodecyl sulfate (SDS), 1 mM dithiothreitol, 1 mM sodium fluoride and the SigmaFast Protease inhibitor cocktail). After clarification (20,000g, 30 min, 4 °C), the protein concentration of each lysate was measured using the bicinchoninic acid assay (BCA, Pierce/Thermo Scientific). Lysate aliquots containing equal amounts of total protein were fractionated on SDS–PAGE gels (either a 8% tris-glycine gel or a 4–12% bis-tris gel), and transferred to a nitrocellulose membrane. Quantitative immunoblotting with the various antibodies was performed using the Li-Cor Odyssey infrared imaging system. In all immunoblots, vertical dashed black lines represent non-contiguous lanes from the same immunoblot juxtaposed for clarity. Each immunoblot was repeated 2–3 times for all experiments shown.

Gli1 mRNA is a commonly used metric for Hh signalling activity, because Gli1 is a direct Hh target gene. Gli1 and Gapdh mRNA levels were measured by quantitative, reverse-transcription PCR (qRT–PCR) using the Power SYBR Green Cells-To-CT kit from Thermo Fisher Scientific and custom primers for Gli1 (forward primer: 5′-CCAAGCCAACTTTATGTCAGGG-3′ and reverse primer: 5′-AGCCCGCTTCTTTGTTAATTTGA-3′) and Gapdh (forward primer: 5′-AGTGGCAAAGTGGAGATT-3′ and reverse primer: 5′-GTGGAGTCATACTGGAACA-3′). Transcript levels relative to Gapdh were calculated using the ΔCt method and reported in arbitrary units. Each qRT–PCR experiment, which was repeated 2–4 times, included two biological replicates, each with two technical replicates.

Statistical analysis of Gli1 mRNA levels across samples was performed using an ordinary one-way ANOVA test with a Holm–Sidak post-test to correct for multiple comparisons using the GraphPad Prism suite. Statistical significance in the figures is denoted as follows: NS, P > 0.05; *P ≤ 0.05, **P ≤ 0.01, ***P ≤ 0.001, ****P ≤ 0.0001.

Ligand affinity chromatography

Ligand affinity chromatography to assess the interaction between SMO protein in detergent extracts and beads covalently coupled to 20(S)-amine has been described previously10. Membranes from cells transiently or stably expressing constructs encoding mouse SMO variants were lysed in a DDM extraction buffer (50 mM Tris pH 7.4, 150 mM NaCl, 10% v/v glycerol, 0.5% w/v DDM and the SigmaFast EDTA-free protease inhibitor cocktail) for 2 h at 4 °C, followed by removal of insoluble material by ultracentrifugation (100,000g, 30 min). This DDM extract was incubated with 20(S)-OHC beads overnight at 4 °C to allow binding to equilibrium. After extensive washing, proteins captured on the beads were eluted with reducing LDS sample buffer (Life Technologies) and 100 mM dithiothreitol. The presence of SMO in these eluates was determined by quantitative immunoblotting with an anti-SMO antibody32 and infrared imaging (Li-Cor Odyssey). Each experiment was repeated twice with similar results.

Preparation of cholesterol–PEG3–sepharose using the azide-alkyne Huisgen cycloaddition reaction from the Click Chemistry toolbox

100 μl packed PEG3-azide sepharose resin (22 μmole per ml, Click Chemistry Tools) was washed 3 × 1 ml with 20% ethanol (v/v aq.) and re-suspended in 500 μl 20% ethanol (v/v). The bead slurry was supplemented with 1 mM CuSO4, 5 mM Tris(3-hydroxypropyltriazolylmethyl)amine (THPTA), and 15 mM sodium ascorbate. The cycloaddition coupling reaction was initiated by adding 0.044 μmoles (high) of LKM-26, a previously described44,45 alkynyl cholesterol derivative synthesized in-house, to achieve an approximate ligand density of 1:50 (moles coupled azide functional groups: moles uncoupled azide functional groups). High, medium and low ligand densities (Fig. 2f) represent coupling ratios of 1:50, 1:200 and 1:1,000. The reaction was protected from light and allowed to proceed at room temperature for 20 h with end-over-end rotation. The supernatant was removed from the resin, and the reaction quenched with 1 mM EDTA in 20% ethanol (v/v aq.). The supernatant was extracted with diethyl ether and loaded on a normal-phase TLC plate next to an alkynyl cholesterol standard to assess the efficiency of coupling. The thin layer chromatography (TLC) plate was developed using 2% methanol in chloroform (v/v) and stained using 10% CuSO4/10% H3PO4 followed by charring at 200 °C for visualization.

SMO pull-down assays using cholesterol–PEG3–sepharose

Binding reactions were carried out in buffer containing 20 mM HEPES pH 7.4, 150 mM NaCl, and 0.03% DDM in a final volume of 100 μl. Each reaction contained 1 μg purified SMOΔC, the same protein used for crystallization studies. Competitors were added to the binding reaction and incubated for 1 h at room temperature before the addition of 20 μl packed cholesterol–PEG3–sepharose. Reactions were incubated for 16 h at 4 °C for affinity capture. The binding reactions were washed 3 × 1 ml with binding buffer and eluted using 2× Laemmli buffer for 30 min at room temperature. SMOΔC levels in the input, flow-through and captured on the beads were determined by quantitative immunoblotting (Li-Cor Odyssey) using the 1D4 primary antibody (mouse 1:2,000). Each experiment was repeated twice with similar results.

Crystallization and data collection

Protein samples were reconstituted into lipidic cubic phase (LCP) by mixing with molten lipid in a mechanical syringe mixer46. Molten lipid, consisting of 10% cholesterol (Sigma-Aldrich) and 90% 9.9 monoacylglycerol (monoolein, Sigma-Aldrich), was mixed with detergent-solubilized protein (either apo or with 10 mM vismodegib in DMSO) at ~30 mg ml–1 in a ratio of 60:40. A Gryphon robot (Art Robbins Instruments) was used to dispense 50 nl boluses of protein laden mesophase followed by 0.8 μl of precipitant solution onto each of 96 positions on a siliconized glass plate, which were then covered with a coverslip in a ‘sandwich-plate’ format. Crystals were grown at 20 °C and monitored by eye, using a microscope fitted with cross-polarizers, and subsequently imaged using a UV-imaging system (Rigaku Minstrel). apo-SMOΔC crystallized in 0.1 M MES pH 6, 30% (v/v) PEG500 DME, 0.1 M sodium acetate, 0.5 mM zinc chloride, 0.1 M ammonium fluoride. vismo-SMOΔC crystallized in 0.09 M sodium acetate pH4, 0.09 M sodium malonate, 27% (v/v) PEG500 DME, 0.1 M sodium acetate, 0.5 mM zinc chloride, 0.1 M ammonium fluoride.

X-ray data collection was conducted at MX beamline I24 at the Diamond Light Source (Harwell, UK). Prior to data collection crystals were flash-frozen at 105 K. X-ray data were processed using Xia2 (refs 47, 48), scaled using XSCALE48 and merged using Aimless49,50. The final data set used for structure solution and refinement was merged from data from nine crystals for apo-SMOΔC and two crystals for vismo-SMOΔC. Data collection statistics are shown in Extended Data Table 1.

Structure determination, refinement and analysis

The apo-SMOΔC structure was solved by molecular replacement in PHASER51 using the structure of human SMO TMD (PDB 4QIM16), zebrafish SMO CRD (PDB 4C7910) and BRIL (PDB 4EIY52) as search models. Extra electron density accounting for the region between CRD and LD, BRIL and TMD was immediately discernible after density modification in PARROT53 (Extended Data Fig. 3e). We also observed extra density within the CRD ligand binding pocket (Fig. 2a) and assigned this to cholesterol bound in a stereo-specific orientation based on shape, coordination and refinement statistics (cholesterol addition improved the R-factors by over 1%), which is also in agreement with the markedly lower B-factor of the refined cholesterol compared to the protein backbone. Cholesterol may be derived from the cells (mM concentrations within the cell membrane) or from the LCP crystallization mix (that contained 10% (w/v) cholesterol). The apo-SMOΔC polypeptide chain was traced using iterative rounds of BUCCANEER54, manual building in COOT55 and refinement in autoBUSTER56 and PHENIX57. This resulted in a well-defined model for the apo-SMOΔC structure that included two molecules of SMO (residues 59–549) with a BRIL protein segment inserted between SMO TMD helices 5 and 6, two N-linked glycans and a cholesterol molecule. We observed a systematic disorder along the c axis (Extended Data Fig. 2a), resulting in alternating ordered and less ordered hydrophilic layers within the LCP-grown crystals. This was not caused by crystal non-isomorphy or pseudo-symmetry, because reducing the space group from C2 to P1 had no effect on the disorder. The vismo–SMOΔC structure was solved by molecular replacement using the apo-SMOΔC structure. Extra electron density accounting for vismodegib and two well-ordered monoolein molecules was immediately apparent. The structure was refined using autoBUSTER56 and PHENIX57 with non crystallographic and secondary structure restraints. Crystallographic and Ramachandran statistics are given in Extended Data Table 1. Stereochemical properties were assessed by MOLPROBITY58. Superpositions were calculated using the program COOT55, electrostatic potentials were generated using APBS59 and hydrophobicity was calculated according to the Eisenberg hydrophobicity scale60, as implemented in PyMOL61. Buried surface areas of protein–protein interfaces were calculated using the PISA webserver62 with a probe radius of 1.4 Å. Sequence alignment was performed using MULTALIN63 and formatted with ESPRIPT64. Program Caver was used with default settings to visualize the SMO TMD ligand binding pocket65.

Amphipol exchange and MALS

In order to avoid background light-scattering due to free detergent in solution, protein samples were exchanged into amphipol66(A8-35, Anatrace) at a mass ratio of 3:1 amphipol:protein and rotated at room temperature for 30 min. BioBeads (BIORAD), equilibrated in detergent-free buffer (10 mM HEPES pH 7.5, 300 mM NaCl), were added to the protein–detergent–amphipol mixture at 10 mg per 100 μl and incubated overnight at 4 °C to remove all detergent molecules. For multi-angle light scattering (MALS) experiments, amphipol-solubilized protein at 1 mg ml–1 was loaded onto a Superose 6 10/300 column (GE Healthcare), equilibrated in detergent-free buffer, on a Shimadzu system with inline MALS detector (Wyatt). Data were analysed using ASTRA6.1.2 software (Wyatt). For SMOΔC, the values used for dn/dc and ε at 280 nm were 0.185 ml g–1 and 1.541 ml (mg.cm)–1, respectively. For protein conjugate analysis, the dn/dc used for amphipol A8-35 was 0.15 ml g–1 (ref. 67).

Molecular dynamics system setup

Simulations were performed using the GROMACS v4.6.3 simulation package68. Side chain ionization states were modelled using pdb2gmx (Histidine) and PropKa (all other residues)69,70. The N and C termini were treated with neutral charge. Intracellular loop 3 (occupied by the BRIL fusion in our crystal structure) was modelled using coordinates from the PDB entry 4N4W (ref. 16). The protein structure was then energy-minimized using the steepest descents algorithm implemented in GROMACS, before being converted to a coarse-grained representation using the MARTINI 2.2 force field71. The energy-minimized coarse-grained structure was centred in a simulation box with dimensions 100 × 100 × 180 Å3. 270 1-palmitoyl-2-oleoyl-sn-glycero-3-phosphocholine (POPC) lipids were randomly placed around the protein and the system solvated and neutralised to a concentration of 0.15 M NaCl. An initial 1 μs of coarse-grained simulation was applied to permit the self-assembly72 of a POPC lipid bilayer around the GPCR. During the coarse-grained simulation, the structure of the protein was maintained by an elastic network, allowing local conformational flexibility of the protein. Thus, the protein was able to adopt its optimal orientation within the lipid bilayer73. The endpoint of the coarse-grained bilayer self-assembly simulation was converted back to atomic detail using a fragment-based protocol for the lipid conformations74, while retaining the original crystal structure of the protein, now located in its optimal orientation and position within the lipid bilayer. Equilibration of the atomic system was achieved through 1 ns of NPT simulation with the protein coordinates restrained, before the system was subjected to 100 ns of unrestrained atomistic molecular dynamics. Simulations were performed both in the presence and absence of the cholesterol ligand. Five repeat simulations were run for each case.

Coarse-grained simulations

The standard MARTINI force field75 and its extension to proteins71,76 was used to describe all system components. During the coarse-grained self-assembly simulation an ELNEDYN network77 was applied to the protein using force constant of 500 kJ mol–1 nm–2 and a cutoff of 1.5 nm. Temperature was maintained at 310 K using a Berendsen thermostat78 with a coupling constant of τt = 1 ps, and pressure was controlled at 1 bar using a Berendsen barostat78 with a coupling constant of τp = 1 ps and a compressibility of 5 × 10−6 bar-1. Electrostatics and van der Waals interactions in the CG simulations were shifted between 0 and 1.2 nm, and 0.9 and 1.2 nm, respectively, using the standard MARTINI protocol75. An integration time step of 20 fs was applied. Covalent bonds were constrained to their equilibrium values using the LINCS algorithm79. All simulations were run in the presence of standard MARTINI water particles75, and ions added to an approximate concentration of 0.15 M NaCl.

Atomistic simulations

Atomistic simulations were run using the GROMOS53a6 force field80, and its extension to glycans81. The system was solvated using the SPC water model, and ions added to yield an electrically neutral system with a NaCl concentration of approximately 0.15 M. Systems contained approximately 140,000 atoms including 270 POPC molecules, ~41,000 water molecules, 149 sodium ions, and 154 chloride ions. Periodic boundary conditions were applied, with a simulation time step of 2 fs. Temperature was maintained at 310 K using a V-rescale thermostat82 with a coupling constant of 0.1 ps, while pressure was controlled at 1 bar through coupling to a Parrinello–Rahman barostat83 with a coupling constant of 1 ps. Particle Mesh Ewald (PME)84 was applied to model long-range electrostatics. The LINCS algorithm was used to constrain covalent bond lengths79. The g_dist, g_rmsf and g_rms tools implemented in the GROMACS v4.6.3 software package68 were applied to analyse the simulations, with VMD85 and PyMOL61 used for visualization. Cα r.m.s.d. calculations for the CRD were performed after a least-squares fit of the trajectory to Cα particles of the initial (crystal structure) CRD coordinates. Cα r.m.s.d. calculations for the TMD were performed after a least squares fit to Cα particles of the initial (crystal structure) transmembrane helix coordinates, excluding the inter-helix loop regions of the TMD from this calculation. DSSP matrices were produced using the do_dssp tool implemented in GROMACS68.

SAXS experiments

For size-exclusion chromatography-coupled small-angle X-ray scattering (SEC–SAXS), amphipol-exchanged untreated and 20(S)-OHC-treated (~5 mM 20(S)-OHC) SMOΔC were loaded onto a 4.8-ml KW-403 column (Shodex), equilibrated in a no-detergent buffer, on an Agilent 1260 system (B21, Diamond Light Source). Approximately 45 μl of sample was injected at 9.6 (20(S)-OHC-treated) or 13 (apo) mg ml–1 using a flow rate of 160 μl per min. Chromatographic elution was directed into a specialized SAXS flow cell, with a 1.6 mm path length, held at 20 °C. SAXS measurements were made using a sample-to-detector distance of 4.09 m at a wavelength of 1 Å. SAXS images (frames) were collected as a continuous set of 3 s exposures across the elution peak. The corresponding buffer background frames for producing the background-subtracted SAXS curve was collected at greater than 1.5 column volumes.

Images were corrected for variations in beam current, normalized for exposure time and processed into 1D scattering curves using in-house beamline software (GDA). Buffer subtractions and all other subsequent analysis were performed with the program ScÅtter (http://www.bioisis.net/scatter). Samples were checked for radiation damage by visual inspection of the Guinier region as a function of exposure time. Considerable differences between the SAXS curves (q > 0.05 Å–1) of the treated and untreated samples imply major structural differences between the two states. Large differences between the two P(r) distributions imply a significant structural change in the 20(S)-OHC state.