Abstract
Until now, the access of ligands into the binding pocket of a G-protein coupled receptor has scarcely been studied using molecular-modeling techniques because of the lack of sufficient algorithms. Neither with Monte-Carlo- nor with Molecular Dynamics Simulations can the penetration of a ligand into the binding pocket of a receptor be calculated because of the excessive amount of computing time needed. Therefore, a new algorithm LigPath for approximate calculation of a ligand’s pathway into the binding pocket has been developed. This new algorithm is based on a linkage of directional guiding of the ligand, Monte-Carlo-Search and minimization. In order to evaluate the performance of the algorithm, the guinea-pig histamine H1 receptor was investigated in combination with one of its potent agonists, histaprodifen, which is proposed to bind in a pocket deep between the transmembrane helices of the receptor. Our calculations show that the amino acids Tyr194, Phe193, Phe436 and Phe433 guide the positively charged histaprodifen from the extracellular part of the receptor into the binding pocket.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
In the development of potential drugs, computational methods play an important role. Particularly molecular modeling techniques [1–16] are used to optimize a potential ligand for optimal fitting into the proposed binding pocket of a G-protein-coupled receptor (GPCR), because this docking is considered to be responsible for an agonistic or antagonistic effect. Most modeling studies neglect the possibility of transporting a ligand through the transmembrane helices into the binding pocket of the GPCR. This results in missing knowledge about the amino acids that lead the ligand into the binding pocket and prevents predictions about kinetic aspects concerning experimentally determined rate constants. Molecular-dynamics simulations include the time variable and should therefore be appropriate for modeling the transport of a ligand into the binding pocket. However, because of the large amount of computational time needed, this method is not successful for solving this problem. Steered molecular-dynamics simulations, as described in the literature [2–4], use a constant external force on the ligand in one direction to enforce ligand movement. However, this force introduces an unwanted arbitrariness on the one hand and on the other hand there is no knowledge about its magnitude. In order to calculate the unbinding pathway, a further technique in combination with molecular-dynamics simulations is known [17]. In this technique, the ligand is moved incrementally from the binding pocket in a small number of steps to the surface of the protein. At each of these intermediate points on the pathway, minimization and molecular-dynamics simulations are carried out. Because of the large distance between each of the intermediate points, up to 0.2 nm, a loss of conformational information in the unbinding pathway must be expected. Therefore, the intention is to calculate a sequence of differentially spaced conformations determining the way of a ligand into the binding pocket, only on the basis of the potential energy surface without kinetic aspects so far. Nevertheless, the amino acids that are absolutely necessary for the transport of the ligand into the binding pocket should be predicted. Thus, the new algorithm LigPath has been developed and applied to the calculation of the pathway of histaprodifen (Fig. 1), a potent agonist at the guinea-pig histamine H1-Receptor (gpH1R) [18–20].
Methology
As already mentioned, traditional modeling techniques are not suitable for solving this problem. Therefore, the LigPath algorithm was developed in order to limit cpu time and to obtain results within a reasonable calculation time. Thus, to obtain results after a few days of calculation without large computer clusters, one must use preassumed information in the calculation. Logically, it would be appropriate to give a source structure as starting information. This source structure should be a minimized ligand–receptor-complex with the ligand positioned at the extracellular part of the receptor, where the entry to the binding pocket is localized. To avoid large searches of the algorithm in regions of the energy surface that are not involved in ligand transport, the search direction should be given as the destination of the ligand in the proposed binding pocket.
Essentially, the algorithm is based on a generation-child scheme. Beginning with the starting structure, n child conformations are generated randomly and energy minimized. The best child of this actual generation is determined and used as a new starting structure for the next generation.
In Fig. 2, the flow chart of the algorithm LigPath is given. At the beginning, the source structure and the ligand destination are read in. Starting with the source structure, n new child structures are calculated in five steps:
-
Each atom of the ligand is translated about Δr on a virtual guiding line between its actual coordinates and the corresponding destination coordinates. Furthermore, each atom is allowed to deviate from the guiding line by an angle Δφ.
-
After translation, the ligand is rotated by angles of Δα, Δβ and Δγ about the x-, y- and z-axes. The ligand atom that defines the center of rotation is chosen randomly in each step.
-
Next, rotations by an angle of Δρ over all defined rotable bonds of the ligand are carried out.
-
Next, rotations by an angle of Δρ over all rotable bonds of the defined amino acids (Fig. 3) of the receptor are carried out.
-
Finally, a random translation ΔTM xy of the transmembrane helices within the xy plane coupled with random angular variation Δθ z of the transmembrane helices with respect to the z axis is allowed for n TM children of the whole set of n children.
After these translational and rotational motions, carried out by randomly chosen values within user-defined limits, very small interatomic distances, smaller than optimal van-der-Waals distances, between some atoms of ligand and receptor are expected. To get rid of this disadvantage, the structure generated must be energy minimized. However, bad starting structures can result in expensive minimization times or in bad minimization results. Therefore, the interatomic distances between ligand–receptor and receptor-receptor including all displaced atoms are calculated. If there are distances below a defined limit Δr coll, translation and rotation of the ligand and rotation of the corresponding amino-acid side chains (Fig. 2) are carried out until there are no more interatomic collisions. If the structure generated thus is accepted within the defined limits, it is minimized using the software package GROMACS 3.2 [21]. Our algorithm, written in the programming language C, is linked to the GROMACS 3.2 [21] minimization routine by a shell script, using Linux as the operating system. The minimization parameters can be defined within GROMACS 3.2 [21] just as when using GROMACS 3.2 [21] in the standalone mode. After minimization of all children structures, the best of them would be determined by observation of the potential-energy gradient of the whole simulation box. Because of the large impact of the environment, containing water and lipid, on the potential energy, unfavorable conformations of the ligand–receptor-complex cannot be detected. Thus, the potential energy of the ligand–receptor-complex is used in calculation of the quantity q j (Eq. 1) to determine the best child.
The term E j (i) is the potential energy of the ligand–receptor-complex of child j in the actual generation i, whereas E(i−1) means the potential energy of the ligand–receptor complex of the best child of the previous generation (i−1). Specifically, the notation E 0(0) stands for the potential energy of the ligand–receptor-complex in the starting structure. Denoting the coordinates of ligand atom k of child j in generation i by \(x^{k}_{j} {\left( i \right)},\;y^{k}_{j} {\left( i \right)}\;{\text{and}}\;z^{k}_{j} {\left( i \right)}\) the corresponding rmsd j (i) (Eq. 2) describes the spatial distance between the actual and the destination position given by \({\left( {x^{k}_{{{\text{dest}}}} ,\;y^{k}_{{{\text{dest}}}} ,\;z^{k}_{{{\text{dest}}}} } \right)}\) of the ligand, which consists of N atoms. The symbol rmsd0(0) defines the appropriate value of the starting structure.
The calculation is stopped if the ligand has reached the defined destination position within defined limits. Thus, the LigPath algorithm is based on a combination of directional guiding (translation of the ligand on the guiding line), Monte-Carlo search (random motions of ligand and receptor) and minimization.
Evaluation of the LigPath algorithm
In order to evaluate our new LigPath algorithm, we performed a calculation on a known system [3]. Therefore, we used the crystal structure of bacteriorhodopsin (bR) (1FBB) [22, 23] as a template and prepared the retinal-bacteriorhodopsin complex according to [3] and minimized the complex with GROMACS 3.2 [21] in combination with the ffG53A6 force field [24]. To avoid constraints on the bacteriorhodopsin as described in [3], the protein was embedded in the surrounding medium. The protein was placed manually in a POPC (1-palmitoyl-2-oleoyl-phosphatidylcholine) membrane bilayer (104 molecules) using the software package VEGA [25]. 13,216 SPC water molecules and five sodium and chloride ions were added to the simulation box. After further energy minimization, the resulting system was treated with the LigPath algorithm. All possible rotable bonds of the retinal were included in the “rotable bond” module of LigPath (Fig. 2). The rotable bonds for the amino acids are given in Fig. 3. After equilibration of the simulation box (seed = 56,789, n = 2, Δr = 0.0 nm, Δφ = 20°, α = Δβ = Δγ = 0°, Δρ = 2.5°, n™ = 0, ΔTM xy = 0 nm, Δθ z = 0°, Δr coll = 0.1 nm, E 0(0) = −15,544 kJ mol−1, rmsd0(0)=1.62 nm), the unbinding pathway of retinal was calculated with LigPath (seed = 56,789, n=10, Δr = 0.02 nm, Δφ = 20°, Δα = Δβ = Δγ = 2.5°, Δρ = 2.5°, n™ = 0, ΔTM xy = 0 nm, Δθ z = 0°, Δr coll = 0.1 nm, E0(0)=−16,032 kJ mol−1, rmsd0(0)=1.62 nm).
In Fig. 4, different calculated quantities are plotted against the number of generations, where negative numbers indicate the equilibration process. Figure 4a shows the distance of the carbonyl oxygen of retinal and the nitrogen of Lys216 during unbinding, which takes place within generations 0–125, and is characterized by an increase of approximately 1.2 nm. The change in the potential energy of the ligand–receptor-complex from the initial state to the final state is approximately 250 kJ mol−1. During the unbinding process, this quantity shows a maximum (Fig. 4b). In Fig. 4c, the variation of the rmsd for all non-hydrogen atoms of the protein is shown with reference to the initial structure. A comparison with the results given in [3] shows a quantitative correlation for the O–N distance and only a qualitative one with the rmsd because the constraints described in [3] are replaced by the lipid/solvent environment, the increase of rmsd is smaller in our calculation. The use of a new improved force field is responsible for larger differences in the variation of the potential energy of the ligand–receptor complex, and results especially in a significant maximum, as mentioned above.
Qualitative agreement with the structural results given in [3] could also be found. The retinal interacts sequentially with the amino acids Lys216-Tyr185-Trp189-Pro186/Trp138-Met142/Thr139 during its unbinding pathway. As described in a short overview [3], the amino acids Tyr185, Trp189, Pro186, Trp138 and Met142 show an effect on the reconstitution rates in appropriate bR mutants.
Summarizing, the results with respect to the unbinding pathway of retinal in bR obtained with the LigPath algorithm are in good agreement with those predicted by molecular-dynamics simulations [3].
Calculation of the binding pathway of histaprodifen
Preparation of the system
Construction of a receptor model
At first, a model of the gpH1R was generated. The sequence of gpH1R was aligned according to Ballesteros et al. [26] to bovine rhodopsin (Fig. 5). Using the 3D-crystal structure of bovine rhodopsin (1F88) [22, 27] as template, the 3D-structure of the receptor was generated on the basis of comparative homology modeling in combination with the Loop Search module of the software package SYBYL 7.0 [28]. Because of the lack of sufficient experimental data concerning the structure of the intracellular C3-Loop (189 amino acids), it was only partially included in the modeling studies. This approximation should not influence the modeling of the entry to the binding pocket and the binding pocket itself much. Then the receptor was minimized carefully, paying particular attention to the correct orientation of the helical amino-acid side chains that face the interior of the transmembrane helix bundle or the lipids, as predicted [26]. The homology model thus generated is a sound basis for further modeling studies, like docking of ligands into the proposed binding pocket. However, for calculations, especially including the entry to the binding pocket on the extracellular side of the receptor, the lack of an environment like a membrane or aqueous extra-and intracellular regions would be a serious approximation. Therefore, it is state of the art to embed the receptor into the surrounding medium. Using the software package VEGA [25], the receptor was placed in a POPC membrane bilayer (104 molecules) manually [29]. Additionally, the histaprodifen was positioned manually at the proposed entry between the E3-Loop and the N-terminus at the extracellular side of the receptor. With GROMACS 3.2 [21], a simulation box containing intra- and extracellular water (12,735 molecules) was constructed. Because of the positively charged receptor and ligand, electroneutrality was achieved by placing an appropriate number of seven sodium and 25 chloride ions inside the box. The whole system was minimized with GROMACS 3.2 [21].
Parameter for calculation of the pathway
Definition of rotable bonds
The rotable bonds of histaprodifen are illustrated in Fig. 6.
Initialization parameters
For our minimization with GROMACS 3.2 [21], the internal GROMACS force field ffG53A6 [24] was used. The parameters for the LigPath algorithm are given in Table 1.
Changes on the surrounding medium
The surrounding medium (POPC lipid bilayer, water, sodium and chloride ions) was included during the whole calculation. The only changes in solvent/lipid positions were introduced by the minimization steps, whereby no solvent clashes were observed.
Calculation of the starting structure
To guarantee an optimal orientation of the amino-acid side chains, the whole simulation box was first minimized using the LigPath algorithm, with the initialization parameter set to run0 (Table 1), setting the characteristic variables Δr, Δα, Δβ and Δγ to zero. Within the first 100 generations the potential energy of the whole simulation box decreased rapidly from about −7.9×105 kJ mol−1 to about −8.34×105 kJ mol−1. During the next 900 generations the potential energy varied slightly down to −8.4×105 kJ mol−1 (Fig. 7). As source structure for the pathway calculation for the histaprodifen, the last structure resulting from the minimization cycle was used.
Definition of the destination structure
The destination structure is based on the binding mode, which has been described for histaprodifen at the human H1R by Elz et al. [18] In analogy, the binding mode for histaprodifen at the gpH1R was modeled with Sybyl [28]. The histaprodifen interacts with the amino acids Asp116, Tyr117, Ile124, Trp167, Phe208, Pro211, Phe429, Tyr432, Phe433 and Phe436 in the binding pocket of the gpH1R. The resulting coordinates for the histaprodifen relative to the coordinates of the gpH1R were used as destination structure in the pathway calculation.
Results and discussion
The algorithm described allows the calculation of the pathway of histaprodifen into the proposed binding pocket of the gpH1R. The calculations were carried out including the environment (lipid bilayer and extracellular and intracellular water). Neglecting the environment destroys the ternary structure of the GPCR during the calculation, so constraints on the backbone atoms would have to be set. However, this would lead to an artificial impact on the system. Thus, the inclusion of the environment leads to a natural stabilization of the ternary structure without loosing flexibility of the receptor. The quality of the resulting receptor structures was checked with the help of ramachandran plots.
Figure 7 shows the potential-energy curves for minimization of the simulation box (run0) and additionally for the five pathway calculations (run1 to run5). During the first 1,500 generations without ligand penetration, the potential energy leads smoothly into a limit. When ligand penetration is allowed (starting with generation no. 1,500), the potential energy decreases rapidly until the ligand reaches the proposed binding pocket. Variations in initialization parameters lead to deviations in the energy range of about 5%, based on different orientations of the ligand or the amino-acid side chains.
In Fig. 8, the potential energy of the ligand–receptor-complex in the whole simulation box for the above runs is given as a function of the ligand’s rmsd.
The closed lines represent the minimum energy path of each run. The dots show the potential energy of all children produced during the calculations. The five runs show good agreement in the energy profiles with respect to local minima and maxima. The mean values together with the upper and lower limits of the potential energy for all runs are shown in Fig. 9. Six representative snapshots of significant structures along the whole pathway are given in Fig. 10. Structure (a) shows the results of the preminimization of the simulations box. The part of the extracellular receptor surface with the amino acids Tyr194 and Glu448 can be identified as a structure-recognition system for histaprodifen. After about 30 generations (structure (b)), the rmsd of the ligand is about 2.1 nm (Fig. 9). The potential energy of the ligand–receptor complex has a local minimum because of the stabilization of the diphenylpropyl moiety of the ligand. This stabilization is caused by an aromatic interaction between a phenyl group of the ligand and the amino acid Tyr194 of the receptor. This amino acid leads the ligand to the entrance of the hydrophobic channel of the receptor. The terminal \({\text{NH}}_{{\text{3}}} ^{{\text{ + }}} \) moiety, however, does not vary its position, so a rotation of the ligand from a horizontal orientation into a vertical orientation is the consequence. This reorientation destabilizes the ligand–receptor complex, so the potential energy increases about 400 kJ mol−1 at a rmsd of 1.8 nm (Fig. 9). The diphenylpropyl moiety is completely immersed in the receptor in snapshot (Fig. 10c), so the hydrophobic part of the histaprodifen no longer undergoes a repulsive interaction with the polar water on the extracellular side of the receptor. The potential energy of the ligand–receptor-complex varies between −18,900 and −18,700 kJ mol−1 in the rmsd range from 1.7 to 0.3 nm (Fig. 9). This part is shown by snapshots (d) to (f) (Fig. 10). There the amino acids Phe193, Phe436 and Phe433 lead the histaprodifen through the channel into the binding pocket. This guidance is mostly based on successive aromatic interactions of these amino acids with one of the ligand’s phenyl groups. In snapshot (f) (Fig. 10) the histaprodifen is shown in the binding pocket of the receptor. In the LigPath calculation the same histaprodifen-amino acid interactions were found, as described above.
The homology modeling of the receptor is based on an inactive GPCR structure of bovine rhodopsin. However, during the penetration of the ligand, slight changes in the mutual orientation of some transmembrane helices can be observed, accompanied by a structural reformation of the binding pocket for optimal docking of the histaprodifen. Figure 11 shows the difference in the helix orientation before (grey-colored) and after docking of the ligand (red, green and blue colored). During the pathway calculation, the transmembrane helices TM5 and TM6 exhibit the largest changes in helix orientation compared with the starting structure, induced by Pro211 (TM5, red circle, Fig. 11) and Pro431 (TM6, red circle, Fig. 11). Figure 12 shows the corresponding rmsd of the transmembrane helices based on the starting structure as a function of the number of generations. After equilibration without further helix movement (generation −249–0, Fig. 12), a considerable rearrangement of the transmembrane helices is observed during docking of the histaprodifen into the binding pocket (second red arrow, Fig. 12). Further structural changes in the helices are observable for the next 900 generations. TM5 and TM6 with rmsds of 0.34 nm and 0.33 nm, respectively, show the largest deviations. These results may give an indication of continuous receptor activation during the calculation [30–34].
Conclusion
The LigPath algorithm for a predictive calculation of the binding pathway of a ligand into its proposed binding pocket has been developed. On basis of this concept, the pathway of histaprodifen into the binding pocket of the gpH1R was calculated. The simulation shows that the histaprodifen is guided step by step into the binding pocket with the participation of the amino acids Tyr194, Phe193, Phe436 and Phe433. To establish experimental proof, point mutations of these amino acids are in preparation. A comparision of experimentally determined rate constants for the binding and unbinding process of the histaprodifen between the wild type and mutated gpH1R will be used to verify the prediction.
References
Guilbert C, Perahia D, Mouawad L (1995) Comp Phys Comm 95:263–273
Izrailev S, Stepaniants S, Isralewitz B, Kosztin D, Lu H, Molnar F, Wriggers W, Schulten K (1998) In: Deuflhard P, Hermans J, Leimkuhler B, Mark A E, Reich S, Skeel RD (eds) Computational molecular dynamics: challenges, methods, ideas, volume 4 of lecture notes in computational science and engineering. Springer, Berlin Heidelberg New York, pp 39–65
Isralewitz B, Izrailev S, Schulten K (1997) Biophys J 73:2972–2979
Kosztin D, Izrailev S, Schulten K (1999) Biophys J 76:188–197
Tokarski JS, Hopfinger AJ (1997) J Chem Inf Comput 37:792–811
Biebermann H, Schöneberg T, Schulz A, Krause G, Grüters A, Schultz G, Gudermann (1998) FASEB J 12:1461–1471
Czaplewski C, Kazmierkiewicz R, Ciarkowski J (1998) J Comput Aided Mol Des 12:275–287
Colson A-O, Perlman JH, Smolyar A, Gershengorn M, Osman R (1998) Biophys J 74:1087–1100
Salo OMH, Lahtela-Kakkonen M, Gynther J, Järvinen T, Poso A (2004) J Med Chem 47:3048–3057
Gehlhaar DK, Verkhivker GM, Rejto PA, Sherman CJ, Fogel DB, Fogel LJ, Freer ST (1995) Chem Biol 2:317–324
Gershengorn MC, Osman R (2001) Endocrinology 142:2–10
Barnett-Norris J, Hurst DP, Lynch DL, Guarnieri F, Makriyannis A, Reggio PH (2002) J Med Chem 45:3649–3659
Pardo L, Giraldo J, Martin M, Campillo M (1991) Mol Pharmacol 40:980–987
Gouldson PR, Winn PJ, Reynolds A (1995) J Med Chem 38:4080–4086
Barnett-Norris J, Hurst DP, Buehner K, Ballesteros JA, Guarnieri F, Reggio PH (2002) Int J Quant Chem 88:76–86
Gouldson PR, Kidley NJ, Bywater RP, Psaroudakis G, Brooks HD, Diaz C, Shire D, Reynolds CA (2004) PROTEINS: Struc Func Bioinf 56:67–84
Sai Ram KVVM, Rambabu G, Sarma JARP, Desiraju GR (2006) J Chem Inf Model (published on Web on 09-May-2006)
Elz S, Kramer K, Leschke C, Schunack W (2000) Eur J Med Chem 35:41–52
Menghin S, Pertz HH, Kramer K, Seifert R, Schunack W, Elz S (2003) J Med Chem 46:5458–5470
Seifert R, Wenzel-Seifert K, Bürckstümmer T, Pertz HH, Schunack W, Dove S, Buschauer A, Elz S (2003) J Pharmacol Exp Ther 305:1104–1115
van der Spoel D, Lindahl E, Hess B, van Buuren AR, Apol E, Meulenhoff PJ, Tieleman D P, Sijbers ALTM, Feenstra KA, van Drunen R, Berendsen HJC (2004) GROMACS 3.2. Department of Biophysical Chemistry, University of Groningen, The Netherlands
RCSB PDB Protein Data Bank
Subramaniam S, Henderson R (2000) Nature 406:653–657
Oostenbrink C, Villa A, Mark AE, van Gunsteren WF (2004) J Comput Chem 25:1656–1676
Pedretti A, Vistoli G (1996–2006) VEGA ZZ (Software) http://www.ddl.unimi.it/vega/
Ballesteros JA, Shi L, Javitch JA (2001) Mol Pharmacol 60:1–19
Palczewski K, Kumasaka T, Hori T, Behnke CA, Motoshima H, Fox BA, Le Trong I, Teller DC, Okada T, Stenkamp RE, Yamamoto M, Miyano M (2000) Science 289:739–745
SYBYL 7.0 (2004) Tripos Inc
Baldwin JM, Schertler GFX, Unger VM (1997) J Mol Biol 272:144–164
Gether U (2000) Endocr Rev 21:90–113
Gether U, Kobilka BK (1998) J Bio Chem 273:17979–17982
Kobilka B (2004) Mol Pharmacol 65:1060–1062
Ghanouni P, Steenhuis JJ, Farrens DL, Kobilka BK (2001) Proc Natl Acad Sci USA 98:5997–6002
Luo X, Zhang D, Weinstein H (1994) Prot Eng 7:1441–1448
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Straßer, A., Wittmann, HJ. LigPath: a module for predictive calculation of a ligand’s pathway into a receptor-application to the gpH1 - receptor. J Mol Model 13, 209–218 (2007). https://doi.org/10.1007/s00894-006-0152-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00894-006-0152-9