Key words

1 Introduction

The interactions between small molecules or small peptides and protein targets are at the basis of many biological processes; therefore, the scientific community has been very prolific in developing algorithms, protocols, and methodologies to describe, understand, and control the process of recognition and formation of protein–ligand and protein–peptide complexes [15]. The ability to elucidate the pharmacodynamical properties of low molecular weight compounds or small peptides, along with the possibility of rationally designing novel drugs, relies on the accurate prediction of atomic interactions between ligands and target proteins. However, the ligands’ large number of degrees of freedom and proteins’ backbone and side chains flexibility present a critical challenge for an effective computational description of the ligand–receptor interaction (i.e., docking calculations) [68]. Modeling the induced fit phenomenon, whereby both the target and the ligand undergo mutually adaptive conformational changes upon binding, is particularly demanding due to significant conformational sampling required for computational optimization of such interactions [810]. In order to properly account for this effect, experimentally (via X-ray crystallography or NMR spectroscopy) and/or computationally (via molecular dynamics or normal mode analysis) determined protein conformations have been included in current docking calculations [1115]. However, multiple conformations of the protein may not be available, or be biased toward the protein–ligand complex conformations, and, thus not able to capture new rearrangements of protein binding sites upon binding of novel compounds.

To overcome these limitations, we have recently developed a new docking algorithm, namely MedusaDock [16], which accounts for ligand and receptor flexibility at the same time. In MedusaDock, we build a stochastic rotamer library for each ligand, and simultaneously model the protein sidechain conformation using a rotamer library for all natural amino acids. The efficient sampling of our docking is associated with the use of MedusaScore [17], a physical force field-based scoring function accounting for the protein–ligand interaction energy. The adoption of MedusaScore circumvents the problem of low transferability among different targets and ligands, which is typical of empirical scoring functions classically used in docking calculations [18, 19]. MedusaDock and MedusaScore have been successfully adopted in the evaluation of the binding properties of both peptides [5] and small molecules [16, 20, 21].

Our docking approach has successfully predicted the native conformations of 28 out of the 35 study cases proposed in the recent CSAR-2011 competition [20], more than any other group in the exercise (H. Carlson, personal communications). In this chapter, we present a standard protocol to perform the docking of the propanolol enantiomers in the binding site of the β2 adrenergic receptor (β2AR). We (1) assess the structural quality of this G protein-coupled receptor’s structure using our in-house developed software Gaia, which compares the intrinsic properties of protein structural models to high-resolution crystal structures (http://chiron.dokhlab.org [22]); (2) generate the optimized starting structures of ligands using widely used molecular modeling tools; and finally (3) calibrate and run docking calculations using MedusaDock [16], which will eliminate any possible bias originated from the starting conformations of the amino acids in β2AR binding pockets.

2 Materials

To implement the reported docking calculation procedure, it is necessary to have access to an internet-connected computer running a Linux operative system and mount a licensed copy of the Schröedinger Suite (Schröedinger, LLC), as well as a licensed copy of the MedusaDock software (Molecules in Action, LLC).

3 Methods

3.1 Protein Preparation

  1. 1.

    Navigate through the Protein Data Bank (PDB) website [23] to download the crystallographic coordinates of the human β2AR at 2.8 Å resolution (PDB-ID: 3NY8 [24]). From the downloaded file, remove the coordinates of (1) the co-crystallized inverse agonist ICI 118,551; (2) water molecules not mediating the binding of ICI 118,551 to β2AR; and (3) molecules used for technical purposes and present in the final crystal structure.

  2. 2.

    In order to estimate the quality of the resulting β2AR protein structure, run the in-house developed software Gaia [22]. Navigate to the following address http://chiron.dokhlab.org. Click on the Submit Task button in the starting page (Fig. 1a). In the step 1 section, enter a Job Title in the dedicated window, and upload the file containing the β2AR crystallographic structure in pdb format. You can choose to receive an e-mail notification when the submitted job is completed. In the step 2 section, choose the task Gaia to validate the submitted protein structure. The status of the calculation can be monitored via the panel Gaia, which is accessible by clicking the Home/Overview button in the starting page (Fig. 1a). Upon completion of the job (indicated by a green mark in the Status), a short report of some protein features will be presented on the web page (Fig. 1b). The user can download a detailed report on the structural features of the protein clicking on the eye icon in the table (Fig. 1b, see Note 1 ).

    Fig. 1
    figure 1

    (a) Home page of Chiron/Gaia server for protein structure refinement, which is available at the following link: http://chiron.dokhlab.org. (b) Short report of protein’s structural features from the Chiron/Gaia server.Fig. 1 (continued) The green mark below the Status column indicates the completion of the job; the eye icon in the table gives access to a detailed report, which can be downloaded in pdf format. (c) Initial summary about protein’s structural features as downloaded from the Chiron/Gaia server. Values highlighted in red usually need the user attention in order to further refine the submitted protein structure (see Note 1 ). A detailed report about steric clashes, hydrogen bonds in the shell and in the core of the protein, solvent accessible surface area, and void volume is also available to the user

3.2 Ligand Preparation

  1. 1.

    Several applications can be used to prepare the structure of ligands to be used in docking calculations. In this specific case, we will use a number of applications available via the Schrödinger Suite. Starting from the Maestro interface (v. 9.3.5), use the 2D Sketcher tool to draw the chemical structures of the inverse agonist ICI 118,551, co-crystallized with the β2AR protein, as well as the two propanolol enantiomers, whose binding modes will be investigated through docking.

  2. 2.

    The ligand structures need to be further optimized using the LigPrep application. The user can choose the appropriate force field (in this case MMFFs [25]) for the optimization of atom distances, angles, and dihedral angles, along with the most appropriate pH for the determination of the formal charges of titratable groups (see Note 2 ). Several options are available for the determination of the ligands’ stereochemistry. Since we have manually drawn the ligand structures, we determine the appropriate chiralities from the generated 3D structures without constructing any tautomers. The optimized structures of ligands are saved in mol2 format for docking calculations, and in Structure Data Format (i.e., SDF format by MDL Information Systems) for storage.

3.3 Docking Calibration

  1. 1.

    Docking calculations are executed via our Monte Carlo-based algorithm MedusaDock [16], which simultaneously accounts for ligands’ and receptors’ (side chains) flexibility. We calibrate docking calculations to the target protein by performing a self-docking of any co-crystallized binder as retrieved from the PDB to assess both the convergence of docking calculations, and the ability of reproducing the native pose of the co-crystallized ligand (i.e., ICI 118,551) in the β2AR binding site.

  2. 2.

    In order to test the convergence of docking results, submit several independent docking calculations of ICI 118,551 in the β2AR binding site (e.g., 100, 200, 500) using MedusaDock [16] (see Note 3 ), and plot the distributions of the binding energies as estimated by MedusaScore [17] (Fig. 2a). The number of calculations by which there is no more variation of the poses’ binding energy distributions will be the minimal number of docking runs normally submitted to explore the binding modes of compounds (with similar molecular weight and rotatable bonds to ICI 118,551) in the β2AR binding site.

    Fig. 2
    figure 2

    (a) Convergence of the distributions of docking pose’s binding energies extracted from 200 and 500 independent MedusaDock calculations are reported in green and blue, respectively. (b) Normal distribution (red dashed curve) of docking pose’s binding energies extracted from 200 independent MedusaDock calculations (green bars)

  3. 3.

    The estimated binding energies for all of the docking poses of ICI 118,551 (as for any docked compound) show a normal distribution (Fig. 2b). Therefore, according to the central limit theorem [26], it is possible to retrieve as statistical significant solutions from only those docking poses for which the Z-score is lower than −2 (i.e., less than 5 % probability that the specific docking pose is extracted by chance). In this case, Z is defined as:

    $$ Z=\frac{x-\mu }{\sigma } $$

    where x is the estimated binding energy of a specific docking poses, and μ and σ are the mean and the standard deviation of the binding energies in the population of binding poses, respectively.

  4. 4.

    On the subset of extracted docking poses (i.e., poses with Z-score lower than −2), perform a cluster analysis to retrieve the most representative docking pose (i.e., centroid of the most populated cluster of poses). Cluster the ensemble of docking solutions according to the root mean square deviation (RMSD) computed over the ligand’s heavy atoms. The optimal number of highly populated clusters can be identified by applying the average linkage method [27] and the Kelley penalty index [28] in order to minimize the number of clusters and the spread of internal values in each cluster. The clustering level with the lowest Kelley penalty represents a condition where the clusters are highly populated and concurrently maintain the smallest internal spread of RMSD values (see Note 4 ). The centroid of the most populated cluster is chosen as the representative conformation of the ICI 118,551 bound to β2AR.

  5. 5.

    Calculate the RMSD of the extracted solution of ICI 118,551 with respect to the original co-crystallized conformation of the ligand in β2AR. The RMSD computed over the ligand’s heavy atoms (1.4 Å) is below the X-ray resolution (2.8 Å). Therefore, the applied strategy is successful in reproducing the native pose of ICI 118,551 as also demonstrated by the consistency with the electron-density map of the crystal as downloaded from the Uppsala Electron Density Server [29] (Fig. 3a).

    Fig. 3
    figure 3

    (a) Superimposition of MedusaDock docking solution of ICI 118,551 to its crystallographic conformation in the β2AR binding site (PDB-ID: 3NY8). The described docking procedure demonstrates high reliability as it reproduces the binding pose of the original co-crystallized molecule with a RMSD computed over the ligand’s heavy atoms of 1.4 Å, which is below the X-ray resolution (2.8 Å). The binding energy as estimated by MedusaDock is −39.4 kcal/mol and −37.9 kcal/mol for ICI 118,551 in its docked and crystallized conformation, respectively. Carbon atoms are represented in blue and green for ICI 118,551 in its docked and crystallized conformation, respectively. β2AR electron density map available from the Electron Density Server is reported as white mesh. (b) R/S propanolol bound conformations obtained by combining the MedusaScore values with a hierarchical cluster analysis of statistically significant docking solutions (i.e., poses with Z-score lower than −2, main text). The binding energy as estimated by MedusaScore is −38.1 kcal/mol and −38.8 kcal/mol for R- and S-propanolol, respectively. The reported solutions represent the centroids of the most populated clusters of statistically significant docking poses of R- and S-propanolol (i.e., 61.5 % and 57.7 % of the conformational ensembles, respectively). Carbon atoms are represented in pink and cyan for R- and S-enantiomers, respectively. The same color code is adopted to indicate the sidechains of β2AR amino acids when in complex with the two enantiomers

3.4 Docking Calculations for Propanolol Enantiomers

  1. 1.

    Using MedusaDock submit the number of independent docking calculations determined in the step 2 of docking calibration (see Note 5 ).

  2. 2.

    Isolate, cluster, and retrieve the obtained docking poses of propanolol enantiomers (Fig. 3b) as described in the steps 35 of docking calibration.

4 Notes

  1. 1.

    Starting from Gaia panel in the Home/Overview page (Fig. 1b), the user can download a detailed report of the structural properties of the submitted protein in comparison with what observed in high-resolution crystal structures. The initial summary is reported in Fig. 1c. Values highlighted in red usually need the user attention in order to further refine the submitted protein structure. Such operation can be performed using the software Chiron [30], which minimizes the number of non-physical atom interactions (clashes) in the given protein structure.

  2. 2.

    The user can choose several options for the ligands’ optimization. Available force fields are MMFFs [25] or OPLS_2005 [31, 32]. The ionization state of titratable groups can be refined at the appropriate pH (the user should retrieve any available information about the pH value at the protein binding site) using either the Epik or the Ionizer application. The user can also decide to generate tautomers or all possible combinations of stereoisomers for each optimized ligand.

  3. 3.

    MedusaDock command can be submitted in a machine running a Linux operating system using the following command:

    $> ./medusaDock.linux –i TARGET_PROTEIN –m MOLECULE_TO_DOCK –o DOCKING_SOLUTION –p ./ MEDUSADOCK_PARAMETERS/ -M BINDING_SITE_CENTER –r BINDING_SITE_RADIUS –S SEED_NUMBER –R

    In this specific case TARGET_PROTEIN is β2AR; MOLECULE_TO_DOCK is ICI 118,551; DOCKING_SOLUTION is the output name for the calculation; MEDUSADOCK_PARAMETERS is the directory where parameters for docking calculations are stored; BINDING_SITE_CENTER is the centroid of the ICI 118,551’s crystallographic coordinates as retrieved from the PDB (PDB ID: 3NY8), which has been chosen as center of the β2AR binding site; BINDING_SITE_RADIUS is 8 Å; SEED_NUMBER is a random number to be used to define a new independent Monte Carlo cycle; and –R is the flag which specify the initialization of a docking calculation in MedusaDock. The command is customizable for running multiple independent docking calculations as in the following bash script:

    $> for i in $(seq –w 1 200 )

    $> do

    $> rng = \$RANDOM #random number generation

    $> ./medusaDock.linux –i TARGET_PROTEIN –m MOLECULE_TO_DOCK –o DOCKING_SOLUTION –p ./ MEDUSADOCK_PARAMETERS/ -M BINDING_SITE_CENTER –r BINDING_SITE_RADIUS –S ${rng} –R

    $> done

    In this case, we perform 200 independent docking calculations of ICI 118,551 in β2AR. Even though MedusaDock can perform on a single 8-core CPU, each docking calculation requires on average 8 min to be completed, therefore the user should consider the use of supercomputer for the docking of small libraries of compounds.

  4. 4.

    We perform the cluster analysis using an ad hoc developed program. The less experienced user is advised to refer to the Conformer Cluster script available in the Resources of the Schrödinger Suite.

  5. 5.

    Perform MedusaDock calculations for propanolol enantiomers by adapting the command reported in Note 3 to the new compounds.