Background

Virtual screening is a widely used technique in the field of medicinal chemistry to identify lead compounds from a diverse library that can bind to a receptor. The receptor based virtual screening approach involves a process called molecular docking which employs an algorithm that docks each molecule from a library into the binding site in order to predict a binding energy or a binding score [1]. In recent years, a number of successful virtual screening based studies have been conducted as described for example in the recent review by Lavecchia et al. [2]. Although docking provides an efficient and cost effective way to assess interactions between molecules such as proteins and ligands on a large-scale, the accuracy, as defined by the ability to predict strong binding ligands, is limited. This is largely due to the limitation of scoring functions used in the software to calculate binding energies, and therefore their ability to identify true positives from a database composed of known ligands and decoys that is typically used in evaluations of virtual screening [3, 4]. The accuracy of the screening method can be assessed quantitatively through calculation of the robust metric known as Receiver Operator Characteristic Enrichment (ROCE) [5]. An ROCE factor is obtained as the true positive rate divided by the false positive rate, thus ROCE factors much larger than 1.0 are desirable to establish that the docking algorithm can distinguish active compounds from decoys.

Several software for molecular docking are available [6] and have been evaluated [7, 8]. Furthermore, methods to increase the accuracy of virtual screening have been suggested, for example considering receptor flexibility to reduce the numbers of false positive molecules [9], consensus docking to predict correct binding pose [10], and a consensus virtual screening method that combined the rank lists of ligands from different algorithms [11]. However, these improved methods can still result in a low number of correct predictions for some receptors [11]. In the work described here the novel strategy of using receptor decoy sites was developed and evaluated for the first time together with the docking software AutoDock Vina [12]. This involved performing virtual screening against a non-binding (receptor-decoy) site on the same protein target, and developing a way to re-rank the screening results, thus enabling a comparison of ROCE factors before and after the application of receptor-decoy screening in order to evaluate the novel strategy.

Methods

Ligand and decoy sets for fifteen target proteins were downloaded from the Database of Useful Decoys [3]. The complexes were selected from several different protein categories in the database such as hormone receptors, kinases, proteases and other enzymes to represent a wide range of targets, including 10 targets which had previously been evaluated [11]. Virtual screening for all fifteen targets was performed using Autodock Vina version 1.1.1 with the default parameters [12]. The FTMap binding site prediction server [13] was used to help define the decoy site for docking. The FTMap server identifies binding hot-spots by computational solvent mapping whereby 16 different molecular probes are docked onto the protein surface to locate favorable binding regions [13]. The decoy site was chosen based on the following criteria: 1) contains no binding hotspot predicted by FTMap, 2) it appears structurally different to the actual binding site and 3) it does not form an obvious binding cavity but is at a flat region on the exterior surface of the protein. The search space for docking was defined via a grid box manually specified with Autodock Tools [14] around the binding or decoy site. A grid spacing of 0.375 Å was used to determine the box dimensions. The box dimensions remained the same for binding site and decoy site docking. Adjusted rank lists were generated from the binding site list by considering molecules that were in the top 10 %, 15 %, 20 %, 30 % and 50 % of the decoy site list, and adjusting the rank of the binding site list using the following formula:

$$ Adjusted\; rank=\left( Binding\; site\; rank- Decoy\; site\; rank\right)+ Total\; no. of\; ligands\; in\; list $$

The fraction of decoy-site docking results was varied in order to find a cut-off where maximum enrichment is achieved. The numbers of active ligands in the database were then used to calculate the ROC Enrichment (ROCE) factors at 1 % and 2 % of the number of molecules. The ROCEx% was calculated as the fraction of true positives divided by the fraction of false positives at x% of the ligand/decoy database according to the equation:

$$ ROC{E}_{x\%}=\frac{f_{actives}}{1-\frac{\left({N}_{decoys}-{N}_{inactives}\right)}{N_{decoys}}} $$

Where f actives  = (number of actives at x%) / (number of all actives),

N decoys  = the total number of inactive decoys,

N inactives  = the number of decoys chosen at x% of the ligand/decoy database.

Binding site and decoy sites were analysed post-docking with the KVFinder Cavity Detection PyMol Plugin [15] to provide a quantitative description of the two sites. The software enables comparison and characterisation of protein binding sites by the number, area and volume of cavities in a specified search space. The default parameters were used for all fifteen targets which included a probe in size of 1.4 Å, probe out size of 4.0 Å and a step size of 0.6 Å. The minimum cavity volume was set at 5.0 Å. The binding site search space was set around the position of the actual ligand molecule obtained from the Protein Data Bank, and the decoy site search space was set using a docked molecule from the decoy site screening.

Results and discussion

High predicted binding affinities between a ligand and a receptor may not always correspond with the best binding molecules for the target site investigated [6, 16]. In virtual screening this is reflected by low enrichment factors which indicate that many of the highest ranked molecules may be false positive predictions [5]. In this study, the level of Receiver Operator Characteristic Enrichment (ROCE) was determined at fractions of 1 % and 2 % of the dataset of ligand/decoy molecules obtained from the Database of useful Decoys [3]. Docking against a non-binding ‘decoy’ site on the same receptor (Fig. 1) was carried out using the software Autodock Vina that lead to a ranking of molecules different from the ranking for the true binding site. The predicted binding energies among top molecules for the decoy site were less negative than for binding sites, indicating a lower degree of binding to the decoy site. The ranking for the true binding site was adjusted by considering a varied fraction of the rank list produced from the decoy site from 0 % (no correction) to 50 % (Tables 1 and 2).

Fig. 1
figure 1

a Acetycholine esterase (Ache) receptor with binding site shown in red and decoy site in blue. b Detailed view of Ache binding site. c Detailed view of Ache decoy site

Table 1 ROCE at 1 % of the binding site list considering top x% of the decoy site list
Table 2 ROCE at 2 % of the binding site list considering top x% of the decoy site list

The results show a considerable variation between the fifteen targets investigated confirming the general consensus that virtual screening accuracy is highly dependent on the target (Tables 1 and 2). Overall, the majority of targets did not show any improvement in enrichment at the top 1 % or 2 % of the list after applying the receptor decoy method. Five targets (Comt, Ache, CDK2, HIVrt and Pparg) show improved ROCE factors compared to those obtained in the previous study [11], (see footnotes in Tables 1 and 2) when considering at least the top 15 % of the decoy site list. Beyond 15 % the enrichment for all targets (except HIVrt and Parp) either remained constant or dropped to a lower value.

The rationale behind the receptor decoy strategy was that the number of false positive binders could be reduced by determining molecules, which have a tendency to bind non-specifically to molecular surfaces that are different to the binding site. As a result a higher number of active ligands would remain after adjusting the rank list for the true binding site with the rank list for the decoy site. However, the results show that this approach is unlikely to help in the identification and selection of molecules for experimental testing as a higher number of true positives were recalled for only 5 out of 15 targets. The extent of enrichment achieved for the top 1 % and 2 % differed for all targets due to properties that determine the binding interactions between amino acid residues of the target and the ligand-decoy dataset used for docking. The optimum cut-off for maximum enrichment at the top 1 % of a binding site list was obtained when considering 15 % of the decoy list (Table 1), and 10 % for the top 2 % of the binding site list (Table 2). This shows that the ranking of molecules with regards to binding to the decoy sites is meaningless for lower ranks.

The largest improvement in enrichment was achieved with the targets CDK2 and Pparg. For the targets PR, Hsp90 and ampC the ROCE at 1 % and 2 % remained at zero until considering at least 30 % of molecules in the decoy list, indicating that true and false ligands cannot be distinguished by the Autodock Vina docking algorithm. Cavity analyses of the binding site and decoy site (Table 3) using the software KVFinder [15] shows that the total number, volume and area of the cavities found in the decoy site were smaller in comparison to the binding site for all targets except HIVrt and trypsin. This confirms that the shapes of the 2 sites are very different, although this did not prevent false positive molecules binding with high affinity.

Table 3 Cavity analysis of binding sites and decoy sites for all targets using KVFinder [15]

The targets Inha, MR and VEGFr2 show a significant decrease in ROCE indicating this strategy makes the retrieval of active ligands in the top ranks worse for these targets. The actual binding site for VEGFr2 appears to be non-specific, open and flat, therefore binds molecules which also bind easily to the decoy site, resulting in a high proportion of active molecules at the top of the decoy list. However, the Inha binding site is a small, deep pocket with a total cavity area of 838.4 Å2 which appears not to be easily surface accessible, so it is expected that this receptor only binds ligands which are complementary in shape. Although, this was not seen as a higher number of active ligands were found in the top 1 % of the decoy site list compared to the binding site list. Thus, when the re-ranking formula to generate the adjusted list is applied, the binding site list is re-ordered such that the active ligands do not appear in the top positions. This highlights the shortcoming, if applying this strategy to a virtual screening experiment where active molecules are not known, it cannot be guaranteed that any improved prediction accuracy will result.

Conclusion

The novel development and evaluation of docking with a decoy binding site shows that improved prediction of active ligands could not be achieved in general. It should be noted that the ligand/decoy dataset used for this evaluation is especially challenging as decoys physico-chemical similar to ligands were chosen [3]. The choice of appropriate decoy binding sites is critical for the success of this method. Choosing an obviously unfavorable site, such as a flat molecular surface, reduces the docking scores overall and thus the potential to discriminate between ligands and decoys, while on the other hand the choice of an alternative binding cavity might cause a novel mode of specific binding that does not help to eliminate the false postives for the true binding site. The question, how to define a decoy binding site, such that false positive predictions for the real binding site are removed must remain open and is put forward to the academic community. Further work addressing the re-ranking of predicted ligands may also lead to improvements.