Introduction

Microtubules (MTs), major structural components in cells, are the target of a large and diverse group of natural product anticancer drugs (Jordan and Wilson, 2004). MTs are polymeric protein complexes constructed from a heterodimer of two highly homologous proteins known as α- and β-tubulin. The assembly of tubulin heterodimers into a macromolecular MT complex is a tightly regulated and dynamic process. MTs have essential roles in vital cellular functions, such as motility, division, shape maintenance, and intracellular transport (Verhey and Gaertig, 2007; Pellegrini and Budman, 2005). Tubulin is therefore target of numerous small molecule ligands that interfere with MT dynamics, several of which are of clinical use, in particular for cancer treatment (Kiselyov et al., 2007; Jeffrey et al., 2007; Giannakakou et al., 2000). A structurally diverse collection of ligands, such as vinblastine sulfate, colchicine, combretastatin A-4 (Pettit et al., 1989), the epothilones (Goodin et al., 2004) and paclitaxel (Wani et al., 1971; Schiff and Horwitz, 1980) as well as some synthetic molecules (Yoshimatsu et al., 1997; Arora et al., 2009) are known to exert cytotoxic activities through binding to tubulin. Recently, much effort has been devoted to the identification of new compounds from both natural and synthetic sources that bind at the colchicine site of tubulin (Li and Sham, 2002; Kingston, 2009; Beckers and Mahboobi, 2003). Mutations in β-tubulin that affect MT polymer mass and/or drug binding are associated with resistance to tubulin-binding agents such as paclitaxel (Kavallaris, 2010). Also, independent of the precise mechanism of action, clinical use of antitubulin drugs is associated with problems of drug toxicity and bioavailability (Sridhare et al., 2004). Therefore, there is a need to find new tubulin-binding agents for anticancer activity. Recently, the benzoyl urea analogs have shown good anticancer activity and become new family of tubulin-binding agents (Song et al., 2008; Ling et al., 2002; Ling et al., 2003; Jiang et al., 2002; Li et al., 2003; Hwang et al., 2002).

We have previously demonstrated the use of energetic pharmacophore for the designing of cyclooxygenase-2 inhibitors (Lokwani et al., 2012) and used docking study for predicting the binding affinity of targets such as non-nucleoside reverse transcriptase inhibitors (Mokale et al., 2012). In search of new targets using computational studies, we present here the correlation of structures of benzoyl urea analogs with antiproliferative activity. Thus, to optimize this pharmacophore and for further improving the activity, we developed atom-based 3D-QSAR models using pharmacophore alignment and scoring engine (PHASE) and performed docking study on colchicine-binding site of β-tubulin using grid-based ligand docking with energetics (GLIDE). PHASE is self contained system for pharmacophore perception, QSAR model development, and 3D database screening (Dixon et al., 2006). We validated developed 3D-QSAR models internally by predicting the activity of test set and externally by predicting the activity of compounds obtained from literature. The GLIDE docking module approximated a complete systematic search of the conformational, orientational, and positional space of the docked ligand molecules into the receptor-binding pocket (Friesner et al., 2004). The developed atom-based 3D-QSAR model and docking studies highlight the structural features of benzoyl urea analogs for binding to colchicine-binding site of β-tubulin which is useful for further design of more potent tubulin-binding agent having antiproliferative activity.

Materials and methods

Biological data

A set of 32 benzoyl urea analogs (Table 1) with available IC50 data for antiproliferative activity were taken from literature (Song et al., 2008) for the development of ligand-based CPHs. The negative logarithm of the measured IC50 value (pIC50) was used in this study. For 3D-QSAR studies, these 32 compounds were divided into a training set (21 compounds) and a test set (11 compounds). The training set molecules were selected randomly in such a way that they contained information in terms of both their structural features and biological activity ranges. The most active molecules, moderately active, and less active molecules were included in training set to spread out the range of activities. In order to assess the predictive power of the model, a set of 11 compounds was arbitrarily set aside as the test set. The test compounds were selected in such a way that they truly represent the training set.

Table 1 Structure of compounds used for development of common pharmacophore hypothesis and 3D-QSAR studies along with biological activity

Ligand preparation

The structure of each compounds were cleaned and optimized using Ligprep v2.4 (Schrödinger, LLC, New York, NY, 2009). The cleanup and optimization process include conversion of structures from 2D to 3D, addition of hydrogen atoms, removal of counter ions, ionization state at the pH 7.0, generation of stereoisomers, removal of noncompliant structures, and energy minimization. Conforms of all ligands were generated using ConfGen macromodel search method (Watts et al., 2010) with maximum number of conformers 1,000 per structure and minimization steps 100 and minimized using OPLS_2005 force field (Jorgensen et al., 1996). Each minimized conformer was further filtered to eliminate redundant conformations. For each molecule, a set of conformers with maximum energy difference of 10 kcal/mol relative to global energy minimum conformers were retained. Also, conformers with the RMSD of lower than 1.0 Å between all pairs of corresponding heavy atoms were considered identical and discarded.

Generation of the common pharmacophore hypotheses (CPHs)

Pharmacophore features; hydrogen bond acceptor (A), hydrogen bond donor (D), hydrophobic group (H), negatively charged group (N), positively charged group (P), and aromatic ring (R) were defined by a set of chemical structure patterns as SMARTS queries. Common pharmacophoric features were then identified from a set of variants (set of feature types) that define a possible pharmacophore using a tree-based partitioning algorithm with maximum tree depth of four with the requirement that all actives must match. After applying default feature definitions to each ligand, CPHs were generated using a final box of 1 Å.

All generated CPHs were examined and selected based on a scoring function to yield the best alignment of the active ligands using an overall maximum root mean square deviation (RMSD) value of 1.2 Å with default options for distance tolerance. The quality of alignment was measured by a survival score, defined as

$$ S \, = \, W_{\text{site}} \,S_{\text{site}} + \, W_{\text{vec}} \,S_{\text{vec}} + \, W_{\text{vol}} S_{\text{vol}} + \, W_{\text{sel}} \,S_{\text{sel}} + W_{\text{rev}}^{m} - \, W_{\text{E}} \,\Updelta E \, + \, W_{\text{act}} A $$

where W is the weight and the S is the score; S site is based on alignment score which is RMSD in the site point position as

$$ S_{\text{site}} \left( i \right) = { 1} - S_{\text{align}} \left( i \right)/C_{\text{align}} \left( i \right) $$

S align (i) represents alignment score and C align (i) is RMSD cutoff (default value 1.2 Å); S vec represents vector score which is average cosine of the angles formed by corresponding pairs of vector features (acceptor, donors, and aromatic rings) in aligned structures; S vol represents volume score based on overlap of van der Waals model of non-hydrogen atoms in each pair of structures

$$ S_{\text{vol}} \left( i \right) = \, V_{\text{common}} \left( i \right)/ \, V_{\text{total}} \left( i \right) $$

where V common (i) is common or overlapping volume between ligand i and the reference ligand, while V total (i) is the total volume occupied by both ligand; S sel is the selectivity score which is an empirical estimate of the rarity of hypothesis, and accounts for what fraction of molecules are likely to match the hypothesis regardless of their activity toward the receptor; W mrev is the reward weights defined by m − 1, where m is the number of actives that match the hypothesis; ΔE is a penalty for high energy structures by subtracting a multiple of the relative energy from final score; A is the penalty for hypotheses for which the reference ligand activity is lower than the highest activity, by adding a multiple of the reference ligand activity to the score; W site, W vec, W vol, W rew, and W act, have default values of 1.0, while W sel and W E has a default value of 0.0. In hypothesis generation, default values have been used.

The CPHs with high survival score were chosen for alignment of molecules and used for further 3D-QSAR studies.

Building of 3D-QSAR models

Atom-based 3D-QSAR models were developed using partial least squares (PLS) regression analysis for the top four CPHs based on their survival score. In atom-based 3D-QSAR in PHASE, each molecule is treated as a set of overlapping van der Waals spheres. Each atom (and hence each sphere) is placed into one of six categories according to a simple set of rules: hydrogens attached to polar atoms are classified as hydrogen bond donors (D); carbons, halogens, and C–H hydrogens are classified as hydrophobic/non-polar (H); atoms with an explicit negative ionic charge are classified as negative ionic (N); atoms with an explicit positive ionic charge are classified as positive ionic (P); non-ionic atoms are classified as electron withdrawing (W); and all other types of atoms are classified as miscellaneous (X). For construction of atom-based 3D-QSAR model, a rectangular grid of cubes (1 Å on each side) were defined for aligned training set for occupation of all atoms. Each occupied cubes were allotted one or more volume bits to represent the molecules by string of zero and ones. This representation gives rise to binary-valued occupation patterns that was used as independent variables to create PLS QSAR models (Dixon et al., 2006). The PLS regression was carried out with maximum of N/5 PLS factors, where N is the number of ligands in training set.

Docking method

The molecular docking tool, Glide v5.6 (Schrödinger, LLC, New York, NY, 2009) was used for docking studies of all compounds on colchicine-binding site of β-tubulin. The crystal structure of β-tubulin was obtained from protein data bank (PDB code: 1SA0) and was prepared for docking using “protein preparation wizard” in Maestro wizard v9.1 (Schrödinger, LLC, New York, NY, 2009). Water molecules in the crystal structures were deleted and termini were capped by adding ACE and NMA residue. The protein preparation was carried out in two steps, preparation and refinement. In preparation component, after ensuring chemical correctness, the hydrogens were added where hydrogen atoms were missing. Side chains that are not close to the binding cavity and do not participate in salt bridges were neutralized. In the refinement component, a restrained impact minimization of the co-crystallized complex was carried out. This helps in reorientation of side chain hydroxyl group. It uses the OPLS_2005 force field for this purpose. Grids were defined by centering them on the ligand in the crystal structure using the default box size.

The ligands were built using maestro build panel and prepared by Ligprep v2.4 (Schrödinger, LLC, New York, NY, 2009) which produces the low energy conformer of ligands using OPLS_2005 force field. The lower energy conformations of the ligands were selected and were docked into the grid generated from protein structures using standard precision (SP) docking mode.

Results and discussion

Generation of 3D-QSAR models

Different variant CPHs were generated by common pharmacophore identification process. All CPHs were examined and scored to identify the pharmacophore that yields the best alignment of the active compounds (pIC50 > 0.0). All CPHs were validated by aligning and scoring the inactive compounds (pIC50 < −0.3). We have selected top four CPHs models whose survival-inactive scores ranked in the top 1 % for alignment of all compounds and 3D-QSAR studies. The survival score for these CPHs is shown in Table 2. All top four CPHs were found to be associated with the five-point hypotheses, which consists of one hydrogen bond acceptor (A), two hydrogen bond donors (D), one hydrophobic (H), and one aromatic ring (R) vector features. All compounds were aligned using CPH ADDHR.102 for 3D-QSAR study (Fig. 1).

Table 2 Survival score of top four CPHs
Fig. 1
figure 1

Alignment of all compounds using CPH ADDHR.102

All top CPHs were used for atom-based 3D-QSAR model generation. The CPH ADDHR.102 yielded a 3D-QSAR model with good value of regression coefficient, low standard deviation, and high variance ratio with good stability, but showed diminished predictive power along with high RMSE value and low Pearson R value which stand for correlation between predicted and observed activity for test set. Similarly, CPH ADDHR.232 yielded a 3D-QSAR model with good statistical values but showed low predictive power (Q 2 = 0.4927) (Table 3). The CPHs ADDHR.11 and ADHHR.245 yielded 3D-QSAR models with good PLS statistical values. Both these hypotheses showed good internal as well as external predictive power. The training set correlation in both CPHs is characterized by PLS factors (R 2 = 0.7217, SD = 0.4705, F = 14.7, P = 5.641e−005, Q 2 = 0.5982 for CPH ADDHR.11 and R 2 = 0.7282, SD = 0.4651, F = 15.2, P = 4.641e−005, Q 2 = 0.5297 for CPH ADDHR.245). The test set correlation is characterized by PLS factors (RMSE = 0.5304, Pearson R = 0.7962 for CPH ADDHR.11 and RMSE = 0.4902, Pearson R = 0.769 for CPH ADDHR.245).

Table 3 Statistical results of 3D-QSAR model developed using different CPHs

We have selected the 3D-QSAR models generated by CPHs ADDHR.11 and ADDHR.245 for correlating the structure with activity. Graph of observed versus predicted biological activity of training and test sets are shown in Figs. 2 and 3, respectively. Residuals values obtained by subtraction of predicted activities from observed biological activities are near to zero which indicated that error in prediction of biological activity is low and predicting ability of QSAR models developed by CPHs ADDHR.11 and ADDHR.245 is good (Table 4). Mean of residual in predicting the activity of compounds was calculated by average of summation of all residual values. 3D-QSAR models associated with hypotheses ADDHR.11 and ADDHR.245 showed 0.3659 and 0.3167, respectively, as the mean of residual which also support the predictability of both QSAR models. The pharmacophore hypothesis showing distance between pharmacophoric sites is depicted in Fig. 4.

Fig. 2
figure 2

Plot of observed activity versus predicted activity for training set for 3D-QSAR model generated using CPHs: a ADDHR.11, b ADDHR.245

Fig. 3
figure 3

Plot of observed activity versus predicted activity for test set for 3D-QSAR model generated using CPHs: a ADDHR.11, b ADDHR.245

Table 4 Comparison of observed biological activity and predicted activity along with fitness of compounds on CPHs
Fig. 4
figure 4

Distance between pharmacophoric sites in CPHs: a ADDHR.11 and b ADDHR.245. All distances are in Å unit

Both 3D-QSAR models developed using CPHs ADDHR.11 and ADDHR.245 were applied to database of compounds taken from literature (Song et al., 2008, 2009a, b; Hu et al., 2007) for validation purpose. The activity of these reported compounds having benzoyl urea or similar pharmacophoric group were predicted and compared with its actual activity (Supplementary Data). The residual values obtained by subtraction of predicted activity from reported activity was found to be near zero for number of compounds. The mean of residual was also calculated and found as 0.529 and 0.484 for 3D-QSAR models associated with CPHs ADDHR.11 and ADDHR.245, respectively. These results suggested that correctness and capability of 3D-QSAR models developed by CPHs ADDHR.11 and ADDHR.245 for prediction of activity of compounds which can be applied to newly designed compounds for the prediction of their activity prior to synthesis.

Based on overall statistical results, 3D-QSAR model developed using CPHs ADDHR.11 were applied to each compound in the series for the establishment of structure–activity correlation. A pictorial representation of the cubes generated in the present 3D-QSAR is shown in Figs. 5 and 6. In these generated cubes, the blue cubes indicate favorable features, while red cubes indicate unfavorable features for biological activity. The blue cubes around the hydrogen bond acceptor and hydrogen bond donor group of benzoyl urea suggest the importance of core ring for activity.

Fig. 5
figure 5

Pictorial representation of the cubes generated using the QSAR model developed using CPH ADDHR.11 for a most active compound 31 and b least active compound 30 in the training set. Blue cubes indicate favorable regions, while red cubes indicate unfavorable region for the activity (Color figure online)

Fig. 6
figure 6

Pictorial representation of the cubes generated using the QSAR model developed using CPH ADDHR.11 for a compound 8, b compound 26, c compound 9, and d compound 13. Blue cubes indicate favorable regions, while red cubes indicate unfavorable region for the activity (Color figure online)

The substitutions around aromatic ring were observed to have small hydrophobic group and electron withdrawing group alone or both. The comparison of the most significant favorable and unfavorable interactions were observed when the 3D-QSAR model was visualized in the context of most active ligand (compound 31) and the least active ligand (compound 30) (Fig. 5). The blue cubes were observed at the position C6 at phenyl ring near hydrogen bond acceptor (A2) vector which indicated that for better activity the ring should be substituted with halogen atom (Fig. 5a). Thus, compounds having substituted ring at position C6 with halogen atom (compounds 7, 16, 18, 19, 31) are more active than compounds having substituted ring with other groups (compounds 14, 30). Moreover, Figs. 5 and 6 showed the importance of fluorine atom for activity among the all halogen atom as hydrophobic group. However, some compounds containing fluorine atom at the position C6 at phenyl ring were shown the moderate or lower activity, but this may be due to the replacement of hydrophobic group (H9) –Br or –I by –Cl near to Hydrogen bond acceptor (A1). This is proved in Fig. 6b, where blue cubes were observed around fluorine atom at the position C6 and red cubes around hydrophobic group (H9) –Cl near to hydrogen bond acceptor (A1) in compound 26.

Figure 6 shows features at position C4 and C5 which indicated that for better activity ring should un-substituted at these position. Therefore, compounds substituted by –Cl, CH3, and OCH3 groups at position C4 and –COOCH3 or NO2 group at position C5 showed lower and moderate activity, respectively. This also indicated that substitution by electron withdrawing groups alone or along with hydrophobic groups lower down the activity.

Docking studies

All structures were docked to colchicine-binding site of β-tubulin for studying of binding mode of compounds for antiproliferative activity. The constraint was defined as reported in literature that all compounds should bind to SH group of CYS 241 by hydrogen bond. The reliability of the docking results was first checked by comparing the best docking poses obtained for the co-crystallized inhibitor with its bound conformation. This was done by removing colchicine from their active site and subjecting again to docking into the binding pocket in the conformation found in the crystal structure. As a result, a RMSD of 1.36 and 0.25 Å were found for rigid and flexible docking which suggested that the docking procedure could be relied on to predict the binding mode of our compounds.

It was observed from docking results that all benzoyl urea analogs have a similar binding mode like colchicines in the binding pockets of β-tubulin (Fig. 7). The amino acids such as VAL 238, CYS 241, ALA 250, LYS 254, LEU 255, LYS 352, and THR 353 are involved in the interaction of all compounds. As defined in constraints, the interaction of standard compound colchicine with the residues of tubulin involved hydrogen bonding with side chain atom S of CYS 241. Similarly, out of three C=O groups, two C=O groups in all compounds except 19 and 32 showed hydrogen binding with SH of CYS 241 and NH of ALA 250 (Fig. 7). Some compounds also showed the hydrogen bonding with O atoms of THR 179, VAL 238, and LYS 352. 6. It was also found that amino acid residues formed the most of the hydrogen bonding with benzoyl urea chain.

Fig. 7
figure 7

Docking of compounds on colchicine-binding site of β-tubulin. a Docking pose of all compounds docked and b overlap of colchicine and compound 18 at binding site

CPHs ADDHR.11 and ADDHR.245 were also correlated with compounds docked at colchicine-binding site of β-tubulin and it was found that C=O group, hydrogen bond acceptor (A1) from CPH ADDHR.11 showed hydrogen binding with SH of CYS 241 (Fig. 8). Hydrogen bond acceptor (A2) vector feature from CPH ADDHR.245 not showed any hydrogen bonding, whereas C=O group near to hydrogen bond donor (D5) vector feature showed hydrogen bonding with NH of ALA 250. Thus, hydrogen bond acceptor (A1) feature showed importance for the formation of hydrogen bonding with SH of CYS 241.

Fig. 8
figure 8

Docking pose of compound 29 at active binding site showing CPH ADDHR.11. Yellow dotted lines show hydrogen binding (Color figure online)

Conclusion

The four 3D-QSAR models were developed using variant CPHs which consist of one acceptor, two donors, one hydrophobic, and one ring vector feature. The correctness and capability of both 3D-QSAR models were validated internally by applying to test set and externally by predicting the activity of compounds having benzoyl urea pharmacophore from literature. The one predictive 3D-QSAR models were selected based on statistical results. These 3D-QSAR models provide insight into the structural requirement of benzoyl urea analogs as tubulin agents for antiproliferative activity. Docking studies were also performed and also correlated with CPHs which indicated that all compounds bind in similar pose at colchicine-binding site of β-tubulin. Hydrogen bond acceptor (A1) vector feature showed the hydrogen bonding with SH of CYS 241 in all compounds suggested the importance of 3D-QSAR model developed using CPH ADDHR.11. Thus, this 3D-QSAR model gives a hypothetical image for designing of new potential compounds. The involvement of benzoyl urea core for the interaction with amino acid in colchicine-binding site of β-tubulin suggests its importance as common core for tubulin-binding agents. In conclusion, the overall study provides detailed structure and important binding information of benzoyl urea derivatives as tubulin-binding agents for antiproliferative activity.