Introduction

Diabetes mellitus is a metabolic disorder, considered as a major public health issue. Million of people worldwide will suffer from type 2 diabetes mellitus (T2DM). T2DM is a chronic disease, developing later in life inmost cases, characterized by several defects pancreatic β cell dysfunction, insulin resistance, increase hepatic glucose production, complications, such as stroke, coronary artery disease, hypertension, nephropathy, neuropathy and retinopathy. (Turner et al. 1996; UK Prospective 1998a, b) Current treatment strategies include reducing insulin resistance, supplementing the insulin deficiency with exogenous insulin, enhancing endogenous insulin secretion, reducing hepatic glucose output, and limiting glucose absorption. Pioglitazone, the widely used antidiabetic agent, has recently been reported to be associated with increased risk of bladder cancer after treatment for 2 years or more, (UK Prospective 1998a, b) Therefore antihyperglycemic agents which are orally available considered to suitable for human being. The GLP-1 in plasma is rapidly degraded by the serine protease dipeptidyl peptidase IV (DPP-4). Thus inhibition of the DPP-4 has appeared as a major target of diabetes research (Holst 1999, 2005; Hui et al. 2005; Deacon et al. 1995).The marvelous interest due to orally intake of active DPP-4 inhibitors which have the potential to control blood glucose level. (Holst 1999) To date, sitagliptin, vildagliptin and saxagliptin have obtained approval as first representatives of this class of novel antidiabetic agents, closely followed by others such as alogliptin. (Hanefeld et al. 2007; Zimmet et al. 2001) Indeed, there is further requirement for more potent and safer DPPIV inhibitor, which have selectivity but does not have specificity and side effect showed by the recently available inhibitors for treatment of T2DM. In the past few decades, a large number of compounds were synthesized and biologically evaluated as DPP-IV inhibitors. (Havale and Pal 2009; Defronzo et al. 2008) Therefore, the SAR and QSAR studies were mainly focused on the presently available series. The QSAR models and docking methodology applied on a series of pyrrolidine to search the interaction between DPP-IV’s inhibitors and the receptor by using Hologram quantitative structure activity relationship (HQSAR), comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) methods (Saqib and Siddiqi 2009; Zeng et al. 2007; Peters et al. 2004) for such a purpose, 42 compounds were selected from literature (Fukushima et al. 2008) and divided into a training data base and a test data base. The models resulting from the study provided some beneficial clues in structural modification of these inhibitors, for further designing inhibitors for the treatment of type 2 diabetes with much improved inhibitory activities against DPP-IV inhibitors.

Materials and methods

Data set

A set of 42 pyrrolidine derivatives (IC50 activity ranges from 1.1 to 316 nM) were collected from the work reported in literature review. (Fukushima et al. 2008) The structures of all the compounds and their biological activities were listed in Table 1 with their IC50 values. IC50 values were transformed to corresponding pIC50 (−log IC50) values which were used as the dependent variables in the QSAR study. It has been suggested that generated models should be tested on a sufficiently large test set to establish a statistically meaningful and reliable QSAR model; therefore, the molecules were randomly divided into a training set and test set compounds in such a way that both sets cover the structural diversity, chemical prototypes and the complete range of DPP-IV inhibitory activity. Data sets divided into a training set of 26 and test set (labelled with asterisk) of 16 compounds in pyrrolidine based series (Gupta et al. 2011; Gupta and Saxena 2011; SYBYL6.9).

Table 1 Structures and biological activities (actual, predicted and residual pIC50 data) of pyrrolidine analogues used in QSAR study representing both training set and test set

Molecular alignment

The molecular modeling studies were performed using SYBYL X 2.0 (Bush and Nachbar 1993) software running on a core-2 duo Intel processor workstation. The 3D structures of the molecules to be analyzed were aligned on a suitable conformational template, which is assumed to adopt a ‘bioactive conformation’. Hence, in this case the molecular structures of all the compounds were drawn using the most active compound (compound with highest pIC50) as a template and the partial charges were calculated using Gasteiger–Huckel (Viswanadhan et al. 1989) method and geometry optimized using Tripos force field (Cramer et al. 1988) with a distance-dependent dielectric function and energy convergence criterion of 0.001 kcal/mol Å using 1000 iterations. Compound 1 of pyrrolidine based series based series with least IC50 value (1.1 nM) was used as the templates. CoMFA and CoMSIA models were constructed based on the structural alignments of both series shown in Fig. 1a and b.

Fig. 1
figure 1

The structural alignment of the 42 molecules (a) with their common substructure used for superimposing the compound in the data set (b) of pyrrolidine based series

Comparative molecular field analysis (CoMFA) studies

The basic assumption for CoMFA and CoMSIA is that the observed biological properties, i.e. pIC50 can be well correlated with the steric, electrostatic and other fields surrounding a set of ligand molecules. (Cramer et al. 1988) In CoMFA analysis, the steric and electrostatic fields were calculated at each lattice with a grid size of 2 Å using sp3 hybridised carbon atom with + 1 charge served as a probe atom. The CoMFA fields generated were truncated by the default energy cutoff of 30 kcal/mol. The Gasteiger–Huckel charge model was determined as the best choice and used in the CoMFA and CoMSIA analyses (Cramer et al. 1988; Klebe et al. 1994).

Comparative molecular similarity index analysis (CoMSIA) studies

The CoMSIA descriptors, namely, steric, electrostatic, hydrophobic, hydrogen bond donor, and hydrogen bond acceptor, were generated using a sp3 hybridized carbon atom with + 1 charge, Vanderwaal’s radius of 1.4 Å and hydrophobic and hydrogen bond properties of + 1. CoMSIA similarity indices (AF, K) between a molecule j and atoms i at a grid point were calculated by using Eq. 1 as follows:

$$ {\text{A}}_{{{\text{F}} \cdot {\text{K}}}}^{\text{q}} \left( {\text{J}} \right) = - \mathop \sum \limits_{{{\text{i}} = 1}}^{\text{n}} {\text{W}}_{{{\text{probe}} \cdot {\text{k}}}} {\text{W}}_{\text{ik}} {\text{e}}^{{ - {\text{a}}}} {\text{r}}^{ 2}_{\text{iq}} , $$
(1)

where q represents the grid point, i is the summation index, over all atoms of the molecule j under computation, W ik is the actual value of the physicochemical property k of atom i, and Wprobe, k is the value of the probe atom (Pirhadi and Ghasemi 2010; Zhao et al. 2011). Five physicochemical properties steric, electrostatic, hydrophobic, hydrogen bond donor and hydrogen bond acceptor were evaluated. A Gaussian-type distance dependence was used between the grid point q and each atom i in the molecule. The value of the attenuation factor was set to 0.3. (Klebe et al. 1994).

PLS calculations and validations

Partial least square (PLS) regression analysis (Bush and Nachbar 1993) was used to quantify the relationship between DPP-IV inhibitory activity and structural parameters of CoMFA and CoMSIA by using dependent and independent variables, respectively. The optimum number of components (ONC), was the number of components resulting in the highest cross-validated correlation coefficient (\( {\text{r}}^{ 2}_{\text{cv}} \)), which was defined as follows:

$$ {\text{r}}^{ 2}_{\text{cv}} = 1- {{\sum \left( {{\text{Y}}_{\text{obs}} - {\text{Y}}_{\text{pred}} } \right)^{ 2} } \mathord{\left/ {\vphantom {{\sum \left( {{\text{Y}}_{\text{obs}} - {\text{Y}}_{\text{pred}} } \right)^{ 2} } {\sum \left( {{\text{Y}}_{\text{obs}} - {\text{Y}}_{\text{mean}} } \right)^{ 2} }}} \right. \kern-0pt} {\sum \left( {{\text{Y}}_{\text{obs}} - {\text{Y}}_{\text{mean}} } \right)^{ 2} }}, $$

where YPred, Yobs and Ymean are predicted, observed and mean values of the target property (pIC50). The predictive r2 value, based on the test set molecules, is computed by the following formula:

$$ {\text{r}}^{ 2}_{\text{pred}} = {\text{SD}} - {\text{PRESS}}/{\text{SD}} $$

where SD is the sum of squared deviation between the biological activities of the test set molecule to the mean activity of the training set molecules while PRESS is the sum of squared deviations between the observed and the predicted activities of the test molecules (Pirhadi and Ghasemi 2010; Zhao et al. 2011; Ai et al. 2011; Ping et al. 2011).

HQSAR studies

HQSAR is a novel technique requires 2D structure which employs specialized fingerprints (molecular holograms) as predicted variable of pharmacological activity. Concerned sensitivity of the generated model depends on the parameters as hologram length, Fragment size and fragment distinct. The analysis was based on the fragment distinction and fragment size (Sridhara et al. 2011). For the model development, the first and foremost condition is to distinguish molecular fragments based on atoms (A), bonds (B), connections (C), hydrogen atoms (H), chirality (Ch), and donor and acceptor (DA). The statistical parameters obtained after completion of analysis were number of component (NC), cross-validated q2, conventional r2, Standard error (SE), Best Hologram length (BHL) and predicted pIC50, which provides the useful information on the basis of best model subsequently fingerprint structure saved with color coding.

Molecular docking

The purpose of docking studies was to generate 3D structures of ligands with appropriate binding orientations and conformations. Molecular docking analysis was carried out using the Surflex Dock in SYBYL X 2.0 to explore possible binding conformations and understanding the interactions of dipeptidyl peptidase-IV receptor with various compounds. The crystal structure of dipeptidyl peptidase-IV (DPP-IV) was retrieved from RCSB Protein Data Bank (PDB entry code: 2G5P). The protein structures were utilized in subsequent docking experiments without energy minimization. All ligands and water molecules were removed at first, the polar hydrogen atoms and AMBER7FF99 charges were added. The protomol bloat value was set as 1 and the protomol threshold value as 0.5 when a reasonable binding pocket was obtained.

Results and discussion

CoMFA and CoMSIA techniques were used to derive 3D-QSAR models on a set of 42 chemically diverse pyrrolidine analogues as DPP-IV inhibitors. The best statistical parameters associated in CoMFA and CoMSIA models are listed Tables 2 and 3. The CoMFA models derived from the 26 pyrrolidine training compounds using both steric and electrostatic fields gave a cross-validated correlation coefficient (q2) of 0.727 with an optimized component of 2 (Table 4). The best CoMSIA model included steric, electrostatic hydrophobic, hydrogen bond donor and hydrogen bond acceptor fields and gave a q2 of 0.870 with an optimized component of 2. Contributions of steric, electrostatic hydrophobic, hydrogen bond donor and hydrogen bond acceptor fields were 0.153, 0.121, 0.267, 0.226 and 0.233, respectively. The external predictive ability of the generated CoMFA model of pyrrolidine analogues was evaluated for the test set of 16 molecules, where the obtained predictive r2 value (\( {\text{r}}^{ 2}_{\text{pred}} \)) of 0.655 supported the high predictive ability of the generated model. Similarly, the CoMFA model, the CoMSIA model of pyrrolidine analogues also showed the lower external predictive ability (\( {\text{r}}^{ 2}_{\text{pred}} \)) of 0.604 for the external test set. The predicted activity, residual value showed in Table 1 and predicted activity correlation for CoMFA and CoMSIA in Figs. 2 and 3 respectively. The generated contour maps from the above CoMFA analyses are shown in Figs. 4 and 5. As shown in above Fig. 4, an analysis of the CoMFA contour in terms of common steric and electrostatic parameters around the most active molecule of the dataset, compound 1, signified the importance of a sterically favorable region (green contour) near the 2nd position of the substituted cyanopyridine ring of the molecule, while sterically unfavorable contours (yellow contour) for DDP-IV inhibitory activity were observed near the 2nd and 3rd position of the pyrrolidine ring as well as around the substituted cyanopyrrolidine ring. As shown in Fig. 5 of compound 17, the large green contour was present near the 1st position of substituted cyanophenyl ring demonstrated that bulky group at this position would enhance the activity while the yellow contour was located adjacent to green contour around this ring and the 2nd, 3rd and 4th position of the pyrrolidine ring suggesting steric restriction near these positions. As shown in Fig. 5b, the blue contour near the 2nd position of substituted cyanophenyl ring and near the 3rd position of the pyrrolidine ring demonstrated that electron donating group was required to enhance biological activity at these positions while red contour around the 3rd position of the substituted cyanophenyl ring suggested that electron withdrawing group would be essential to enhance biological activity at this position. CoMSIA, the hydrophobic fields are represented by yellow and white-colored contours (yellow, favored; white, disfavored); the hydrogen bond donor fields are indicated by cyan and purple-colored contoured contours (cyan, favored; purple, disfavored); while the hydrogen bond acceptor fields are denoted by magenta and red contours (magenta, favored; red, disfavored). As shown in above Fig. 6a, 5-cyano group of substituted pyridine ring and 2nd position of this ring marked with green color while 1st position of the ring marked with yellow color. In Fig. 6b, the 2nd and 3rd position of substituted pyridine ring marked with blue color while red contour was present on the top of this ring. In Fig. 6c, huge yellow contour covered 2nd and 3rd position of the ring. In Fig. 6d, purple contour was present on the top portion of the ring. As shown in above Fig. 7a of compound 17, the 1st, 2nd, 5th and 6th position of substituted cyanopyridine ring marked with green color while yellow contour covered the 1st position of this ring. In electrostatic contour (Fig. 7b), 6th position of this ring marked with red color while blue contour covered the 1st position of this ring. In hydrophobic contour (Fig. 7c), huge yellow contour covered 1st and 5th position of the ring while white contour was present near the 3rd and 4th position of the ring. In H-bond donor contour (Fig. 7d), purple contour covered the 6th position of the ring while cyan contour was present on the top of the amino group. In H-bond acceptor contour (Fig. 7e), magenta contour around the C-2 and C-3 atom of this ring indicated that these atoms act as hydrogen bond acceptor. HQSAR models were predicted on the series of DPP IV inhibitors and relates with structural fragments (Table 5). The best statistical results among all models using the training set compounds of pyrrolidine based series were obtained for model 1 fragment distinct A/B/C/H (q2 = 0.939, r2 = 0.949) which was resultant using the combination of fragment size (4–7), with six optimum number of components, 353 hologram length and 0.166 as standard error of estimation. The internal consistency plot Fig. 8, available in the forms of q2 and r2, is important and significant, to predict external activity of compounds. The green and yellow color (represents positive contribution), white color (indicates intermediate or moderate contribution), red and orange (negative contribution) and cyan color (represents maximal common structure) suggest the structure fragment requirement for enhancing the binding affinity. As shown in above Fig. 9, in compound 1, the pyrrolidine portion is a common backbone to all compounds into the training set, and therefore, is colored cyan. In compound 17, 5th position of the pyrrolidine ring marked with red colour showed that electropositive group is required at this position to enhance activity. In compound 37, 2nd position of the ring marked with green color showed that bulky group is required to enhance the activity at this position. In compound 38, 1st, 3rd and 4th position of the pyrrolidine ring marked with yellow color showed that no bulky group is required to enhance activity at these positions. The most potent compounds 1 and 17 were selected according to docking scores and performed deeper docking study discussed in Fig. 10. The hydrogen bonds are shown by yellow color broken lines. As shown in Fig. 10, the important amino acid residues which form interaction with compound 1 were Arg-358, Tyr-666 with 2G5P receptor. The interaction of Arg-358 amino acid with 4-fluoro group of pyrrolidine ring is favored for bulkiness. In Fig. 10b, the important amino acid residues which form interaction with compound 17 were Arg-658, Glu-206 and Tyr-667 with 2G5P receptor.

Table 2 Statistics of CoMFA models of both series on different partial atomic charges
Table 3 Statistics of CoMSIA models of both series on different partial atomic charges
Table 4 Summary of the CoMFA and CoMSIA statistical results for the pyrrolidine molecules of training set best model
Fig. 2
figure 2

Graph of actual vs. predicted pIC50 values of all compounds for training and test sets using CoMFA

Fig. 3
figure 3

Graph of actual vs. predicted pIC50 values of both series for training and test sets using CoMSIA

Fig. 4
figure 4

CoMFA steric and electrostatic contour maps for compound 1 of pyrrolidine based series; a steric CoMFA contour of compound 1, b electrostatic CoMFA contour of comp. 1

Fig. 5
figure 5

CoMFA steric and electrostatic contour maps for compound 17 (compound with best dock score) of pyrrolidine based series: a steric CoMFA contour of compound 17, b electrostatic CoMFA contour of comp. 17

Fig. 6
figure 6

CoMSIA steric, electrostatic, hydrophobic, H-bond donor, H-bond acceptor contour maps for compound 1 of pyrrolidine based series; a steric CoMSIA contour of comp. 1, b electrostatic contour of comp. 1, c hydrophobic contour of comp. 1, d H-bond donor contour of comp. 1, e H-bond acceptor contour of comp. 1

Fig. 7
figure 7

CoMSIA steric, electrostatic, hydrophobic, H-bond donor, H-bond acceptor contour maps for compound 17. a Steric, b electrostatic, c hydrophobic contour, d H-bond donor, e H-bond acceptor

Table 5 The determination of statistical parameters for the models of pyrrolidine series based on different fragment distinct with default fragment size (4–7)
Fig. 8
figure 8

Graph of actual vs. predicted pIC50 values of both series for training and test sets using HQSAR

Fig. 9
figure 9

The compounds (1, 17, 37 and 38) contributing map of pyrrolidine based series which shows the direct relation between the structural fragment and pharmacological activity

Fig. 10
figure 10

Stereo-view of the docked conformations of compound 1 (a) and 17 (b) of pyrrolidine based series in the active site of DPP-IV enzyme (2G5P)

Conclusion

The present work elucidated the successful application of the combination of three different computational approaches (CoMFA, CoMSIA and HQSAR) and molecular docking analysis to identify the essential structural requirements in 3D chemical space for the modulation of DPP-4 inhibitory activity of pyrrolidine derivatives. The CoMFA and CoMSIA contour maps of pyrrolidine analogues showed that the electron donating group at 3rd position of pyrrolidine ring increases the activity while the electron withdrawing groups are favored at 4th and 5th position of pyrrolidine ring. The binding mode of the high active compound at the active site of DPP-4 Inhibitor (PDB: 2G5P) was explored and hydrogen-bonding interactions were observed between the inhibitor and the target. These results will serves as useful guideline for designing the novel compounds with desired DPP-IV inhibitory activity.