Introduction

Type 2 diabetes mellitus (T2DM) is one of the most common diseases which many of the world’s people suffer from, and is increasing rapidly. T2DM constitutes approximately 90% diabetic patients (Sharma and Chandola 2011). It is estimated that 382 million people have diabetes and this number is expected to reach 592 million by 2035 (Guariguata et al. 2014). Therefore, there is a great requirement for the discovery of novel drugs in this field. Intensive efforts have been made worldwide to develop therapies for type 2 diabetes. The attractive targeting of G-protein-coupled receptor 119 (GPR119) with the small molecules inhibitors is one of the most promising new approaches for the treatment of type 2 diabetes and other elements of the metabolic syndrome. GPR119 is a class A type receptor (Shah 2009) that is expressed in pancreatic β-cells and incretin releasing cells in the gastrointestinal (GI) tract (Overton et al. 2008). One of reasons that cause T2DM is gradual loss of beta-cells function in the pancreatic (Wellenzohn et al. 2012). Agonists of GPR119 as antidiabetics attract interest for their capability to simulate an incretin response in the gut and glucose-dependent insulin release in the pancreas (McClure et al. 2011).

In recent years, the computer-aided drug design approaches (Cern et al. 2014) have made a large contribution to the design of novel drugs. Comparative molecular field analysis (CoMFA) (Cramer et al. 1988; Shiri et al. 2015) studies through interpretation of contour maps have given insight to design the new inhibitors. The pharmacophore approach (Ambure et al. 2014) (Ambure et al. 2014; Pirhadi and Ghasemi 2012) is effective in drug design. According to a definition by IUPAC, a pharmacophore model is ‘an ensemble of steric and electronic features that is necessary to ensure the optimal supramolecular interactions with a specific biological target and to trigger (or block) its biological response’(Yang 2010). Virtual screening as a power tool to introduce novel hits have been used (Cano et al. 2014; Shoichet 2004). Molecular docking (Gschwend et al. 1996) is applied for the investigation of the interactions between protein and ligand and to attain bioactive conformers (Ghasemi et al. 2013; Li et al. 2011; Negoro et al. 2012). In the present work, we are to find new inhibitors for type 2 diabetes from the approaches of pharmacophore modeling, molecular docking, and virtual screening.

Moreover the electronic effects play a key role in drug discovery process (Feng et al. 2005). Herein, docking studies and quantum chemistry calculations were performed on the basis of density functional theory (DFT) theory in order to explore some amino acids involved in the binding site by paying special attention to the electronic effects which are related to highest occupied molecular orbital (HOMO) and lowest unoccupied molecular orbital (LUMO) energies and molecular electrostatic potential (MEP) map. Herein, we have introduced the novel hits with different scaffolds using virtual screening in the ZINC 3D database which involve ligand-based pharmacophore and molecular docking. Finally, absorption, distribution, metabolism, and excretion (ADME) prediction and toxicity risk assessment analysis were carried out by virtual screening to obtain the required ADMET properties.

Material and method

Data set

A series of 44 small molecules of GPR119 agonist and their functional potency (EC50 value) were taken from the literature (Scott et al. 2012). The dataset was split to training set (33 compound) and test set (10 compound) by Kennard and Stone algorithm to maximize the diversity of the test set and to test the predictive accuracy of the model when extrapolating outside the training set (Kennard and Stone 1969). Due to large residual compound 34 is outlier and was removed from the training set. The EC50 values (μM), (The EC50 is the concentration of a drug that gives half-maximal response) was converted to pEC50 in the molar range in order to give reliable numerical values. The structure of the compounds and their functional potency are shown in Tables 1ae.

Table 1a Structures and experimental biological activities (pEC50) for compounds of the general structure Ι
Table 1b Structures and experimental biological activities (pEC50) for compounds of the general structure IΙ
Table 1c Structures and experimental biological activities (pEC50) for compounds of the general structure ΙII
Table 1d Structures and experimental biological activities (pEC50) for compounds of the general structure ΙV
Table 1e Structures and experimental biological activities (pEC50) for compounds of the general structure V

Molecular modeling and alignment

The 3D structures of compounds 144 (Tables 1ae) were built in SYBYL 7.3 molecular modeling software package from Tripos, Inc., St. Louis, MO. The partial atomic charges of structures were calculated by Gasteiger–Hückel and their energy was minimized using the Tripos force field with a distance-dependent dielectric and the Powell conjugate gradient algorithm convergence criterion of 0.01 kcal/mol (Clark et al. 1989). Alignment is a critical step in CoMFA modeling. Appropriate alignment method caused to acquire a model with high quality statistical parameters. Here, we applied rigid body alignment by the SYBYL Distill module. Compound 40 was considered as a template for the alignment of compounds because it had the highest potency. Alignment of molecules was performed with maximum common substructure by distill method without including bond types in rings (Pirhadi and Ghasemi 2012; Pirhadi et al. 2014). Because CoMFA models are very sensitive to the different space orientations of the molecular aggregate with regard to the grid box, all-orientation search (AOS) strategy was used to achieve the best orientation and minimize the effect of the initial orientation of aligned compounds. AOS module was written in SYBYL programming language (Wang et al. 1998).

CoMFA study

The CoMFA model described the molecular characterization with van der Waals and electrostatic fields, using Lennard–Jones and Coulombic potential, respectively. The CoMFA descriptor fields were calculated at each grid point by a sp3 hybridized carbon atom with a charge of +1 as the probe atom in a 3D cubic lattice. Column filtering was set to 2.0 kcal/mol. Cut-off for both steric and electrostatic field was set to 30 kcal/mol. Also, in method comparative molecular field analysis region focusing (CoMFA-RF) applies weights to the lattice points which attenuate the contribution of these points to subsequent analysis and refining model, here ‘StDev* Coefficients’ values and different weighing factors were applied in addition to grid spacing for achieving the better models (Pirhadi and Ghasemi 2010; Shiri et al. 2016). Partial least squares (PLS) regression was used to find the relationship between the CoMFA descriptors as independent variables and the biological activities as dependent variables (Wold et al. 2001).

Model validation

Leave-one-out cross-validation (LOO-CV) as internal validation was employed to validate the derived quantitative structure activity relationships (QSAR) model. In the LOO-CV method, one compound of the training set was excluded, and a new QSAR model was built by the remaining compounds and the activity of the excluded compound was predicted using the new constructing model. This process was repeated until all compounds of the training set were excluded once. The optimum number principle component for the model as the number leading to the highest value of q 2 is defined as

$${q^{\rm{2}}} = 1 - \frac{{\mathop {\sum}\nolimits_{{\rm{i = 1}}}^n {{{\left( {{Y_{{\rm{pred}}}} - {Y_{{\rm{obs}}}}} \right)}^2}} }}{{\mathop {\sum}\nolimits_{{\rm{i = 1}}}^n {{{\left( {{Y_{{\rm{obs}}}} - {Y_{{\rm{mean}}}}} \right)}^2}} }}$$

where Y obs, Y pred, and Y mean are the observed, predicted, and mean of the biological activity values, respectively, and n is the number of compounds in the training set. A high q 2 value (q 2 > 0.5) was used as the evidence of high predictive ability of the model (Gibbs 2000; Hawkins et al. 2003). Golbarikh and Tropsha reported that the high value of q 2 is essential and important but not adequate for a predictive model (Lu et al. 2010) and an external analysis should be employed by a test set of molecules to investigate the prediction of model. They also stated that a QSAR model is predictive if it fulfills the following conditions (Tropsha et al. 2003):

$$q^{2} \geq0.5$$
$$R^{2}\geq 0.6$$
$$\frac{{\left( {{R^2} - R_0^2} \right)}}{{{R^2}}} < 0.1\;\;{\mathrm{or}} \;\frac{{{R^2} - R_0^{\prime 2}}}{{{R^2}}} < 0.1$$
$$0.85 \le k \le 1.15\,\,{\mathrm{or}}\,\,0.85 \le k\prime \le 1.15$$

Where R 2 is the squared correlation coefficient values between the observed and predicted values of the test set compounds (Lu et al. 2010). \(R_0^2\) and \(R_0^{\prime 2}\) are squared correlation coefficient values for predicted vs. observed and observed vs. predicted activities, through origin, respectively, and k’ is the slope of regression lines through the origin (Lu et al. 2010). In order to evaluate the predictive power of the CoMFA models, the activity values of the external test set of compounds not used in model generation, were predicted. The predictive R 2 (R 2 pred) of the external prediction was obtained as following formula:

$$R_{{\rm{pred}}}^{\rm{2}} = 1 - \frac{{{\sum} {{{\left( {{Y_{{\rm{obs}}}} - {Y_{{\rm{pred}}}}} \right)}^2}} }}{{{\sum} {{{\left( {{Y_{{\rm{obs}}}} - {{\bar Y}_{{\rm{mean}}}}} \right)}^2}} }}$$

where Y obs , Y pred , and \({\bar Y_{mean}}\) are the observed, predicted, and mean values of the activity, respectively.

Molecular docking

The crystal structure of GPR119 has not been published yet. The pdb homology modeled from the primary sequences of human GPR119 (Q8TDV5) was taken from the CSSB (http://cssb.biology.gatech.edu/skolnick/sup/GPCRt3). Among the top five generated homology models, the first one was selected based on the highest C-score (2.1) and used for docking study. Molecular docking was done in Discovery studio 2.5 package (AccelrysInc, San Diego, CA, USA) with Gold algorithm. The highest active compound 40 was typed with CHARMM force field and the partial charges were determined by the Momany–Rone option. Then, the resulting compound was minimized by a smart minimizer algorithm which performs 1000 steps of the steepest descent with a RMS gradient tolerance of 3, followed by conjugate gradient minimization. The preparation and minimization of protein were made, the protein structure was typed with CHARMM force field, the partial charge was computed by Momany–Rone, hydrogen atoms were added, and all water molecules were removed and pH of protein was adjusted to almost neutral 7.4, using protein preparation protocol. The active site of the target protein was created as a spherical region with a radius of 16 Å which shows atoms of the ligand and the side-chains of the residues of the receptor within 16 Å from the center of the binding site are free to move. Other parameters were fixed by default protocol settings. The most active compound 40 was docked into active site of the protein. The best pose among 10 given poses with the higher Gold score fitness function was chosen in order to achieve the key interaction between the compound and the amino acid of active site.

Quantum chemical calculations

In this study, the geometry optimization and calculations were performed at the DFT level on a personal computer by energy optimization, using the Gaussian 09 (Frisch et al. 2009) program package. The HOMO and the LUMO energies at B3LYP/6-31G (d,p) level were calculated for bioactive conformer obtained of compound 40 in the docking studies. The MEP was also studied at the same level. The MEP surface was created using Molekel (Varetto 2009).

3D-Pharmacophore generation

A pharmacophore is a structural representation of a spatial arrangement of the essential features (hydrogen bond donor, hydrogen bond acceptor, hydrophobic area, and positively and negatively ionized areas) for a set of molecules to interact with the specific target receptor (2426). The pharmacophore model was generated to extract the common important pharmacophore features among the first five most active compounds by Hip-Hop in the discovery studio 2.5 packages. The best pharmacophore model was obtained by two features including hydrogen bond acceptor and hydrophobic feature which were color coded with green and blue, respectively.

Virtual screening

Virtual screening including a wide variety of computational approaches let us derivate the most active hits from a massive virtual library. The pharmacophore validated model was used as a 3D virtual screening query for the detection of the potent candidates from ZINC chemical database. At first we selected a random collection 124,276 molecule from ZINC database and built an in-house chemical database. For each compound in the database, a fast conformer generation approach produced 100 conformers allowing a maximum energy of 15 kcal/mol above the global energy minimum for the conformation searching. In the first screening step, a total 7430 compounds were detected by pharmacophore model. The hit compounds were further screened by pharmacokinetic property filters, Lipinski’s rule of five to make them more drug-like because the molecules with molecular weight (MW) more than 500 calculated logP higher than 5, and hydrogen-bond donors, or hydrogen-bond acceptors more than 5 and 10 respectively, were rejected from the database (32, 33). Based on Lipinski’s rules, 264 compounds were retrieved. The obtained hit compounds were docked into the protein structure as a structure-based screening using a fast screening method, GOLD, within Discovery Studio 2.5 package. Finally, seven compounds were chosen at this step using a high Gold fitness score. The virtual screening steps are shown in a flowchart in Fig. 1.

Fig. 1
figure 1

The flowchart of the combined virtual screening

ADME and toxicity risk assessment studies

The best binding interactions with a target for the best drug are necessary but not a sufficient condition. An ideal oral medicine absorbed rapidly and completely from the digestive tract, and distributed exactly and specially to its target in the body, does not metabolize the way that immediately clear its activity, and eliminate in a proper way without leading harm. The ADME are the main subjects in pharmacokinetics (Hodgson 2001). The ADME properties prediction of compounds before synthesis would be beneficial in the process of drug discovery. The chemists need this data to improve the pharmacokinetic features of compounds. Some chemical descriptors such as MW (less MW for better absorption) were used to measure the pharmacokinetic parameters. Drugs with more MW include many polar functional groups and rotatable bonds so their absorption is poor, and drugs with low MW can result in the extracellular solution of body organs rather quickly (Guranatra 2001). Lipinski rules (Lipinski et al. 2001) were used as a filter to assess the medicinal ability of the compounds in virtual screening step. Some of computed pharmacokinetic parameters were blood-brain barrier (logBB), apparent Caco-2 permeability, apparent MDCK permeability, logKp for skin permeability, the distribution volume, and plasma protein binding (logK hsa) which make effect for the distribution of a drug in the body (Reichel and Begley 1998). LogP (the octanol–water partition coefficient) is a physicochemical parameter of a drug’s hydrophobicity property, and also logP and MW are effective in drug excreting from the body. It is obvious that compounds with greater lipophilicity have low absorption and more metabolisms in liver since they can bind macromolecules easier causing their toxicity, also fast renal clearance is related to hydrophilic and small compounds. On the other hand, the metabolism of most medicines in the liver is related to hydrophobic and big compounds (Lombardo et al. 2003; Testa et al. 2001).The in-silico prediction of ADME descriptors were calculated using QikProp v.3.0 program (Jorgensen 2006) and were investigated about related standard ranges (http://www.schrodinger.com/QikProp/). All hits were subjected to the toxicity risk assessment (mutagenicity, tumorigenicity, irritant, and reproductive effects) using OSIRIS program. Also, their drug-likeness drug score values were determined.

Results and discussion

CoMFA analysis

The statistical parameters from CoMFA and CoMFA-RF models constructed with steric and electrostatic fields are mentioned in Table 2. The best results were obtained for CoMFA-RF model. LOO-CV procedure of the CoMFA-RF model showed a high q 2 value of 0.605 with 6 latent variables. The non-cross-validated PLS analysis of the model leading to R 2 pred = 0.88 and standard error of estimation (SEE) of 0.204, F = 72.86. To evaluate the robustness and statistical confidence of the models, a total of 100 runs bootstrapping analysis was carried out. A high R 2 bs (0.965) indicated the degree of confidence in analysis. These results detected a good conventional statistical correlation. The observed and predicted activities, based on the prediction of the models are listed in Table 3. Figure 2 depicts the relationship between the experimental and predicted activities based on the derived QSAR models. The interpretation of this model’s contour maps is easier and more rational than the corresponding CoMFA model.

Table 2 Summery of the statistical parameters for the CoMFA and CoMFA-RF models
Table 3 Gold fitness, experimental and predicted pEC50 values with residual values of the training and test set compounds
Fig. 2
figure 2

Predicted against observed activities for training and test sets of GPR 119 agonists based on the CoMFA model

The steric (green and yellow) and electrostatic (blue and red) fields of CoMFA-RF model as 3D contour maps for visualizing the results are exhibited in Fig. 3a, b for compound 40 as template. The changes in fields around the molecules were associated with the changes in volume and position of the contour maps leads to the differences in biological activity. Higher activity values are related to greater bulky group near green, minor bulky group near yellow, more electrons donating near blue and more electron withdrawing near red regions in all grid potential analysis.

Fig. 3
figure 3

Contour maps of a steric and b electrostatic fields of CoMFA based on the most active compound 40

The yellow contour near R2 substituent demonstrated the activity of compound 21 because the presence of F substituent on the aryl ring is higher than compound 22 with cyanide group here. Also, compound 18 with more bulky group has lower inhibitory activity than compound 19. A favorable steric green contour covering R4 substituent and showing bulky groups at this position is able to increase the activity. This can be expressed by comparing the order of activities in these compounds: 33 > 32 and 38 > 39 (compound 38 has cyclopropyl group instead of methyl group in compound 39 at R4 position). Two small red contours near X substituent at C5 position of the ring shows presence of the electronegative groups are beneficial for the activity. As it is obvious, the compound 29 with F electronegative substituent is more active than the compounds 24 and 26 with H atom. The presence of small blue contour around the R2 substituent expressed that electropositive groups in this area will increase the activity. For example, the activity of compound 20 (pEC50 = 8.22) (N groups as resonance in aromatic ring) is better than compound 19 (pEC50 = 7.41).

Pharmacophore modeling analysis

Using ligand-based pharmacophore modeling, first, common features were obtained by five compounds 20, 21, 31, 40, and 44 as active compounds with the Hip-Hop algorithm. Hydrogen-bond acceptor, hydrogen-bond acceptor lipid, hydrophobic, hydrophobic aromatic, and ring aromatic features were chosen as the initial chemical features in the qualitative pharmacophore generation. The qualitative pharmacophore model indicated which feature of Hydrogen-bond acceptor and hydrophobic existed around active compounds. In brief, the chemical features common to a set of active compounds were estimated by the program and some hypotheses were generated for their activity. Spatial dispositions of pharmacophore features providing the compounds’ relative alignment in the active site of the target are hypotheses. The top ranked solutions, HYP1 (rank = 82.95), had two features; hydrogen bond acceptor and hydrophobic feature. The pharmacophore model which was mapped to the most active compound 40 is shown in Fig. 4. For model validation, 110 GPR 119 agonists associated with EC50 obtained from binding database (http://www.bindingdb.org) which is a large library. Among these compounds, nearly 90 compounds had the two pharmacophore features of the best pharmacophore model. This fact let us use this pharmacophore model for virtual screening. The initial pharmacophore model (HYP1) returned 90 molecules (82% of the database, Table 1S in Supplementary Information). These results validate the HYP1 pharmacophore model as a screening methodology for potent inhibitors of T2DM.

Fig. 4
figure 4

Mapping of the most active compound 40 on the pharmacophore model. These chemical features are color coded with green, hydrogen-bond acceptor and blue, hydrophobic feature

Molecular docking analysis

Molecular docking computations were carried out to discover the better binding conformations of the inhibitors. Many interactions between the more active compounds and amino acids of the active site were observed. The amino acids of the active site are shown in the literature (Negoro et al. 2012). The key interactions between the ligands and the active site confirm the three dimensional quantitative structure activity relationships (3D-QSAR) model results as well. Binding modes of the most active ligand (compound 40) with receptor were investigated and the best docked conformation of this compound is shown in Fig. 5. The Gold program validated docking performance using GOLDscore fitness and Binding energy (kcal/mol) which showed a high reliability of the docking method. Molecular docking study between the most active compound and the receptor indicated hydrogen bonding interactions with three active site residues of Thr86 (O ….. H–N, bond length = 3.2 Å), Trp238 (O ….. H–O, bond length = 2.47 Å), and Ser156 (O….H–O, bond length = 2.32 Å and N…..H–O, bond length = 1.98 Å). Also, there are several van der Waals interactions between Ile136, Ala90, Trp265, and Phe241 residues and more active molecule in the dataset. Furthermore, all 44 compounds associated their GOLDscore values are shown in Table 4. Site-directed mutagenesis of target residues was applied to validation of GPR119 homology modeled. We selected three key residues (THR86, SER156, and TRP238) of interest from the docking area that had H-bond with compound 40 for the site-directed mutagenesis. Then, single, double, and triple mutated of 3 key residues to ALA were done. Compound 40 was docked to all mutants GPR119. Table 2S clearly depicted that all mutant GPR119 had small GOLDscore values than the GPR119 without mutagenesis. Also, there was not any H-bond in docking results of all mutant GPR119 and compound 40.

Fig. 5
figure 5

The best docked conformation of the most active compounds (compound 40), in the binding site of GPR119. Hydrogen bonding interactions as highlighted by the dashed lines in green color formed between compound 40 and HSA

Table 4 Structures, Gold fitness values for hits

Frontier molecular orbital

According to the frontier molecular orbitals theory, HOMO and LUMO energy are two significant indicators of chemical reactivity. The character electron donor and electron acceptor of compound were measured by the HOMO and LUMO energies, respectively. The energy difference between the HOMO and LUMO (gap) is an important factor to represent a simple measure of molecule stability. A small gap value implies high reactivity of molecules in reactions while a large gap value implies high stability of molecules and low reactivity in reactions.

$${\rm{GAP}} = {E_{{\rm{HOMO}}}} - {E_{{\rm{LUMO}}}}$$

HOMO represents the ability to donate an electron while LUMO as an electron acceptor represents the ability to obtain an electron. In Fig. 6, HOMO and LUMO orbitals of the conformer obtained from docking with HOMO–LUMO gap are given. As seen in the figure, in the HOMO, the electrons are mainly delocalized on the nitrogen of the piperazine and pyrimidine ring, and partially on the nitrogen and oxygen of 1,2,4-oxadiazole ring; in the LUMO the electrons are completely delocalized on the pyridine and cyanide group. These observations confirm the obtained results from molecular docking.

Fig. 6
figure 6

The HOMO and LUMO orbitals and the energy levels for the most active molecule (40)

Molecular electrostatic potential

The MEP, V(r), at a given point r(x, y, z) located in the neighborhood of a molecule can be defined in terms of the interaction energy between the electrical charge generated from the molecule electrons and nuclei, as well as a positive test charge (a proton) placed at r. The V(r) values for the system studied were calculated employing the equation (Politzer and Murray 2002)

$$V\left( r \right) = \sum {Z_A}/\left| {{R_A} - r} \right| - \mathop {\smallint }\nolimits^ \rho \left( {r\prime } \right)/\left| {r\prime - r} \right|{d^3}r\prime$$

where Z A is the charge of nucleus A located at R A , ρ(r′) is the electronic density function of the molecule, and r′ is the dummy integration variable.

The MEP is related to the electronic density and can also be used as a highly beneficial descriptor for the determination of sites for electrophilic attack and nucleophilic reactions as well as hydrogen-bonding interactions (Kaufman 1979; Pomelli et al. 2001). The electrostatic potential V(r) is also well-suited for analyzing processes based on the recognition of one molecule from another one as in drug–receptor and enzyme substrate interactions, since it is through their potentials that the two species first ‘see’ each other (Politzer et al. 1985). Defined as a real physical property, V(r) can be determined experimentally by diffraction or by computational methods (Politzer et al. 1981).

MEP was calculated at the B3LYP/6-31G (d,p) optimized geometry so that it was possible to anticipate reactive sites for electrophilic and nucleophilic attack for the title molecule. As shown in Fig. 7, the two regions namely negative (red) and positive (blue) were related to electrophilic and nucleophilic reactivity respectively. As can be seen from the figure, the negative region for electrophilic attack is at pyridine and 1,2,4-oxadiazole ring and oxygen with red and yellow color. The negative regions are localized on the unprotonated nitrogen atom of the pyridine ring, on the oxygen between the pyridine and pyrimidine rings and on the nitrogen and oxygen of 1,2,4-oxadiazole ring that are H-bond acceptor from backbone NH group of the Trp238, backbone OH group of the Thr86 and backbone OH group of the Ser156, respectively in docking. However, there are some positive regions are localized on hydrogen atoms in pyridine and pyrimidine rings that indicated they are possible sites for nucleophilic attack.

Fig. 7
figure 7

MEP map (in a.u.) of the most active molecule (40)

Virtual screening results

The best pharmacophore model was used as a 3D virtual screening query for regaining novel and potent candidates from ZINC chemical database. At first we selected a random collection 124,276 compound from it. The model was filtered using different steps. In first screening step, the compounds of database were obtained by pharmacophore model. A total 7430 compound was retrieved. The compounds of pharmacophore model were screened by pharmacokinetic property filters, Lipinski’s rule of five (Lipinski 2000; Lipinski et al. 2012). Based on Lipinski’s rule of five 264 compound was retrieved. The next step, the yield hit compounds was docked in to the protein structure as a structure—based on screening using a fast screening method, GOLD, within Discovery Studio 2.5 program package.

The structures of the seven hits associated their GOLDscore values are shown in Table 4. The retrieved compounds show new scaffolds and they are useful for drug design. All of seven retrieved compounds had amine group and aromatic ring in their structures which were mapped to the hydrogen bond acceptor and hydrophobic features of pharmacophore model. In addition, six of seven hits have new scaffold. The first hit with high GOLDscore value had two hydrogen bond with Glu261 and Arg81. One of the retrieved compounds with a GOLDscore of 53.56, namely, Vs7, was so similar the dataset compound and it has two hydrogen bond to Thr86 and Arg81. The claim of “discovery of a new scaffold” from virtual screening analysis is supported by having six of seven hits which are not GPR 119 agonists derivatives. We did a search in ChEMBL database (Gaulton et al. 2012) to find the structures retrieved from the virtual screening. Compounds Vs4 (CHEMBL174261), Vs6 (CHEMBL419709), and Vs7 (CHEMBL62803) were in the database. There were several reports of identification Vs4 (Lipinski 1983), Vs6 (Andrea and Kalayeh 1991), and Vs7 (Berlin et al. 2010; Faghih et al. 2002; Gfesser et al. 2004; Vasudevan et al. 2002) with good potency as inhibitor for different target in literature.

We did a search in ChEMBL database to find the structures retrieved from the virtual screening. Compounds Vs4 and Vs7 were in the database. There were two reports inhibitory concentration by compound Vs7 (i) for Histamine H3 receptor with potency of 8.86 nM (ii) for Histamine H1 receptor with potency of 6.56 nM. Also, we saw some reports inhibitory concentration by compound Vs4 (i) potency of 2238.7 nM Ubiquitin carboxyl-terminal hydrolase 2 (ii) potency of 4466.8 nM Microtubule-associated protein tau (iii) potency of 5623.4 nM Beta-glucocerebrosidase.

In silico ADME and toxicity risk assessment analysis

In silico ADME prediction was performed on the hits extracted from ZINC (Koes and Camacho 2012) database. For this purpose pharmacokinetic parameters, logBB, skin-permeability coefficient (logKp), apparent Caco-2 and MDCK permeability that the higher the value of MDCK cell, higher the cell permeability, aqueous solubility (log S), number of metabolic reactions, logK hsa for serum protein binding, human oral absorption in the GI tract, logP for octanol/water which control hydrophilicity or lipophilicity property of molecules, were calculated. The drug with too much hydrophobic property will be badly soluble in the gut and will solve in fat globules. Hydrophilic compounds unlike hydrophobic ones have low logP, low serum protein binding, potent aqueous solubility and bad cell permeability (Jorgensen 2004). Almost, all hits fell in the acceptable ranges for the properties analyzed for 95% of known drugs, Table 5. Additionally, they all confirm drug-like properties according to Lipinski’s rule of 5 (Lipinski 2000; Lipinski et al. 1997), which was second filter applying in the combined virtual screening analysis. All retrieved compounds had amine group in their structures involving effective interactions such as forming hydrogen bonds with receptor. They can pass cell membranes in their non-ionized forms, and they can be solved in water well in their ionized forms (Palm et al. 1997). Toxicity risk assessment parameters and drug relevant properties of hits were within the acceptable interval which valid the hits are as drugs. As it can be seen in Table 6 none of the hits presented a risk of mutagenicity, tumorogenicity, irritating, or reproductive toxicity.

Table 5 The pharmacokinetic parameters of the retrieved hits compound using Qikprop
Table 6 Compliance of the hits with the standard intervals for computational toxicity risk parameters

Conclusion

In this study, 3D-QSAR models with high predictive capabilities were derived for a series of GPR119 agonist using CoMFA method to assist of the design of new potent candidates. Molecular docking was performed to investigate the binding interactions of inhibitors and finding the bioactive conformations. The pharmacophore models have been developed with the aid of Hip-Hop algorithm within Discovery Studio program package. The best pharmacophore model was obtained by two features including hydrogen bond acceptor and hydrophobic feature. The validated pharmacophore model was used as a 3D search query to prescreen the compounds from binding database (http://www.bindingdb.org). The Gold program was used as a fast method docking based screening methods. The MEP energy at DFT level confirms the results from molecular docking. These methods allowed us to select seven molecules with high binding affinity for the treatment of type 2 diabetes from the ZINC database. In silico ADME studies indicated the hits to compare the computed values of ADME descriptors with the accepted ranges. Therefore, “Discovery of a new scaffold” and finding the new inhibitors for the treatment of type 2 diabetes within the hit compounds is possible in near future.