Introduction

Human acquired immunodeficiency syndrome (AIDS) is a progressive, degenerative disease of human immune system caused by human immunodeficiency virus (HIV). During the initial infection, person may experience brief period of illness and fever. As the illness progresses, it gets involved more and more with the immune system, making the person much more likely to get infections and disorders that including opportunistic infections and tumors (Siegal et al. 1981; Sanders et al. 2004; Richman 2001). Human immunodeficiency virus type 1 (HIV-1) is a causative pathogen of AIDS, which replicates using three essential enzymes encoded in the HIV pol gen: reverse transcriptase (RT), protease (PR), and integrase (IN). These three enzymes play a pivotal role in the HIV-1 life cycle: RT catalyzes conversion of single-stranded viral RNA into the double-stranded proviral DNA (Sarafianos et al. 2009; Anthony 2004); IN is required for the integration of proviral DNA into the host genome before replication (Louis et al. 2007); and PR cleaves newly synthesized polyproteins at the appropriate places to create the mature protein. These three enzymes are recognized as the most promising targets for the development of anti-HIV-1 drugs. IN is a 288-amino acid protein (32 kDa) encoded by the C-terminal part of the pol gene encoded by HIV genome, and essential for viral replication. IN catalyzes the covalent insertion of the reverse-transcribed DNA into the chromosomes of the infected cells, which serves as a template for the transcription of viral genes and replication of the viral genome leading to the production of new viruses (Brown 1990). IN has three domains: (i) The N-terminal domain (residues 1–49); (ii) the central core domain (residues 50–212) involved in DNA substrate recognition; (iii) the C-terminal domain (residues 213–288, which binds DNA and helps to stabilize the IN-DNA complex (Rezende and Prasad 2004). Asp-64, Asp-116, and Glu-152 residues of catalytic core domain of IN are critical for the catalytic activity. For integration and disintegration, catalytic core domain requires Mg2+ and Mn2+ ions. Mg2+ is biologically relevant divalent metal ion critical for IN activity. The β-diketo acids (DKA) possess metal ion binding properties and inhibit IN by binding with the active site Mg2+ (Huanga et al. 2011). One of the major problems in the treatment of the infection by HIV is the ease with which the virus can mutate and develop drug-resistance. To date there is no vaccine available for AIDS; however, a total number of 25 approved antiretroviral drugs are available, which attack four targets: viral entry, RT, PR, and IN. Antiretroviral treatment can slow the course of the disease and may lead to a near-normal life expectancy (Castro et al. 2006). A new class (DKA) of antiretrovirals raltegravir, elvitegravir and dolutegravir (GSK1349572) (Fig. 1) inhibit HIV IN-catalyzed insertion of the HIV-1 genome into cell chromosomes and emerged as a promising therapeutics for treatment of AIDS (Mouscadet et al. 2010). In this study, we performed comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) on substituted hydroxamic acid derivatives as inhibitors of IN. The contour maps of CoMFA and CoMSIA fields were used for the design of 10 novel DKA derivatives as IN inhibitors. We predicted the activity of designed compounds through generated 3D QSAR models, and toxicity using osiris property explorer. Molecular docking study was performed to investigate the binding mode of designed compounds and compared with standard raltegravir. Finally, molecular dynamics (MD) simulation studies of ligand-enzyme complex were carried out to get the dynamics information.

Fig. 1
figure 1

Chemical structure of inhibitors of HIV-1 IN enzyme

Materials and methods

Dataset

A data set of 71 compounds consisting of azaindole hydroxamic acids, azaindole N-methyl hydroxamic acids and N-hydroxy-dihydronaphthyridinones were collected from the literatures (Plewe et al. 2009; Tanis et al. 2010; Johnson et al. 2011). Chemical structures and activity data are shown in Table 1. IC50 values were converted to pIC50 and subsequently used as dependent variable for 3D QSAR study (Table 1). The total set of 71 inhibitors were divided into training set (51 compounds) for generating QSAR model and a test set (20 compounds) for validating the quality of the models. Selection of a training and test set molecules was done by considering the fact that test set molecules represent a range of inhibitory activity similar to that of a training set. Thus, a test set was the true representative of the training set. This was achieved by arbitrarily setting aside 20 compounds as a test set with a regularly distributed biological data.

Table 1 Structures, experimental and predicted inhibitory activities along with residuals of hydroxamic acid derivatives

Computational details

QSAR, molecular docking, calculations, and visualizations for CoMFA and CoMSIA contour maps were performed using SYBYL X 1.2 software from Tripos Inc., St. Louis, MO, USA. In silico toxicity were predicted using osiris property explorer. MD simulation was performed using the GROMACS (Berendsen et al. 1995). Compound 32 was selected as template molecule because of its high IN inhibitory activity. The structures of all other compounds were constructed from the template molecule using “SKETCH” function in SYBYL and partial atomic charges were calculated by the Gasteiger–Huckel method and energy minimization was performed using Tripos force field with a distance-dependent dielectric and Powell conjugate gradient algorithm. The minimum gradient difference of 0.05 kcal/mol Å was set as a convergence criterion (Clark et al. 1989).

Alignment

Distill (align 1)

Rigid body alignment of molecules in a Mol2 database was performed using maximum common substructure (MCS) defined by Distill. Compound 32 was used as a template and all other compounds were aligned on the basis of the common structure. MCS represents a common core of all the structures used for the alignment. Distill generates MCS on the basic of a group of connected atoms common to a set of structures used for the alignment. A rigid alignment attempts to align molecules in a database to a template molecule on a common backbone or core (MCS). This core will typically have been produced by Distill. Alignment of training and test set compounds using Distill module is shown in Supplementary Fig. 1.

Pharmacophore-based alignment (align 2)

All the compounds (training and test sets) were selected to generate the pharmacophore model using DISCOtech module. All the compounds were aligned on some common features depending upon the position rotation and conformation. Generated pharmacophore model contains acceptor site, acceptor atom, donor atom, donor site two hydrophobic features, and one aromatic region, which were then used for the alignment of the compounds in QSAR studies (Supplementary Fig. 2).

Docking-based alignment (align 3)

Docking-based alignment was performed using Surflex-Dock module of Sybyl X 1.2. The X-ray crystallographic structure of catalytic core domain of IN protein data bank (PDB ID: 1QS4) (Goldgur et al. 1999) solved at 2.10 Å resolution was retrieved from the PDB. Compounds were docked into the IN using the flexible docking module implemented in Surflex-Dock. Active conformation was selected from the binding orientation in the active site of IN and evaluated by consideration of binding free energy scores (Surflex-Dock score). Since for all compounds the best docked geometries (active conformation) were in agreement with the crystallographic data of the IN complex (and thus already aligned) (Supplementary Fig. 3), they were directly submitted to QSAR studies.

Statistical results of the three different alignment methods of molecules were shown, but rigid body alignment of molecules by Distill (align 1) provided the best results (Table 2) with significant statistical values as compared to align 2 and 3, so further QSAR study was carried out using align 1.

Table 2 Statistical parameters of CoMFA and CoMSIA models using PLS analysis

CoMFA and CoMSIA model

In CoMFA analysis, steric and electrostatic potential energies were calculated using Tripos force field with a probe atom having a van der Waals radius of sp 3-hybridized carbon and a +1 charge to generate steric (Lennard-Jones 6-12 potential) field energies and electrostatic (Coulombic potential) fields with a distance-dependent dielectric at each lattice point. A lattice with 2 Å grid spacing extending at least 4 Å in each direction beyond the aligned molecules was used. The steric and electrostatic energy values were truncated at 30.0 kcal/mol. In order to reduce noise and improve efficiency, column filtering was set to 2.0 kcal/mol. CoMSIA similarity index descriptors were calculated using a dummy sp 3-hybridized carbon with +1 charge. The same lattice box used in CoMFA calculations was also applied to CoMSIA calculations with a grid spacing of 2 Å with a radius of 1.0 Å as implemented in SYBYL. Similarity indices were calculated between a probe and each atom of the molecules based on a Gaussian distance function. CoMSIA not only computes steric and electrostatic fields, but also calculates hydrophobic, hydrogen-bond donor (HBD), and hydrogen-bond acceptor (HBA) fields. For the distance dependence between the probe atom and the molecule atoms, a Gaussian function was used. Because of the different shape of the Gaussian function, the similarity indices calculated at all grid points, both inside and outside the molecular surface.

Partial least square (PLS) analysis

CoMFA and CoMSIA models were derived using PLS regression analysis. Calculated CoMFA and CoMSIA descriptors were used as independent variables and IN inhibitory activity (pIC50) as the dependent variable in the PLS analysis. PLS analysis was performed using the leave-one-out (LOO) and cross-validation (CV) methods for 3D QSAR analysis, which gives q 2 and r 2 cv, respectively, as a statistical index of predictive power. The non-cross validated models were assessed by the conventional correlation coefficient (r 2 ), standard error of estimation (SEE) and F values. A 100-cycle bootstrap analysis was performed to assess the statistical confidence of the derived models. The mean correlation coefficient is represented as bootstrap r 2 (r 2 boot). The PLS analysis was then repeated with no validation using the optimal number of components to generate CoMFA and CoMSIA (Cramer III et al. 1988). The predictive r 2 (r 2 pred ) was based only on the molecules (20 compounds) not included in the training set and is defined as r 2 pred  = SD−PRESS/SD where, SD is the sum of the squared deviations between the IN inhibitory activity of molecules in a test set and the mean inhibitory activity of a training set molecules, and PRESS is the sum of squared deviations between predicted and actual activity values for every molecule in a test set. The training set was initially checked for outliers for 3D QSAR analysis. Generally, if the residual of a compound between experimental pIC50 and predicted pIC50 values is greater than 1 logarithm unit, compound is considered as outlier. Examination of the residuals (Table 1) indicated that there is no outlier in 3D QSAR models.

Docking study

The Surflex–Dock module of SYBYL was used for molecular docking. The structure of catalytic core domain of IN (PDB ID: 1QS4) (Goldgur et al. 1999) solved at 2.10 Å resolution was retrieved from the PDB, and modified for docking calculations. Co-crystallized ligand 1-(5-chloroindol-3-yl)-3-hydroxy-3-(2H-tetrazol-5-yl)-propenone) (5ClTEP) was removed from the core domain, water molecules were removed, –H atoms were added and side chains were fixed during protein preparation. Protein structure minimization was performed by applying Tripos force field, and partial atomic charges were calculated by Gasteiger–Huckel method.

In silico toxicity risk prediction study

In silico toxicities were predicted using osiris property explorer, which uses Chou and Jurs algorithm, based on computed atom contributions (Vyas et al. 2015).

MD simulation

MD simulations were performed on catalytic core domain of IN with designed compound 4d based on the docking results using the software package GROMACS with the standard GROMOS96 force field (Lindahl et al. 2001). Topology file and other force field parameters except the charge of 4d was generated using the PRODRG program (Schuttelkopf and van Aalten 2004). The enzyme-ligand complex was neutralized by adding sodium and chloride ions by replacing water molecules. The energy of the complex was minimized using the steepest descent approach realized in the GROMACS package. Then, a 100 pico-seconds (ps) position restraining simulation was carried out to restrain the 1QS4. Finally, a 1 ns MD simulation was performed at a 100  ps NVT (constant number of particles, volume, and temperature), and 100 ps NPT (constant number of particles, pressure, and temperature) equilibration phases, and the periodic boundary conditions were used in all three dimensions. 1QS4, water molecules, and ligand were coupled separately in a temperature bath at 300 K. The pressure coupling was set as independent in the x and y directions, with a constant pressure of 1 bar. The particle mesh Ewald method (Darden et al. 1993) was used for long-range electrostatics.

Results and discussion

Results of the CoMFA and CoMSIA analysis (Align 1)

The statistical parameters of standard CoMFA models constructed with steric and electrostatic fields are given in Table 2. The q 2, r 2 cv, r 2 pred, r 2 ncv , F, and SEE values were computed as defined in SYBYL. PLS analysis showed a q 2 value of 0.664 and r 2 cv of 0.691. A non-cross-validated PLS analysis results in a conventional r 2 of 0.951, F = 347 and a SEE of 0.031 with 6 components. In both steric and electrostatic field contributions, former accounts for 0.506, while latter contributes 0.494, indicating that steric field contributed highest to the binding affinity. A high bootstrapped r 2 (0.972) value and low standard deviation (0.011) suggested a high degree of confidence in the analysis. The predicted r 2 (r 2 pred) values of CoMFA models is 0.689. The predicted and experimental pIC50 and residual values are listed in Table 1, and the correlation between the predicted and the experimental pIC50 of training and test set is depicted in Supplementary Fig. 4S(A). CoMSIA offered steric, electrostatic, hydrophobic, HBD, and HBA fields’ information. CoMSIA models were generated using steric, electrostatic, hydrophobic, HBD and HBA fields, and the results of this study are summarized in Table 2. CoMSIA models showed higher correlation and high predictive properties. Statistically significant CoMSIA model gives q 2 of 0.643, r 2 cv of 0.661, r 2 of 0.971, F of 480, and SEE of 0.021 values with 6 components. The corresponding field contributions are 0.220 (steric), 0.238 (electrostatic), 0.141 (hydrophobic), 0.252 (HBD), and 0.149 (HBA). The predicted r 2 (r 2 pred) values of CoMSIA models is 0.654. Plot of experimental and predicted pIC50 of training and test set is depicted in Supplementary Fig. 4S(B).

CoMFA contour maps

The CoMFA and CoMSIA electrostatic and steric contour maps are more or less similar and both models established good predictive ability. Compound 32 is shown inside the field for demonstrating contours generated from CoMFA and CoMSIA maps. Favoured and disfavored levels were fixed at 80 and 20%, respectively. In the CoMFA steric contour map (Fig. 2a) reference molecule’s (32) 4-F-Bn ring substituted on the pyrrole ring was observed in the region of green colored (80% contribution) contour, which showed favorable bulky substitution. Differences in the activity of 29 and 30 (29, IC50 = 32 nM, 30 = 18 nM) probably due to the presence of the sterically favoured additional fluoro group at 2 position in pyridine ring in 30. Similar could be the reason for activity diversity in molecules 31 (IC50 = 50 nM) and 33 (IC50 = 17 nM). In the case of inactive molecules, the extension of various groups might led to the loss of inhibitory activity as they were oriented away from the green contour map. A second small green colored steric contour was found at the methylene carbon of dihydronaphthyridinone ring system indicating a favorable effect of steric bulk. It can be observed in acyclic N-methyl hydroxamate 23 (IC50 = 250 nM) and cyclic N-hydroxy-dihydronaphthyridinone 32 (IC50 = 2.9 nM), which is nearly 100-fold potent than 23. One steric unfavorable contour (yellow colored) was observed near the pyrrole ring. It suggested that bulky groups in these regions might decrease HIV-1 IN inhibitory activity. Another large unfavorable yellow region was flanked at N-hydroxy-dihydronaphthyridinone ring, suggested that there was a definite structural requirement of a substructure with appropriate shape to exhibit good inhibitory activity. For example, molecules 21 (IC50 = 250 nM) and 22 (IC50 = 356 nM), exhibit less activity as compared to 8 (IC50 = 84 nM). The reason may be, 22 had –OCH3 group at R2 position and 21 had –CH3 group at R1 position, which were filling the yellow contour space, whereas 8 had –OH at R2 and –H at R1 position. CoMFA electrostatic contour map is shown in Fig. 2b. A large blue colored (80% contribution) contour was observed around the methylene linker between phenyl and pyrrole ring, indicated that electropositive groups in this region would increase the activity. Second large blue contour was observed in the region of C3,7,8,9 (3,7,8,9-tetrahydro) position of naphthyridinone ring system. In the present dataset of compounds cyclization of 23 (250 nM) into tricyclic 32 (2.9 nM) resulted into active compounds. One large red colored contour covering the nitrogen and oxygen atoms of dihydronaphthyridinone ring. These positions indicated that the presence of electronegative substituents such as oxygen and nitrogen atoms are required for good inhibitory activity.

Fig. 2
figure 2

CoMFA contour maps. Compound 32 is shown inside the field, a contour maps of CoMFA steric map shown in green (80% contribution) refer to sterically favored regions; yellow (20% contribution) indicates disfavored areas, b contour maps of CoMFA electrostatic field. Electrostatic contour map shown in red (20% contribution) indicates regions where negatively charged substituents are favored and blue contours (80% contribution) refer to regions where negatively charged substituents are disfavored (color figure online)

CoMSIA contour maps

CoMSIA contribution maps denote those areas within the specified region where the presence of a group with a particular physicochemical property will be favoured or disfavored for good inhibitory activity. CoMSIA calculates both steric and electrostatic fields, as in CoMFA, but additionally uses hydrophobic, HBD and HBA fields. Favoured and disfavored levels fixed at 80 and 20%, respectively. The CoMSIA hydrophobic contour map is shown in Fig. 3a, represented by yellow (80% contribution) and gray (20% contribution) colored contours. Yellow colored contours indicated the regions where hydrophobic groups on ligands are favoured and gray colored contours represent those areas where hydrophobic groups are unfavoured (or favorable for hydrophilic groups on ligands). One large yellow colored contour covering the 4-F-Bn and pyrrole ring indicated that hydrophobic property in this region might increase the activity. It can be observed by comparing the structures of 8 (84 nM) and 9 (137 nM), 9 contains the 4-F-Ph ring which is less hydrophobic than the 2,4-F-Ph ring of 8 in this region. Difference in the activity of 16 (812 nM) and 17 (1620 nM) was due to presence of more hydrophobic favoured –F group substituted at 4th position of phenyl ring in 17, whereas 18 do not contain –F group at this position. One hydrophobic unfavoured gray contour was covering 7-hydroxy and C6 (–C=O) position of N-hydroxy-dihydronaphthyridinone ring revealed the necessity of the hydrophilic groups to increase the activity. Differences in the activity of 21 (250 nM), 22 (356 nM) and 8 (84 nM) probably due to –OCH3 and –CH3 groups in 21 and 22 in the region of gray colored contour. Hydrophobic unfavoured gray colored contour was in agreement with sterically unfavoured yellow color contour. The graphical interpretation of the HBD interactions in the CoMSIA model is represented in Fig. 3b. Cyan colored contours (80% contribution) indicated the regions where HBD substituents on ligands are favoured and purple colored contours (20% contribution) represented those areas where HBD properties on compounds are disfavored. Table 2 showed that HBD made largest contribution to CoMSIA model. In the HBD contour map, one cyan colored area was observed near the 7-hydroxy position of N-hydroxy-dihydronaphthyridinone ring. 7-Hydroxy group is necessary for a cyan favorable isopleth in proximity to this area. It is indeed in case of compound 21 (250 nM) and 22 (356 nM). A purple polyhedron near the carbonyl oxygen C6 –C=O) position of N-hydroxy-dihydronaphthyridinone ring is indicative of a disfavored HBD region. The graphical interpretation of the HBA interactions in the CoMSIA model is shown in Fig. 3c. Magenta and red contours represent those areas where HBA are favored and disfavored, respectively. A large magenta contour was present near the carbonyl oxygen C6 (–C=O) position of N-hydroxy-dihydronaphthyridinone ring. Carbonyl oxygen can act as H-bond acceptor by attacking protons, indicated a favorable interaction of HBA group in this region to enhance inhibitory activity. One large red colored contour was observed away from the molecules area and indicated that there is no significance of HBA in this region. Analysis of CoMFA and CoMSIA contour plots offered enough information to understand the binding mode between the inhibitors and binding site of IN enzyme.

Fig. 3
figure 3

CoMSIA contour maps. Compound 32 is shown inside the field. Hydrophobic field a, HBD field b, and HBA field c. Yellow and gray contours indicate regions where hydrophobic groups favored and disfavored the activity, respectively. Cyan contour represent the area where HBD is favored and purple contour represent the area where HBD is disfavored. Magenta and red contours represent areas where HBA is favored and disfavored, respectively. Favored and disfavored levels are fixed at 80 and 20%, respectively (color figure online)

Structure-based interpretation of CoMFA and CoMSIA contour maps

The statistical results of docking-based alignment showed good value of q 2, r, 2 and r 2 cv (Table 2), and in order to map generated contours over the active site of IN enzyme, docking-based alignment was also used for the generation of CoMFA and CoMSIA contour maps. Superimposition of CoMFA and CoMSIA contour map with the active site of IN enzyme provides an idea of interacting amino acids with the ligand’s favorable and disfavorable regions. The steric CoMFA contour plot with compound 32 in the active site of IN is shown in Fig. 4a. A large green colored contour covered the 4-F-Bn ring substituted on the pyrrole ring and found in the vicinity of Phe-181 which suggested more favorable π–π interactions at this position. Another large green colored contour was found near the aromatic amino acids Tyr-83, Trp-61, and Trp108. A bulky steric favoured group in this region is well tolerated due to aromatic hydrophobic interactions between substitutions on ligand with aromatic amino acids. CoMFA electrostatic contour map (Fig. 4b) was projected with negatively and positively charged amino acid in order to observe the favorable and disfavorable interactions due charged amino acids in the vicinity of the active site of IN. One large blue colored contour was found out of the molecular area in the vicinity of His-67 and Lys-71. Second large blue colored contour was found in the vicinity of Glu-152 and, Glu-170, and third small blue colored contour was covered with Glu-87, which suggested favorable ionic interactions between the ligand and negatively charged amino acids of the active site. Two small red colored contour were observed in the area of positively charged amino acids His-171 and Lys-173, which are important residues for interactions with electronegative functional groups on the ligand. CoMSIA Hydrophobic plot (Fig. 5a) is represented by yellow (hydrophobic favoured) and gray (hydrophobic disfavored) colored contours. One large hydrophobic favoured yellow colored contour was found near the hydrophobic amino acids Leu-68 and Val-72, another small yellow colored contour was present on 4-F group substituted on benzyl ring in the vicinity of Leu-158, which indicted favorable hydrophobic interactions between ligand and active site of enzyme for good inhibitory activity of the compounds. Unfavorable gray colored contours were flanked with polar amino acids Thr-66 and Ser-153. CoMSIA HBD and HBA contour map is shown in Fig. 5b. For better view, only HBD and HBA amino acid residues of the active site i.e. Asp-64, Glu-85, Glu-87, Ser-147, Gln-148, Glu-152, and Gln-177 were retained for the superimposition of the contour maps on the active site of IN. Two favorable HBD cyan colored contours along with unfavorable HBA red colored contour was found in the vicinity of Gln-177 and Asp-64. These amino acid may form H-bond with the favorable H-bond donor functional groups on the ligand. Similarly two HBA favorable magenta colored contours and two unfavorable HBD purple colored contour were flanked near the Ser-147, which can donate H-atom and form an H-bond with the ligand.

Fig. 4
figure 4

Superimposition of CoMFA contour maps on IN active site. Steric contour map is shown in green (sterically favored) and yellow color (disfavored area) a. Electrostatic contour map is shown in red (negatively charged favoured) and blue color (positivity charged favoured) b (color figure online)

Fig. 5
figure 5

Superimposition of CoMSIA contour maps on IN active site. Hydrophobic contour map is shown in yellow (hydrophobic favored) and gray color (hydrophobic disfavored) a. HBD field is shown in cyan (HBD favored) and purple color (HBD disfavored), and HBA field is shown in magenta (HBA favored) and red (HBA disfavored) b. Compound 32 is shown inside the field into the active site of IN enzyme (color figure online)

Generation of new leads

We have attempted to confirm the results drawn from the 3D QSAR analysis for the design of novel DKA derivatives as IN inhibitors (Fig. 6). A total of 10 hydroxamic acid derivatives (1d10d) were designed based on the generated contour maps in 3D QSAR study, which have structural profile similar to studied compounds. The main strategy in the design of these leads compounds was the introduction of functional groups found to be significant with the contour maps analysis. The –F group was replaced with –OCF3, –CF3, and –OCH3 groups in the designed compounds at C4 position of terminal phenyl ring to increase the bulkiness and hydrophobicity in the molecules in this region. Methylene spacer was replaced by ethylene spacer in between phenyl and pyrrole ring system in some of the designed compounds. Tricyclic N-hydroxy-dihydronaphthyridinones ring was maintained in the designed compounds. For increasing H-bonding interactions with the active site of IN, and for the consideration of hydrophobic unfavoured region 1-alkoxymethyl, 1-aminomethyl substitutions on pyrrole ring was replaced with –OH and –CONH2 groups.

Fig. 6
figure 6

Strategy for the design of hydroxamic acid derivatives using CoMFA and CoMSIA contour maps (color figure online)

Prediction of the pIC50 of the generated leads

Each of the designed compound was evaluated by the generated 3D QSAR models, and the biological activity values (pIC50) were predicted (Table 3) using both the (CoMFA and CoMSIA) models. The analysis of predicted activity of compounds showed that presence of –OCH3 group at terminal phenyl ring and –CONH2 group substituted on pyrrole ring are crucial to increase the predicted activity of designed compounds. Compounds 4d and 6d showed better predicted activity as compared to other designed compounds (Table 3). The pIC50 of 4d (CoMFA predicted = 8.948) was slightly better than pIC50 of template molecule 32 (pIC50 = 8.538), which indicated that substitution of methoxy group caused an increase in the activity. The designed compound had potential structural and pharmaceutical drug features to become IN inhibitors.

Table 3 Structure of designed compounds with predicted activity and docking score

Molecular docking analysis

To study the binding mode of the designed molecules in the catalytic core domain of IN, we performed molecular docking study with the designed compounds along with reference standard raltegravir. The Surflex–Dock uses an empirically derived scoring function that is based on the binding affinities of enzyme-ligand complexes, and on their X-ray structures. The protomol is a unique and important factor of the docking algorithm and is a computational representation of assumed ligands that interact with the binding site. The catalytic core domain of IN was retrieved from the PDB. After running Surflex–Dock, the scores of the active docked conformers were ranked in a molecular spread sheet. We have selected the best total score conformers and speculated regarding the detailed binding patterns in the cavity. A total dock score the total Surflex–Dock score (docking score) represent binding affinities which include hydrophobic, polar, repulsive, entropic, and salvation is given in Table 3.

Binding mode of IN inhibitors (4d and raltegravir)

Docking results suggested that most of the designed compounds have a common binding mode in the vicinity of active site of subunit A of IN, lying between the three catalytic acidic residues, Asp-64, Asp-116, and Glu-152. Designed compounds showed interactions with several residues near the active site, including Lys-156, Lys-159, Gln-148, and Tyr-143, which seem to play a key role in binding of viral DNA substrate (Jenkins et al. 1997; Esposito and Craigie 1998). Lys-156, Lys-159, and Gln-148 residues are important in binding of inhibitors into the active site. In the catalytic core domain of IN in complex with 5-ClTEP, the Mg2+ ion was coordinated to two of the catalytic triad residues, Asp-64, and Asp-116. It was observed that the ligand binding does not displaced the bound Mg2+ ion. The overall binding of 4d, and raltegravir in catalytic core domain of IN is illustrated in Fig. 7. Docking results showed that compound 4d has highest docking score of 5.35 and formed two hydrogen bonds (Fig. 7a). The –N atom of naphthyridin-6-ones ring formed one H-bond with ε-amino group (NH3 +) attached to the fifth carbon beginning from α-carbon of Lys-159 (N· · ·HNH Lys-156 2.01 Å). Oxygen atom of amide group (–CONH2) substituted on pyrrole ring formed second H-bond with ε-amino group (NH3 +) of Lys-156 (C=O· · · HNH Lys-159 2.07 Å). The hydrophobic methoxyphenyl ring was found in contact with Gln-148, which is an important residues for binding of ligand. These residues are involved in binding of ligand to the catalytic core domain of IN. Raltegravir as reference molecules formed three hydrogen bonds with Asp-64 and one H-bond with Lys-159 with a docking score of 4.81 (Fig. 7b). Docking study explored the interaction mechanism and reasonable binding mode of designed compounds in catalytic core domain of IN.

Fig. 7
figure 7

Binding mode of ligand 4d a and raltegravir b in catalytic core domain of IN proposed by docking studies using Surflex-Dock, ligands are in ball-and-stick model with color by atoms, H-bond interaction were highlighted using yellow line. The labeled protein residues are in capped stick model with color by atom (color figure online)

Toxicity prediction study

We selected compounds 4d and 6d as the most suitable (best predicted pIC50 and docking score) candidates for the in silico toxicity risk prediction study. Factors of toxicity risk management relies on a computed set of structural fragment that give rise to toxicity alerts in case they are encountered in the structure. Factors like mutagenicity, tumorigenicity, irritant and reproductive effects were predicted as toxicity risk management study. Results of toxicity prediction revealed that no compound is at any risk of tumorigenicity, irritant and reproductive effective, but they are at the medium risk of mutagenicity due to fragment 2-(1-oxo)-nitrosopyridine present in their structures.

MD simulation

MD simulation studies were initialized from the energy-minimized configurations (Supplementary Fig. 5). We equilibrated the solvent and ions around the protein. (Supplementary Fig. 6) showed that the system reached the target temperature 300 K, and stays there over the remainder of the equilibration (average temperature 299.65 K). After we arrived at the correct temperature (based on kinetic energies), we applied pressure to the system until it reached the proper density. The pressure value fluctuates widely over the course of the 100 ps equilibration phase (Supplementary Fig. 7). Over the course of the equilibration, the average value of the pressure was 1.04 bar. The average density value over the course of 100 ps is 1008.2 kg m−3, close to the experimental value of 1000 kg m−3. The density values are very stable over time, indicating that the system is well-equilibrated with respect to pressure and density (Supplementary Fig. 8). The root-mean-square deviation (RMSD) plots were shown in Fig. 8 for the backbone and ligand (4d) to examine conformational variations of the system. Both the plots showed (Fig. 8) that the RMSD levels off to ~ 0.1 nm (1 Å) after 0.4 ns and then leveled off after that, it indicated that after an early rise in magnitude of 4d atoms variation, the 4d reached an equilibrium state. RMSD plots showed that the structure is very stable with low and stable RMSD value. We have also measured the radius of gyration of a protein for its compactness. Stably folded protein maintains a relatively steady value of radius of gyration (Rg) and if protein unfolds, its Rg will change over time. From Supplementary Fig. 6 it can be seen that Rg values were invariant and the protein remains very stable in its compact (folded) form at 300 K. The simulations revealed that the enzyme-ligand complex remained very close to its crystal structure throughout the simulation, which implies a very stable configuration.

Fig. 8
figure 8

RMSD plots derived by MD simulations. For 1QS4 ligand (4d) (a), for backbone atoms of 1QS4 (b)

Conclusions

IN has emerged as an attractive target for the development of novel anti-HIV agents. In the present study good predictive CoMFA and CoMSIA models were generated and utilized for the design of novel DKA derivatives, and also used for the prediction of activity of designed compounds. Molecular docking study was performed with the designed compound in the catalytic core domain of IN. The designed compounds showed interaction with crucial residues in the active site of IN. In silico toxicities were predicted for the best ranked designed compounds. Finally, MD simulations were performed to get dynamics information of ligand-enzyme complex in the state close to natural conditions. Designed compound 4d was stable in the catalytic core domain of IN and showed stable and low RMSD values, indicating stable complex during MD simulations. Hence, it was concluded that combination of these modeling methods are able to design new hits, which may be a good lead to develop novel IN inhibitors for the treatment of AIDS.