Introduction

Originally identified as a modulator of glycogen metabolism about 20 years ago, glycogen synthase kinase 3β (GSK-3β) is now found to be a Ser/Thr protein kinase with key roles in transduction of regulatory role in a variety of pathways. These include the initiation of protein synthesis, cell proliferation, cell differentiation, and apoptosis. This kinase is also essential for embryonic development [14]. In humans, two genes are present that encode the related GSK-3 isoforms GSK-3α and GSK-3β, which exhibit approximately 98% sequence identity within their catalytic domains.

Many different kinds of GSK-3 inhibitors have been studied by various researchers [427]. Our attention was directed to the discovery of inhibitors of the GSK-3β to be used possibly in the treatment of a number of CNS disorders including Alzheimer′s disease, Parkinson′s disease, bipolar disorders, and traumatic brain injury. Our work in this area was influenced by the maleimide-bearing natural product staurosporine [19, 24].

In our previous paper, we reported on the chemical synthesis and the biological activities of a number of substituted maleimides as inhibitors of GSK-3β and additionally examined their selectivity for inhibition of CDK2/cyclinE [28]. In this paper, we report on our study of the molecular modeling and docking of the inhibitors into the binding site of GSK-3β, together with 3D-quantitative structure-activity relationships (3D-QSAR) using the comparative molecular field analysis (CoMFA) [2931] and the comparative molecular similarity indices analysis (CoMSIA) [32]. A specific aim of this study is to identify the correct binding mode of the substituted maleimide compounds included in this study using the computer-aided molecular modeling techniques. Fifty-one 3-benzofuranyl-4-indolyl-maleimide-based GSK-3β inhibitors of structural type I are included in the present work. Two possible binding modes are examined to determine the correct interaction mode of these compounds with the enzyme. Superpositions of the two alignments are obtained by docking the inhibitors to the known X-ray crystal structure of GSK-3β (1R0E), where a similar ligand to our inhibitors is bound.

Results and discussion

Studies on the binding mode of the inhibitors

In order to study the binding mode of the inhibitors, we chose to utilize 3D-QSAR methodologies. For such 3D-QSAR studies employing both the CoMFA or CoMSIA methodologies, all compounds need to be superimposed under the assumption that they bind in a similar manner to the same binding site. Different methods have been used in the literature for the superposition of the compounds of interest. We decided to dock the inhibitors to the binding site of GSK-3β protein and use the docked conformation of the inhibitors in our CoMFA and CoMSIA studies. In previous publications from this laboratory we assumed that the binding mode of the substituted maleimides, either indol-3-yl-(indazol-3-yl)maleimides or benzofuran-3-yl-(indol-3-yl)maleimides, is similar to that found for staurosporine in its X-ray co-crystal structure with GSK-3β (1Q3D) [33].

In this study, we reinvestigated the possible binding mode of the benzofuran-3-yl-(indol-3-yl)maleimides (I) to GSK-3β in an effort to develop a potent and selective GSK-3β inhibitor. In order to find relevant information about the binding mode and conformation of the inhibitors, we first examined the known X-ray crystal structures of GSK-3β currently available in the RCSB PDB Protein Data Bank [34]. Table 1 lists the X-ray structures of the GSK-3β complexes that were examined. Four of the eight ligands in Table 1 are similar to our GSK-3β inhibitors.

Table 1 Known GSK-3β X-ray structures

Examination of the X-ray crystal structures of GSK-3β in Table 1 revealed that there are roughly two types of GSK-3β structures with respect to Phe67: one is 1R0E-like (in yellow), and the other is 1Q4L-like (in orange) (Fig. 1a). Between these two extreme structures, there are intermediate ones like that represented by the 1Q41 structure (Fig. 1b, in pink). The changes in position of the Phe67 residue are due to the differences in the conformation of the Gly-rich loop observed in essentially all Ser/Thr and Tyr protein kinase structures [3537]. Another major change in the conformations observed among these structures is the movement of the Arg141 side chain (see the discussion below). The side chain movements of Arg141 in the GSK-3β structures can be seen at the lower left corner of Fig. 1b.

Fig. 1
figure 1

X-ray crystal structures of ligand-bound GSK-3β structures listed in Table 1. (a) Approximately two groups of GSK-3β structures are shown with respect to the residue Phe67: one is 1R0E-like (yellow), and the other is 1Q4L-like (orange). The two groups of Phe67 positions of GSK-3β structures are illustrated by the two different positions of the phenyl ring shown in the ball-and-stick model in Fig. 1a (labeled 1 and 2 in Fig. 1b). (b) Between the two extreme structures shown in (a), there are intermediate ones represented by 1Q41 in pink (labeled 3 in Fig. 1b). The phenyl ring of this intermediate ones of GSK-3b is illustrated by the phenyl ring (ball-and-stick model in pink) shown between the two different positions of the phenyl rings shown in Fig. 1a. Multiple conformations of Arg141 can be seen at the lower left corner. (c) Binding mode of staurosporine (1Q3D). (d) Binding modes of II (1R0E), III (2OW3), and IV (1Q4L) from the superimposed GSK-3β structures. They show similar binding conformation of the three compounds. (e) Binding modes of II and IV showing similar binding conformation of these two compounds

The binding mode of staurosporine, the compound that the design of our inhibitors was initially influenced by, is shown in Fig. 1c. The X-ray crystal structure shows that the binding pose of the staurosporine is guided by the two adjacent intermolecular hydrogen bonds of the pyrrolidin-2-one moiety. The NH group of the pyrrolidin-2-one ring of staurosporine forms a hydrogen bond to the backbone carbonyl oxygen of Asp133, and the carbonyl oxygen of the same ring forms a hydrogen bond to the backbone NH of Val135. Hydrophobic side chains of Leu132 and Lys85 are present around the methylene group (carbon 7 indicated in the staurosporine structure) of staurosporine. In addition, there are two water molecules connecting the carbonyl oxygen of the pyrrolidin-2-one moiety and Glu97 nearby. The two indole rings of staurosporine are fixed by a phenyl group connecting the two rings, and so do their binding positions.

Figure 1d shows the binding modes of II, III, and IV in Table 1. Compound III has two indole rings as does staurosporine, whereas II has only one indole ring, and IV has an aminophenyl and a phenyl ring instead of the two indole rings. The two indole rings in III are partially rigidified through macrocycle formation encompassing the two indole rings. Even though the two indole rings are semi-rigid, the binding conformations of the two indole rings of III are significantly different from that of staurosporine, and similar to those of II and IV.

Figure 1e shows the binding modes of II and IV. The indole ring conformation of II is very similar to that of III as well as that of the aminophenyl ring of IV. The conformation of the phenyl ring of II and IV is similar to one another.

Based on the X-ray structures of II, III, and IV, we proposed that the binding mode of benzofuran-3-yl-(indol-3-yl)maleimides (I) in Table 6 would be similar to those of II, III, and IV. This is contrary to the binding mode suggested in the earlier publications for 3-indolyl-4-indazolylmaleimides or benzofuran-3-yl-(indol-3-yl)maleimides [19, 24]. Since the indole ring of II is on the left side and the substituted phenyl ring of II or IV is on the right side in Fig. 1d, it was thought that the indole ring of the maleimides in Table 6 is on the left and the benzofuran ring is on the right in this view. Therefore, all 51 maleimides in Table 6 were docked into the binding site of GSK-3β (1R0E) in this postulated manner. We chose the GSK-3β structure of 1R0E for our docking study, because its bound ligand is the most similar to our inhibitors. The binding conformation of the benzofuran ring could be similar to that of the indole ring of III if the substituent on the benzofuran ring is not large. However, if the substituent on the benzofuran ring is of sufficient bulk, then there is not enough space for the conformation observed for III. In such a case, the benzofuran ring would be rotated by 180º. For the purpose of the CoMFA and CoMSIA studies, the conformation of the rotated benzofuran ring was selected, because some of our inhibitors have a larger substituent than can be accommodated by the pocket present in the unrotated-binding conformation. When a large substituent is present at the X5 position of the indole ring of the benzofuran-3-yl-(indol-3-yl)maleimide, the substituent would clash sterically with Phe67, which would result in the movement of the glycine-rich loop. The consequence of this movement would result in the conformational change of Phe67 from the position of 1R0E-like structure to the position of 1Q4L-like structure (see Fig. 1a). For consistency, all the compounds listed in Table 6 were docked in this conformation to the GSK-3β binding site. (See further discussion below.) The starting conformation of the initial conformation of each compound was manually superimposed over the pyrrolidin-2-one ring of the ligand-bound GSK-3β structure (1R0E) because of the reason discussed above.

Even though the binding mode of the compounds in Table 6 is likely to be the one (binding mode 1) described above, we decided to investigate other possible binding modes, especially in light of the fact that a different binding conformation was previously proposed [19, 24, 56]. First, we examined the relative conformational energy of different possible binding conformations of an unsubstituted benzofuran-3-yl-(indol-3-yl)maleimide to the GSK-3β binding pocket. Figure 2 shows eight different binding modes with four different conformations that are possible in the GSK-3β binding site. In 2a and 2b in Fig. 2, the positions of the indole and the benzofuran rings are switched. With respect to the indole and the benzofuran ring positions, the conformations 2a and 2b in Fig. 2 are the same when only the ligand is considered, but they would be different in the binding site. The conformational energies of these conformations are summarized in Table 2.

Fig. 2
figure 2

Four representative conformations (14) of benzofuran-3-yl-(indol-3-yl)maleimides in binding mode 1 (a) and 2 (b)

Table 2 Relative conformational energy of four representative conformations of 3-(benzofuran-3-yl)-4-(indol-3-yl)maleimides shown in Fig. 2

The conformational energy differences among these four conformers are very small, thus suggesting that any of these conformers may represent their actual binding conformation to the GSK-3β. Nonetheless, it is interesting to note that conformation 2a (and 2b), which is the one believed to represent the likely binding mode of the compounds in Table 6, has the lowest conformational energy. Among the eight possible binding modes shown in Fig. 2, binding modes 2a and 2b are the two most likely binding modes based upon an analysis of the known ligand-bound GSK-3β X-ray crystal structures as discussed above. Therefore, we chose to investigate 3D-QSAR based on these two binding modes in order to determine the binding mode of these compounds.

Comparative molecular field analysis

The 51 compounds included in this study are listed in Table 6 along with the IC50 values toward GSK-3β. The IC50 values were determined for their potency to inhibit GSK-3β. Commercially available human GSK-3β, was assayed for its ability to phosphorylate the primed peptide substrate (RRRPASVPPSPSLSRHSS(P)HQRR; 10 μM) in the presence of 0–10 μM of the maleimides [57]. The inhibitory potency expressed as pIC50 values is the negative logarithm of IC50 value. Therefore, the larger the pIC50 value is, the more potent the compound is as an inhibitor of the kinase. The two binding modes (2a and 2b in Fig. 2) of these compounds were obtained by docking each molecule into the binding site of GSK-3β starting from two different initial binding conformations as described above.

I. CoMFA for binding model 1

Figure 3 shows all the compounds superimposed of the docked conformation in binding mode 1. Binding mode 1 corresponds to the conformation 2a shown in Fig. 2. The best CoMFA model of the 51 substituted maleimides obtained is a three-component model from the steric and electrostatic fields with the following statistics (see Table 3): R2(cv) = 0.386 and SE(cv) = 0.854 for the cross-validation, and R2 = 0.811 and SE = 0.475 for the fitted. F (3,47) = 67.034, and Prob. of R2 = 0 (3,47) = 0.000. The steric component of these maleimide analogs on the inhibitory potency described by this model is 48%, whereas the electrostatic portion is 52%. The first component explains 51% of the variance in the pIC50, and the second and the third components account for additional 22% and 8% of the variance, respectively. An essentially identical CoMFA model was obtained when the steric and the electrostatic fields were considered separately.

Fig. 3
figure 3

Superposition of 51 substituted maleimides as GSK-3β inhibitors obtained from docking into the binding site of GSK-3β structure (1R0E) in binding mode 1

Table 3 CoMFA models for 51 substituted maleimides from the binding modes 1 and 2

In our previous unpublished study, the classical QSAR result shown below was obtained [28]. It is interesting that the statistical quality (R2 and SE) of the classical QSAR and the 3D-QSAR method using the CoMFA methodology is similar. One compound (compound 1 in Table 6) was treated as an outlier in the classical QSAR, but this compound was included in the CoMFA study.

\({\text{pIC}}_{{50}} = - 0.60{\left( { \pm 0.18} \right)}\pi _{{{\text{Y}}6}} + 0.51{\left( { \pm 0.12} \right)}{\left( {\pi _{{{\text{Y}}6}} } \right)}^{2} + 1.78{\left( { \pm 0.21} \right)}\pi _{{{\text{X}}5}} - 0.07{\left( { \pm 0.01} \right)}{\left( {{\text{C}}\log {\text{P}}} \right)}^{2} + 0.10{\left( { \pm 0.02} \right)}{\text{MR}}_{{\text{r}}} - 0.86{\left( { \pm 0.17} \right)}{\left( {\pi _{{{\text{X5}}}} } \right)}^{2} - 1.18{\left( { \pm 0.38} \right)}\sigma {\text{p}}_{{{\text{X}}6}} + 0.49{\left( { \pm 0.24} \right)}\pi _{{\text{R}}} + 0.34{\left( { \pm 0.24} \right)}\pi _{{{\text{X7}}}} + 7.60{\left( { \pm 0.30} \right)},\) N = 50, R2 = 0.842, RMSE = 0.436

Figure 4 is the coefficient contour map of the three-component model derived from all 51 compounds. In this contour map, the sterically favored regions are shown in green, and the sterically disfavored regions are shown in yellow. The positive electrostatic contours are shown in blue, and the negative electrostatic contours are shown in red. Table 6 shows the observed and the calculated pIC50 values from this three-component CoMFA model along with their observed values. It is interesting to note that there is a sterically favored region in the steric contour map (colored in green in Fig. 4a) near the Y2 position of the benzofuran-3-yl-(indol-3-yl)maleimide. The hydrophobic residue Leu132 and the four methylene groups of Lys85 are located near the X2 group and the adjacent carbonyl group of the pyrrolidine-2-one ring. This indicates that a hydrophobic group at this position may improve the inhibitory potency of the compound. As is the case for the binding orientation of staurosporine observed in Fig. 1c (1Q3D), the binding orientation of the GSK-3β inhibitors in Table 6 is fixed by the two hydrogen bonds involving the pyrrolidin-2-one rings of the inhibitors. One of the carbonyl oxygens (left one in Fig. 3) of the pyrrolidin-2-one ring forms a hydrogen bond with the backbone NH of Val135, and the NH group of the pyrrolidin-2-one ring interacts with the backbone carbonyl group of Asp133 residue. Both residues are in the hinge region of GSK-3β.

Fig. 4
figure 4

(a) Steric contour of the three-component CoMFA model from the binding mode 1. The regions in green represent sterically favored, whereas the regions in yellow represent disfavored. (b) Electrostatic contour of the three-component CoMFA model from the binding mode 1. The regions in blue represent electrostatically favored, whereas the regions in red represent disfavored

Four compounds (14) in Table 6 have IC50 values in sub-nanomolar inhibitory potency. Compounds 2, 3, and 4 have a 6-CH2OH at the Y6 position, and compound 1 has a 7-CH2OMe at the X7 position. There are other compounds with 6-CH2OH at the Y6 position among the compounds in Table 6. These compounds are in general potent inhibitors. Two compounds (7 and 8) have a similar substituent 7-CH2OH at X7 position to the 7-CH2OMe of Compound 1.

Figure 5 shows the binding site amino acid residues around the 7-CH2OH group at X7 position and the 6-CH2OH at Y6 position of Compound 2 or 8 as a representative case, respectively. The 7-CH2OH group of Compound 8 is in the hydrogen bonding distance with the side chain of Arg141, whereas the 6-CH2OH group of compound 2 is in the hydrogen bonding distance with the side chain of Gln185.

Fig. 5
figure 5

Docked binding mode of compounds 2 (a) or compound 8 (b) in the binding site of GSK-3β structure in binding mode 1

Whereas the 7-CH2OH group can act as a hydrogen-bond donor as well as a hydrogen-bond acceptor, the 7-CH2OMe group can only act as a hydrogen-bond acceptor. Because the 7-CH2OMe or 7-CH2OH group would interact with Arg141, and should act as a hydrogen-bond acceptor, the 7-CH2OMe group would be preferred to 7-CH2OH at this position if the hydrogen bonding interactions are the major factor. In addition, if the 7-CH2OH or 7-CH2OMe group interacts with the hydrophobic methylene groups of Arg141, the 7-CH2OMe group would be preferred to 7-CH2OH. Comparison of compounds 1 and 8 shows this is indeed the case.

Figure 6 is a plot of the observed and the calculated pIC50 values from the three-component CoMFA model (Eq. 1 in Table 3) from the steric and electrostatic fields.

Fig. 6
figure 6

A plot between the observed and the calculated pIC50 values from the three-component CoMFA model of the ligand-binding mode 1

II. CoMFA for binding model 2

Figure 7 shows all the compounds superimposed in the docked conformation of binding mode 2. Binding mode 2 corresponds to the conformation 2b shown in Fig. 2. The best CoMFA model for the 51 substituted maleimides obtained is a three-component model from the steric and electrostatic fields with the following statistics (see Table 3): R2(cv) = 0.296 and SE(cv) = 0.915 for the cross-validation, and R2 = 0.784 and SE = 0.507 for the fitted. F (3,47) = 56.784, and Prob. of R2 = 0 (3,47) = 0.000. The steric contribution of these maleimide analogs toward their inhibitory potency as described by this model is 49%, whereas the electrostatic portion is 51%. The first component explains 51% of the variance in the pIC50, and the second and the third components account for additional 16% and 12% of the variance, respectively. An essentially identical CoMFA model was obtained when the steric and the electrostatic fields were considered separately.

Fig. 7
figure 7

Superposition of 51 substituted maleimides as GSK-3β inhibitors obtained from docking into the binding site of GSK-3β (1Ρ0Ε) structure in binding mode 2

Comparison of two binding modes with CoMFA results

It is interesting to examine the CoMFA results from the two different binding modes of the compounds listed in Table 6: whether the CoMFA results can be used to distinguish the two different binding modes and select the correct binding mode from the wrong one. In general, the two binding modes of the inhibitors alone are not significantly different; the entire molecules are rotated about 180° in the binding pocket as can be seen in Figs. 3, 7, especially for the study of CoMFA. The differences between the two superpositions are due to the different X- and Y-substituents and the environment of the GSK-3β binding site.

The CoMFA results summarized in Table 3 show that the superposition from the binding mode 1 (Eq. 1) accounts for the variation in the pIC50 values of the compounds studied more than the superposition from the binding mode 2 (Eq. 2): while Eq. 1 explains 81% of the variance, Eq. 2 explains 78%. Even though the difference between the two results (percent of the variance explained by the two models) is not large, the statistics (SE and R2 for both the cross-validation and the fitted) indicate that the binding mode 1 is better than binding mode 2. The relatively similar results are due to the similar superposition of the two binding modes. It is interesting, in this case, that the correct and the incorrect binding modes (or superpositions) yielded similar CoMFA results, even though the correct binding mode gave the better results. Usually in CoMFA, the model having R2(cv) > 0.3 is considered significant. Therefore, the CoMFA model for binding mode 2 is at the borderline of statistical significance. Therefore, the results overall indicate that the binding mode 1 would be favored over binding mode 2 for the benzofuran-3-yl-(indol-3-yl)maleimides examined in this study.

Further validation of CoMFA model from binding model 1

In order to further validate the CoMFA model derived from binding mode 1, the inhibitory potency values pIC50 were scrambled and used as such to develop a CoMFA model. The resulting CoMFA model was then compared with the CoMFA model developed using the correct pIC50 values. Such procedures have been used to prove the robustness of the derived CoMFA model. Five different scrambled pIC50 data sets were used in this validation procedure (See Table 4 for further details). The statistics for the CoMFA models developed using the scrambled pIC50 values for the 51 substituted maleimides are summarized in Table 4. The results show that none of the scrambled pIC50 data sets yielded a statistically significant CoMFA model. The results provide additional support for the CoMFA model derived from binding mode 1 of these compounds.

Table 4 Validation of CoMFA and CoMSIA models for 51 substituted maleimides from binding mode 1 using scrambling pIC50 values

Comparative molecular similarity index analysis

The two binding modes of the compounds (Figs. 2, 5) in Table 6 used in the CoMFA studies described above were also used to study 3D-QSAR using the CoMSIA approach. The CoMSIA results obtained from binding modes 1 and 2 are summarized in Table 5.

Table 5 CoMSIA models for 51 substituted maleimides from binding modes 1 and 2

I. CoMSIA for binding model 1

The best CoMSIA model obtained from 51 substituted maleimides in Table 6 is a three-component model from the steric and electrostatic fields with the following statistics (see Table 5): R2(cv) = 0.414 and SE(cv) = 0.835 for the cross-validation, and R2 = 0.746 and SE = 0.550 for the fitted. F (3,47) = 45.952, and Prob. of R2 = 0 (3,47) = 0.000. The steric portion of the influences of maleimide analogs for the inhibitory potency described by this model is 25%, whereas the electrostatic portion is 75%. The first component explains 53% of the variance in the pIC50, and the second and the third component account for additional 14% and 8% of the variance. An essentially identical CoMFA model was obtained when the steric and the electrostatic fields were considered separately. Addition of hydrophobic and/or hydrogen donor or acceptor components did not improve the correlation already obtained (However, see further discussion below). It is interesting to note that the calculated pIC50 value of compound 1 has the largest deviation (1.258) from the observed pIC50 value. Compound 1 was also found to be an outlier in the classical QSAR discussed above. In this aspect, the CoMSIA results are similar to the classical QSAR.

Table 6 Inhibitory potency of 51 GSK-3β inhibitors included in this study

As the validation process for the CoMFA model derived from binding mode 1, the inhibitory potency values pIC50 were scrambled and used as such to develop a CoMSIA model. The same five different scrambled pIC50 data sets used for the validation of the CoMFA model were used in this validation procedure. The results are also summarized in Table 4. The results show that none of the scrambled pIC50 data sets yielded a statistically significant CoMSIA model.

Figure 8 is the coefficient contour map of the three-component CoMSIA model derived from 51 compounds. In this contour map, the sterically favored regions are shown in green. The positive electrostatic contours are shown in blue. The model indicated that there are no sterically disfavored regions and no electrostatically negative regions at this contour level. Table 6 shows the observed and the calculated pIC50 values from this three-component CoMSIA model along their observed values. Figure 9 is a plot of the observed and the calculated pIC50 values from this model.

Fig. 8
figure 8

(a) Steric contour map (70% level) of the three-component CoMSIA model. (b) Electrostatic contour map (30% level) of the corresponding three-component CoMSIA model in the ligand-binding mode 1

Fig. 9
figure 9

A plot between the observed and the calculated pIC50 values from the three-component CoMSIA model from the steric and electrostatic fields of the ligand-binding mode 1

II. CoMSIA for binding model 2

The best CoMSIA model obtained from 51 substituted maleimides in Table 6 is a two-component model from the steric and the electrostatic fields with the following statistics (see Table 5): R2(cv) = 0.281 and SE(cv) = 0.915 for the cross-validation, and R2 = 0.650 and SE = 0.638 for the fitted. F (2,48) = 44.89, and Prob. of R2 = 0 (2,48) = 0.000. The steric portion of the influences of maleimide analogs for the inhibitory potency described by this model is 22%, whereas the electrostatic portion is 78%. The first component explains 53% of the variance in the pIC50, and the second component accounts for additional 13% of the variance. An essentially identical CoMSIA model was obtained when the steric and the electrostatic fields were considered separately. Addition of hydrophobic and/or hydrogen donor or acceptor components did not improve the correlation already obtained.

Comparison of two binding modes with CoMSIA results

In light of the similar CoMFA results from the two different binding modes of the compounds studied, it is interesting to examine the CoMSIA results with respect to the two different binding modes.

The CoMSIA results (Eqs. 3, 4, 5, and 6) summarized in Table 5 show that the superposition from binding mode 1 (Eqs. 3 and 4) accounts for the variation in the pIC50 values of the compounds studied more than the superposition from binding mode 2 (Eqs. 5 and 6). These results are consistent with those of the CoMFA discussed above. The differences in statistics between the two CoMSIA results are larger than the corresponding CoMFA results. The statistics of the CoMSIA analyses (SE and R2 for both the cross-validation and the fitted) also indicate that binding mode 1 explains the variation in pIC50 better than binding mode 2. The results further support binding mode 1 as the correct binding modes of the compounds studied. Another interesting aspect of the CoMSIA results is that Eqs. 4 and 6 indicate some possible contributions of hydrophobic contribution of the substituents toward the observed pIC50 values as in the classical QSAR discussed above. It was previously studied that the steric contribution of CoMFA may include the hydrophobic contribution [3843]. The present CoMSIA results indicate separate contributions from the hydrophobic contributions even though the steric contributions may include the hydrophobic contributions of the substituents as in the case of CoMFA.

As in CoMFA, the model having R2(cv) > 0.3 may be considered significant in CoMSIA. Therefore, the CoMSIA model for binding mode 2 may not be statistically significant in this standard. Therefore, the results of CoMSIA are consistent with those of CoMFA, and both results indicate that binding mode 1 would be favored over binding mode 2 for the benzofuran-3-yl-(indol-3-yl)maleimides examined in this study.

Comparison of 3D-QSAR results with the current X-ray structures

While this manuscript was in preparation, the preliminary results of ligand-bound X-ray crystal structures of GSK-3β became available. The present 3D-QSAR results are interesting to compare with the preliminary X-ray crystal structures of two GSK-3β inhibitors, namely compound 5 and compound 14. Compound 5 has Br as X5 and (CH2)3OH as R. Compound 14 has 5-cyclopropylethynyl as X5, F as Y5, and CH3 as R. Even though there are clear differences in the substituent pattern of these two compounds, the initial X-ray crystal structures could not readily discern the relative positions of the indole ring and the benzofuran ring. In fact, the positions of these two rings were thought to be switched and be similar to binding mode 1 in the initial preliminary X-ray results. This initial observation was not consistent with the 3D-QSAR results. In the updated X-ray crystal structures of these two compounds, however, it was determined that the positions of the two rings are consistent with binding mode 1.

The CoMFA results indicate that binding mode 1 is preferred over binding mode 2. The same is true for the CoMSIA results. The CoMSIA results indicate more clearly that binding mode 1 better accounts for the variance in pIC50 than does binding mode 2, even though their statistics are inferior to those of CoMFA.

Figure 10 shows a superposition of the docked conformation and the current X-ray crystal structures of compounds 5 and 14. Figure 10a shows that the docked conformation (shown in purple) and the current X-ray crystal structure (shown in cyan) of compound 5 are essentially identical. Although the superposition of the docked conformation (shown in green) and the X-ray crystal structure (shown in orange) of compound 14 are very similar, Fig. 10b shows that there is some movement in the binding pocket of GSK-3β (See further discussion below about the flexible binding pocket relating to the binding mode of compound 14).

Fig. 10
figure 10

Superposition of the docked and the preliminary X-ray crystal structures of compounds 5 (a) and 14 (b). In Fig. 10a, the docked structure of compound 5 is shown in purple, while the X-ray crystal structure of the same compound is shown in cyan. In Fig. 10b, the docked structure of compound 14 is shown in green, while the X-ray crystal structure of the same compound is shown in orange

Unexplained pIC50 portions by the 3D-QSAR results

The current CoMFA and CoMSIA models account for the variance of pIC50 values about 80% and 75%, (R2 = 0.81 for CoMFA and 0.75 for CoMSIA), and the corresponding SE values are 0.48 and 0.55, respectively. Unexplained portion or outliers of QSARs can be very important and interesting, especially when the observed biological activity is higher than that predicted by the QSAR model. Unexplained portions or outliers by the QSAR model may imply several possibilities in addition to experimental errors. They may imply that the QSAR may lack certain descriptors to describe the QSAR of the entire group of compounds studied, or that the mathematical model or approach may not be appropriate. The outliers or unexplained portions may also be due to the inappropriate calculation of the parameter values used, may indicate a different mechanism of action, or may result from a different binding mode or a flexible binding site [44, 45].

One possible source of the unexplained component of the pIC50 values in the present case, which is about 20% of the variance in pIC50 values, is likely due to the flexible binding pocket as shown in Fig. 1b. Depending on the size of the ligand, the flexible glycine-rich loop of GSK-3β may change its loop conformation as indicated by the different position of Phe67 in the various X-ray crystal structures. (Also see the discussion below for Fig. 10b and compound 14.) Another possible source is the flexible side chain conformation as observed by the different side chain conformations of Arg141 (see the lower left corner of Fig. 1b). The third possibility is the binding mode of the benzofuran ring. The size of the binding pocket where the benzofuran ring binds is large enough to accommodate the unsubstituted benzofuran ring in two different positions. One possibility is the one shown by the binding of the corresponding indole ring in staurosporine (Fig. 1c, 1Q3D) or in the bis-(indole)maleimide pyridinophane (Fig. 1d, 2OW3), and the other is the one seen in the conformation used in the current CoMFA or CoMSIA studies. Interestingly, two different benzofuran binding conformations were observed in the compound 14-bound GSK-3β X-ray crystal structure which is shown in Fig. 11.

Fig. 11
figure 11

Two different binding modes of the benzofuran ring of compound 14 observed in the X-ray crystal structure of the ligand-bound GSK-3β

In QSAR, if the difference between the observed and the calculated activity values is greater than twice the standard error of the model, such compounds are normally considered as outliers. Table 6 shows that one compound (compound 7; X = 7-CH2OH, Y = 6-CH2OH, and R = CH3) is an outlier of the CoMFA model derived from binding mode 1. The calculated pIC50 value of this compound is 9.31, whereas the observed pIC50 value is 8.29. The difference is larger than 1.0 in logarithmic unit. Thus, the compound is calculated to be more a potent binder than observed. Possible reasons for the discrepancy between the calculated and the observed inhibitory potencies of this compound could be due to various reasons as described above, as well as to possible experimental error. This discrepancy suggests that this compound might be studied further.

Unlike the CoMFA model, the corresponding CoMSIA model from binding mode 1 yielded five outliers (compounds 1, 7, 10, 12, and 41). The larger number of outliers from the CoMSIA model shows that the CoMSIA model does not explain the observed pIC50 values as well as the CoMFA model, and is not as good as the corresponding CoMFA model to describe the 3D-QSAR for the set studied. However, it is interesting to see that the CoMSIA model also suggests that compound 7 would be a more potent binder than observed. The difference between the two values is greater than 1.0 in logarithmic unit. These results are consistent with those of CoMFA.

Further utilization of the 3D-QSAR results

The aim of this study was to identify the binding mode of the substituted maleimides (I) to the binding site of GSK-3β. Understanding the binding modes of compounds under study is critical in drug discovery research. Utilizing the 3D-QSAR methodologies of both CoMFA and CoMSIA, the possible binding mode of the maleimides of interest was determined in this study. The CoMFA model was further validated statistically using the scrambled pIC50 values. The suggested binding modes of these compounds were further supported by the two preliminary X-ray crystal structures of inhibitor-bound GSK-3β.

Even though the major aim of our study was accomplished, it was interesting to test the CoMFA model developed from binding mode 1 for its predictability. Two compounds were synthesized and their pIC50 values were estimated using the final CoMFA and CoMSIA models while they were being tested for their biological activity. One compound (52) is X5 = Cl, X6 = OMe, R = CH3, and the other compound (53) is X7 = CH2OMe, Y6 = CH2OH, R = CH3. The calculated pIC50 values from the CoMFA model are 6.13 for compound 52 and 7.67 for compound 53. The calculated pIC50 values from the CoMSIA model are 5.95 for compound 52 and 5.77 for compound 53. The experimentally determined pIC50 values are 7.09 (81.4 nM) and 9.14 (0.73 nM) for compound 52 and 53, respectively. The experimentally determined pIC50 values are higher than the calculated values for both compounds from both CoMFA and CoMSIA methods. Although compound 53 is not the most potent inhibitor in this series, it is still significantly potent. It provides an example of utilizing the binding mode and 3D-QSAR of these GSK-3β inhibitors in our drug discovery research. The results of CoMFA fit better than the results of CoMSIA.

Although we were delighted to see the higher inhibitory potencies of the newly synthesized compounds, it was also puzzling to see the poor predictabilities of both 3D-QSAR models, at least for these two compounds. For compound 52 (5-Cl, 6-OCH3), one can compare the results with those of compound 39 (5-OCH3, 6-Cl). The observed pIC50 of compound 39 is 6.36, whereas the calculated values are 6.54 and 6.51 from CoMFA and CoMSIA, respectively. One can see that while both 3D-QSAR models predicted well the pIC50 value of Compound 39, those models did not predict well the pIC50 value of compound 52. The results indicate that the current models do not describe the effects of these structural changes on the inhibitory potencies. However, inclusion of these two compounds in the further development of the CoMFA and CoMSIA models could improve the predictabilities of both models for future compounds containing such structural modifications. If a 3D-QSAR model does not contain certain structural information, it is not surprising to find that the model is generally unable to predict the activity of a compound embodying such structural modifications [30].

Summary and conclusions

The binding modes of GSK-3β inhibitors have been studied with molecular modeling and docking methods along with 3D-QSAR approaches. The approaches of CoMFA and CoMSIA were used for 3D-QSAR with 51 substituted benzofuran-3-yl-(indol-3-yl)maleimides as GSK-3β inhibitors.

Two binding modes of our inhibitors to the binding pocket of GSK-3β were investigated. Binding mode 1 yielded better CoMFA and CoMSIA correlations. The binding mode determined by the results of this study is consistent with the preliminary results of an X-ray crystal structure analysis of inhibitor-bound GSK-3β. This study shows that the 3D-QSAR methodologies are useful in identifying the correct binding modes of the substituted benzofuran-3-yl-(indol-3-yl)maleimides to GSK-3β. These models will be updated with additional compounds and used in our continued work to estimate the inhibitory potency of other novel GSK-3β inhibitors of this structural class.

Several possible sources of the unexplained component of the pIC50 values by the 3D-QSAR models are discussed.

The present study provides the first example of identifying the correct binding mode of GSK-3β inhibitors using the molecular modeling, docking, and 3D-QSAR approaches.

Experimental section

The publicly available protein structures used in this study and listed in Table 1 were obtained from the RSCS protein data bank [34].

Two binding modes of the 51 compounds in the GSK-3β binding site were obtained by docking each compound into the binding site (see the discussions in the text) starting from the initial 2a and 2b conformations in Table 2.

The ligands were manually docked into the binding site of GSK-3β (1R0E). The initial binding position of the ligands was set by superimposing the pyrrolidin-2-one ring of the inhibitors over the corresponding ring of the ligand in the ligand-bound X-ray crystal structure of GSK-3β, 1R0E. Initial docking conformations of the substituted indole ring and the substituted benzofuran ring of the inhibitors were set to be similar to the two conformations (2a and 2b) shown in Fig. 2. The orientation of each side chain of the molecules were set in such a way that the substituents would exhibit minimal steric clashes with any amino acid residues of the protein, but would be able to engage in possible hydrogen bonding interactions with nearby amino acid residues.

The geometry optimizations of the ligand-bound GSK-3β complexes were then performed using the molecular modeling software Sybyl version 7.3 of Tripos. The optimization of the protein-ligand complex was done by the Powell method without any initial optimization using the MMFF94 force field, the Gasteiger-Marsili charges, constant dielectric function, NB cut-off of 8.0, and dielectric constant of 1.0. The default settings were used for others with termination when the gradient reaches 0.05 kcal mol-1. The maximum iteration for the geometry optimizations was set to be 1000.

The 3D-QSAR models were developed using the techniques of CoMFA and CoMSIA available in the Sybyl software package. The superpositions of the inhibitors used for each CoMFA and CoMSIA models were those of the docked positions and conformations obtained as described above. The CoMFA and CoMSIA modules of the molecular modeling software Sybyl version 7.3 of Tripos were used for these 3D-QSAR analyses. Default settings for all parameters were used using CH +3 as the probe, a 2-angstrom lattice box, and the Gasteiger-Marsili charges. Leave-one-out method was used for the cross-validation step. The PLS analysis was done using the SAMPLS method available through Sybyl. The selections of the final CoMFA and CoMSIA models were based on the results of the cross-validation, and are summarized in Table 3 (CoMFA) and Table 5 (CoMSIA). Further validation of the CoMFA and CoMSIA models from binding mode 1 using scrambling pIC50 values are summarized in Table 4.

All the figures were generated using the UCSF chimera molecular modeling program production version 1 [46].