Introduction

Structure–activity relationship (SAR) and quantitative structure–activity relationship (QSAR) studies play a significant role in drug discovery and development. SAR/QSAR developments often yield observations of outliers. A decade after our initial attention to the outliers in SAR and QSAR [1], many were detected in developing QSARs [2], as noted in previous literature and the C-QSAR database [3]. We suggested that outliers in SAR and QSAR could result from the distinctive binding modes or flexibilities of the binding site even if the compounds involved were structural congeners [1, 4]. We also discussed the importance of considering the role of water molecules in protein–ligand interactions and QSAR studies [2]. In these studies, “outliers” referred to those compounds that possessed the unexpected biological activity. They were unable to fit in a derived QSAR model, as described by Verma and Hansch [5].

In the course of these studies, we searched the C-QSAR database with the query ‘carbonic anhydrase inhibitor,’ ‘elastase inhibitor,’ and ‘rhinovirus inhibitor.’ Among the 260 equations retrieved from the searches, 19 QSAR equations flagged our attention (Table S1, Supplementary Material). These 19 QSARs showed an inverted parabolic or bilinear relationship with ClogP or CMR. The majority of the QSARs in the C-QSAR database had a normal parabolic or bilinear relationship. Such inverted parabolic or bilinear QSARs were deemed atypical correlations.

Inverted parabolic or bilinear equations with ClogP, CMR (or other such) indicated that the biological activity initially decreased as ClogP or CMR increased. At the inversion point, however, the biological activity stopped decreasing and began increasing as ClogP or CMR increased further. It was suggested in the literature that such inverted relationships were due to allosteric interactions involving conformational change [6,7,8].

In this study, we examined whether such inverted parabolic or bilinear correlations were associated with the allosteric interactions. In addition, we examined whether allosteric interactions can be a source of outliers in some SAR and QSARs. Only the inverted parabola/bilinear QSARs were suggested as “allosteric QSARs” by Verma and Hansch [5,6,7,8,9,10]. We investigated whether linear and normal parabola/bilinear QSARs could equally describe allosteric interactions.

Material and method

RCSB protein data bank searching

The RCSB protein data bank [11, 12] was searched with the query ‘glycogen phosphorylase.’ The multiple sequence alignments for protein structure comparison were completed utilizing the Clustal Omega program of the EMBL-EBI [13] and/or the UCSF Chimera molecular modeling program (version 1.14) [14].

Molecular graphics

All the figures were generated using the UCSF Chimera molecular modeling program using the multiple sequence alignments obtained from the Clustal Omega of EMBL-EBI described above or the structure comparison tool of Chimera. All the figures included in this paper were generated from the corresponding X-ray crystal structures after multiple sequence alignments.

Multiple regression analysis for QSAR derivatization

The QSAR equations presented in this paper were executed using the C-QSAR program of Biobyte [15]. Most of the physicochemical parameters and structural descriptors were auto-loaded utilizing the C-QSAR program. CPI was the calculated hydrophobic parameter for the substituents. CMR was the calculated molar refractivity for the molecule or substituents and depended on the volume and polarizability. MgVol was the molar volume for the molecule. The indicator variable was also assigned the value of one or zero for special features with special effects that could not be parameterized. Further details for the indicator variable used were explained whenever it was used. Each regression equation included 95% confidence limits for each terms in parentheses.

In these QSAR equations, n was the number of data points, r2 was the squared correlation coefficient that showed the goodness of fit, while q2 was the goodness of leave-one-out prediction. Finally, s was the standard deviation.

In this paper, all the QSAR equations except Eqs. 4g and 4h were developed from only those compounds whose ligand–protein X-ray crystal structures were available in the RCSB PDB protein data bank.

Results and discussion

Different binding sites of ligands in a protein

In SAR and/or QSAR, the general assumption was that all the structural congeners bind at the same binding site in an essentially identical binding mode [16]. In our previous studies, it was demonstrated that even if some compounds were structural congeners, their binding modes could be different and thus showed up as outliers in SAR/QSAR, despite the fact that their binding sites were the same [1].

Subsequently, it would be reasonable to expect some congeneric compounds not to fit the same SAR/QSAR when they bound at a different binding site. Such situations would likewise have included compounds binding in orthosteric and allosteric binding sites of the protein.

After searching the ligand-bound protein structures in the RCSB protein data bank, we observed numerous examples that showed even very close structural analogs had bound at different binding sites. Such X-ray crystal structures provided clues for possible sources of outliers in SAR/QSARs. These findings are summarized below.

We chose the glycogen phosphorylase enzyme as an example for a thorough examination. A list of an extensive number of known allosteric receptors or enzymes are available (http://mdl.shsmu.edu.cn/ASD) [17].

Inverted parabola/bilinear correlations and allosteric interactions

In addition to the multiple binding sites, another aspect to be considered regarding the allosteric modulation of the protein is the inverted parabolic/bilinear allosteric QSAR correlations.

Besides those equations listed in Table S1, a number of additional inverted parabolic or bilinear QSAR have been reported [3]. Hansch and his co-workers [5] suggested the rationale behind the inverted parabola/bilinear relationships from allosteric interactions. Works on the allosteric interactions appeared in the literature as early as 1958 [7]. However, QSAR on the allosteric interactions began in 2001. Since then, Hansch’s group published several QSAR papers on the allosteric interactions [6, 18,19,20]. They initially observed that some QSARs correlated the data by an inverted parabolic relationship with ClogP, CMR, and molar volume (MgVol) [8]. Such inverted parabolic relationships showed that the activity at first decreased as the value of the related parameter increased. However, at a specific point, it turned around and increased. They attributed such behavior to a change in the structure of the receptor that occurred with ligand binding as in the allosteric interactions. They suggested that a change in the reaction mechanism occurred at the inversion point [7].

Furthermore, Hansch and co-workers [5,6,7,8, 10, 18, 21] proposed allosteric QSARs could be used to uncover an allosteric interaction. The classic means for uncovering allosteric reactions was to carefully evaluate a particular molecule at a time enzymatically, and eventually use X-ray crystallography to confirm it. Allosteric QSAR correlations can be illustrated by Eqs. i–iv for Clog P or CMR [5].

$$\begin{array}{*{20}c} {{\text{log }}\left( {{1}/{\text{C}}} \right) \, = \, - {\text{ a Clog P }} + {\text{ b Clog P}}^{{2}} + {\text{ constant}}} & {\left( {\text{i}} \right)} \\ \end{array}$$
$$\begin{array}{*{20}c} {{\text{log }}\left( {{1}/{\text{C}}} \right) \, = \, - {\text{ a Clog P }} + {\text{ b log }}\left( {\beta {\text{ x 1}}0^{{{\text{ClogP}}}} + { 1}} \right) \, + {\text{ constant}}} & {\left( {{\text{ii}}} \right)} \\ \end{array}$$
$$\begin{array}{*{20}c} {{\text{log }}\left( {{1}/{\text{C}}} \right) \, = \, - {\text{ a CMR }} + {\text{ b CMR}}^{{2}} + {\text{ constant}}} & {\left( {{\text{iii}}} \right)} \\ \end{array}$$
$$\begin{array}{*{20}c} {{\text{log }}\left( {{1}/{\text{C}}} \right) \, = \, - {\text{ a CMR }} + {\text{ b log }}\left( {\beta {\text{ x 1}}0^{{{\text{CMR}}}} + { 1}} \right) \, + {\text{ constant}}} & {\left( {{\text{iv}}} \right)} \\ \end{array}$$

These equations were inverted parabola or bilinear correlations. They implied that as Clog P or CMR increased, the activity decreased. However, at the inversion point, the exponential term took over and the activity increased with further increase in Clog P or CMR value. Similarly, in the bilinear model, activity first decreased linearly up to the inversion point and then increased linearly [7]. They proposed another way of explaining the inverted parabola/bilinear correlations which suggested that there could be an additional binding site. As molecules became larger in ClogP or CMR, they were limited in binding to the ‘typical’ site, and forced to bind in the secondary site [7]. Hansch and others reported several QSARs regarding such possible allosteric interactions [5,6,7,8,9,10].

One does not typically recognize exactly what the receptor structure is in a cell, much less a whole animal. Nonetheless, it was suggested that QSAR can serve as a valuable tool in gaining an indirect view of what one might learn about its in situ properties [7].

Glycogen phosphorylase and allosteric inhibitors

Glycogen phosphorylase (GP) comprises a family of three isozymes: muscle GP (mGP), liver GP (lGP), and brain GP (bGP) [22]. GP is an allosteric enzyme that catalyzes the first step of glycogenolysis in the liver, muscle, and brain to produce glucose-1-phosphate (G-1-P) from glycogen.

GP is a homodimer that exists in two interconvertible forms, GPb and GPa. GPb is the ‘closed’ nonphosphorylated form, low activity, low substrate affinity, and predominantly T-state in equilibrium. GPa is the ‘open’ phosphorylated form, high activity, high substrate affinity, and predominantly R-state in equilibrium [23,24,25]. Phosphorylation at Ser14 at the N-terminus converts the enzyme from the T-state to the R-state. The allosteric transition (T- to R-state) of GP is affected by allosteric modulators that bind to GP and stabilize or promote either T-state or R-state conformation [24]. In the “closed” T-state conformation, the active site is blocked, preventing the entrance of the substrate. In the “open” R-state conformation, the catalytic site becomes accessible to the substrate. Potent GP inhibitors stabilize the inactive T state conformation [26]. Allosteric inhibitors can alter the equilibrium between T- and R-state [23].

Allosteric inhibitors interact with binding sites on the enzyme that are distinct from the binding site (the orthosteric site) for the endogenous agonist [27]. Allosteric sites allow inhibitors to bind to the enzyme and often result in a conformational change. Verma and Hansch suggested that allosteric effects occur when the interaction between protein and ligand results in a structural change of the protein [5].

X-ray crystal structures of ligand-bound glycogen phosphorylase

There are over 200 glycogen phosphorylase crystal structures reported in the PDB database (Table S3). They are from the organisms of human, rabbit, and Baker’s yeast, along with the muscle, liver, and brain forms [17]. Most of these structures are utilized in the “allosteric QSARs” presented below.

Allosteric inhibitors, their different binding modes and their effects on SAR/QSAR

The general assumption in SAR/QSAR is that all the structurally similar compounds, especially structural congeners, bind in a similar binding mode at the same binding site. Therefore, if some compounds bind at other binding sites, it would be reasonable to expect that they do not fit to the same SAR/QSAR, thus becoming outliers.

Such various binding modes are exactly the case of the GP inhibitor flavonoid derivatives that Chetter et al. reported [28]. Flavonoid analogs chrysin, quercetin, and quercetagetin are structurally similar. Ordinarily, these compounds would be considered as congeners and included in the same set of data for QSAR development. However, the binding sites of these compounds at GP were reported to be completely different: the inhibitor binding site for chrysin, the quercetin binding site for quercetin, and the allosteric binding site for quercetagetin (Please see Table 6 and Fig. 6 and relevant discussion below).

figure b

We investigated whether such differences in the binding site of the allosteric inhibitors among the structural congeneric series are common. For this, we used the X-ray crystal structures of the inhibitor-bound GP complexes listed in Table S2 (Supplementary Material). We also examined whether any unusual allosteric binding of ligands can yield outliers in QSAR. The results are summarized here.

(C-β-D-glucopyranosyl)-hydroquinone derivatives

A substantial number of glucose derivatives have been shown to inhibit GP. Most often these compounds bound to the catalytic site of the enzyme. Three main groups of glucose derivatives that exhibited potent inhibitory activities were C-glucopyranosyl heterocycles, N-acyl-N’-glucopyranosyl urea, and glucopyranosylidine-spiro-heterocycles [29].

Alexacou et al. [30] reported the inhibitory potencies of glucopyranosyl-hydroquinone regioisomers (I1–I4) listed in Table 1. These compounds were competitive inhibitors of GPb with respect to α-D-glucose-1-phosphate (Glc-1-P). The X-ray crystal structures of these compounds revealed that they bound at the catalytic site (Fig. 1) and stabilize the T conformation of the enzyme. The X-ray structure of similar compound I5, that He et al. [31] reported, showed it was also bound to the catalytic site.

Table 1 (C-β-D-glucopyranosyl)-hydroquinone regioisomers and their X-ray crystal structure information
figure e
Fig. 1
figure 1

a Binding mode of I1 (cyan) at the catalytic site (right) and the novel allosteric binding site (left). Other analogs are also shown at the catalytic site. b Location of the two binding sites of I1: the catalytic site (C, right) and the novel allosteric binding site (N, left)

Alexacou et al. [30] described that I1 bound to the novel allosteric binding site as well as the catalytic site in the GPb complex structure. On the other hand, I3I5 did not bind at this allosteric binding site.

Compound I1 bound at the new allosteric binding site only when GPb crystals were soaked with a mixture of I1 and I2 and not when soaked with I1 alone. The two experimental conditions were similar when soaking GPb native crystals with either a solution of a mixture of compounds I1 and I2 (100 mM, 21 h) or a solution of I3 (70 mM, 20 h). They were unsure whether this new allosteric binding site represented a genuine new binding site with a regulatory function or if it was an artifact of the experimental conditions. Nonetheless, Alexacou et al. stressed the new allosteric binding site displayed some specificity toward I1, since only I1 (which is the weaker inhibitor) bound to this site from the mixture of compounds I1 and I2 [30].

Figure 1 is a stereo-pair picture of the compound I1 bound to the catalytic site as well as to the novel allosteric binding site. The crystal structures of the ligand-GP complexes showed that the inhibitors were accommodated at the catalytic site without any significant conformational change of the protein structure.

Even though surrounded by such a complex protein environment, the inhibitory potency expressed as pKi of these compounds significantly correlated with the hydrophobic parameter CPI of the substituent (Eq. 1). Equation 1 suggests that when the hydrophobicity of the substituents increases, the inhibitory potency will increase as well. Table 1 lists the calculated pKi values using Eq. 1.

$$\begin{aligned} {\text{pKi }} = & \, 0.{99}\left( { \pm 0.{47}} \right){\text{ CPI }} + { 2}.{65}\left( { \pm 0.{6}0} \right) \\ & n \, = { 5}, r^{{2}} = \, 0.{94}, q^{{2}} = \, 0.{86}, s \, = \, 0.{149} \\ \end{aligned}$$
(1)

Because of the limited number of compounds involved in this case, it was not possible to examine any two-parameter equations. Since the coefficient of CPI is positive, it would likely become a normal parabola/bilinear correlation rather than an inverted correlation, even if CPI could be extended. Even so, Eq. 1 accounts for 94% of the variance in these inhibitory potency data. No additional parameter was required to explain the observed behavior of I1 binding at the novel allosteric binding site. (Please see further discussions below.) Because I1 bound at the novel allosteric binding site only when GPb crystals were soaked with a mixture of I1 and I2, the result was not unexpected.

Phosphorylated glucose derivatives

Martin et al. [32] reported several glucose analogs binding to the catalytic site of T-state GPb: a T-state-stabilizing inhibitor α-D-glucose (II1; synergistic with binding of AMP, IMP, and caffeine to the inhibitor site), R-state-stabilizing phosphorylated ligands α-D-glucose 1-phosphate (II2), 2-deoxy-2-fluoro-α-D-glucose 1-phosphate (II3), and α-D-glucose 1-methylenephosphonate (II4). They are listed in Table 2.

Table 2 Binding sites of α-D-glucose (II1, GLC) and its structural analogs II2II6 and their X-ray crystal structure information

Martin et al. described that the phosphorylated ligands II2, II3, and II4 were bound at the allosteric activator (AMP) site (A site, also sometimes called N site in the literature) in addition to the catalytic site. The binding of the phosphorylated inhibitors was accompanied by the movement of catalytic site residues, especially a shift of a loop out of the catalytic site toward the exterior of the enzyme.

Table 2 includes heptulose 2-phosphate (II5) reported by Johnson et al. for comparison [33].

Even though all the compounds in Table 2 are structurally similar to glucose, their binding modes are different as seen in their crystal structures. II1 and II2 are bound at the catalytic site, whereas the other three analogs (II2II4) are bound at two separate binding sites: the catalytic site and the allosteric activator (AMP) site. In each binding site, their binding conformations are essentially identical as seen in Fig. 2a. Furthermore, Fig. 2b shows the location of these two binding sites in GPb. Compound II6 (phosphoramidate) reported by Chrysina et al. [34] also binds at the catalytic site.

Fig. 2
figure 2

a Binding modes of II2–II4 at the catalytic site (top) and the allosteric activator (AMP) site (bottom). b Location of the two binding sites of II2 (G1P, PDB ID: 3GPB): the catalytic site (C, top) and the allosteric activator site (A, bottom)

Compounds II1II6 in Table 2 provide another example that shows structurally close analogs bind at different binding sites. When studying SAR/QSAR, one should carefully consider their binding site as well as their binding mode. Because of their structural diversities and lack of pKi values of some compounds, no QSAR was developed from these compounds.

Spiro-glucose derivatives

Table 3 lists 17 spiro-glucose analogs with their X-ray crystal structures information. Benltifa et al. [35] reported III1III5 as inhibitors of rmGPb and showed that the inhibitors bound preferentially at the catalytic site of the enzyme retaining the less active T-state conformation. Watson et al. [36] also reported III6III9 and described that they bound at the catalytic site. Czifrak et al. [37], Oikonomaos et al. [38], and Gregoriou et al. [39] described III10III14 for their inhibition of GPb. They reported that all five compounds bound at the catalytic site of T-state GPb with very little change in the tertiary structure. Szabo et al. [29] reported the inhibitory activities of III15III17 and their binding modes. III15 was unique in this series because the compound was bound at two binding sites: the catalytic site (C) and the new allosteric (indole) binding site (NA).

Table 3 Spiro-glucose analogs (III1III17) and their X-ray crystal structure information
figure l

The entire compounds listed in Table 3 did not yield any statistically sound QSAR. Since the GP is an allosteric enzyme, the ‘splitting QSAR’ approach suggested by Verma and Hansch [5] was utilized to develop Eqs. 2a2d. The Ki value of one compound (III9) was not available and omitted. From the first set of nine compounds, Eq. 2a was derived, which is a normal parabola correlation with MolVol. A statistically slightly inferior correlation was obtained with CMR (Eq. 2b). The results were not surprising because there was high collinearity between CMR and MolVol with this set of compounds (Eq. 2c). The remaining seven outlier compounds of Eq. 2a yielded Eq. 2d. Interestingly, Eq. 2d is an inverted parabola correlation with CMR. Compound III13 was treated as a final outlier. Equation 2d shows that the inhibitory potency of these compounds first decreases with an increase in molar refractivity (CMR) up to the inversion point for CMR = 6.96 and then increases. Equations 2a and 2d explain 77% and 88% of the variance in the inhibitory activity data of the molecules, respectively.

$$\begin{aligned} {\text{pKi }} = &\, { 4}.{15}( \pm {3}.{55}){\text{ MolVol }} - \, 0.{85}( \pm 0.{82}){\text{ MolVol}}^{{2}} + \, 0.0{1}( \pm {3}.{76}) \\ & n \, = { 9}, r^{{2}} = \, 0.{77}, q^{{2}} = \, 0.{45}, s \, = \, 0.{143} \\&{\text{outlier:}}\,{\mathbf{III13}}, {\mathbf{III14}}, {\mathbf{III17}}, {\mathbf{III18}}, {\mathbf{III14}}, {\mathbf{III15}}\\& {\text {optimum MolVol:}}\, 2.43(\pm\,4.04). \end{aligned}$$
(2a)
$$\begin{aligned} {\text{pKi }} = & \, 0.{79}( \pm 0.{68}){\text{ CMR }} - \, 0.0{4}( \pm 0.0{4}){\text{ CMR}}^{{2}} + { 1}.{51}( \pm {2}.{58}) \\ & n \, = { 9}, r^{{2}} = \, 0.{75}, q^{{2}} = \, 0.{42}, s \, = \, 0.{148} \\ &{\text{outlier:}}\,{\mathbf{III3}}, {\mathbf{III4}}, {\mathbf{III7}}, {\mathbf{III8}}, {\mathbf{III13}}, {\mathbf{III14}}, {\mathbf{III15}}\\& {\text {optimum CMR}} = \, 9.88 (\pm\,\,60.60). \end{aligned}$$
(2b)
$$\begin{aligned} {\text{CMR }}& =\, { 4}.{4}0( \pm 0.{22}){\text{ MolVol }}{-}{ 1}.{7}0( \pm 0.{48}) \\ & n \, = { 1}0, r^{{2}} = { 1}.00, q^{{2}} = \, 0.{99}, s \, = \, 0.{113} \\ \end{aligned}$$
(2c)
$$\begin{aligned} {\text{pKi }} = & \, - {7}.{8}0( \pm {6}.{47}){\text{ CMR }} + \, 0.{56}( \pm 0.{45}){\text{ CMR}}^{{2}} + { 3}0.{3}0( \pm {22}.{35}) \\ & {\text{n }} = { 6}, {\text{r}}^{{2}} = \, 0.{88}, {\text{q}}^{{2}} = \, 0.{46}, {\text{s }} = \, 0.{537} \\&{\text{outlier:}}\,{\mathbf{III13}}\\& {\text {inversion point for CMR:}}\, 6.96 (\pm\,0.86).\end{aligned}$$
(2d)

Among the 17 compounds listed in Table 3, only III15 was reported to bind at the two binding sites. Equation 2d includes III15, and no other parameter was required to account for any effects due to the dual binding of III15. Figure 3 shows the binding modes and binding sites of III1III17. The crystal structures show that all compounds are bound only to the catalytic site except III15. Compound III15 is bound to the new allosteric binding site as well as the catalytic site.

Fig. 3
figure 3

a Binding modes of 17 compounds (III1–III17) including III15 bound at two separate binding sites: the catalytic site (right) and the new allosteric (indole) binding site (left). b Location of the binding sites of III15 at the catalytic site (C, right) and the new allosteric binding site (NA, left)

Szabo et al. [29] reported that the crystal structures showed only the R epimers of III16 and III17 bound preferentially at the catalytic site. The R epimer of III15 was bound at both the catalytic and the new allosteric (indole) binding sites. They suggested the catalytic site was the primary binding site for this inhibitor, and the new allosteric (indole) binding site (NA) was the secondary binding site. On the other hand, the S epimer of III15 did not bind at the catalytic site but bound at the new allosteric (indole) binding site. Other glucose-derived inhibitors discussed later were also bound at this new allosteric site [29].

Upon binding at the new allosteric site, III15 participated in five hydrogen bond interactions with several protein residues. Besides that, the imidazolinone ring formed a hydrogen bond with the side chain of the enzyme. The binding of III15 at the new allosteric site triggered a shift of the side chain of Arg60 by about 3.0 Å. This shift caused a small translocation of the helix (residues 60–64) [29].

Equation 2a (and 2b) is a normal parabola correlation, and 2d is an inverted parabola correlation. The results suggest that there are (at least) two different modes of interactions among these analogs affecting the inhibitory potencies expressed as pKi even though all bind at the catalytic site. Relatively low r2 values of Eqs. 2a (or 2b) and 2d indicate other factors that have not yet been accounted for. However, a smaller number of compounds available for analyses especially Eq. 2d prevented further investigation. III13 is shown to be an outlier in Eq. 2d. This is the only thiohydantoin compound in Table 3. The thio atom of III13 interacts with the carboxyl oxygen atom of Asp339 residue through a water molecule (W1009). Such hydrogen bonding interaction with a thiocarbonyl group is not present in other analogs. There is a corresponding carbonyl derivative (III6) used in Eq. 2d. The X-ray crystal structure of III6 (1FTW) lacks such hydrogen bonding with Asp339. Furthermore, a thiocarbonyl group is generally different and more basic than the corresponding carbonyl group [40]. The observed pKi value of III13 is more potent than the calculated value from Eq. 2d by 0.75, which is the largest deviation in this series. This deviation could be due to the effects of hydrogen bonding interactions that are not accounted for in Eq. 2d.

No additional parameter was needed to account for the binding at two different binding sites of III15. The results indicate such effects are minor in the current situation.

Glucopyranosyl nucleoside derivatives

Numerous researchers reported the inhibitory potencies and their X-ray crystal structures of several D-glucopyranosyl nucleoside analogs (IV1IV22) bound to GPb (Table 4) [41,42,43,44,45,46]. Three structurally similar furanosyl analogs are additionally included (IV23IV25) in the table for comparison [46]. The crystal structures demonstrated that most of these inhibitors were competitive inhibitors (with the substrate Glc-1-P) and preferentially bound at the catalytic site which promoted the less active T state conformation of the enzyme.

Table 4 (C-β-D-glucopyranosyl)-hydroquinone regioisomers and their X-ray crystal structure information
figure m

No formal paper has been published about IV19IV25, but the crystal structures of these structures revealed fascinating information regarding their binding sites. There are two sub-groups of these structures including IV18: five 6-membered pyranosyl compounds (IV18IV22) and three 5-membered furanosyl compounds (IV23IV25). Most of the inhibitors are bound to the catalytic site, but there are two other binding sites for some compounds. Compound IV21 is a structural analog of IV1IV20, but it binds at the inhibitor binding site. Compound IV22 is similar to both IV21 and IV25, but it binds to both the catalytic site and the inhibitor binding site. Besides, the binding sites of the three furanosyl compounds (IV23, IV24, and IV25) are different from IV1IV20. Compounds IV23 and IV24 bind to the inhibitor (purine) binding site, whereas IV25 binds to the allosteric (AMP) binding site.

Figure 4 is a stereo pair picture of compounds IV1IV25 at the binding site of GP. Figure 4a shows the binding modes at the three binding sites of 20 compounds (IV1IV20): the catalytic site (C, middle), the inhibitor (purine) binding site (I, top), and the allosteric (AMP) binding site (A, bottom). Figure 4b shows the locations of these three binding sites of GPb.

Fig. 4
figure 4

a Binding modes of IV1IV25 in the catalytic site (C. middle), the inhibitor (purine) binding site (I, top), and the allosteric (AMP) binding site (A, top). IV25 is shown at the bottom. b IV18 (red), IV23 (green), IV25 (orange) are shown in the CPK model showing the locations of their binding sites

Despite different structural analogs (spiro vs. hydroquinones), the overall binding situations are similar to the compounds listed in Tables 3 and 4: the primary binding site is the catalytic site for most compounds. Only a limited number of compounds have been reported to bind at the allosteric binding site (new allosteric binding site NA, inhibitor binding site I, or allosteric AMP binding site A). In addition, their QSARs were also developed from two subgroups even though their parameters were the same in each group.

Equation 3a and 3b was developed from the compounds listed in Table 4 using the ‘splitting QSAR’ approach. Because only five compounds were involved in deriving Eq. 3b, two-term QSAR correlations were not considered.

$$\begin{aligned} {\text{pKi }} = & \, - \, 0.0{7}( \pm 0.06){\text{ CPI}}^{{2}} {-}{ 2}.{83}\left( { \pm 0.{38}} \right) \, I \, + { 5}.{23}( \pm 0.{22}) \\ & n \, = { 14}, r^{{2}} = \, 0.{96}, q^{{2}} = \, 0.{68}, s \, = \, 0.{287} \\ &{\text{outlier:}}\,{\mathbf{IV2}}, {\mathbf{IV7}}, {\mathbf{IV9}}, {\mathbf{IV11}}, {\mathbf{IV13}}, {\mathbf{IV15}}, {\mathbf{IV18}}\\ \end{aligned}$$
(3a)
$$\begin{aligned} {\text{pKi }} = & \, - \, 0.{58}( \pm 0.{65}){\text{ CPI}}^{{2}} + { 5}.{30}( \pm {1}.{41}) \\ & n \, = { 5}, r^{{2}} = \, 0.{73}, q^{{2}} = \, - 0.{47}, s \, = \, 0.{835} \\&{\text{outlier:}}\,{\mathbf{IV15}}, {\mathbf{IV18}} \end{aligned}$$
(3b)

Even though the inhibitory potencies of these compounds correlated well with the same parameter CPI2 in Eqs. 3a and 3b, the size of the coefficients of the CPI2 term were significantly different, indicating the two sets of compounds behaved differently. A similar phenomenon was observed with the compounds in Table 3 (Eqs. 2b and 2d). In Eq. 3a, an indicator variable was used for the five compounds of 3-deoxy-3-fluoro-β-D-glucopyranosyl derivatives (IV1IV5). The negative coefficient of the indicator variable showed that the 3-deoxy-3-fluoro derivatives yielded significantly weaker potency than the other compounds. This result is consistent with the suggestion of Tsirkone et al. [41]. They indicated the 3-hydroxyl group of the glucose moiety was a good hydrogen bond donor and acceptor, but the corresponding 3-fluorine was not as good and did not improve the potency as much.

Equation 3a and 3b explains 96% and 73% of the variance in the inhibitory activity data, respectively. VII15 and VII18 are outliers in Eq. 3b. Relatively low correlation coefficient and high standard deviation of Eq. 3b both indicate that there are other effects that have not been accounted for. However, the limited number of data points prevented further investigation.

Kantsadi et al. [44] reported that an extended C5-alkynyl group exploited interactions with the β-pocket of the active site and induced significant conformational changes of the 280 s loop. IV13 induced significant conformational changes in the 280 s loop, while other compounds also induced such conformational change of the 280 s loop, though to a lesser extent.

IV15 and IV18 are final outliers of Eq. 3b. Their observed Ki values cannot be explained based on Eq. 3b: The observed Ki value of IV15 is about tenfold weaker than the calculated value, and that of IV18 is more than 1000-fold stronger than the calculated value. Such unusual binding potency of IV15 was also noted by Kantsadi et al. [44]. They suggested that the alkynyl group of IV15 was pointing toward the side chains of Asp339 and His341. Thus, restructuring of the 339–341 loop and the change of water structure were suggested for the increase in Ki value. Such changes were in contrast to a significant conformational change of the 280 s loop structure upon binding of a similar compound such as IV13. On the other hand, Mamais et al. [45] explained that the tight binding of IV18 was because of the increased hydrogen bonding network and van der Waals interactions due to the conformational changes in the side chains of the 280 loop residues observed in the IV18-GP complex. Therefore, it could be concluded that the outliers IV15 and IV18 in Eq. 3b were not due to the binding at two different binding sites.

Among the 25 compounds listed in Table 4, two compounds (IV21 and IV22) were reported to bind at the two binding sites. Equation 3a includes IV21, and no other parameter was required to account for any effects appropriate to the dual binding of IV21, indicating that the binding at the two binding sites does not cause significant effects on its binding.

Compounds IV22 and IV23IV25 that only bound at the allosteric binding site were not included in deriving Eqs. 3a or 3b, because their Ki values were unavailable. Consequently, the effects of two-site binders on QSAR were not examined with allosteric correlations. Nevertheless, these compounds provide other examples that show structurally close analogs bind at various binding sites.

Indirubin derivatives

Kosmopoulou et al. [47, 48] reported the binding mode of indirubin-5-sulphonate (V1, E226) and indirubin-3-aminooxy-acetate (V2, E243) to GPb by kinetic and crystallographic experiments (Table 5). They showed V1 was a competitive inhibitor with respect to ATP and V2 was a competitive inhibitor with respect to both Glc-1-P and AMP. The X-ray crystal structures showed that V1 bound at the inhibitor (purine) binding site (Fig. 5). Only one molecule was bound at this site. On the other hand, two additional V2 molecules were bound at the allosteric (AMP) binding site and a new subsite in the vicinity of the allosteric site, respectively.

Table 5 Indirubin analogs and their X-ray crystal structure information
Fig. 5
figure 5

Binding modes and binding sites of indirubin analogs in Table 5. V1 bound at the purine inhibitor site (I). V2 (E243, IAA, PDB ID: 1Z62, cyan) bound at the purine inhibitor site (I) as well. Two V2 molecules are bound at the allosteric activator AMP binding site (A) and a new subsite in the vicinity of the allosteric site

Even though there are only two crystal structures of indirubin analogs, this case represents another example that structural analogs bind at various binding sites of the same allosteric enzyme.

Flavonoids

Chetter et al. [28] reported that flavonoids are novel inhibitors of GP, but their mode of action is unspecific in regard to the GP binding sites involved. The crystal structures show VI1, VI2, and VI3 bound exclusively at the inhibitor binding site. Tsitsanou et al. [49] reported that VI4 (chrysin) and VI6 were accommodated at the inhibitor site, whereas flavonoid VI8 (quercetagetin) was bound at the allosteric site [50]. Kantsadi et al. [51] reported that VI5 bound at the novel binding site. Anderka et al. [52] described that quinolone class VI7 (AVE9423) was bound to the allosteric AMP site. In addition, Kato et al. [53] showed VI8 bound at the GP allosteric site. The binding site of VI9 was reported to be unknown [51]. The inhibitory potencies and X-ray crystal structure information (Fig. 6) of VI1–2VI7 are listed in Table 6.

Table 6 Flavonoid analogs and their X-ray crystal structure information. (VI8 and VI9 are additionally included for comparison.)
figure p

The inhibitory potency determined from rmGPb of VI1VI5 in Table 6 correlates well with CPI (Eq. 4a), explaining 91% of the variance in the biological data. Equation 4a indicates that the inhibitory potency (pKi) of these compounds can be accounted for with their hydrophobicity. Statistically slightly less satisfactory correlation was obtained with CMR (r2 = 0.88, s = 0.207). There are significant correlations between the inhibitory potency determined from rmGPb, rmGPa, and hlGPa as shown in Eqs. 4b4d.

Compound VI5 bound at the quercetin binding site, a completely independent binding site of VI1VI4. The binding potency of VI5 was about tenfold weaker than VI1VI4. Upon critical examination of Eq. 4a, we realized that this one point (VI5) greatly influenced the correlation, yielding a statistical artifact. Without VI5, a statistically less strong Eq. 4e was obtained. The pKi values from rmGPa and hlGPa gave statistically similar but weaker correlations than Eq. 4e. Therefore, Eq. 4a was considered as a preliminary QSAR.

Fig. 6
figure 6

a Binding modes of seven compounds of VI1VI7 at the inhibitor binding site (right) and the quercetin (VI5) binding site (left, cyan). b Location of the binding sites: the inhibitor (purine) binding site (I, VI1, green) and the quercetin binding site (Q, VI5, blue). c Location of the allosteric (AMP) binding site (A) of VI7 (pink). (The one on the left is from the allosteric (AMP) binding site of the other dimer.)

An optimistic point of Eq. 4a and 4e was that these allosteric QSARs indicated that there may be something unusual about VI5 from the rest of the analogs, even if these compounds were structurally similar and could typically be considered as an analog for SAR/QSAR studies. An interesting correlation obtained excluding VI5 was Eq. 4f with CMR2. Equation 4f was statistically superior to Eq. 4e. Including VI5, the correlation with CMR2 was statistically much inferior (r2 = 0.21, s = 0.723).

$$\begin{aligned} {\text{pKi}}_{{({\text{rmGPb}})}} = & \, 0.{27}( \pm 0.{16}){\text{ CPI}}^{{2}} + { 3}.{89}( \pm 0.{84}) \\ & n \, = { 5}, r^{{2}} = \, 0.{91}, q^{{2}} = \, 0.{67}, s \, = \, 0.{251} \\ \end{aligned}$$
(4a)
$$\begin{aligned} {\text{pKi}}_{{({\text{rmGPb}})}} = & { 1}.{29}( \pm 0.{71}){\text{ pKi}}_{{({\text{rmGPa}})}} {-}{ 1}.{46}( \pm {3}.{66}) \\ & n \, = { 5}, r^{{2}} = \, 0.{92}, q^{{2}} = \, 0.{78}, s \, = \, 0.{233} \\ \end{aligned}$$
(4b)
$$\begin{aligned} {\text{pKi}}_{{({\text{rmGPb}})}} = & { 1}.{29}( \pm 0.{58}){\text{ pKi}}_{{({\text{hlGPa}})}} {-}{ 1}.{36}( \pm {2}.{95}) \\ & n \, = { 5}, r^{{2}} = \, 0.{94}, q^{{2}} = \, 0.{85}, s \, = \, 0.{194} \\ \end{aligned}$$
(4c)
$$\begin{aligned} {\text{pKi}}_{{({\text{rmGPa}})}} = & \, 0.{99}( \pm 0.{13}){\text{ pKi}}_{{({\text{hlGPa}})}} + \, 0.{16}( \pm 0.{67}) \\ & n \, = { 5}, r^{{2}} = { 1}.00, q^{{2}} = \, 0.0{4}, s \, = \, 0.{194} \\ \end{aligned}$$
(4d)
$$\begin{aligned} {\text{pKi}}_{{({\text{rmGPb}})}} = & \, 0.{36}( \pm 0.{57}){\text{ CPI}}^{{2}} + { 3}.{37}( \pm {3}.{29}) \\ & n \, = { 4}, r^{{2}} = \, 0.{79}, q^{{2}} = \, 0.{23}, s \, = \, 0.{273} \\ \end{aligned}$$
(4e)
$$\begin{aligned} {\text{pKi}}_{{({\text{rmGPb}})}} = & \, 0.{31}( \pm 0.{36}){\text{ CMR}}^{{2}} + { 3}.0{4}( \pm {2}.{74}) \\ & n \, = { 4}, r^{{2}} = \, 0.{88}, q^{{2}} = \, 0.{51}, s \, = \, 0.{2}0{7} \\ \end{aligned}$$
(4f)

Among the compounds that Chetter et al. [28] reported were two structural analogs, VI10 and VI11. Assuming their binding site was the same as VI1VI4, these two compounds were added to derive fresh QSARs. Equation 4g was the result. Equation 4g indicates that VI10 and VI11 behave in the same way as VI1VI4, suggesting that they bind at the catalytic site. VI5 became an outlier in this case: the difference between the observed and the calculated pKi values is 1.35. The result is not surprising because VI5 is bound at a separate binding site from VI1VI4. Equation 4f is provided for comparison.

$$\begin{aligned} {\text{pKi}}_{{({\text{rmGPb}})}} = & \, 0.{28}( \pm 0.{15}){\text{ CMR}}^{{2}} + { 3}.{32}( \pm {1}.{14}) \\ & n \, = { 6}, r^{{2}} = \, 0.{87}, q^{{2}} = \, 0.{72}, s \, = \, 0.{167} \\ & {\text{outlier}}:{\mathbf{IV5}} \\ \end{aligned}$$
(4g)
$$\begin{aligned} {\text{pKi}}_{{({\text{rmGPb}})}} = & \, 0.{2}0( \pm 0.{22}){\text{ CPI}}^{{2}} + { 4}.{18}( \pm {1}.{41}) \\ & n \, = { 6}, r^{{2}} = \, 0.{61}, q^{{2}} = \, - 0.{23}, s \, = \, 0.{291} \\ & {\text{outlier}}:{\mathbf{IV5}} \\ \end{aligned}$$
(4h)

Equations 4g and 4f are essentially identical within the confidential limits: the same coefficient of CMR, intercept, and r2, but q2 and s values improved.

Even though VI7 was not included in Eqs. 4a4h because of the lack of Ki value, this compound was also expected to become an outlier in Eq. 4g because its binding site was also different.

Hansch and his co-workers attributed allosteric inverted parabolic (or bilinear) QSARs to a change in the structure of the receptor and/or a change in the reaction mechanism upon ligand binding to the protein [5,6,7,8,9,10]. They also suggested that such inverted parabolic correlations could be due to the presence of another binding site [5]. The allosteric QSAR Eq. 4g supports Hansch’s suggestions and indicates that the allosteric inverted parabola correlation could be due to the conformational change in the protein. Besides the inverted parabola QSAR for the allosteric interactions, the binding of IV5 at the new allosteric site stood out as an outlier.

β-D-Glucopyranosyl-thiosemicarbazone derivatives

Alexacou et al. [54] reported 15 aromatic aldehyde 4-(β-D-glucopyranosyl)thiosemicarbazones listed in Table 7 as inhibitors of rabbit muscle GPb. They described that these compounds were competitive inhibitors of GPb with respect to α-D-glucose-1-phosphate and revealed the inhibitors were accommodated at the catalytic site with the glucopyranosyl moiety at approximately the same position as α-D-glucose.

Table 7 Aromatic aldehyde 4-(β-D-glucopyranosyl)thiosemicarbazones reported by Alexacou et al. [54] and their X-ray crystal structure information
figure q

These inhibitors fit tightly into the β-pocket, a side-channel from the catalytic site with no access to the bulk solvent. Alexacou et al. reported that 14 out of the 15 inhibitors bound at the new allosteric site of the enzyme as well as the catalytic site [54]. They described that the binding of several compounds (VII2, VII3, VII4, VII5, VII8, and VII11) triggered a significant shift of the 280 s loop. On the other hand, the orthonitro-substituted compound VII13 (PDB ID: 3MSC, 24S) was described to bind only at the catalytic site and not bound at the new allosteric site [54]. However, to our surprise, examination of the corresponding crystal structure revealed that this compound was equally bound at the catalytic site as well as the new allosteric binding site (Fig. 7). The orthonitro compound VII13 is a Z isomer to the N = C double bond, but so are the two other compounds VII14 and VII15.

Fig. 7
figure 7

Binding modes of VII1VI15 at the two binding sites

Equations 5a and 5b was developed using the ‘split QSAR’ method suggested by Verma and Hansch [5]. The indicator variable Iz was assigned the value of one for the three Z-isomers (VII13VII15) and zero for all others. The inhibitory potencies (pIC50) of these compounds correlated parabolically with CPI. It is a normal parabola, not an inverted one. The negative coefficient of Iz indicates that the three Z-isomers are about 20-fold weaker than the others. Four compounds (VII4VII6, VII8) were not used in Eq. 5a (QSAR1). An inverted parabolic relationship for Eq. 5b or 5c could be observed when their pIC50 values were plotted against CPI or CMR values. However, because not enough data points were available, two-parameter equations were not considered. Excluding VI8, the remaining three compounds yielded Eq. 5b or 5c (QSAR2). There is a high collinearity between CPI and CMR for these compounds (Eq. 5d). Thus, Eq. 5b or 5c should be considered preliminary.

$$\begin{aligned} {\text{pIC}}_{{{5}0}} = & { 2}.{18}( \pm {1}.{3}0){\text{ CPI }}{-} \, 0.{72}( \pm 0.{46}){\text{ CPI}}^{{2}} - { 1}.{3}0( \pm 0.{5}0){\text{ Iz }} + { 3}.{17}( \pm 0.{82}) \\ & n \, = { 11}, r^{{2}} = \, 0.{87}, q^{{2}} = \, - {1}.{45}, s \, = \, 0.{259} \\ & {\text{outlier}}:{\mathbf{VII4}},{\mathbf{VII5}},{\mathbf{VII6}},{\mathbf{VII8}} \\ &{\text{optimum\,CPI} = 1.52 (\pm\,0.32)} \end{aligned}$$
(5a)
$$\begin{aligned} {\text{pIC}}_{{{5}0}} = & \, - { 1}.{16}( \pm {1}.{62}){\text{ CPI }} + { 6}.0{4}( \pm {3}.{6}0) \\ & n \, = { 3}, r^{{2}} = \, 0.{99}, q^{{2}} = \, 0.{81}, s \, = \, 0.0{35} \\ & {\text{outlier}}:{\mathbf{VII8}} \\ \end{aligned}$$
(5b)
$$\begin{aligned} {\text{pIC}}_{{{5}0}} = & \, - { 9}.{54}( \pm {8}.{55}){\text{ CMR }} + { 62}.0{5}( \pm {52}.{49}) \\ & n \, = { 3}, r^{{2}} = { 1}.00, q^{{2}} = \, 0.{92}, s \, = \, 0.0{23} \\ & {\text{outlier}}:{\mathbf{VII8}} \\ \end{aligned}$$
(5c)
$$\begin{aligned} {\text{CPI }} = & { 8}.{22}( \pm {4}.{13}){\text{ CMR }}{-}{ 48}.{27}( \pm {25}.{34}) \\ & n \, = { 3}, r^{{2}} = { 1}.00, q^{{2}} = \, 0.{97}, s \, = \, 0.0{11} \\ \end{aligned}$$
(5d)

VII1VII15 are allosteric inhibitors that bind at two binding sites of GP, and the allosteric QSARs expressed as Eq. 5a is a normal parabola correlation.

Alexacou et al. [54] mentioned the binding of these inhibitors at the new allosteric site slightly shifted in the vicinity residues, indicating a conformational change.

N-substituted-N-β-D-Glucopyranosyl)urea derivatives

Chrysina et al. [55] and others [56, 57] reported N-substituted-N-β-D-glucopyranosyl)ureas listed in Table 8. These compounds are structurally not so much different from those in Table 7. Unlike those compounds in Table 7, four of the 14 compounds (VII16VII17, VII26, and VII29) bound at the two separate binding sites, and 10 compounds bound only at the catalytic site.

Table 8 N-substituted-N-β-D-Glucopyranosyl)urea derivatives reported by Chrysina et al. [55] and others [56, 57] and their X-ray crystal structure information
figure r

Figure 8 shows the two binding modes (the catalytic site and the new allosteric binding site) of VII16VII17, VII26, and VII29.

Fig. 8
figure 8

Binding modes of VII16, VII17, VII26, and VII29 at the catalytic site (right) and the new allosteric binding site (left)

From the compounds listed in Table 8, Eq. 6 was developed. The correlation was a normal parabolic one, not an inverted one. Three compounds were outliers in Eq. 6 (QSAR1). (When pKi values were plotted against CPI or CMR, an indication of a parabolic relationship with CPI or CMR was observed with the three outliers in QSAR2. However, because not enough data points were available, further investigation was not considered.)

$$\begin{aligned} {\text{pKi }} = & { 8}.{61}( \pm {2}.{79}){\text{ CMR }}{-} \, 0.{74}( \pm 0.{25}){\text{ CMR}}^{{2}} {-}{ 18}.{74}( \pm {7}.{52}) \\ & n \, = { 1}0, r^{{2}} = \, 0.{92}, q^{{2}} = \, 0.{85}, s \, = \, 0.{193} \\ & {\text{outlier}}:{\mathbf{VII16}},{\mathbf{VII20}},{\mathbf{VII29}} \\& {\text {optimum CMR}} = 5.82(\pm\, 0.17). \end{aligned}$$
(6)

The information about the crystal structures of most compounds in Table 8 was obtained from the RCSB PDB protein data bank since no paper has been published. An exception was Oikonomakos et al.’s paper. Oikonomakos et al. [56] reported VII16 bound tightly at the catalytic site and induced substantial conformational changes in the loop containing residues 282–287 of 280 s loop. They showed VIII6 equally bound at the new allosteric site, about 33 Å from the catalytic site. Three other compounds (VII17, VII26, and VII29) showed that they were equally bound at both the catalytic site and the new allosteric binding site as VIII6.

As the normal parabola correlation of Eq. 5a which was discussed with the compounds in Table 7, Eq. 6 is a normal parabola correlation. This correlation provides another example of normal parabola allosteric QSAR, which involves conformational changes in the protein-inhibitor allosteric interactions.

The binding modes of the final three outliers (VII16, VII20, VII29) from Eq. 6 are not identical: VII16 and VII29 bound at the two different binding sites (C, NA), whereas VII20 bound only at the catalytic site (C). Regarding such outliers, please see the further discussion below under the titles of ‘Different binding modes at the allosteric secondary binding sites of glycogen phosphorylase’ and ‘Dual inhibitions and their QSARs.’

N-(β-D-glucopyranosyl)-N'-oxamide derivatives

Czifrak et al. [58] and Hadjiloi et al. [59] studied the binding modes of several N-(β-D-glucopyranosyl)-N'-oxamide analogs (Table 9). They are competitive inhibitors of rabbit muscle GPb with respect to α-D-glucose-1-phosphate. The ligand-bound crystal structures revealed the inhibitors were accommodated at the catalytic site at approximately the same position as α-D-glucose and stabilized the T-state conformation of the 280 s loop. Examination of the crystal structures revealed that only one of the eight compounds listed in Table 9 bound at two separate binding sites.

Table 9 N-(β-D-glucopyranosyl)-N'-oxamide analogs as inhibitors of GP reported by Czifrak et al. [58] and Hadjiloi et al. [59] and their X-ray crystal structure information
figure s

Figure 9 shows VII33 bound at two separate binding sites. VII33 and VII26 bound at the same two binding sites. However, comparison of VII33 with VII26 and VII29 revealed substantial differences in the binding mode of VII33 from the other two compounds at the new allosteric binding site. The binding modes of VII26 and VII29 were essentially identical at both binding sites, but a large difference in the binding mode of VII33 could be observed. (Regarding the effects of such a difference in the binding mode on the outlier in QSAR, please see the further discussion below and Ref. [1].)

Fig. 9
figure 9

a Binding modes of VII33 (PDB ID: 3CUW, 445) at the catalytic site (right) and the new allosteric binding site (left). b The difference in the binding modes of VII33 (3CUW, green, ball-and-stick), VII26 orange), and VII29 (pink) at the catalytic site (right) and the new allosteric binding site (left). While the binding modes of VII26 and VII29 are essentially identical at both binding sites, a large difference in the binding mode of VII33 from the other two compounds can be seen at the new allosteric binding sites

Equation 7a was developed from the compounds listed in Table 9. One compound (VII35) became an outlier. The inhibitory potencies of these compounds were correlated with CMR with a reasonable s value. Because of the narrow range of the pKi values involved, the squared correlation coefficient (r2) was not as high as one would hope to see. Anyhow, Eq. 7a indicates the importance of CMR as in Eq. 6.

$$\begin{aligned} {\text{pKi }} = & \, 0.{19}( \pm 0.{17}){\text{ CMR }} + { 2}.{93}( \pm 0.{7}0) \\ & n \, = { 7}, r^{{2}} = \, 0.{63}, q^{{2}} = \, 0.{27}, s \, = \, 0.{291} \\ & {\text{outlier:}} {\mathbf{VII35}}.\end{aligned}$$
(7a)

Hadjiloi et al. [59] discussed comparisons of the compounds in this series with the lead compound N-acetyl-β-D-glucopyranosylamine presented previously. They described that the hydrogen bonding interaction of the amide nitrogen with the main-chain carbonyl oxygen of His377 is missing in these complexes. As they suggested, the differences in the Ki values of these compounds could be partially due to the subtle conformational changes of the protein residues [59].

Even though Eq. 7a (or 7b–7d in Supplemental Material 2) is not a parabola/bilinear QSAR, the QSAR describes their allosteric effects.

Acyl urea derivatives

Oikonomakos et al. [60], Anderka et al. [52], and Klabunde et al. [61] reported a ‘novel’ class of GP inhibitors listed in Table 10. They are structurally similar to the side chains of the β-D-glucopyranosyl analogs listed in Table 8 but different in the core structure. They are benzoylaminocarbonylaminophenyl analogs.

Table 10 Acyl urea analogs as inhibitors of GP reported by Oikonomakos et al. [60], Anderka et al. [52], and Klabunde et al. [61] and their X-ray crystal structure information
figure t

The X-ray crystal structures of VIII1VIII4 were done with rmGPb, whereas VIII5VIII7 were done with hlGPa. The first set of four compounds (VIII1VIII4) bound at the allosteric activator (AMP) binding site. These authors reported the acyl urea analogs inhibited GP by direct inhibition of AMP binding and by indirect inhibition of the substrate-binding through stabilization of the T’-state.

The second set of three compounds (VIII5VIII7) was equally bound at the allosteric (AMP) binding site, competing with the physiological activator AMP and acting synergistically with glucose. VIII5 occupied only the lower part of the bifurcated AMP site, whereas VIII6 exploited the full binding pocket. Anderka et al. [52] suggested the binding entropy of VIII6 was due to the extensive displacement of solvent molecules as well as to ionic interactions with the phosphate recognition site.

Equation 8 was derived from VIII1VIII4. Because of the limited number of compounds included to develop Eq. 8, a statistically weak QSAR was obtained. Nonetheless, it was an inverted parabola correlation. There was a visible indication that the relationship was a reverse parabola correlation when pKi was plotted against CMR. No statistically significant correlation with CPI existed for the corresponding parabola QSAR (r2 = 0.19, s = 0.20). Equation 8 provides another example of an inverted parabola QSAR for the allosteric interaction of GP inhibitors.

$$\begin{aligned} {\text{pKi}}= & - { 4}.{72}( \pm {1}.{83}){\text{ CMR }} + \, 0.{38}( \pm 0.{15}){\text{ CMR}}^{{2}} \\& + { 2}0.{24}( \pm {5}.{68}) \\ & n \, = { 4}, r^{{2}} = { 1}.00, q^{{2}} = \, - {8}.00, s \, = \, 0.00{7} \\ \end{aligned}$$
(8)

No QSAR was attempted with the set of VIII5VIII7 because of the modest range of the biological activity values and the limited number of compounds available.

β-D-glucopyranosyl triazole, pyrrole, imidazole, thiazole, tetrazole derivatives

The crystal structures of many β-D-glucopyranosyl triazole, pyrrole, imidazole, thiazole, and tetrazole analogs have been reported by Leonidas and his co-workers [26, 62,63,64,65,66,67]. They are listed in Table 11. Most compounds are bound to the catalytic site. However, IX7, IX8, and IX26 are equally bound at the new allosteric binding site in addition to the catalytic site (Fig. 10).

Table 11 The inhibitory potencies and the crystal structure information of β-D-glucopyranosyl triazole, pyrrole, imidazole, thiazole, tetrazole analogs, and their X-ray crystal structure informationView full size image
Fig. 10
figure 10

Binding modes of IX1IX32 in ligand-GPb complexes. IX7, IX8, and IX26 (PDB ID: 5LRE, 5LRF, and 5O52) at the two separate binding sites: catalytic site (C, right) and new allosteric (indole) binding site (NA, left)

From the compounds listed in Table 11, Eqs. 9a9c was developed using the ‘split QSAR’ method [5]. For Eq. 9b, an indicator variable Ithiazole was assigned for the four thiazole derivatives (IX29IX32). Equations 9a9c explains 88% of the initial dataset, 94% of the second dataset, and 89% of the final dataset, respectively. No normal or inverted parabola correlation was obtained with these sets. Three QSARs indicated the critical role of the hydrophobic parameter CPI. In addition, the molar refractivity parameter played a significant part to explain the biological activity of those compounds used in Eq. 9a. The coefficient values of CPI for Eqs. 9a and 9c were essentially identical indicating their similar roles in protein–ligand interactions. However, the coefficient of CPI in Eq. 9b was different, suggesting a distinct role of these compounds in their protein–ligand interactions. Such diverse nature of correlations formulated from the sub-datasets represented the fundamental idea of proposing the ‘split QSAR’ method. The negative coefficients of CPI for Eqs. 9a and 9c may lead to an inverse parabola QSAR if the value of CPI is extended.

The negative coefficient of Ithiazole in Eq. 9b indicated that the average amount of thiazole compounds were more than 1000-fold weaker than the others. This result was consistent with that of Kyriakis et al. [26]. They suggested the importance of hydrogen bond interactions between the imidazole ring and the main chain carbonyl group of His377. When replaced by a sulfur atom, such hydrogen bond interaction led to a decrease in the inhibitory activity due to geometrical constraints.

$$\begin{aligned} {\text{pKi }} = & \, - { 1}.{18}( \pm 0.{24}){\text{ CPI }} + { 1}.{23}( \pm 0.{24}){\text{ CMR }} + \, 0.{67 }( \pm 0.{91}) \\ & n \, = { 2}0, r^{{2}} = \, 0.{88}, q^{{2}} = \, 0.{84}, s \, = \, 0.{318} \\& {\text{outlier:}} {\mathbf{IX6}}, {\mathbf{IX8}}, {\mathbf{IX11}}, {\mathbf{IX12}}, {\mathbf{IX16}}, {\mathbf{IX23}}{-}{\mathbf{IX26}}, {\mathbf{IX29}}, {\mathbf{IX30}}\end{aligned}$$
(9a)
$$\begin{aligned} {\text{pKi }} = & \, 0.{94}( \pm 0.{5}0){\text{ CPI }}{-}{ 3}.{41}\left( { \pm {1}.{25}} \right) \, I_{{{\text{thiazole}}}} + { 4}.0{3}( \pm 0.{96}) \\ & n \, = { 7}, r^{{2}} = \, 0.{94}, q^{{2}} = \, 0.00, s \, = \, 0.{319} \\ & {\text{outlier:}} {\mathbf{IX11}}, {\mathbf{IX23}}, {\mathbf{IX24}}, {\mathbf{IX29}}\end{aligned}$$
(9b)
$$\begin{aligned} {\text{pKi }} = & \, - { 1}.{1}0( \pm {4}.{64}){\text{ CPI }} + { 8}.{34}\left( { \pm {11}.{17}} \right) \\ & n \, = { 3}, r^{{2}} = \, 0.{89}, q^{{2}} = \, - 0.{95}, s \, = \, 0.{324} \\& {\text{outlier:}} {\mathbf{IX24}} \end{aligned}$$
(9c)

Kandsami et al. [67] reported that the binding of the inhibitors IX5IX8 did not trigger any significant conformational change of the overall protein structure. The crystal structures of IX7 and IX8 showed the inhibitors were equally bound at the new allosteric site and the catalytic site. Kandsami et al. [67] suggested that the primary binding site was the catalytic site. They also indicated that the binding to the new allosteric binding site might be a result of the experimental concentration (10 mM) of the inhibitor solution used for soaking the crystals. However, under the same inhibitor concentration used for IX7 and IX8, other compounds such as IX5 and IX6 did not bind at the new allosteric site. An insightful observation that Kandsami et al. made was that the binding of IX7 and IX8 at the new allosteric site triggered a significant conformational change of this site.

The critical roles of CPI and CMR indicated in Eqs. 9a9c were consistent with the explanation of Kandsami et al. When the prime binding site was the same as the other analogs, no other parameters were required in QSAR to account for the effects due to the binding at the secondary site.

There are other series of GP inhibitors whose binding modes were reported based on their ligand-bound GP X-ray crystal structures: 5-chloroindolyl derivatives (Table S4 in Supplementary Material 3) and phthalic acid derivatives and anthranilimide derivatives (Table S5 in Supplementary Material 3). The binding site of 5-chloroindolyl derivatives is the new allosteric (indole) binding site (NA) (Fig. 11) and the binding site of phthalic acid and anthranilimide derivatives is the allosteric activator (AMP) binding site (A). Interestingly, none of these inhibitors bound at the catalytic binding site. Because of their structural diversity and a limited number of compounds involved, no SAR/QSARs have been discussed, but Tables S4 and S5 were included for comparison. Further study would be possible when the binding modes of additional compounds become available.

Fig. 11
figure 11

a Binding modes of X1 (three structures, orange), X2 (five structures, green), and X3 (blue). b Binding modes of X2 (five structures, green), and X3 (blue), and X4 (magenta). (Supplementary Material 3)

Multiple binding sites of glycogen phosphorylase inhibitors

Among the crystal structures that we examined, 36 GP inhibitors were bound at two separate binding sites. They are summarized in Table 12. (Five single-site binders are also included for the purpose of discussion.) Except for IV23, IV24, IV25, V2, VI5, VI7, the primary binding site of these compounds is the catalytic site. Their secondary binding sites include the novel allosteric binding site (N), the allosteric (AMP) binding site (A), the new allosteric (indole) binding site (NA), the inhibitor (purine) binding site (I), and the quercetin binding site (Q) (Fig. 12). There are only two indirubin derivatives (V1 and V2) with reported ligand-bound GP crystal structures. Both compounds are bound at the inhibitor (purine) binding site (I), but V2 is equally bound at the allosteric (AMP) binding site (A). Unlike the other five flavonoid analogs in Table 6, VI5 and VI7 are bound at the quercetin binding site (Q) and the allosteric (AMP) binding site (A), respectively. However, they are not bound at the catalytic binding site (C). III15, VII1VII17, VII26, VII29, VII33, IX7, IX8, and IX26 are equally bound at the catalytic site (C) and the new allosteric (indole) binding site (NA).

Table 12 Summary of GP inhibitors that are bound to more than one site discussed in this paper
Fig. 12
figure 12

The three distinctive binding modes of the structural analogs at the new allosteric (indole) binding site and location of different binding sites. a Binding modes of the group I compounds at the new allosteric (indole) binding site (bottom) and the catalytic site (top). Compounds included are VII1VII12, VII14, VII15, IX7, IX8, IX26. b Binding modes of the group II compounds at the catalytic site (top) and the new allosteric (indole) binding site (bottom). Compounds included are III15, VII13, VII16 (1K06, 1K08, 2QNB), VII17, VII33. c Binding modes of the group III compounds at the new allosteric (indole) binding site (top) and the catalytic site (bottom). Compounds included are VII26 and VII29

Fig. 13
figure 13

Location of the five allosteric binding sites in GP: the allosteric (AMP) binding site, (A; II3, yellow, 4GPB), the new allosteric (indole) binding site (NA; III15, cyan, 6QA6), the inhibitor (purine) binding site (I; IV21, magenta, 3BD7), the quercetin binding site (Q; VI5, orange, 4MRA), the novel allosteric binding site (N; green, I1, 3NP7: Z15), and the catalytic site (C; center, between orange and magenta)

There is no apparent ligand’s structural reason why some compounds are bound at more than one site. As discussed above, the effects of inhibitor binding at the secondary binding site on the correlations were not significant when the inhibitor bound at both the primary and the secondary binding sites. On the contrary, the effects were noticeable when the inhibitors bound at the secondary binding site without binding at the primary one. The latter group of compounds would ultimately end up as outliers in SAR/QSAR (for example, Eq. 4g).

Different binding modes at the allosteric binding sites of glycogen phosphorylase

Twenty-three of the 36 compounds in Table 12 are bound at the catalytic site (C) and the new allosteric (indole) binding site (NA). The binding modes of these compounds at the catalytic site are similar and not appreciably different from the customary binding modes of structural analogs. However, the binding modes at the new allosteric (indole) binding site are relatively diverse as shown in Fig. 12. There are three distinctive binding modes (Fig. 12a–12c). These diverse binding modes are most likely due to the location of the binding site, less buried than the catalytic site.

Despite such distinctive binding modes, no other parameter was required in various allosteric QSARs examined above. The results indicated that the effects of binding at the secondary binding site were minimal when the inhibitor was equally bound at the primary binding site. On the other hand, the effects of binding at the secondary binding site were significant if the inhibitor was only bound at the secondary binding site. Such effects could be explained with the allosteric mechanism. When the inhibitor binds at the catalytic site, the access of the substrate glycogen to the catalytic site is restricted by the 280 s loop. In this manner, the binding of an inhibitor at the catalytic site stabilizes the T-state conformation of the enzyme and blocks the enzyme activity. Since the enzyme function is already reduced at this point, additional binding of the inhibitor at the allosteric site would not affect the enzyme activity further. On the other hand, when the inhibitor binds only at the allosteric binding site, the binding causes conformational changes of the enzyme by different mechanisms of action [68, 69]. Therefore, the binding at the secondary site influences the enzyme activity. This would eventually yield the outliers in SAR/QSAR as in Eq. 4g.

Table 13 summarizes the number of inhibitors bound at numerous binding sites of glycogen phosphorylase. Most of the inhibitors were bound at the catalytic site and an allosteric binding site. However, thirty-one out of 167 inhibitors (indirubin derivatives (Table 5), flavonoids (Table 6), acyl urea derivatives (Table 8), 5-chloroindolyl derivatives (Table S4), and phthalic acid and anthranilimide derivatives (Table S5)) did not bind to the catalytic site. Glucopyranosyl nucleoside derivatives (Table 4) were bound at three different binding sites including the catalytic site and two allosteric binding sites. Flavonoids (Table 6) were bound at three separate allosteric binding sites. Phthalic acid and anthranilimide derivatives (Table S5) were bound at single allosteric binding site.

Table 13 Number of inhibitors bound at different binding sites of glycogen phosphorylase examined in this study

Among the various allosteric binding sites, most dual-binding inhibitors preferred to bind at the new allosteric (indole) binding site, the allosteric (AMP) binding site, and the inhibitor (purine) binding site of GP in that order. Only one inhibitor each bound at the quercetin binding site and the novel allosteric binding.

Allosteric enzymes refer to the enzymes which have another site other than the active site. Allosteric enzymes can have more than one allosteric site. Allosteric sites are different from the active site and the substrate-binding site [70]. An allosteric inhibitor is a molecule that binds to the enzyme at an allosteric site, and allosteric inhibition is a form of noncompetitive inhibition. A noncompetitive inhibitor is not directly competing with the substrate at the active site. Instead, it is indirectly altering the structure of the enzyme. After changing the structure, the enzyme becomes inactive and does not bind with its corresponding substrate. The result is slowing down the formation of subsequent products [71].

Figure 13 shows five allosteric binding sites in GP. Table 13 shows that the inhibitors of five (Tables 5, 6, 8, S4, and S5) out of 11 series are only bound to an allosteric site: these inhibitors are single allosteric site inhibitors. Because of their structural diversity, QSAR analyses were performed only for the dataset in Tables 6 and 8, and those in Tables 5, S4, and S5 were not done. QSARs from the data in Tables 6 and 8 are both an inverse parabola correlation (Eqs. 4g and 8). Except for two flavonoid analogs (VI6 and VI7) in Table 6 and VIII5, VIII6, and VIII7 in Table 8, which were structurally diverse, all the remaining compounds whose binding modes were identical were included in the QSAR analysis and correctly identified as a single site allosteric binder. One compound (IV5 from Table 6), which is a different allosteric site binder, was identified as an outlier in the corresponding QSAR (Eq. 4g). The outcome was not surprising because unlike all the other inhibitors, IV5 bound at a separate binding site (quercetin binding site). The correlations of inverted parabola QSARs for the allosteric interactions were consistent with the suggestion and allosteric QSAR results of Verma and Hansch [5,6,7,8,9,10].

Dual inhibitions and their QSARs

The inhibitors of five (Tables 2, 3, 4, 7 and 9) out of 11 series described in Table 11 are bound to an allosteric site as well as the catalytic site (orthosteric site). These inhibitors are dual binders. The QSARs (Table 12) for these series of compounds include linear and normal as well as inverted parabola correlations. Equations 2a (or 2b), 3a, 3b, 5a, and 6 are normal parabola QSARs. Equations 2d and 8 are inverted parabola correlations, suggesting this is allosteric QSAR. All other equations are linear correlations (Table 14).

Table 14 Summary of QSAR of GP inhibitors described in this study

It is worth noting that both normal and inverse parabola correlations are included in describing these dual inhibitions. Since linear and/or normal parabola QSARs are reported from various correlation studies, and inverted parabola QSARs are seen from allosteric inhibition studies, QSARs of all such forms are deemed natural to describe dual allosteric inhibitions. Ultimately, these QSARs can contain outliers observed in many QSARs [1, 2, 4] that are due to various possible reasons (Ref. [2] and the references cited therein).

Sharma and Gupta [72, 73] reported normal and inverse parabolic relationships with CMR in several inhibitor series, suggesting a dual allosteric binding mode in glycine/NMDA antagonism. They proposed that some molecules may be altering the shape of the active site residues, leading to normal and inverted allosteric correlations. They supported their suggestions based on a molecular docking simulation study [73], unlike this study based on the inhibitor-bound enzyme X-ray structures. Several recent studies indicated an effective dual-targeting therapeutic mechanism involving allosteric and orthosteric binding sites [74,75,76,77,78,79,80].

Normal parabolic or bilinear correlations for allosteric interactions

Verma and Hansch [5] reported two allosteric inverted parabolic QSARs (Eqs. 11a and 12b) with GP inhibitors as shown in datasets 1 and 5 of Table S2. The compounds they studied were 5-chloroindolyl derivatives which belong to Table S4. Four compounds (X1X4) are listed in Table S4 and are all bound at the new allosteric (indole) binding (NA) site. It was and is still not known whether any of the structural analogs that Verma and Hansch reported concerning the allosteric QSARs bound at more than one binding site or at a separate binding site.

They included all the compounds of the corresponding structures reported by Wright et al. [81] without considering their binding sites. Interestingly, the authors reported an additional normal parabolic QSAR (Eqs. 11b and 12a). In each case, even though these equations were not mentioned as allosteric QSARs, a single parameter Eq. 11c was additionally included. There was one final outlier in each example after the ‘splitting QSAR’ development.

Inhibition of glycogen phosphorylase A (GPA, EC 2.4.1.1) by 5-chloroindolyl derivatives I

figure c
$$\begin{aligned} \log \,1/C =& - 4.96( \pm 2.61){\text{CMR }}\, + \,0.20( \pm 0.12){\text{CMR}}^{2} \\ & + 36.54( \pm 14.44) \\ & n = \, 21,\,\,r^{2} = 0.855,\,\,q^{2} = 0.819,\,\,s = 0.193 \\& {\text {inversion point for CMR} = 12.38 (11.85{-}14.30)} \end{aligned}$$
(11a)

$$\begin{aligned} \log \,\,1/\text C & = \,\,28.01( \pm 6.42){\text{C}}\log {\text{ P}} - \,\, 4.19( \pm 0.95){\text{C}} \log \text {P}^{2} \\ & \quad - \,\, 39.39( \pm 10.74) \\ & n = \,\, 10, \,\,r^{2} = \,\, 0.940,\,\,q^{2} = \,\, 0.866,\,\,s \,\, = {\text{ }}0.249 \\ &{\text {optimum ClogP}} = 3.34 (3.28-3.40) \end{aligned}$$
(11b)

$$\begin{aligned} {\text{log 1}}/{\text{C }} = & \, - \, 0.{81}( \pm 0.{41}){\text{ ClogP }}{-}{ 3}.0{1}( \pm {1}.{55}) \\ & n \, = { 5}, r^{{2}} = \, 0.{931}, q^{{2}} = \, 0.{831}, s \, = \, 0.{215} \\&{\text {outlier}} = \text{CH}_{2}\text{CH}_{2}\text{CH}_{2}\text{N}( \text{CH}_{3})_{2} \end{aligned}$$
(11c)

Inhibition of glycogen phosphorylase A (GPA, EC 2.4.1.1) by 5-chloroindolyl derivatives II

$$\begin{aligned} {\text{log 1}}/{\text{C }} = & \, 0.{62}( \pm 0.{13}){\text{ Clog P }} + { 4}.{1}0( \pm {1}.{1}0){\text{ CMR }} \\& - \, 0.{23}( \pm 0.0{6}){\text{ CMR}}^{{2}} {-}{ 13}.0{3}(\pm {5}.{6}0) \\ & n \, = { 2}0, r^{{2}} = \, 0.{881}, q^{{2}} = \, 0.{819}, s \, = \, 0.{172} \\ & {\text {optimum CMR} = 8.92 (8.48{-}9.25)}\end{aligned}$$
(12a)
$$\begin{aligned} {\text{log 1}}/{\text{C }} = & \, - { 1}.{52}( \pm 0.{39}){\text{ C\,log\,P }} + \, 0.{38}( \pm 0.{1}0){\text{ C\,log\,P}}^{{2}} \\&+ { 7}.{74}( \pm 0.{32}) \\ & n \, = { 7}, r^{{2}} = \, 0.{967}, q^{{2}} = \, 0.{926}, s \, = \, 0.{131} \\ &{\text {inversion point for C\,log\,P} = 2.00 (1.85-2.18)}\\&a{\text {outlier = 3-Tetrahydrofuryl}} \end{aligned}$$
(12b)
figure d

Hansch’s group reported inverted parabola or bilinear QSAR correlations to indicate the allosteric interactions and change of conformations involved. In our recent search of the C-QSAR database for all the reported QSAR equations using three particular search queries (carbonic anhydrase, elastase inhibitor, and rhinovirus inhibitor) [2], 270 equations were retrieved. Among the 270 equations, 19 equations were inverted parabolic or bilinear correlations (Table S1) and 43 equations were normal parabolic or bilinear correlations. Supuran [82] reported one of the carbonic anhydrase inhibition mechanisms represented an allosteric interaction with conformational change suggested based on the crystal structure. Our results presented here also show that in addition to the inverted parabola QSARs, the normal parabola QSARs and the linear parameter QSARs can equally describe dual allosteric interactions.

In the study of the allosteric site of muscarinic acetylcholine M2-receptors, Bender et al. [83] reported a QSAR with a significant correlation between the volume of the substituents and the allosteric potency. One significant point to note about their allosteric correlation is that their QSAR is a normal parabola, not an inverted one. The allosteric potencies of the compounds they studied cover more than two orders of magnitude, and the dataset was suitable to establish a QSAR.

In another report, Sharma and Gupta [72] examined several sets of compounds as selective glycine/NMDA (N-Methyl-D-aspartic acid) site antagonists and reported ten QSARs suggesting dual allosteric binding interactions. Interestingly, two alternative forms of allosteric QSARs were reported: normal and inverted parabola. Among the ten QSARs, three were normal parabolic correlations (with CMR) and three were inverted parabolic correlations (one with ClogP and two with CMR). Additionally, there were other linear parameter equations with CMR or CPI. Unlike the suggestions made by Verma and Hansch, they described the normal parabola relationship for the allosteric interactions as well.

It was assumed that at the inversion point the structure of a receptor is forced to change into a new shape. This would result in an altogether different type of interaction. Another possibility would be that there is more than one binding site. In such a case the ligand should not bind with the same parameters defined in the first half of the equation [8].

Hansch et al. [6] reported 60 examples of the normal parabolic or bilinear QSARs with CMR and 27 examples with MgVol from their QSAR database. Even so, they were not certain if any of those results involved allosteric effects. They suggested other researchers should check such cases for the possibility of allostery and the role of QSARs in rationalizing such results [18].

Based on the various reports described above, as well as our results presented here, it is clear that normal parabola/bilinear QSARs, linear parameters, and inverted QSARs can all describe allosteric interactions, especially in the case of dual allosteric interactions. Since normal parabola/bilinear QSARs are frequently observed in QSARs for various biological activities, the key QSARs that can be used to uncover allosteric interactions are the inverted parabola/bilinear QSARs, as suggested by Hansch and his co-workers. Even though this study confirmed Hansch et al.’s attribution of inverted parabolic/bilinear QSAR to the allosteric ligand-binding mechanism, additional studies with other allosteric binders and proteins (including dual binders) with experimental binding information would further confirm and can firmly generalize this point.

Conclusion

We examined over 200 X-ray crystal structures of the ligand-bound allosteric enzyme glycogen phosphorylase. The QSAR analyses of the inhibitors resulted in the inverted parabola correlations in several cases. In addition, we obtained the normal parabola as well as linear correlations. These results indicated that linear, normal parabola/bilinear and inverted parabola/bilinear correlations could all describe the allosteric interactions, particularly dual allosteric interactions. In many cases, the binding of various allosteric inhibitors accompanied the conformational change. This study supported Hansch and his co-workers’ proposal that inverted parabola/bilinear QSARs describe the allosteric interactions and such QSARs could be used to uncover such allosteric interactions.

The crystal structures revealed many ligands bound at more than one binding site of the enzyme. Some compounds were bound at the secondary binding site only and not at the primary binding site where most other structural analogs were bound. It was not apparent at present why these compounds bound more than one binding site, unlike their close structural analogs. We initially expected that compounds bound at an uncommon secondary binding site would be outliers in QSAR. On the contrary, the results revealed that the effects of binding at the secondary binding site on many SAR/QSARs were not significant when the inhibitor was equally bound at the primary binding site. However, the effects were noticeable when the inhibitors bound at the secondary binding site without binding at the primary binding site. We proposed such a phenomenon could be explained with the allosteric mechanism. We also suggested compounds belonging to the latter group would eventually end up as outliers in SAR/QSAR of that series. As in Eq. 4g, QSAR analysis may be able to identify such compounds as outliers.