Introduction

Type 2 diabetes is a chronic metabolic disorder that results from defects in both insulin action and secretion, which is increasing today at a fast pace throughout the world. Its intensity of occurrence can be inferred from the report of the International Diabetes Federation (IDF) 2008, which states that about 280 million people worldwide are suffering from this disease, and this figure is expected to pass 410 million within 20 years. The incidence of diabetes has thus reached global epidemic proportions. Obesity is a major risk factor for developing this metabolic disease (James, 2004).

Potential molecular targets to treat and prevent these metabolic disturbances are peroxisome proliferator-activated receptors (PPARs). PPARs are some of the central regulators of nutrient–gene interactions that regulate lipid, carbohydrate, and inflammatory pathways, thereby maintaining homeostasis (Kahn et al., 2006; Delerive et al., 2001; Evans et al., 2004; Martens et al., 2002). PPAR-α agonists (fibrates) and PPAR-γ agonists (thioglitazones or thiazolidinediones) have been used to improve serum lipoprotein profiles and glucose metabolism, respectively (Miyazaki et al., 2001; Staels et al., 1998).

PPAR- γ agonists, such as rosiglitazone and pioglitazone, are mainly used today to improve glucose metabolism and insulin sensitivity in diabetic patients (Miyazaki et al., 2001). They are known to increase basal and insulin-stimulated glucose transport in adipocytes and skeletal muscle cells without necessarily increasing glucose transporter levels (Prashantha Kumar et al., 2007). Thioglitazones are known to preserve islet β-cell function in Type 2 diabetes. The mechanism whereby thioglitazones activate glucose transport, however, is only partly understood (Kanoh et al., 2000; Standaert et al., 2002; Weinstein et al., 1993). A three-dimensional (3D) quantitative structure-activity relationship (QSAR)-based technique, namely, comparative molecular similarity index analysis (CoMSIA), which uses an existing set of molecules and their corresponding biological activities to predict the biological activities of nonsynthesized compounds that are structurally related to the set of existing compounds, has been meaningfully used by earlier workers in the design of a variety of potential drugs (Chen et al., 2003, 2004). In view of the problems associated with designing PPAR-γ-targeted drugs, CoMSIA could facilitate the design of novel PPAR-γ agonists.

Earlier, we developed 2D QSAR and comparative molecular field analysis (CoMFA) models for thiazolidine-2,4-diones and their antihyperglycemic activities with different sets of compounds (Prashantha Kumar and Nanjan, 2008b; Prashantha Kumar et al., 2008a). In the present study, we report the results of the 3D structure-activity relationships of the title compounds regarding their antihyperglycemic activity via CoMSIA. We derived various models by varying all five fields, i.e., steric, electrostatic, hydrophobicity, and hydrogen bond donors and, which are available in CoMSIA. We employed regular steps in the CoMSIA protocol and computations at different levels of theory to find clues regarding the selectivity of thioglitazones toward PPAR-γ receptors. Compounds that did not show confirmed antihyperglycemic activity in our and other previous studies were not included. The present study can guide further structural optimization and predict the potency and physiochemical properties of emerging clinical drug candidates from this class of compounds.

Materials and methods

CoMSIA study

The antihyperglycemic activities of various thiazolidine-2,4-diones were selected from the literature (Takashi et al., 1982). Takashi et al. reported the antihyperglycemic activity of a set of 50 compounds using genetically obese and diabetic yellow KK mice and the same procedure. These compounds with diverse molecular structures were used in the present study to avoid incongruency of data. The test set and training set compounds were chosen manually such that low-, moderate-, and high-activity compounds were present in approximately equal proportions in both sets. The reported antihyperglycemic activities were originally assigned numbers from 1 to 3 based on the percentage reduction in blood glucose concentration compared to the control (HGA). These data were converted into natural log molar antihyperglycemic activity data by dividing the original values by their respective molecular weights (MHGAs) and taking natural logarithms (lnMHGAs), as this would give numerically larger values for active compounds than for inactive compounds.

The training set was designed so as to account for all the structural and activity variations among the compounds. Thirty-nine compounds included in the training set such that they represent each congeneric series of compounds and have a range of activities which spans the 50 compounds. The rest of the compounds constituted the test set to validate the model generated by CoMSIA. The common structures of each congeneric series of compounds along with the structure of each compound and their biological activity values are given in Figs. 1, 2 3 and Tables 16, respectively.

Fig. 1
figure 1

General structure of 5-[4-(2-phenylethoxy) benzyl]-1,3-thiazolidine-2,4-dione derivatives

Fig. 2
figure 2

General structure of 5-benzyl-1,3-thiazolidine-2,4-dione derivatives

Fig. 3
figure 3

General structure of 5-(pyridine-3-ylmethyl)-1,3-thiazolidine-2,4-dione derivatives

Table 1 Biological activity data on the 5-[4-(2-phenylethoxy) benzyl]-1,3-thiazolidine-2,4-dione derivative series in the training set
Table 2 Biological activity data on the 5-[4-(2-phenylethoxy) benzyl]-1,3-thiazolidine-2,4-dione derivative series in the test set
Table 3 Biological activity data on the 5-benzyl-1,3-thiazolidine-2,4-dione derivative series in the training set
Table 4 Biological activity data on the 5-benzyl-1,3-thiazolidine-2,4-dione derivative series in the test set
Table 5 Biological activity data on the 5-(pyridine-3-ylmethyl)-1,3-thiazolidine-2,4-dione derivative series in the training set
Table 6 Biological activity data on the 5-(pyridine-3-ylmethyl)-1,3-thiazolidine-2,4-dione derivative series in the test set

The molecular modeling software package SYBYL 6.7, installed on a silicon graphics workstation with an IRIX 6.5 operating system (SYBYL Molecular Modeling Software, Version 6.7; Tripos Associates Inc., St. Louis, MO, USA), was used for 3D structure generation and molecular modeling studies. All compounds were built from fragments in the SYBYL database. Each structure was fully geometry-optimizeed using the standard Tripos force field (Clark et al., 1989) with a distance-dependent dielectric function until a root mean square (rms) deviation of 0.001 kcal/mol Å was achieved. The partial atomic charges required for the electrostatic interactions were computed using the Gasteiger-Marsili (1980) method as implemented in SYBYL.

The conformational search was performed using a systematic search protocol. Rotatable bonds in all molecules were searched from 0 to 360º in 10º increments. The minimum energy conformation thus obtained was subsequently used in the analyses.

An important requirement for CoMSIA is that the 3D structures of the molecules to be analyzed be aligned according to a suitable conformation template, which is assumed to adopt a “bioactive conformation” (Cramer et al., 1988b). Therefore, molecules in the database were aligned using the Database Align routine available in SYBYL. Compounds were fitted to the template molecule 25, one of the most active molecules that correlated with the binding affinity toward PPAR-γ receptors. The alignment rule was optimized by using the appropriate common substructure (5-methylene-1,3-thiazolidine-2,4-dione) for alignment. The aligned training set is shown in Fig. 5. Field fit was used to optimize the alignment of the molecules to a previously calculated steric and electrostatic field (Clark et al., 1990). These aligned molecules were stored as the training aligned and test aligned databases, respectively.

The default SYBYL settings were used for all steps unless otherwise noted. CoMSIA steric, electrostatic, hydrophobicity, and hydrogen bond donor and hydrogen bond acceptor interaction fields were calculated at each 3D cubic lattice intersection point of a regularly spaced grid 2.0 Å in the x, y, and z directions to encompass the aligned molecules. The steric term (Lennard-Jones 6–12 potential) and electrostatic (Coulomb potential) field interactions were calculated using the sp3 carbon probe atom (1.52-Å van der Waals radius) carrying a +1 charge, with a distance-dependent dielectric at each lattice point. Charges were determined using the Gasteiger-Marsili method. The energy calculation was performed for all grid points such that all energies were constrained to be between –30 and 30 kcal/mol.

Partial least squares (PLS) analysis was done to derive the 3D QSAR models, utilizing CoMSIA standard scaling for molecular fields (Clark et al., 1989). The activity data (lnMHGA) were used as the dependant variable and the predictive value of the model, represented by q 2, was evaluated using the leave-one-out (LOO) cross-validation method. To speed up calculations and to reduce the noise between the calculated columns, column filtering was used to exclude the columns with a variance <2.0 units. The optimum number of components was determined based on the standard error of prediction at different component levels and q 2 values. The conventional r 2 values for the training set were also calculated under different combinations of CoMSIA models. To further assess the robustness and statistical confidence of the derived models, bootstrapping analysis (Cramer et al., 1988a) for 100 runs was performed. Bootstrapping involves the generation of many new datasets from the original dataset and is obtained by randomly choosing samples from the original dataset. The statistical calculation is performed on each of these bootstrapping samples. The difference between the parameters calculated from the original dataset and the average of the parameters calculated from the many bootstrapping samples is a measure of the bias of the original calculations. Models with a cross-validation (q 2) value >0.3 were sought, since at this value the probability of chance correlation is <5% (Clark et al., 1990). To validate and to ensure the developed CoMSIA model, the activities of the training set and test set compounds were predicted. The actual and predicted values of the compounds are listed in Tables 16.

3D QSAR models were established for our structures. Graphical examination of the best developed model revealed which areas around the molecules contain a greater or lesser number of electronegative groups, electropositive groups, hydrophobic groups, hydrogen bond donors (OH or NH groups), and hydrogen bond acceptors (N, O, F) and whether sterics were an important factor for affinity toward PPAR-γ receptors and to show antihyperglycemic activity.

Results and discussion

The 3D QSAR models were derived employing the CoMSIA method with 39 compounds in the training set. The remaining 11 compounds, which are congeneric to training set compounds, constituted the test set. PLS analyses on 39 training-set compounds yielded a q 2 value (by the method) of 0.51 at the optimal number of components, six. The statistical data are reported in Table 7. Since the q 2 value was reasonably good, no compound was removed as an outlier. Further, exclusion of compounds in a particular congeneric series may affect the predictivity of the model. The developed model showed good predicted activities for the training set, with low residual values as reported in Tables 1, 3, and 5 and Fig. 4. The predictive ability of the model was checked by predicting the activities of the test-set compounds under different models. The model correlating the training-set compounds with respect to steric, electrostatic, hydrophobic, hydrogen bond donors, and hydrogen bond acceptors seems to be of better statistical significance compared to all other models as reported in Table 7. The predicted activities of the training- and test-set compounds reported in this article are all from this optimized model.

Table 7 PLS statistics for the CoMSIA models
Fig. 4
figure 4

Correlation between observed and predicted activities of the developed CoMSIA model: (shaded square) training-set compounds; (shaded circle) test-set compounds

Contour plots depicting the steric, electrostatic, hydrophobicity, hydrogen bond donor, and hydrogen bond acceptor fields of the compounds generated based on the contributions to the PLS model are shown in Fig. 5. Statistical parameters like q 2 value, r 2 value, F value, and p value and steric and electrostatic contributions are reported in Table 7. Steric, electrostatic, hydrophobicity, hydrogen bond donor, and hydrogen bond acceptor contributions were found to be 36.8%, 22.2%, 21.3%, 8.0%, and 11.7%, respectively. The developed model was further validated by predicting the activities of the compounds in the test set. This gave good predicted activities with low residual activities as reported in Tables 2, 4, and 6 and Fig. 4.

Fig. 5
figure 5

Aligned thioglitazones CoMSIA SD × coefficient contour plot. Green contours indicate regions where steric bulk is favorable and yellow contours indicate regions where steric bulk is not favored. Blue contours indicate regions where electronegative groups increase activity and red contours indicate regions where electronegative groups decrease activity. White contours indicate regions where hydrophobicity is favorable and magenta contours indicate regions where hydrophobicity is not favored. (Color figure online)

The conclusions drawn from the above contour plots generated from the steric, electrostatic, hydrophobic, and hydrogen bond acceptor and donor contributions of the generated CoMSIA model are as follows. A 5-benzyl-1,3-thiazlolidine-2,4-dione moiety is required for binding to the PPAR-γ receptor (except for the compounds in Table 5, which contain a pyridine ring attached to the fifth position of the thiazolidine-2,4-dione ring system instead of benzene but still exhibit antihyperglycemic activity). An oxygen atom connected directly to the aromatic ring in the form of ether is essential for the activity, as there is a blue contour near the red-colored oxygen atoms in Fig. 5. A two-carbon atom linkage between the oxygen atom and the terminal alicyclic or aromatic or heterocyclic ring is ideal for the activity, as compounds not containing the two-carbon atom linkage failed to show the activity (compounds 34 and 36). Extension of the carbon linkage chain between the benzyl moiety and the terminal heterocyclic ring may pull the terminal ring toward the yellow region and decrease the activity. A prominent yellow contour in that region indicates that there should not be any substitution over those two carbon atoms. The green contour map over the terminal ring system indicates that it is favorable for substitution either at the point of connection (compounds 23 and 24) or at the ortho positions (compounds 2, 7, and 11). The red contour, partly masked by the magenta contour near the two-carbon atom linkage, indicates that replacing or substituting with electronegative groups will decrease the activity. The white contour near the terminal ring indicates that the presence of similar hydrophobic groups will enhance the activity (compounds 1-13, 23-29, 32, 33, and 39). The presence of a magenta-colored contour over the ethylene bridge indicates that hydrophobic substituents should be avoided. Finally, the model indicates that a thiazolidinedione ring system containing three hydrogen bond acceptors and one hydrogen bond donor (NH) is the basic requirement for this class of compounds to exhibit antihyperglycemic activity.