Introduction

Thyroid hormone receptors (TRs) play an important role in normal development, differentiation, growth and metabolic regulation in humans. TRs can be considered as ligand-regulatable transcription factors, assuming their competencies to interact with ligands and DNA along with their ability to control transcription [1, 2]. TRs, which are members of the nuclear hormone receptor (NR) superfamily, are differentially expressed as many isoforms (TRα1, TRβ1 and TRβ2) in different tissues of the body and are key targets for commonly used drugs [35].

TRs are comprised of functional domains A/B, C, D and E, with a DNA binding domain (DBD), hinge domain and ligand binding domain (LBD) associated with the C, D and E domains, respectively. The sequences of DBD and LBD are extremely highly conserved amongst the TR isoforms. Conversely, no resemblance occurs in the A/B domain between TRα and TRβ isoforms [6]. TRα1 and TRβ1 vary from 400 to 500 amino acids in size, having abundantly homologous DBD and LBD. TRα isoforms α1, α2 and α3, are encoded by the TRα gene. These isoforms vary in their carboxyl termini due to alternative splicing. TRα1 has a binding capacity for T3, which leads to activation or repression of target genes, whereas TRα2 and TRα3 are non-T3 binding products and inhibit T3 function [7]. TRβ gene encodes three T3-binding splice products isoforms: TRβ1, TRβ2, and TRβ3, which differ in their amino termini. All TRβ isoforms are capable of binding to their cognate ligand T3 with high affinity to facilitate target gene transcription [1]. The effects of thyroid hormone have been delineated by observations in investigations on human subject with poor or extreme thyroid hormone levels. Hypothyroidism occurs due to thyroid hormone deficiency, while hyperthyroidism results from oversecretion of thyroid hormone. Resistance to thyroid hormone (RTH) is a condition in which persons have hyposensitivity to thyroid hormone, raised circulating serum levels of T3 and T4, and high or non-suppressed thyroid stimulating hormone (TSH) serum levels. RTH was recognized because of a mutation in TR genes. It is significant that RTH patients can exhibit inconsistent resistance in various tissues and clinical symptoms characteristic of the hypothyroidism and hyperthyroidism [8].

To date, numerous thyroid ligands have been reported [9, 10]. These TR ligands could lead to safe treatments for non-thyroid illness while evading the cardiac side effects and, therefore, could be used as a short-term supplemental therapy to traditional cures [11]. Specifically impeding the consequences of thyroid hormone binding at the receptor level may lead to a substantial improvement in the treatment of hyperthyroid patients [3].

Various quantitative structure–activity relationships (QSAR) from one- to six-dimension models, as well as structure–activity relationships (SARs) have been developed satisfactorily [1215]. Hormone activities of OH-PBDEs for the TRβ and 2D-QSAR model have also been reported [16]. 3D-QSAR studies have been combined with a molecular docking and molecular dynamic (MD) simulation approach [1719]. These models can assist with finding critical structural features for high binding affinity of novel ligands against TRβ. The identification and design of novel TR antagonists could play a significant role in their potential medical applications in the therapy of thyroid disorder [20]. Such antagonists usually interact with hydrophobic and important hydrophilic residues lining the ligand-binding pocket [21].

The aim of the present study was to explore the binding mode of inhibitors with TRα and TRβ receptors through molecular docking and MD simulation to gain further insight into their structure–activity relationship, and to obtain robust 3D-QSAR models. Such models could be useful in designing novel antagonists.

Materials and methods

All computational analysis was carried out on a Linx operating system (Cent OS 6.3) on a 64-bit machine running with an Intel quard core (4*2 core) processor and 16 GB RAM, with nvidia (Quadro FX 1700) graphic memory cards (512 MB) and a 900-GB SAS hard disk.

Ligand preparation and dataset for analysis

Three dimensional (3D) molecular structural datasets of 357 and 203 inhibitors from the binding database [22] were employed against TRβ and TRα, respectively. All molecules were processed with LigPrep 2.5 software [23] in order to assign appropriate protonation states at physiological pH (7.2 ± 0.2), employing the ionizer option. The IC50 values of the biologically active molecules were converted to pIC50 = −log10 (IC50) + 9. The distribution of pIC50 of the entire data set for TRβ inhibitors varied from 10.721 to 2.699, whereas for TRα inhibitors fluctuated from 3.67 to 10. Care was taken to definite a uniform distribution for TRα and TRβ inhibitors from a varied range of pIC50 for both training and test sets.

Generating pharmacophore sites

Six pharmacophore features, such as hydrogen bond acceptor (A), hydrogen bond donor (D), hydrophobic group (H), negatively ionizable (N), positively ionizable (P), and aromatic ring (R) are available in the Phase 3.4 [24] module of Schrodinger. The procedures used to map the locations of pharmacophore sites are acknowledged as feature definitions, and are characterized within the program by a set of SMARTS patterns. Each pharmacophore feature is specified by a set of 3D structural configurations of the compound. Once a feature has been mapped to a precise position in a conformation, it is stated as a pharmacophore site.

Searching common pharmacophores and scoring hypotheses

The desired variants were selected to find common pharmacophores among the active ligands. In this study, with five active compounds in the training set for TRβ and ten for TRα, shared pharmacophores were studied using a scoring procedure to recognise the pharmacophore from each surviving n-dimensional box that yields the best alignment of the active-set ligands. This pharmacophore offers hypotheses to describe how the active molecules interact with the receptor. As there are many boxes, there will be many hypotheses. The scoring method offers a ranking of the diverse hypotheses, allowing rational selection of those hypotheses most suitable for further examination. The shared pharmacophore hypotheses were scored by setting the root mean square deviation (RMSD) value <1.2 Å, the vector score value to 0.5 and weighting to include consideration of the alignment of inactive molecules by default constraints.

3D-QSAR model development

Phase—an integrated software module in maestro—was used to produce pharmacophore and 3D-QSAR models for TRα and β inhibitors. 3D-QSAR models of TRα and TRβ were developed from varying activities of different compounds, linked with a particular reference molecule, that have all been aligned to a common pharmacophore hypothesis. An atom-based QSAR model was used in this study, which takes all atoms into account. Standard parameters of the program were employed utilizing the partial least-squares (PLS) regression analysis method. The precision of the predicted models increases with rising number of PLS factors until over fitting twitches to transpire.

Model validation

Phase 3D-QSAR models utilise distinct training and test sets rather than internal cross-validation methods. In Phase 3D-QSAR models, leave-n-out models are made and the R 2 value between the leave-n-out estimates is calculated. The estimates come from the model built on the complete training set. This value is designated as the stability value and has a maximum value of the integer one. Models with high stability are preferred because they are not excessively reliant on the peculiarities of any specific training set.

The variance in observed activity (σ 2y ) and coefficient of determination (R 2) are very important statistical parameters used for precision of the training set model where the observed activity for the training set molecule i (y i), is as shown in Eqs. 1 and 2:

$$ {\sigma}_y^2=\frac{1}{n}{\displaystyle {\sum}_{i=1}{\left({y}_i-{\overline{y}}_i\right)}^2} $$
(1)
$$ {R}^2=1-\frac{\sigma_{err}^2}{\sigma_y^2} $$
(2)

The value of R 2 will always be positive, for the reason that the regression coefficients are optimized to have low sum of squared error (sse). The weakest situation is when the independent variables have absolutely no statistical association with activity. The regression coefficients will all be zero, and the predicted model will comprise only an intercept factor, the value of which will be mean observed activity(\( \overline{y} \)) as per the above-mentioned conditions. Therefore, all predicted activity will be \( \overline{y} \), and variance in errors (σ 2∈ rr ) will be equivalent to σ 2y , resulting in R 2 = 0.

A statistical quantity Q 2 (Eq. 3), equivalent to R 2 is calculated using the experimental and predicted activities for the test set. It is uncommon to obtain negative Q 2 values due to smaller variance in y as the test set does not have as high a series of activity values as the training set and the variance in the errors is larger because of errors in the test set, which have tendency to be larger than those for the training set. As all values are moved by the sample means, the Pearson correlation coefficient (r) is impervious to logical errors in the estimations; however, Q 2 is not. Therefore, if the rank order of activity computations is basically precise, but there is a key constant shift in the values related to the experiential activities, r may still be moderately high, even if Q 2 is small or negative.

$$ {\mathrm{Q}}^2={\mathrm{R}}^2\left(\mathrm{T}\right) $$
(3)
$$ r=\frac{{\displaystyle {\sum}_{j\in T}\left({y}_j-{\overline{y}}_T\right)\left({y}_j-\widehat{{\overline{y}}_T}\right)}}{\sqrt{{{\displaystyle {\sum}_{i\in T}{\left({y}_j-{\overline{y}}_T\right)}^2\left({y}_j-\widehat{{\overline{y}}_T}\right)}}^2}} $$
(4)

The PLS regression was performed by PHASE with all available (i.e., seven) PLS factors. All models were validated by predicting activity for test sets of 47 and 71 molecules for TRα and TRβ, respectively.

Pharmacophore screening

For external cross validation of the predicted model, a phase 3D-database was created against both models. A total of 197 inhibitors, consisting of 146 TRβ and 51 thyroid oxidase2 antagonists were employed to screen external cross validation of the built 3D QSAR model of TRβ. In total, 66 inhibitors, consisting of 15 compounds of TRα and 51 thyroid oxidase 2 inhibitors molecules, were used to cross-validate the constructed 3D QSAR model of TRα.

Molecular docking

Molecular docking was performed to investigate in-depth binding modes of different sets of TRα and TRβ inhibitors. The Glide 5.8 [25] module of Schrodinger software was used for molecular docking. Glide applies a hierarchical succession of filters to explore probable sites of the ligand in the active-site region of the target protein. The shape and properties of the target protein are characterized on a grid by numerous diverse sets of fields that make available gradually more precise scoring of the compound poses. Mainly, the degrees of freedom of side chains were sampled, while minor backbone movements were permitted through minimization. The 3D coordinates of the TRα in complex with ligand (PDB ID: 3JZB), and TRβ in complex with ligand (PDB ID: 1NAX), were retrieved from the Brookhaven Protein Databank. The four significant amino acids residues Arg 282, Arg 320, Asn 331 and His 435 for TRβ and three residues Arg 266, Ser 277, His 381 for TRα, distinct as flexible residues in the docking process, were considered as binding sites. The best docking score was selected on the basis of the conformation possessing high docking scores as well as maximum activity.

The XP glide docking process is very precise and produces 10,000 poses for every ligand during docking and provides the top pose on the basis of the energy term E model. The top poses of each compound were further given preference on the basis of XPGscore [26]. Lower XPGscore for a ligand specifies improved binding affinity with the protein. The limit of the XPGscore constraint for XP glide docking was kept to 0.0 kcal mol−1, a restraint set to reject compounds with positive XPGscore after docking yield.

Molecular dynamics simulation

Desmond 3.1 and 3.5 [27]—a suite of computer program for executing classical MD simulation systems—was used to evaluate the stability and conformational changes, and to generate high-quality simulation trajectories of moderate timescales in solution along with thermodynamic measures of docking complexes [28]. The MD simulations were studied through using the OPLS2005 force field with the predefined [29] with the docked complex of TRβ protein with second lowest value docking score of ligand PCID 9933119 and TRα protein with lowest docking score of ligand PCID 10456672.

The original coordinates for the MD simulations were pocketed from the docking results. Prior to the simulation, an energy minimization process was performed on the full system without constraints using the steepest descent integrator for maximum 2,000 iterations, and the limited memory Broyden-Fletcher-Goldfarb-Shanno (LBFGS) algorithms, with a convergence threshold of 20.0 kcal mol−1 Å−1. The system was rooted with simple point charge (SPC) water model along with 10 Å × 10 Å × 10 Å ortho-rhombic water box, and the systems were neutralized by replacing solvent molecules with counter ions (Na+) to balance the net charges of the systems. This ensures that the major surface of both complexes (TRα and TRβ) were covered by the solvent model. In all, 64,009 atoms of TRα and 59,141 atoms of TRβ protein and ligand complex were simulated through a multi sim step procedure of the final system.

Prior to equilibration and extensive MD simulations, the structures were minimized and pre-equilibrated using a default relaxation routine. To perform this, the program ran six phases composed of minimizations and small (12 and 24 ps) MD simulations to relax the model system before accomplishment of the final long simulations. A 5-ns production MD simulation was then performed for each system. The shake algorithm was applied to all hydrogen atoms and the van der Waals (VDW) cut off was set to 9 Å [30]. The temperature of the system was maintained at 300 K, using the Nose-Hoover thermostat approach with a relaxation time of 1 ps, and pressure was maintained by employing the Martyana-Tobais-kleinbarostat approach using coupling style isotropic with relaxation time of 2 ps. Long-range electrostatic forces were considered by means of the particle-mesh Ewald (PME) method [31]. Trajectory was recorded at 4.8 ps during the MD runs and the recording interval energy was set at 1.2 ps. The dynamics equilibration was observed by studying the stability of temperature, energy, pressure and density of the system along with RMSD of the backbone atoms. RMSD and energy fluctuations of the complex in every trajectory were examined with respect to simulation time. The root mean square fluctuations (RMSF) of overall atoms, backbone and side chains of TRα and TRβ were studied for each residue. The docking complex was explored and examined for consistency of hydrogen bonding interactions.

Results and discussion

Analysis of the atom-based 3D-QSAR model

To explore the common pharmacophore hypotheses, the complete dataset was separated into an active set and an inactive set, underlying the attribute of pharm set. Compounds with pIC50 > 8.60 for TRα and pIC50 > 10.00 for TRβ were considered to be active and those with pIC50 < 5.00 for both (TRα and TRβ) were set to be inactive, whereas those in-between were considered as moderately active (supplementary Tables 1 and 2). Total 439 (8 variant) and 33 (5 variant) hypotheses were identified for TRα and TRβ, respectively (Table 1).

Table 1 Pharmacophore hypotheses with variant of thyroid hormone receptors (TR) TRα and TRβ

These hypotheses yielded a scoring process comprising of three scores: survival (solely on the basis of the active set), surv-inactive (on the basis of the active and inactive sets) and post-hoc score; calculated on the basis of active and inactive sets, with a reward assigned based on the pIC50 of each compound of the data set. The default parameter of the post-hoc score was taken as cut-off value. A total of 80 TRα (Table 2) and 20 TRβ (Table 3) hypotheses survived, and only hypotheses with the highest survival-inactive scores were considered for building 3D-QSAR models. All molecules in the dataset were then aligned, matching with at least three pharmacophore features. The selected test set members were considered by the minimum distance from the centroid of each cell in the top map. After analyzing the alignment between active ligands and the generated hypothesis, the best model was DHHHNRR.251 of TRα, in which five active molecules in the active set matched with the hypotheses and were selected for further study. Likewise, the built QSAR model generated by hypothesisAHHHNRR.25 for TRβ had two active molecules in the active set out of five that matched hypotheses. Hypothesis 79 for TRα (Table 2) and 19 for TRβ (Table 3) are also shown to allow the reader to draw comparisons. The QSAR model was constructed by an atom-based QSAR modeling method.

Table 2 All 80 pharmacophore hypotheses of TRα with parameters and their values for all 10 matching sites
Table 3 Twenty hypotheses generated to build three dimensional quantitative structure–activity relationship (3D-QSAR) model of TRβ for five matching sites with their parameters

The best models for TRα and TRβ were selected on the basis of the PLS factor model’s minimum SD (standard deviation) and RMSE (root-mean-square error) value for both sets. The PLS regression was achieved by Phase with a maximum of seven PLS factors at a minimum of ten ligands per PLS factor. The legitimacy of each model was tested based on the calculated correlation coefficient (R 2) and variance (F) for the test set. Training and test set ligands were respected in a 4:1 ratio to obtain the best model. “F” enlightens the proportion of the model variance to the observed activity variance. Higher values of “F” specify a more statistically significant regression. Pearson-R value indicates the correlation between the experimental and observed activity for the test set. The “P” value indicates probability, i.e., correlation could occur by chance and signify a level of “F” when considered as a ratio of Chi-squared distributions. Lower values indicate a higher degree of confidence. Cross-validated R 2 (R 2−CV) value, computed from predictions is obtained by a leave-n-out approach.

It is advocated that Phase works well when R 2 > 0.4 [32]. The statistical constraints related to the built atom-based 3D-QSAR model for TRα were as follows: the training set (= 156 compounds) attained an R 2 value of 0.9535, with SD of 0.3016; the test set (= 47 compounds) obtained Q 2 (value of Q 2 for the predicted activities) of 0.4303, squared correlation (random selection R 2− CV = 0.6929), Pearson-R of 0.7294 and RMSE of 0.6342. The P value of 1.411e-96 stated a high degree of self-reliance. The anticipated activities of the training and test set molecules were given in Table 4. The regression line for predicted activity of TRα (cross validated test data set) is displayed in Fig. 1.

Table 4 Statistical analysis of 3D-QSAR model for TRα. PLS Partial least squares, SD standard deviation, RMSE root-mean-square error
Fig. 1
figure 1

Scatter plot of cross validated predicted values of thyroid hormone receptor α (TRα)

A total of 357 molecules was employed for the TRβ model, of which 336 were retrieved from the binding database and 21 from the published literature [3].

The entire data set for developing the TRβ QSAR model was distributed, randomly choosing 279 compounds to be in the QSAR training set and 71 compounds for the test set. Training and test set were maintained at a one to four ratio in an automatic process. Thus, total 350 molecules were engaged for making the model and seven molecules were discarded because they did not fulfill the criteria to fit the model. The training set yielded an R 2 value 0.9424 with SD (= 0.3719). The squared correlation (random selection R 2−CV = 0.7201), the Pearson-R (= 0.7852), and RMSE for test set predictions (= 0.8630) all strengthen the good predictive competences of the concluding QSAR model for the test set of TRβ. The P value, 2.108e-165 specified a high degree of self-reliance for TRβ model. The predicted activities of the training and test set molecules are given in Table 5. The regression line for predicted activity of TRβ (cross validated test data set) is shown in Fig. 2. Stability of the model predictions to modifications in training set composition have a maximum value of 1.

Table 5 Statistical analysis of 3D-QSAR for TRβ
Fig. 2
figure 2

Scatter plot of cross validated predicted values of TRβ

The Q 2 value of TRα at PLS factor 3 yielded better at the cost of R 2, so it was omitted. The optimum model of TRβ was found at PLS factor 6. The RMSE value of PLS factor 6 was found to be lower (= 0.8630) compared to PLS factor 7 (= 0.9067) and there was very little difference between R 2 values.

The established 3D QSAR model satisfies model requirements and has good correlation with biological activity. After analyzing the statistical values, the QSAR model for TRβ was found to be much improved as compared to that of TRα.

QSAR visualization of TRα and TRβ models

In the present study, the best QSAR model for inhibitors of both TRα and TRβ were developed at PLS factor 6. The selected hypothesis DHHHNRR.251 of TRα contains one hydrogen-bond donor (light blue), three hydrophobic sites (green), one negatively ionizable atom (red) and two aromatic rings (dusky saffron) in Fig. 3a. The 3-D arrangement of pharmacophore sites reveals that the main section of the model is engaged by a hydrophobic pocket. The hydrogen bond donor site is located on one aromatic ring and the negatively ionizable atom on the other.

Fig. 3
figure 3

(a) Pharmacophore mapping of hypothesis DHHHNRR.251for TRα (b) Distance site mapping (Å)for TRα (c) Angle mapping for TRα (d) Pharmacophore mapping of hypothesis AHHHNRR.25 for TRβ (e) Distance mapping of TRβ (Å) (f) Angle site mapping for TRβ model

The pharmacophore hypothesis displaying distance site and angle mapping between the pharmacophoric sites of TRα is shown in Fig. 3b,c. The distances between sites D3–H4 and D3–R8 were 2.169 Å and 3.266 Å, respectively. The details of distances and angles between different groups of TRα are listed in supplementary Table 3.

The chosen hypothesis AHHHNRR.25 was found as the best among 20 hypotheses of the TRβ model; this hypothesis is comprised of one hydrogen-bond acceptor (light red), three hydrophobic (green), one negatively ionizable atom (red) and two aromatic rings (dusky saffron) (Fig. 3d). The pharmacophore hypothesis displaying distance site and angle mapping between the pharmacophoric sites of TRβ is shown in Fig. 3e,f. The distance between sites A2–R9 and A2–H6 were 2.77 Å and 3.238 Å, respectively. The details of different distances and angles between different groups of TRβ are given in supplementary Table 4. The spatial arrangement of the pharmacophore sites shows a major area of the model being occupied by a hydrophobic pocket. The hydrogen bond acceptor site is situated on one aromatic ring and a negatively ionizable atom at the other ring with a COOH− group.

All the active compounds (ten active compounds for TRα and five for TRβ) used for building the model fitted and occupied into the similar spatial arrangement is illustrated in Fig. 4a,b. The mapping reveals both that the model satisfied the hypothesis and that all active molecules taken for the training set are aligned.

Fig. 4
figure 4

(a) Alignment of the ten active compounds for the TRα model (b) Superposition of five active compounds for the TRβ model

Pictorial representations of volume occupied maps (hydrophobic, donor, aromatic ring and electron-withdrawing features) for TRα hypothesis DHHHNRR.251 and TRβ hypothesis AHHHNRR.25 are shown in Fig. 5. The reference compounds for TRα and TRβ are 3-[4-(4-hydroxy-3-iodophenoxy)-3, 5-diiodophenyl] propanoic acid (PCID 5804), 2-[3, 5-dibromo-4-(4-hydroxy-3-propan-2-ylphenoxy) phenyl] acetic acid (PCID 9933119), respectively. Favorable and unfavorable contacts are displayed in blue and red cubes in the contour map. The volume occupied map is helpful to identify significant sites that necessitate a convinced physicochemical property for developing an effective antagonist drug for hyper level thyroid hormone secretion as discussed in the following section.

Fig. 5
figure 5

3D-QSAR visualization of contour map for TRα and TRβ. (a) Hydrogen bond donor, (b) hydrophobic donor/non-polar (c) electron-withdrawing features, of TRα hypothesis (DHHHNRR), and (d) Negative ionic effect (e) hydrophobic donor/non-polar (f) electron-withdrawing properties, of TRβ hypothesis (AHHHNRR)

Hydrogen bond donor

The volume occlusion map for the hydrogen bond donor (HBD) of TRα explains the favorable 3D arrangement of hydrogen bonding or non-covalent interactions with acceptor groups of the protein. The structure of the HBD acquired from the QSAR model could be applied to the most active ligands (Fig. 5a). HBD favorable blue cubes are present near the fourth position of the benzene (R8) group of the model. All highly active molecules contain the HBD group (OH) at their fourth position in molecules PCID 5804, 10456830, 23648079, 5803, 71212, 7048703, 10904700, 9933119, 9951790 and 10479779. The unfavorable red contours are not found in this volume occupied region. These HBD groups also show H-bonding contacts with His381 in docking (Fig. 6) with the most active ligand PCID (5804) but the docking score with (2-[4-(5-bromo-6-hydroxynaphthalen-1-yl)-3, 5-dichloroanilino]-2-oxoacetic acid) PCID(10456672) was the best compared to that mentioned above, which is a moderately active ligand, thereby indicating the importance of this interaction for the activity of all molecules. This result thus justifies the occurrence of HBD blue regions at this site.

Fig. 6
figure 6

Two-dimensional (2D) docked view of ligand 3-[4-(4-hydroxy-3-iodophenoxy)-3, 5-diiodophenyl] propanoic acid (PCID 5804) with TRα receptor

Hydrophobic/non-polar property

The hydrophobic property of the TRα 3D-QSAR model was applied to the most active compound PCID (5804) as shown in Fig. 5b. In this figure, favorable blue maps are seen near the third position of the benzene ring (R8), and the third and fifth positions of the phenyl ring (R9) in active molecules. On the other hand, hydrophobic unfavorable red contours are visualized near the ortho- position of the benzene ring (R8) in most of the inactive molecules. Hydrophobic favorable blue contours are also shown near the first position of the diiodophenyl ring in active molecules. This location is especially favorable for hydrophobic aliphatic chains having a carboxylic group at their terminus. This noticeably specifies that hydrophobic substitutions can be acknowledged at this position to increase activity. The appearance of unfavorable red cubes at approximately similar sites in inactive molecules revealed that massive hydrophobic assemblies at this position are harmful for activity. Favorable blue contours are present at the second position of the diiodophenyl ring. Thus, hydrophobic groups affixed through a linker are favorable at this site, whereas the non-appearance of hydrophobic groups at the same site is likely to reduce the biological activity. A red contour located not too close to any of the atoms of the compounds suggests that occupancy of this spatial region by a hydrogen bond acceptor group would cause a decrease in activity.

Figure 5e shows the significant favorable regions and unfavorable hydrophobic/non-polar contacts that arise with the TRβ 3D-QSAR model applied to the most active ligand of PCID (9933119). Favorable blue regions are seen near the third position (H4) and fifth (H6) position of the benzene ring (R8), and the third position (H9) of the phenyl ring (R9) in active molecules. On the other hand, hydrophobic unfavorable red regions are visualized near the aliphatic chain associated with the phenyl ring (R9) in most inactive molecules. Hydrophobic favorable blue contours are shown more closer to the dibromophenyl ring in active molecules. This place is mainly favorable for hydrophobic aliphatic chains containing a carboxylic group at the terminus. This favors the view that hydrophobic substitutions can be agreed at this position to enhance the activity of the molecule. The presence of red cubes at a nearly comparable position in inactive compounds suggests that vast hydrophobic assemblies at this site are detrimental for activity.

Electron withdrawing and hydrogen bond acceptor

An electron withdrawing (EW) property contour map for TRα inhibitors is shown in Fig. 5c. Electronegative atoms such as N, O, S and halogen along with hydrogen bond acceptor (HBA) groups were considered under the EW property map using atom-based QSAR. EW group blue regions are seen close to atoms associated with N7 groups. In highly active molecules, favorable blue cubes are present near the ‘O’ atom of the 3-propanoic acid group attached to the position of the phenyl ring (R9), and ‘O’ atoms form H-bonds with Arg266 in docking analysis acting as HBA (Fig. 6). Favorable blue cubes are also present near the fourth position of the benzene ring (R8) in all active molecules, thus suggesting the importance of this interaction for a molecule’s inhibitory activity for TRα.

EW/HBA properties of TRβ inhibitors are displayed in Fig. 5c. HBA group favorable regions are observed close to atoms associated with the carboxylic group of the dibromophenyl ring (R8). In highly active ligands, blue cubes are seen near the ‘O’ atom of the acetic acid group attached to the dibromophenyl ring (R8) and forming H-bonds with amino acid Asn331 in docking analysis acting as HBA (Fig. 7d). EW property favorable regions are seen near the aliphatic chain associated with the phenyl ring (R9) in all active molecules. Analysis of this interaction is important for a molecule’s inhibitory activity for TRβ.

Fig. 7
figure 7

Docking analysis of TRα and TRβ. (a) Ligand and protein interaction hydrogen bond interaction view (black dotted lines hydrogen bonds) (b) 2D docked view of ligand 2-[4-(5-bromo-6-hydroxynaphthalen-1-yl)-3,5-dichloroanilino]-2-oxoacetic acid (PCID 10456672) with TRαreceptor (c) Ligand and protein interaction view with hydrogen bond (d) 2D docked view of 2-[3,5-dibromo-4-(4-hydroxy-3-propan-2-ylphenoxy) phenyl] acetic acid (PCID9933119) ligand with TRβ receptor

Negative ionizable property

The negative ionizable property contour map of TRβ is shown in Fig. 5d. Negative ionizable group favorable blue cubes are found adjacent to the carboxylic group attached to the benzene ring (R8). All highly active compounds examined in this study bear a negative ionizable ‘O’ atom at this position. The negative ionizable property contour map of TRα is shown in supplementary Fig. 1. Negative ionizable group favorable blue cubes are seen adjacent to the phenyl ring (R9) and aliphatic chain.

For TRβ, the active ligands are quite sensitive to steric bulky groups or expansions to the phenyl ring, which lead to a decrease in activity of the molecules. However, a propyl group attached to the carboxylic group maintained the activity. The expansion of ring size, i.e., n-propyl to cyclo-pentyl, reduces the activity. Halogen atoms along with two benzene rings are important for maintaining the activity of the molecules .

These results provide perceptions of the structural requirements of the most active compound in our dataset, which in turn can help in the rational drug design of novel TR antagonist derivatives. This analysis opens up the possibility of predicting the biological activity of new compounds and their active sites inside their receptors through pharmacophore-based virtual screening and interaction analysis.

External validation of 3D-QSAR model

For external validation of the TRα hypothesis DHHHNRR, a 3D-phase database of 66 compounds was created that includes 15 molecules of TRα inhibitors and 51 known thyroid oxidase 2 inhibitors, which are provided in the supplementary Table 5. The structures of known inhibitors were retrieved from the binding database. The source organism taken for TRα was Rattus norvegicus (mammal) due to the non-availability of a human data set in the binding database; R. norvegicus was considered for our study because it belongs to the mammalian class. Only one molecule (T3) made a good hit, with a fitness value of 2.39 (supplementary Table 6), giving a good predictive value of being an active molecule for TRα. A fitness value range of 1.89 (minimum) to 3 (maximum) was used in our study to build the 3D-QSAR model (supplementary Table 1).

For external validation, a 3D-phase database of 197 compounds was created including 146 known TRβ inhibitors (138 from Homo sapiens and 08 from R. norvegicus), the dataset for which was retrieved from the binding database (given in supplementary Table 7). The structures of known inhibitors were taken from the binding database. The other 51 compounds, thyroid oxidase 2, employed in the test set used for external validation were selected from the binding database, with care taken to ensure that none of the molecules was structurally correlated with those in the original training and test sets. When the AHHHNRR.25 hypothesis was used as the query to build the 3D phase database, 18 molecules (supplementary Table 8) were found as hits that target human TRβ, but only 11 of them gave good predictive value as the range of fitness values in the built model was predicted to lie in the range of 1.78 (minimum) to 3 (maximum) (supplementary Table 2). None of the thyroid oxidase 2 inhibitors made hits. The biological activity of these molecules was also predicted for all seven PLS factors. Hits with the highest fitness score were retained. The score assortment for the vector score is dispersed between −1.0 and 1.0, for the alignment score 0.0 to 1.0, and for the volume score 0.0 to 1.0. For all three terms, the weight range is 0.0 to 1.0, with a default weight of 1.0. The fitness score varied from −1.0 to 3.0 and is a linear combination of the site and vector alignment scores, in addition to the volume score. The fitness score is a parameter that explains how well the aligned compound conformer coordinates with the hypothesis built on RMSD site matching, vector alignments and volume terms.

Molecular docking analysis

Flexible molecular docking of all the ligand molecules was performed, but only 139 molecules were successful in molecular docking. Docking on the whole pharma-set was carried out at the active site Arg 266, Ser 277 and His 381of TRα (Pdb Id: 3JZB). The most active compound (PCID 5804) was found to have a gscore of −8.731492. As can be seen clearly in Fig. 6, there are two hydrogen bond interactions between the carboxylic group (‘O’ atoms) of inhibitors and the NH of Arg266. It is also observed that NH of Ser 277 forms a hydrogen bond with the oxygen atom of the ligand group. The ‘N’ atom of imidazole group of His 381 makes a hydrogen bond at fourth position of the OH group with the ligand. The top gscore of compound PCID 10456672 was obtained as −9.89261 at the cost of activity. The best possible pose view of ligand 10456672 in the TRα binding site is shown in Fig. 7a and the corresponding 2D-view of the ligand–protein interaction is displayed in Fig. 7b. The hydrophobic surrounding is observed at the distal position of the ligand, shown in green circles labeled with the three-letter code of the amino acid. Note that three polar residues, viz. Ser 260, Ser 277 and His 381 (circled in cyan),, two glycine residues at positions 290–291 and charged positive residue Arg 266 play an important role in the interaction. The binding contacts arrangements noted during docking analyses are in agreement with that of HBD and HBA contour maps. Docking gscores of TRα inhibitors with receptors are given in supplementary Table 9.

Figure 7c shows the best possible pose view for ligand PCID (9933119). The most active compound among the whole data set was docked into the binding site of the TRβ receptor (PDB Id: 1NAX). Figure 7d shows the 2D representation of the ligand–protein interaction of the same. The predicted gscore of ligand PCID 9933119 was −11.52. It can be seen clearly from Fig. 7d that the ligand molecule is surrounded by hydrophobic residues Ile 275, Ile 276, Ala 279, Met 310, Met 313, and Ala 317, mainly through hydrophobic interactions. The charged positive residues Arg 282 and Arg 316 also surround the ligand molecule. In addition, the two oxygen atoms of the terminus of the ring form three hydrogen bonds to Asn 331as HBAs and with Met 313 and His 435 as HBD. Therefore, the inhibitor is stabilized in the binding pocket by H-bond interactions. It should be emphasized that analysis of the docking modes of the inhibitors PCID (9933119) in terms of gscore as well as activity specifies the establishment of more extensive contact forms.

Thus, it can be concluded that a ligand containing a halogen or a more electronegative atom can result in increased activity. This is in agreement with our contour maps as more favorable regions (blue cubes) are observed at that location. Another hydrophobic interaction was formed considering mainly Ile 275, Ile 276, Ala 279, Met 310 Met 313, and Ala317 amino acid residues, so a benzene ring in this position can increase activity. Out of 357 ligands, only 97 succeeded in docking. The succeeded docking gscores of these compounds with 1NAX protein are given in supplementary Table 10. Comparative docking studies suggest that the TRβ receptor is a better potential target compared to TRα because of the binding energy and biological activity of inhibitors towards this target.

Molecular dynamics simulation analysis of TRα and TRβ inhibitor complexes

The MD simulation of docking complexes, compound 10456672-TRα and compound9933119-TRβ receptor, was performed for 5 ns to confirm the dynamic stability and binding modes to uncover the conformational changes occurring. Figures 8 and 9 show the RMSD trajectory, RMSF, protein–ligand contacts (PLC) and ligand–protein contacts (LPC) of both complexes. Fluctuations within 1–3 Å RMSD values are perfectly acceptable for small proteins [3335].

Fig. 8
figure 8

Results of molecular dynamics (MD) simulation analysis. For TRα complex (a) MD simulation time vs RMSD in Å for all residues shown in different colours: light green backbone, red along with ligand fit on protein, and pink ligand fit on ligand (b) RMSF vs residues index for TRα (c) Protein ligand contact interaction over trajectory (d) Average conformation of the binding pocket of ligand 10456672-TRα complex throughout the simulation

Fig. 9
figure 9

Results of MD simulation for the TRβ complex. (a) MD simulation time vs RMSD in Å for all residues (b) RMSF vs residues index (c) Protein ligand contact interaction over trajectory (d) Average 2D-docking view of ligand 9933119-TRβ complex throughout MD simulation

Ligand 10456672-TRα-complex reaches 1.5 to 3 Å RMSD from initial to 1.8 ns and undergoes a conformational change during the MD simulation as reflected by its RMSD value (Fig. 8a). During MD simulation from 1.8 to 5 ns, the RMSD was found to be relatively stable about a 3-Å RMSD value. The mean value of the TRα-complex of the backbone atom was found to be 2.212 Å. A small perturbation in the initial state was observed. Figure 9a displays the RMSD evolution of a protein (left y-axis) and with ligand added (right y-axis). All protein frames are aligned on the backbone. From initial to 3.8 ns, the compound 9933119-TRβ-complex trajectory was observed to have an average RMSD of about 1.5 Å. During 3.80–3.90 ns MD simulation the backbone protein of TRβ complex (red) underwent a slightly higher conformational variation due to ligand and protein contacts and later attained a second plateau after 3.9 ns simulation at about 2.5 Å RMSD. The range of backbone TRβ complexes was found between 0 to 2.708 Å and the mean value was observed at 1.742 Å. The average RMSD of heavy atoms and side chain atoms were found to be 2.028 and 2.582 Å, respectively. It is observed from the above plot that the ligand RMSD is lower than that of the protein backbone atoms. Ligand RMSD attains two plateaus, first from initial to 1 Å and later after 3.9 ns simulation at about 1.75 Å. Lig fit Lig RMSD value calculates the internal fluctuations of the ligand atoms and is aligned on its reference conformation. Ligand RMSD specifies that the ligand is stable with respect to the protein and its binding pocket during the course of simulation.

Figures 8b and 9b depict the analysis of RMSF versus the residue number for TRα (aa 144–407) and TRβ (211–460), respectively. RMSF is beneficial for describing local changes along the protein chain. The analysis of RMSF is important for the explicit relationship between the inhibitory capability of a ligand and the folding pattern of a helix. Alpha-helical and beta-strand regions are given in red and blue backgrounds, respectively. These regions are explained by helices or strands that continue over 70 % of the entire simulation. In Figs. 8b and 9b, peaks show the segments of protein that fluctuate the most during the simulation. Typically, the tails (N- and C-terminal) fluctuate more than any other part of the protein. It is seen from the plot that alpha helices (red) and beta (blue) strands are more rigid and have less fluctuation than the unstructured part loop regions (white region) for both TRα and TRβ complexes. Protein residues that interact with the ligand are manifested as green vertical bars. It can be observed that both TRα and TRβ proteins alter the conformation, which agrees with the result of RMSF. Two major residue segments, 357–362 and 391–400 and flexible protein regions of the backbone atom (green) can be seen in TRα protein but only one is seen in TRβ protein (251–263), which indicates that TRβ protein is more stable than TRα. The fluctuation of TRβ protein is seen at Asn 257 with a backbone RMSF value of 4.17 Å, which could be due to a bend or turn.

PLCs for TRα are classified into four types: viz. hydrogen bonds, and hydrophobic, ionic and water bridges (Fig. 8c). The binding site includes Thr 178, Asn 179, Ala 180, Phe 218, Ile221, Ile 222, Pro 224, Ala 225, Ile 226, Arg 228, Met 256, Met 259, Arg 262, Ala 263, Arg 266, Leu 276, Ser 277, Leu 287, Leu 292, His 381, Phe 401, Phe 405 residues, with the TRα complex coming mainly under the alpha-helix region. The y-axis is normalized over the course of the trajectory. Phe 218 with 0.5 values, Ile 222 with 0.3 values and Leu 276 with 0.1 interactions over trajectory values are found, which suggests that 50, 30 and 10 % of the simulation time, the specific interaction of Phe 218, Ile 222 and Leu 276, are maintained with ligand 10456672. The geometric criteria for TRα protein-10456672 ligand, H-bond is 2.5 Å between the donor and acceptor atoms (D–H···A, 2.5 Å, ≥120°). Ser 277 values over 1.0 suggests that multiple contacts (water-bridge and H-bond) with the ligand take place. Ser 277 (one side chain and one backbone H-bond) and His 381(side chain) residues participate mainly in intermolecular hydrogen bonding. Arg 266, with an interaction trajectory value of about 0.4, shows that it participates in ionic- and water-bridges with very little interaction of hydrogen bonding with the ligand, which differs from the docking results (Fig. 7b). The significance of the hydrogen-bonding interactions in drug design is important because of their deep impact on drug specificity, metabolization and adsorption. Three H-bonds appear in both docking and MD simulation for the Ligand 10456672-TRα complex, but distinct variations are observed in both results. In docking His381, a backbone H-bond is observed, whereas in MD simulation, a side chain H-bond is noted for same residue. Met 259 makes a backbone H-bond with the NH atom of the ligand in docking but shows only 20 % of interaction over trajectory in the initial phase of the MD simulation and later forms 10 % hydrophobic interaction. An ionic interaction is seen between TRα protein residue Arg 266 and oxygen charged atoms of ligands that are within 3.7 Å of each other in the docking complex and 20 % of simulation time with specific interaction over the trajectory. Arg 266 forms a side chain H-bond with the oxygen atom of the –COOH group of ligand 10456672 in docking but has a negligible interaction during MD simulation. Ser 277 plays a significant role in MD simulation as it forms one side chain and one backbone H-bond with the ligand and stabilizes the interaction with TRα protein. In this analysis it is evident that the conformational change in ligand 10456672 is due mainly to fluctuations of the –NH–CO–COOH group attached to the dichlorobenzene ring, which can be explained by both docking and MD simulation study.

It is evident from the analysis that aromatic residues, mainly Phe 218 of TRα protein, can make π –π interactions with the aromatic ring of ligand 10456672. The geometric criteria for hydrophobic contacts are measured with π-cation- aromatic and charged groups within 4.5 Å; π –π, i.e., two aromatic groups stacked face-to-face or face-to-edge and others, i.e., a non-specific hydrophobic sidechain within 3.6 Å of a ligand’s aromatic or aliphatic carbons.

The binding site, including mainly Phe 269, Phe 272, Ile 275, Ile 276, Ala 279, Met 310, Met 313, Arg 320, Leu 330, Asn 331, Leu 346 and His 435 for TRβ protein-ligand 9933119 participates in PLC (Fig. 9c). The binding site comes mainly under the region of alpha-helix (red). The interactions of key residues Ile 276, Leu 330, Leu 346 over trajectory values are about 0.3 or greater, which suggests that, for ≥30 % of the simulation time, the specific interaction of these residues is maintained with ligand atoms that make hydrophobic contacts. Asn 331, His 435 and Arg 320 residues participate mainly in hydrogen bonding (Fig. 9c,d). Asn 331 with about >1.2 interaction trajectory value indicates that it makes multiple contacts (H-bond side chain and backbone) with the ligand in the form of hydrogen bonding through a water-bridge. Ionic interaction is observed between TRβ protein residue Arg 320 and the oxygen atom of the terminal –COOH group of ligand 9933119 by 25 % of simulation time with specific interaction over trajectory. Further, the new H-bond formed between the ketonic oxygen of the ligand and the side chain of Arg 320, is altered from docking results. Three H-bonds participate in both docking and MD simulation, of which two H-bonds with residues Asn 331 and His 435 of TRβ protein are consistent in both events but only Met 313 is altered to Arg 320 during MD simulation. This study suggests that these H-bonds play a significant role in stabilizing interactions between ligand 9933119 and TRβ protein. Thus, the conformational change of ligand 9933119 results mainly from movement of the terminal –COOH-CH2 group attached to the dibromobenzene ring, as could be elucidated by binding mode from docking and MD simulation analysis.

Interactions that take place more than 30 % of the simulation time in the selected trajectory (0.0 through 5.0 ns), are shown in Figs. 8d and 9d for TRα protein and the TRβ protein–ligand complex, respectively. The current geometric criteria for a protein–water or water–ligand H-bond are: (D–H ⋯A, 2.7 Å ≥110°); and an acceptor angle of ≥80° between the hydrogen-acceptor-bonded atom, i.e., atoms (H⋯A–X). Additionally, two water bridges, hydrogen-bonded protein–ligand interactions mediated by a water molecule, are found in TRβ protein-ligand 9933119 but not in TRα-10456672. In TRβ protein-ligand 9933119, one water bridge is formed between Asn 331 (H-bond side chain) and Ile 275 (H-bond backbone).

The upper section of Fig. 10a,b displays the total number of definite interactions for the 10456672-TRα complex and ligand 9933119-TRβ complex, respectively, over the course of the trajectory. The lower section depicts the residues that make contact with the ligand in each trajectory frame. Ser 277 residues of TRα protein and Asn 331 of TRβ protein form more than one contact with ligands 10456672 and 9933119, respectively, as denoted by a darker shade of orange, according to the scale on the y-axis (right) of the plot. Ile 275, Arg 320 and His 435 of TRβ protein can be seen as more consistent after 2 ns during MD simulation, which makes a single contact and signifies the stability of the ligand 9933119-TRβ complex. A differnt pattern is observed in ligand 10456672-TRαcomplex as only Ser 277 is steady during the MD simulation. These results suggest that the average structure and backbone atoms of TRβ protein and its ligands are relatively more stable than those of the TRα-complex during simulation time, suggesting the acceptability of the model.

Fig. 10
figure 10

Protein ligand contact analysis (a) Ligand 10456672-TRα complex (b) Ligand 9933119-TRβ complex

These outcomes suggest that the TRβ-complex is relatively more stable during the course of simulation as compared to the TRα-complex, and established the good predictive results achieved from the molecular docking. Though both complexes experienced numerous changes throughout the MD simulation, the binding pocket and the conformation of 9933119-TRβ complex are more constant compared to those of the10456672-TRα complex, imparting confidence in the rationality and validity of the docking result. The regions achieved from 3D contour maps are consistant with the contacts between ligands and amino acid residues of receptor TRα and TRβ recognised by molecular docking.

These findings validate the notion that computational 3D-QSAR approaches can be an effective tool with which to predict the biological activity of compounds, to comprehend the toxicity process, and to propose novel molecules with precise biological activity. The benefit of the molecular docking methodology is that it takes the characteristics of the binding pocket of both TRα and TRβ receptors into consideration, to obtain the active conformation of compounds. Due to both large numbers, i.e., 64,009 atoms of TRα and 59,141 atoms of TRβ protein ligand complex, and computational constraints, MD simulations were performed at 5 ns. A study by Genheden and Ryde [36] showed that simulations of protein–ligand complexes with several approaches do not converge to equilibria even up to 500 ns, meaning that not all available states have been touched. Hence, MD simulation will not reach equilibrium with the current computation and algorithms.

Conclusions

The objective of this study was to establish a robust relationship between structural properties and antagonist activity of human TRα and TRβ by using 3D-QSAR, molecular docking and MD simulation methods to explore receptor-ligand contacts, which can be useful to minimize the adverse effect of current drugs. The built model produces high Q 2 values with small standard errors of estimation. The TRβ model was found to be more optimized and robust as compared to TRα after analyzing the results. The model was validated internally and externally both in order to confirm consistency and to maintain high predictive ability. Contour maps revealed that inhibitory activity can be improved by modulating the donor abilities of nitrogen or oxygen atoms in the fused aromatic rings that are involved in the H-bond interactions with the binding site of the receptor. The built model can be used to design more effective antagonists than existing drugs (PTU and MMI), which have side effects such as agranulocytosis, aplastic anemia, and the ability to cross the placenta [3739]. The model built in this work could be useful to recognize substantial essential features that affect thyroid hormone function and are favorable to deliver certain statistics about the interaction of ligands with receptor. The coherence of the outcome found from 3D-QSAR, molecular docking and MD simulation further specifies the robustness of the best 3D-QSAR model.