Introduction

Huntingtin interacting protein (HIP14) is a conserved mammalian neuronal palmitoyl transferase (PAT) involved in the palmitoylation and trafficking of multiple neuronal proteins, including Huntingtin protein (htt), which is known to play an important role in facilitating the palmitoylation activity of HIP14. It regulates several functions of htt by palmitoylation at the residue cysteine 214, which is essential for htt trafficking and function [1, 2]. There exists a reciprocal relationship between the over expression of HIP14 and formation of huntingtin inclusion bodies [2]. Therefore, abnormal modifications resulting from the palmitoylation of htt contributes to protein misfolding and formation of inclusion bodies, a characteristic feature of Huntington disease (HD) [3]. Earlier studies have suggested that an expansion of polyglutamine tract of htt alters the interaction between htt and HIP14 [13]. This reduces the post translational modification of htt by HIP14 resulting in the formation of inclusion bodies and enhanced neuronal toxicity. Consequently, it could also result in neuronal dysfunction in HD thus obstructing the normal intracellular transport pathways in neurons [3]. PATs have also been established as anticancerous drug targets as they help in the trafficking of several oncogenes through post translational modifications. siRNA studies have demonstrated the PAT activity in HIP14 can be targeted by small molecule inhibitors [4].

The crystal structure of the HIP14 fragment contains seven ankyrin repeats with each repeat possessing a helix-turn-helix-β-turn structure. It also contains an aromatic cage on the surface which is the potential site for methyl-lysine binding. Glutamate and aspartate residues present in the aromatic cage play an essential role in the differential selection of methylation states of lysine. The N-terminal domain of the protein is responsible for substrate recognition. The aromatic cage is lined by hydrophobic residues comprising two tryptophans, one tyrosine and one methionine [6]. We have investigated the inhibitory potential of the various inhibitors that target the Huntington disease. Also, we have focused on the potential of the new binding site for interaction with drugs varying in their chemical and pharmacological properties that are employed in the symptomatic treatment of HD. Therefore, several classes of heterogeneous molecules were used for the present study. Inhibitors that have been screened initially targeted the various processes of the disease. Only those inhibitors that adhered to the Lipinski’s rule of Five [5] have been evaluated further for their inhibitory effects against HIP14.

So in the present study we hypothesize the binding activity of surface aromatic cage [6] involved in methyl-lysine binding with the screened inhibitors. The compounds have been adjudged based on their docking energy and free energy of binding. These compounds majorly belong to alcohols, carbohydrates and tetracycline. The discovery of these potential inhibitors against HIP14 opens future avenues for designing compounds that could target the interaction of HIP14 with htt— an important protein that directly relates to Huntington disease. Reduction in interaction could down regulate the htt’s expression, providing a lead to prevent the inclusion body formation in Huntington disease. To the best of our knowledge this is the first report that explores the druggable property of the HIP14.

Methods

Selection and preparation of macromolecule

Crystal structure of the HIP14 protein (3EU9) was retrieved from PDB databank (http://www.pdb.org/) [6, 7]. The protein as well as ligand optimization were carried out using sybyl-x1.1 with conjugate gradient method [8]. Rigid docking was performed for studying protein-inhibitor interactions through AutoDockTools. The atom types and bond types were assigned and only polar hydrogens were added to neutralize the protein [9, 10]. The grid maps were generated with different docking grids points spanning different regions of the protein with the grid point spacing of 0.375 Å. The grid map has covered the active site along with the significant portions of the surrounding surface [11].

Retrieval and preparation of ligand dataset

Thirty four drugs were obtained from PubChem database (http://pubchem.ncbi.nlm.nih.gov/) and their qualitative and quantitative characterizations such as physiochemical properties were analyzed. To generate dataset, compounds have been screened based on the above defined properties [5, 12] and Open Babel software were used for the file format conversion (http://openbabel.org) [13].

Solvent accessibility, logP and logD calculations

The different aspects of solvent accessible surface (SAS) of the biologically active compounds were visualized using Accelrys Discovery Studio Visualizer2.5 (http://www.accelrys.com/products/downloads/ds_visualizer/) with default parameters. The radius of a rolling probe was taken as 1.4 Å, which imitates the size of a water molecule. Solvent accessibility was calculated for side chains and all residue atoms. At different pH, logP and logD values of drugs have been derived from PubChem and ChemSilico Predict (http://www.chemsilico.com) respectively.

Interaction studies and binding pattern detection

Molecular docking has been carried out using Lamarckian Genetic algorithm with AutoDock4.1 with default parameters. Docking interactions have been clustered to determine free energy of binding and the optimal docking energy conformation that is considered as the best docked pose [11]. Blind docking and different docking grids have been used to rule out the possibility of false results. Subsequently, the standard deviation of mean (MD) was calculated using optimal energy ligand conformation obtained after each run. RMSD calculations were performed using protein- Minocycline complex, exhibiting lowest inhibitory constant as the template.

Alternatively, docking was carried out between HIP14 and inhibitors to check for the existence of any interaction with the previously known binding pocket. The geometry based protein-protein docking transformations between htt-HIP14 and htt-HIP14-inhibitor complex was also performed using Patch Dock Server (http://bioinfo3d.cs.tau.ac.il/PatchDock/) [14] to yield good molecular shape complementarity with the default settings for further validation. CastP server has been used for validation of the newly defined pocket (http://sts.bioengr.uic.edu/castp/).

Binding energy and pattern analysis

Generated conformations had an associated value of binding free energy. Estimated inhibition constants (Ki) were used for determination of binding energies of different docking conformations, ranking in accordance to their binding scores [15]. Ki was calculated using the Lamarckian genetic algorithm applied on a large dataset of complexes as implied in Autodock4.1. For the estimation of binding free energy in kcal mol-1 the electrostatic energy, van der Waals, hydrogen bond, desolvation energy, total intermolecular and torsional energy of binding were used (Tables 3, 4, S2 and S3). A 2 Å RMSD constraint with respect to the HIP14 crystal structure was set for studying protein-inhibitor interactions. Chimera and DS Visualizer2.5 [16] software were used for visualization and calculation of respective interactions.

Intermolecular Gibbs free energy was calculated as described previously [17].

Statistical performance

The statistical parameters of sensitivity, specificity, false positive rate, false negative rate, precision, recall, F-statistics and accuracy have been calculated for the 18 inhibitors that have primarily screened, based on their ‘drug likeliness’. The calculations have been done twice to determine the precision in our results. All the formulas for the statistical calculations are included in the supplementary Table S4 [18].

Results

Rationale for macromolecule and ligand selection

To determine the inhibition efficacy of various drugs against HIP14. The dataset of 34 inhibitors, given in Fig. 1 are useful for the diagnosis and treatment of Huntington disease as shown in Table 1. Lipinski’s rule of five [5] was applied to limit the number of inhibitors to 18 which are further used for screening (Fig. 2).

Fig. 1
figure 1

Inhibitors used for docking studies Chemical structure of 34 inhibitors dataset used for the diagnosis and treatment of Huntington disease and their CID are also given

Table 1 Physiochemical properties of selected drugs
Fig. 2
figure 2

Flow chart depicting the strategy used for screening of compounds 18 inhibitors which have been screened based on their physiochemical properties. Inhibitors with molecular weight (M.W.), hydrogen bond acceptance (HBA) and hydrogen bond donation (HBD) potential within the range of Lipinski’s rule of five have been considered for the study

Correlation between logP and polar surface area (PSA)

For determination of the absorption of various inhibitors logP and polar surface area (PSA) were considered. A threshold of 140 Å2 is used for distinguishing between poorly absorbed and well absorbed inhibitors. The mean values of logP and PSA are 2.3226 and 64.5867 Å, respectively. The inhibitors close to mean logP and PSA value were considered to be readily absorbed, as compared to those exhibiting values far from the threshold. There exists a reciprocal relationship between the PSA and human intestinal absorption [19] (Table 2).

Table 2 Comparison of various parameters for screened potential drugs

Based on logP and PSA values the inhibitors with alcoholic scaffold were found to be readily absorbed whereas the inhibitors with tetracycline and carbohydrate scaffold showed poor intestinal absorption.

Comparison of logP and logD values of inhibitors at different pH

The partition coefficient (logP) is the ratio of the concentrations of an unionized compound in the two phases of a mixture of octanol and water at equilibrium, whereas the distribution coefficient (logD) is the ratio of the sum of the concentrations of all forms of the compound (both ionized and unionized) in each of the two phases [20]. On comparing logP and logD values, it was found that as the logP value increases from negative toward positive value, logD also increases, but as logP value increases for Mephenesin(1.4), Metoprolol(1.9), Oxpranolol(2.1) logD value decreases. Furthermore it was observed that logD value increases for Propranolol as shown in Tables 2, S1. LogP values are also related to the solvent accessible surface area (SASA) and PSA, indicating compounds activity (Tables 2 and S1). An increase in PSA values with substantially high SASA values result in a proportional increase in inhibitor activity [21].

Docking studies and energies calculation

The 18 screened inhibitors have been docked to HIP14 protein using Lamarckian genetic algorithms. The docking results, the predicted binding energies and inhibitory constant have been calculated for all of them (Tables 3, 4, S2 and S4). The results of predicted binding energies have been clustered to determine the optimal docking energy conformation that is considered as the best interacting pose. Further, the results were confirmed using blind dock and different grid size spanning different regions of the protein. Unanimously both the strategies have indicated the HIP14 interaction with inhibitors involves a new binding site. It comprises residues Leu211, Thr212, Asn214, Val215, Ser216, Val217, Asn218, Leu219, Glu246, Ala247, and Gly248 (Fig. 3). The relative contribution of the various residues in these interactions has also been evaluated. Leu211, Val217 and Asn218 are the major interacting residues, lining the pocket (Fig. 4). The results have been validated by docking the HIP14 and also HIP14-inhibitor complex against htt. The ankyrin repeat domain of HIP14 was involved in interaction with the htt and the complex. None of the residues of this new found pocket were involved in protein-protein interactions. However, all the inhibitors interacted with the residues present in the new pocket. The various energies for binding were estimated which includes the electrostatic component of binding free energy, van der Waals, hydrogen bond, desolvation energy, total intermolecular energy, torsional energy and Gibbs free energy of binding. The inhibitory constant Ki for protein-inhibitor binding was calculated which exhibits a strong correlation with the binding energy. Lower Ki values directly relates to the docking energy and inversely to the binding affinity. The RMSD values for the best inhibitors were calculated. The conformation with minimum docking energy in each cluster obtained after every run is taken as reference for calculating the standard deviation of mean. CastP has been used to validate our predictions.

Table 3 Calculated energies and RMSD for protein complexes with potential inhibitors showing interaction with Val 217
Table 4 Calculated energies and RMSD for protein complexes with potential inhibitors showing interaction with Asn218
Fig. 3
figure 3

Cartoon representation of the whole protein with localization of previously known as well as the newly predicted binding pockets Leu211, Val217, Asn218, Ala247, Gly248 are some of the residues lining the newly identified binding pocket and Met191, Trp196, Tyr198 and Trp231 residues localized at the aromatic cage that have been shown as sticks

Fig. 4
figure 4

Docking conformations of the best protein-inhibitor complexes; a Minocycline in complex with HIP14. Hydrogen bonds have been depicted as dashes; b 18 F Fluorodeoxyglucose in complex with HIP14; c Metoprolol in complex with HIP14

Protein–ligand interactions

Based on the energy values for various proteins—inhibitor complexes, nine out of the 22 inhibitors possessing the lowest binding energy were chosen as the best inhibitors. Ethambutol, Prenalterol, Dexpanthenol, have been screened out at the initial stage, owing to their high docking energies. The best inhibitors were broadly classified into three groups namely; alcohols (including phenolic scaffold), carbohydrates (deoxy glucose family) and tetracycline. Complexes formed by Gabapentin, Prenalterol, Paroxetine, would be less stable as they form only one hydrogen bond with the binding pocket. Memantine, mirtazapine have been screened out as they do not show any interaction with the protein, despite their low docking energies. Haloperidol, Venlafaxin interacts with only Asn217. Propranolol, Oxprenolol interacts with both Val217 and Asn218 however forming hydrogen bond only with the former. Out of these nine inhibitors; Minocycline, 18 F Fluorodeoxyglucose and Metoprolol show the optimal binding energies and also they interact with both the residues i.e. Val217 and Asn218 present in the newly defined binding pocket (Fig. 4a, b, c). Minocycline is an inhibitor that plays a role in delaying disease progression of Huntington's disease [22]. Optimal Gibbs free energy values for these three drug complexes are indicative of their maximum stability.

Statistical prediction based on standard deviation of mean values

To define the accuracy of predictions done in the present study pertaining to various docking interactions standard deviation of mean (MD’s) were used [23]. When MD values have been used for finding the accuracy, the test outcome can be either positive (positive MD) or negative (negative MD), while the actual MD remains within the range of tolerance. Hence, with large numbers of false positives and few false negatives, a positive interaction pocket was retrieved. The 100% sensitivity and specificity of 93.33% indicates that the selected inhibitors specifically bind to the newly defined pocket. A false positive rate (α) of 6.667% can be attributed to some of those inhibitors that exhibit interaction either with Val217 or Asn218 of the binding pocket. However, false negative rate (β) remains at zero discarding any possibility of the inclusion of any non-interacting residues. A recall value of 100% signifies that all the relevant interactions have been retrieved with 97.22% accuracy in these interactions and 95.45% precision in the results after repeated calculations. Thus, sensitivity is synonymous to recall values. An F-test has been applied which shows 97.67% value defining the predicted pocket to be the best interacting pocket (Table 5). The major interactions in this pocket take place through hydrogen bonds that are most likely an essential requirement for macromolecule-inhibitor interactions.

Table 5 Statistical analysis of screened protein-inhibitor complexes

Discussion

The main finding of the present study indicates a new binding pocket distinct from the aromatic cage, present on the surface of HIP14. This new site provides the binding surface to many of the small molecule compounds that potentially inhibit the protein. Furthermore, the specific interaction of inhibitors with this pocket was established based on the statistical results and cross validation using web based server. Interestingly, HIP14 is known to modulate the expression of htt protein (that causes Huntington disease) through protein-protein interactions. We have considered 225 residues long C chain of an ankyrin repeat domain of HIP14 for studying the protein-inhibitor interactions. The N terminal domain of the protein is involved in substrate recognition [6]. The C chain comprised of 7-8 ankyrin repeats but lacks five predicted trans membrane helices and DHHC motif that has been reported to be involved in palmitoylation [24]. The protein has also been established as an important drug target as it regulates the sub cellular localization of various oncogenes [4].

Docking tools have turned out to be an effective means for the prediction of binding pockets and ligand-protein interactions. The entire docking method exploits the physiochemical properties and surface complementarities of residues present in the binding site based on how the scoring functions rank the various docking outputs [25]. We have confirmed the identification of the new binding pocket using both blind docking as well as utilizing different docking grids spanning various regions of the entire protein. We sought to determine whether the newly predicted site binds to various putative inhibitors against Huntington disease. The new binding pocket exhibits high specificity for HIP14-inhibitor interactions. An exhaustive computational analysis of the molecular interaction of protein-inhibitor binding performed on dual active site reveals the critical role played by hydrogen bond interactions. Multiple hydrogen bonds confer stability to the protein-inhibitor complexes [21]. Our analyses indicate that better hydrogen interactions lead to a better binding of the drug to protein. The best inhibitors (Tables 3, 4) are comprised of carbohydrates from deoxy-glucose family, alcohols and tetracycline. They have earlier been known to play an important role in the treatment and diagnosis of not only Huntington disease but also to cancer [2629].

The binding of the best inhibitors results in the release of free energy. The analyses of hydrogen bonding interactions point out that the hydrogen bonds contribute to the binding free energy. This stabilizing behavior of the HIP14-inhibitor complex is due to establishment of hydrogen bonds between the electronegative atoms of the anion and electropositive protons of the cation [30]. They contribute the most important role to be the potent inhibitor. The best protein-inhibitor complexes exhibit the minimum intermolecular Gibbs free energy. These complexes have been stabilized by a multitude of forces acting at the binding pocket. Most of the inhibitors make hydrogen bonds with Val217 and Asn218. Also, Val217, Leu211, Leu219, Val215 and Ala247 are the major nonpolar aliphatic residues lining the binding pocket that contribute to hydrophobic interactions. The newly predicted site is specific for these inhibitors as geometry based docking transformations against the previously known binding site did not show any interactions. The inhibitors having the lower inhibitory constant show higher affinity to the protein. The statistical analysis has shown the results to be accurate and precise. These studies provide a strong foundation for future experimental testing to validate the compounds as true lead molecules that can be used to cure Huntington disease.