Application of two-dimensional binary fingerprinting methods for the design of selective Tankyrase I inhibitors

Muddukrishna, B. S.; Pai, Vasudev; Lobo, Richard; Pai, Aravinda

doi:10.1007/s11030-017-9793-0

Application of two-dimensional binary fingerprinting methods for the design of selective Tankyrase I inhibitors

Original Article
Published: 22 November 2017

Volume 22, pages 359–381, (2018)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Molecular Diversity Aims and scope Submit manuscript

Application of two-dimensional binary fingerprinting methods for the design of selective Tankyrase I inhibitors

Download PDF

B. S. Muddukrishna¹,
Vasudev Pai²,
Richard Lobo² &
…
Aravinda Pai³

657 Accesses
3 Citations
Explore all metrics

Abstract

In the present study, five important binary fingerprinting techniques were used to model novel flavones for the selective inhibition of Tankyrase I. From the fingerprints used: the fingerprint atom pairs resulted in a statistically significant 2D QSAR model using a kernel-based partial least square regression method. This model indicates that the presence of electron-donating groups positively contributes to activity, whereas the presence of electron withdrawing groups negatively contributes to activity. This model could be used to develop more potent as well as selective analogues for the inhibition of Tankyrase I.

Graphical Abstract

Schematic representation of 2D QSAR work flow

3D-QSAR, Docking, ADME/Tox studies on Flavone analogs reveal anticancer activity through Tankyrase inhibition

Article Open access 01 April 2019

FlavoDb: a web-based chemical repository of flavonoid compounds

Article 31 October 2019

Ligand-based studies on cis-stilbene derivatives as cyclo-oxygenase inhibitors

Article 17 April 2017

Discover the latest articles, news and stories from top researchers in related subjects.

Polymers

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Molecular similarity is one of the highly applied concepts in rational drug design. It assumes that structurally similar fragments will elicit a similar biological response. The concept of bioisosterism is closely related to molecular similarity where substructures may be interchanged with retention of some degree of biological activity. Assessment of molecular similarity is a tedious process. Molecular graphs and molecular fingerprints can be used systematically to find special chemical features responsible for the biological activity. Fingerprints are binary vectors representing specific key structures in a chemical entity. Each bit score is represented by a binary score of either 1 or 0, where 1 represents the presence of a specific chemical feature, while 0 represents the absence of such feature. Computationally advanced software systems use either 32- or 64-bit systems. In 32-bit systems, the probability of encountering the same fragment is 1/2$^{32}$ and for 64-bit applications the probability is 1/2$^{64}$. The probability of collision is minimized by using advanced computational software which gives statistically significant results. Molecular similarity may be transcribed into a numerical value and can be applied for similarity measurement, virtual screening methods and cluster analysis. In the present study, five different types of binary fingerprinting methods were used to predict the structural requirements for the selective inhibition of Tankyrase I on a series of substituted flavones. The suitability of the particular fingerprint for the selected molecules will be assessed by kernel-based partial least square values. Although Tankyrases have become attractive targets for anticancer agents, there are few effective drugs that inhibit Tankyrase. This is why this simulation study was carried out.

Cancer is one of the most concerning diseases of the modern world. All the cancers have a common characteristic feature known as a dysregulated cell cycle machinery. Among all cancers, the colorectal cancer (crc) has attracted our attention. The recent literature reports that colorectal cancer (crc) has the second highest mortality rate in the cancer segment [1]. The exact molecular mechanisms for the development of colorectal cancer have been deduced. The single major factor responsible for the development of colorectal cancer is the over activation of the Wnt signaling pathway by its central activator beta-catenin [2, 3]. Under normal conditions, $\upbeta $-catenin levels are highly regulated by a feedback mechanism involving a $\upbeta $-catenin degradation complex [4]. Gene mutation causes the disruption of $\upbeta $-catenin degradation components. This leads to translocation of $\upbeta $-catenin into the nucleus, causing abnormal activation of transcription factors and gene networks responsible for the development of colorectal cancers. There is an urgent need to develop small molecule inhibitors for the selective inhibition of the Wnt signaling pathway, thereby increasing the levels of $\upbeta $-catenin degradation complex to combat colorectal cancer.

Tankyrases are members of the poly ADP-ribose polymerase (PARP) family of proteins. They have attracted attention because of their role in axin down regulation and stabilization of $\upbeta $-catenin. The post-translational modification involves the cleavage of PARP proteins, which results in splitting of NAD into ADP ribose and nicotinamide units. Poly ADP-ribose polymerases are responsible for many important biochemical signaling process in a cell machinery. The recently developed Tankyrase inhibitors were able to increase axin levels and down regulate $\upbeta $-catenin levels [5].

Recent literature also gave us insight about the advantage of selective inhibition of Tankyrase I. Partial knockdown of Tankyrase I leads to the shortening of telomere length [6]. The combination of selective Tankyrase I inhibitors with other class of drugs could be a viable strategy for the treatment of cancers like colorectal cancers.

Extensive literature search for small molecule inhibitors of Tankyrase I arouse our interest in the flavone pharmacophore due its diverse biological activity, and simple scaffold Flavones are naturally occurring secondary plant metabolites categorized under the broad class of flavonoids. They possess various degrees of free radical scavenging properties and are present in a wide variety of edible plants and vegetables. Flavonoids have also been shown to possess antitumor effect in various cancer cell lines. Inhibition of TNKS1 with flavone and its antiproliferative properties have already been reported. The present study is based on work reported in the literature [7].

Materials and methods

Selection of data set

In the present study, a data set of 25 out of 30 compounds was chosen from the literature [7] based on their structural diversity and activity. A training set of 19 molecules ($70\%$ of total molecules) was used to generate a kernel-based partial least square regression equation. The training set molecules were selected based on their structural diversity, activity range of 3 log order difference and activities covering the entire range.

To assess the predictive accuracy of the generated kernel-based regression model, a set of 6 molecules were chosen for the test set. The test set was selected in such a way that it is a representation of the training set. The training set was used to generate a 2D QSAR model, and the test set was used to validate the generated model.

Activity values (IC50) from the literature were converted to pIC50 (logarithmic scale) using options available in the calculator. The logarithmic activity is termed as pact.

Importing the energy minimized structures along with their logarithmic scale activity

The structures were imported into Maestro [8], screened for errors, valence parameters, invalid chemistry and duplicate structures in order to avoid interferences in the generation of the model.

Selection and incorporation of various molecular properties

Molecular descriptors were incorporated using the option molecular properties available with the Canvas 2.9 interface. There are main four types of descriptors in Canvas, namely physicochemical descriptors, topological descriptors, ligfilter descriptors and Qik-prop descriptors.

Feature selection

Feature selection is based on a hierarchical clustering to identify subsets of properties that are representative of a larger set. Sixty descriptors were incorporated and explained briefly.

1.
Adsorbability index (AI) A molecular descriptor, which predicts the activated carbon adsorption of chemical substances from aqueous solutions. The adsorbability index [9] for a molecule is calculated by the expression:
$$\begin{aligned} {\hbox {AI}}=\sum \limits _{i} A_{i}+\sum \limits _{i} I_{i} \end{aligned}$$
where A represents the atomic or group contributions of increasing or decreasing adsorbability in the chemical species and I represents the necessary correction factors.
2.
ALOGP One of the most applied universal lipophilicity descriptors derived directly from experimental data.
3.
Atomic composition indices The descriptor possessing zero-dimensional attribute deduced directly from the composition of chemical entities and also provides information on the molecular weight and atomic composition of the chemical entity.
4.
Total information index on atomic composition ${(}{I}_{{\mathrm{AC}}})$ This descriptor calculates total information on atomic composition of a molecule directly from its molecular formula including hydrogen atoms [10].

The atomic composition index (${I}_{\mathrm{AC}}$) is calculated using the formula:
$$\begin{aligned} I_{\mathrm{AC}}=A^{\mathrm{h}}\cdot \hbox {log}_{2} A^{\mathrm{h}}- \sum \limits _{g} A_{g}\cdot \hbox {log}_{2} A_{g} \end{aligned}$$
${A}^{\mathrm{h}}$ total sum of atoms including hydrogens, $A_{g}$ total sum of atoms which belongs to the gth chemical element.
5.
Nuclear information content descriptor $(I_{{\mathrm{NUCL}}})$ An important descriptor which gives the information on the total number of protons and neutrons present in all the types of nuclei in a given molecule.

It is expressed by the following formula:
$$\begin{aligned} I_{\mathrm{NUCL}}={\mathop {\sum }\limits _{i=1}^{A}} I_{i}^{\mathrm{n,p}} \end{aligned}$$
6.
Information index on size $(I_{{\mathrm{SIZE}}})$ This descriptor provides complete information content based on the atomic number [11].

It is expressed as:
$$\begin{aligned} I_{\mathrm{SIZE}}=A^\mathrm{h}\cdot \hbox {log}_{2} A^\mathrm{h} \end{aligned}$$
${A}^{\mathrm{h}}$ is the atom number with or without considering hydrogen atoms.
7.
Autocorrelation descriptors ${(\mathrm{AC}}_{{\mathrm{L}}})$ This molecular descriptor is based on the autocorrelation function expressed as
$$\begin{aligned} \hbox {AC}_{\mathrm{L}}={\mathop {\int }\limits _{a}^{b}} f(x)\cdot f(x+l) \cdot {\mathrm{d}x} \end{aligned}$$
8.
Topological electronic descriptors ${(T}^{{E}}{)}$ These descriptors [12] are based on partial atomic charges (q) and are expressed as
$$\begin{aligned} T^{\mathrm{E}}={\mathop {\sum }\limits _{i=1}^{A-1}} {\mathop {\sum }\limits _{j=i+1}^{A}} {\frac{|q_{i}-q_{j}|}{r^{2}_{ij}}} \end{aligned}$$
9.
Partial charge weighted topological electronic index ${(\hbox {PCWT}}^{{\mathrm{E}}}{)}$ A molecular descriptor based on the topological electronic index. It is expressed as
$$\begin{aligned} \hbox {PCWT}^{\mathrm{E}}={\frac{1}{Q^{-}_{\mathrm{max}}}}\cdot {\mathop {\sum }\limits _{b=1}^{B}}\left( {\frac{|q_{i}-q_{j}|}{r^{2}_{ij}}}\right) _{b} \end{aligned}$$
10.
Local dipole index (D) This descriptor calculates average differences in the magnitude of charge between overall bonded atom pairs (i–j) [13] and is expressed as
$$\begin{aligned} D={\frac{\sum \nolimits _{b}|q_{i}-q_{j}|{_{b}}}{B}} \end{aligned}$$
11.
Atom in structure invariant index (ASII) It belongs to a class of charge-related indices derived from hydrogen depleted molecular graphs. It is grouped under the category of global descriptors.

It is expressed as
$$\begin{aligned} {\hbox {ASII}}_{i}={\hbox {ASII}}_{i}^{0}-h_{i}+q_{i} \end{aligned}$$
12.
Charged partial surface area descriptors (CPSA) These are the set of descriptors [14] that correlate shape and electronic information content to identify chemical compounds and to quantitate the polar interaction between molecules. They mainly consider the Vander Walls radius as a key feature for the quantitation.
13.
Partial negative surface area ${(\hbox {PNSA}}_{{1}}{)}$ It describes the information on total area of solvent-accessible surface area, comprised of all the negatively charged atoms.

It is expressed as
$$\begin{aligned} {\hbox {PNSA}}_{1}=\sum \limits _{a-} \mathrm{SA}_{a}^{-} \end{aligned}$$
14.
Partial positive surface area ${(\hbox {PPSA}}_{{1}}{)}$ It is the total summation of the solvent-accessible surface area of all the positively charged atoms.

It is expressed as
$$\begin{aligned} {\hbox {PPSA}}_{1}=\sum \limits _{a+} \mathrm{SA}_{a}^{+} \end{aligned}$$
The sum is restricted to positively charged atoms $\left( {a+} \right) $
15.
Total charge weighted negative surface area ${(\hbox {PNSA}}_{{2}}{)}$ The product of partial negative solvent-accessible surface area to the total negative charge (${Q}^{-}$).

It is expressed as
$$\begin{aligned} {\hbox {PNSA}}_{2}=Q^{-} \cdot \sum \limits _{a-} \mathrm{SA}_{a}^{-} \end{aligned}$$
16.
Total charge weighted positive surface area ${(\hbox {PPSA}}_{{2}}{)}$ The product of partial positive solvent-accessible surface area multiplied by the total positive charge (${Q}^{+}$).

It is expressed as
$$\begin{aligned} {\hbox {PPSA}}_{2}=Q^{+} \cdot \sum \limits _{a+} \mathrm{SA}_{a}^{+} \end{aligned}$$
17.
Atomic charge weighted negative surface area ${(\hbox {PNSA}}_{{3}})$ The product of atomic solvent-accessible surface areas and partial charges ${q}_{\mathrm{a}}^{-}$ over all negatively charged atoms.

It is expressed as
$$\begin{aligned} {\hbox {PNSA}}_{3}=\sum \limits _{a-} q_{a}^{-}\cdot \mathrm{SA}_{a}^{-} \end{aligned}$$
18.
Atomic charge weighted positive surface area ${(\hbox {PPSA}}_{{3}})$ The summation of the products of atomic solvent-accessible surface areas and partial charges ${q}_{\mathrm{a}}^{+}$ over all positively charged atoms.

It is expressed as
$$\begin{aligned} {\hbox {PPSA}}_{3}=\sum \limits _{a+} q_{a}^{+}\cdot \mathrm{SA}_{a}^{+} \end{aligned}$$
19.
Difference in the charged partial surface area ${(\hbox {DPSA}}_{{1}}{)}$ The difference between partial positive solvent-accessible surface area and partial negative solvent-accessible surface area.

It is expressed as
$$\begin{aligned} {\hbox {DPSA}}_{1}={\hbox {PPSA}}_{1}-{\hbox {PNSA}}_{1} \end{aligned}$$
20.
Difference in the total charge weighted surface area ${(\hbox {DPSA}}_{{2}}{)}$ The difference between total charge weighted positive solvent-accessible surface area and total charge weighted negative solvent-accessible surface area.

It is expressed as
$$\begin{aligned} {\hbox {DPSA}}_{2}={\hbox {PPSA}}_{2}-{\hbox {PNSA}}_{2} \end{aligned}$$
21.
Difference in the atomic charge weighted surface area ${(\hbox {DPSA}}_{{3}}{)}$ The difference between atomic charge weighted positive solvent-accessible surface area and atomic charge weighted negative solvent-accessible surface area.

It is expressed as
$$\begin{aligned} {\hbox {DPSA}}_{3}={\hbox {PPSA}}_{3}-{\hbox {PNSA}}_{3} \end{aligned}$$
22.
Relative negative charge (RNCG) Partial charge of the most negative atom divided by total negative charge.

It is expressed as
$$\begin{aligned} {\hbox {RNCG}}={\frac{Q_{\mathrm{max}}^{-}}{Q^{-}}} \end{aligned}$$
23.
Relative positive charge (RPCG) Partial charge of the most positive atom divided by total positive charge.

It is expressed as
$$\begin{aligned} {\hbox {RPCG}}={\frac{Q_{\mathrm{max}}^{+}}{Q^{+}}} \end{aligned}$$
24.
Relative negatively charged surface area (RNCS) The solvent-accessible surface area of the most negative atom divided by the relative negative charge (RNCG).

It is expressed as
$$\begin{aligned} {\hbox {RNCS}}={\frac{\hbox {SA}_{\mathrm{max}}^{-}}{{\hbox {RNCG}}}} \end{aligned}$$
25.
Relative positively charged surface area (RPCS) The solvent-accessible surface area of the most positive atom divided by the relative positive charge (RPCG).

It is expressed as
$$\begin{aligned} {\hbox {RPCS}}={\frac{\hbox {SA}_{\mathrm{max}}^{+}}{{\hbox {RPCG}}}} \end{aligned}$$
26.
Total hydrophobic surface area (TASA) The sum of solvent-accessible surface areas of atoms with absolute value of partial charges less than 0.2.

It is expressed as
$$\begin{aligned} {\hbox {TASA}}=\sum \limits _{a} \hbox {SA}_{a} \end{aligned}$$
27.
Total polar surface area (TPSA) The sum of solvent-accessible surface areas of atoms with absolute value of partial charges greater than or equal to 0.2.

It is expressed as
$$\begin{aligned} {\hbox {TPSA}}=\sum \limits _{a} \hbox {SA}_{a} \end{aligned}$$
28.
Relative hydrophobic surface area (RASA) The ratio of total hydrophobic surface area (TASA) to the total molecular solvent-accessible surface area (SASA).

It is expressed as
$$\begin{aligned} {\hbox {RASA}}={\frac{{\hbox {TASA}}}{{\hbox {SASA}}}} \end{aligned}$$
29.
Relative polar surface area (RPSA) The total polar surface area (TPSA) divided by the total molecular solvent-accessible surface area (SASA).

It is expressed as
$$\begin{aligned} {\hbox {RPSA}}={\frac{{\hbox {TPSA}}}{{\hbox {SASA}}}} \end{aligned}$$
30.
RHTA index The ratio of hydrogen bond donor groups to hydrogen bond acceptor groups.

It is expressed as
$$\begin{aligned} {\hbox {RHTA}}={\frac{{\hbox {HBD}}}{{\hbox {HBA}}}} \end{aligned}$$
31.
SSAH index The total surface area of hydrogen atoms that can be readily donated.

It is expressed as
$$\begin{aligned} {\hbox {SSAH}}\equiv {\hbox {HDSA}}= \sum \limits _{d} \hbox {SA}_{d} \end{aligned}$$
32.
RSAH index The average surface area of hydrogen atoms that can be donated.

It is expressed as
$$\begin{aligned} {\hbox {RSAH}} = {\frac{\sum \nolimits _{d} \hbox {SA}_{d}}{{\hbox {HBD}}}} \end{aligned}$$
33.
RSHM index The fraction of the total molecular surface area associated with hydrogen atoms that can be readily donated.

It is expressed as
$$\begin{aligned} {\hbox {RSHM}} \equiv {\hbox {FHDSA}}={\frac{\sum \nolimits _{d} \hbox {SA}_{d}}{{\hbox {SASA}}}} \end{aligned}$$
34.
SSAA index The sum of the surface areas of all hydrogen bond acceptor atoms.

It is expressed as
$$\begin{aligned} {\hbox {SSAA}}\equiv {\hbox {HASA}}= \sum \limits _{a} \hbox {SA}_{a} \end{aligned}$$
35.
RSAM index The fraction of the total molecular surface area associated with H-bond acceptor groups.

It is expressed as
$$\begin{aligned} {\hbox {RSAM}} \equiv {\hbox {FHASA}}={\frac{\sum \nolimits _{a} \hbox {SA}_{a}}{{\hbox {SASA}}}} \end{aligned}$$
36.
HDCA index The sum of charged surface areas of hydrogen atoms that can be donated.

It is expressed as
$$\begin{aligned} {\hbox {HDCA}} = \sum \limits _{d} q_{d} \cdot \hbox {SA}_{d} \end{aligned}$$
37.
HBSA index The sum of the surface areas of both hydrogen atoms that can be donated to hydrogen acceptor atoms.

It is expressed as
$$\begin{aligned} {\hbox {HBSA}}= {\hbox {HDSA}}+ {\hbox {HASA}} \end{aligned}$$
38.
Graph distance complexity Molecular descriptor derived from the distance matrix D [15].

It is expressed as
$$\begin{aligned} H_{D}={\mathop {\sum }\limits _{i=1}^{A}}{\frac{\sigma _{i}}{I_{\mathrm{ROUV}}}} \cdot v_{i}^{d}= {\mathop {\sum }\limits _{i=1}^{A}} {\frac{\sigma _{i}}{2W}}\cdot v_{i}^{d} \end{aligned}$$
39.
Polar hydrogen factor ${(Q}_{{\mathrm{H}}}{)}$ This descriptor correlates polarity of molecules to C–H bonds. Its application is limited to halogenated hydrocarbons.

It is expressed as
$$\begin{aligned} {Q}_{\mathrm{H}}=\sum \limits _{b} \left[ \sum \limits _{C} k_{C}+ \sum \limits _{\alpha } k_{\alpha }+ \sum \limits _{\beta } k_{\beta }\right] \end{aligned}$$
40.
Q polarity index Topological polarity index derived from the electro topological intrinsic state of the atoms confined in a molecule.

It is expressed as
$$\begin{aligned} {Q}={\frac{A^{2}\cdot {\mathop {\sum }\nolimits _{i=1}^{A}} I_{i}^{\mathrm{ALK}}}{\left( {\mathop {\sum }\nolimits _{i=1}^{A}} I_{i}\right) ^{2}}} \end{aligned}$$
41.
Molecular polarizability effect index (MPEI) This descriptor works by the principle that molecules are polarized by electrostatic potential fields [16]. The index is calculated by summing the polarizability contributions from different atoms in a molecule.

It is expressed as
$$\begin{aligned} \hbox {MPEI}={\mathop {\sum }\limits _{i=1}^{A}}\hbox { PEI}_{i} \end{aligned}$$
42.
Balaban distance connectivity index (J) It is one of the topological descriptors and its value does not vary substantially with the size of the molecules or number of ring systems.

It is expressed as
$$\begin{aligned}&{J}={\frac{B}{C+1}}\cdot \sum \limits _{b} (\sigma _{i}\cdot \sigma _{j})_{b}^{-1/2}\\&\quad = {\frac{1}{C+1}}\cdot \sum \limits _{b} (\bar{\sigma }_{i}\cdot \bar{\sigma }_{j})_{b}^{-1/2} \end{aligned}$$
43.
Atomic charge (q) The experimental approach to calculate atomic charge is called Mulliken population analysis [17]. The method allocates electrons to atoms, transforming atomic charge to a local descriptor.

It is expressed as
$$\begin{aligned} q_{a}={Z}_{a}-{\mathop {\sum }\limits _{\mu =1}^{{N}_{\mathrm{AO}}}} {\mathop {\sum }\limits _{v=1}^{{N}_{\mathrm{AO}}}} P_{\mu v}\cdot S_{\mu v} \end{aligned}$$
44.
Sub-molecular polarity parameter ${(\hbox {SPP}}^{{1}} \Delta )$: An electronic descriptor [18] that mathematically determines the excess charge difference between a pair of atoms.

It is expressed as
$$\begin{aligned} {}^{1}\Delta =\left| Q_{\mathrm{max}}^{+}-Q_{\mathrm{max}}^{-}\right| \end{aligned}$$
45.
Second-order sub-molecular polarity parameter ${(}^{{2}} \Delta )$: Determines the second prime difference of excess charges [19].

It is expressed as
$$\begin{aligned} {DP}={\frac{\left| Q_{\mathrm{max}}^{+}-Q_{\mathrm{max}}^{-}\right| }{r_{\pm }^{2}}}= {\frac{{}^{1}\Delta }{r_{\pm }^{2}}} \end{aligned}$$
46.
Molar polarization ${(P}_{{\mathrm{M}}}{)}$ The dipole moment induced for each unit volume V is termed molar polarization. Clausius–Mossotti equation explains this descriptor.

It is expressed as
$$\begin{aligned}&P_{\mathrm{M}}={\frac{\varepsilon -1}{\varepsilon +2}}\cdot {\frac{\hbox {MW}}{\varrho }}={\frac{4\pi }{3}}\cdot {N}_{\mathrm{A}} \cdot \alpha \cdot \\&{E}={\frac{n_{D}^{2}-1}{n_{D}^{2}+2}}\cdot {\frac{\hbox {MW}}{\varrho }}=\hbox {MR} \end{aligned}$$
47.
Atom–atom polarizability A chemical reactivity index solely calculated on the basis of perturbation theory.

It is expressed as
$$\begin{aligned} {P}_{ab}=4.\sum \limits _{i}\sum \limits _{j}\sum \limits _{\mu }\sum \limits _{v} {\frac{C_{i\mu ,a}\cdot C_{j\mu ,a}\cdot C_{iv,b}\cdot C_{jv,b}}{\varepsilon _{i}-\varepsilon _{j}}} \end{aligned}$$
48.
Anisotropy of the polarizability It quantifies the deviation of molecular polarizability from an equivalent spherical shape.

It is expressed as
$$\begin{aligned} \beta ^{2}={\frac{\left( \alpha _{xx}-\alpha _{yy}\right) ^{2}+\left( \alpha _{yy}-\alpha _{zz}\right) ^{2}+ \left( \alpha _{zz}-\alpha _{xx}\right) ^{2}}{2}} \end{aligned}$$
49.
Overall electronic constants These are the Hammett substitution constants [20] which measure the total electronic effect of meta and para substituted benzene rings comprised of substituents in a side chain.

It is expressed as
$$\begin{aligned} \sigma _{\mathrm{m,p}}={\frac{1}{\varrho }}\cdot \hbox {log} \left( {\frac{K^{\mathrm{X}}}{K_{0}}}\right) ={\frac{1}{\varrho }}\cdot \left( \hbox {p}K_{\mathrm{a}}^{0}-{p}K_{\mathrm{a}}^{\mathrm{X}}\right) \end{aligned}$$
50.
Information index on the molecular symmetry ${(I}_{{\mathrm{SYM}}})$: An important molecular symmetry descriptor which is solely based on total information content.

It is expressed as
$$\begin{aligned} I_{\mathrm{SYM}}=A\cdot \hbox {log}_{2}A- {\mathop {\sum }\limits _{g=1}^{G}} A_{g} \hbox {log}_{2} A_{g} \end{aligned}$$
51.
Joshi steric descriptor ${(\mathrm{JM}}_{{1}}{)}$ A descriptor that directly measures the steric effect [21] of substituents. It is grouped under the broad class of quantum chemical descriptors.

It is expressed as
$$\begin{aligned} \hbox {JM1}={\frac{\Delta {E}_{\mathrm{X}}}{\Delta {E}_{\mathrm{H}}}}\quad \hbox {log}(\hbox {JM1})=\hbox {log}(\Delta {E}_{\mathrm{X}})- \hbox {log}(\Delta {E}_{\mathrm{H}}) \end{aligned}$$
52.
Substituent front strain ${(S}_{{\mathrm{f}}}{)}$ A steric descriptor [22] obtained using empirical force fields and calculated based on enthalpy of formation.

It is expressed as
$$\begin{aligned} S_{\mathrm{f}}= & {} \Delta H_{\mathrm{f}}^{0}\left[ \hbox {XC}(\hbox {CH}_{3})_{3}\right] -\Delta H_{\mathrm{f}}^{0}\left[ \hbox {XCH}_{3}\right] \\&+\,8.87\,\,[10^{4}\,\hbox {J/mol}] \end{aligned}$$

53.
Steric vertex topological descriptor (SVTI) Best performing steric descriptor for alkyl groups [23] characterized by their topological distance (d) from an H-depleted molecular graph.

It is expressed as
$$\begin{aligned} {\hbox {SVTI}}={\mathop {\sum }\limits _{j=1}^{A_{\mathrm{X}}}} d_{ij}\quad \forall d_{ij}\le 3 \end{aligned}$$
54.
Steric density descriptor ${(\mathrm{SD}}_{{X}}{)}$ The substituent steric descriptor [24], which correlates molecular mass to van der Waals volume.

It is expressed as
$$\begin{aligned} {\hbox {SD}}_{X}= & {} \left( {\frac{MW}{{V}_{\mathrm{VDW}}}}\right) _{\mathrm{X}}- \left( {\frac{MW}{{V}_{\mathrm{VDW}}}}\right) _{\mathrm{H}}\\= & {} \left( {\frac{MW}{{V}_{\mathrm{VDW}}}}\right) _{\mathrm{X}}-0.29 \end{aligned}$$

Table 1 KPLS results on training set
Full size table

Table 2 KPLS results on test set
Full size table

Table 3 Activity prediction (actual vs. predicted)
Full size table
55.
Model of the Frontier steric effect descriptor ${(R}_{{S}}{)}$ Theoretical descriptor used to estimate Taft’s steric constant [25] which is based on the fundamental characteristics of constituent atoms.

It is expressed as
$$\begin{aligned} R_{\mathrm{S}}=-30 \cdot \hbox {log}\left( 1-{\mathop {\sum }\limits _{i=1}^{n}} {\frac{R_{i}^{2}}{4\cdot r_{i}^{2}}}\right) \end{aligned}$$
56.
Carbo similarity index (C) This descriptor [26] compares two molecules based on their electron density. It is also applied to compare any structural properties between molecules.

It is expressed as
$$\begin{aligned} {C}_{st}={\frac{{\mathop {\sum }\nolimits _{k=1}^{\mathrm{N}}} {P}_{sk}\cdot {P}_{tk}}{{\left( {\mathop {\sum }\nolimits _{k=1}^{\mathrm{N}}{P}_{sk}^{2}}\right) ^{1/2}}\cdot {\left( {\mathop {\sum }\nolimits _{k=1}^{\mathrm{N}}{P}_{tk}^{2}}\right) ^{1/2}}}} \end{aligned}$$
57.
Electrophilic atomic frontier electron density descriptor ${(f}_{{a}}^{-}{)}$ Molecular descriptor reflecting the electron density status in the HOMO orbital of a compound.

It is expressed as
$$\begin{aligned} {f}_{a}^{-}=\sum \limits _{\mu }\left( c_{\mathrm{HOMO},\mu }\right) ^{2} \end{aligned}$$
58.
Hardness indices (${{\eta }}$) These are the class of descriptors derived directly from density functional theory [27]. They correspond to the second-order derivative energy levels with respect to the total number of electrons in the system.

It is expressed as
$$\begin{aligned} \eta= & {} {\frac{1}{2}}\left( {\frac{\partial ^{2}{E}}{\partial {N}_{\mathrm{el}}{^{2}}}}\right) _{v({r})}= \left( {\frac{\partial \mu }{\partial {N}_{\mathrm{el}}}}\right) _{v({r})}\\= & {} \int {h}({r}){ \mathrm{d}r}= {\frac{1}{2\cdot {S}}} \end{aligned}$$
59.
Composite nuclear potential $({{\nu }}{(r)})$ This descriptor defines the pattern of the nuclei of a molecule.

It is expressed as
$$\begin{aligned} v({r})={\mathop {\sum }\limits _{a=1}^{A}} {\frac{{Z}_{a}}{|{r}-{R}_{a}|}} \end{aligned}$$
60.
Kier alpha molecular flexibility index (${{\varPhi }}$) Direct measurement of molecular flexibility [28] derived from Kier alpha adapted shape descriptors.

It is expressed as
$$\begin{aligned} \varPhi ={\frac{{}^{1} {K}_{\alpha } \cdot {}^{2}{K}_{\alpha }}{A}} \end{aligned}$$

Incorporation of Binary fingerprints

The seven available binary fingerprints were incorporated using Canvas interface. Only the models with significant contribution were retained for further process.

Kernel-based partial least square regression using different binary fingerprints

1.
Binary fingerprint—atom pairs [29]

Description: represents pair of atoms, which are differentiated into atom types and distance of separation.

Table 4 KPLS results on training set

Full size table

Table 5 Results of KPLS on test set

Full size table

The concept of atom pairs has been used in this fingerprint. A molecular entity is hashed into smaller fragments to give an integer value, and the shortest distance between any two atoms is given by the term d. The actual contribution of the model to the activity is assessed by kernel-based partial least square equation by considering their regression values (${R}^{2}$ and ${Q}^{2}$ values).

2.
Binary fingerprint—atom triplets [29]

Description: triplets of atoms and three distances separating them.

The model is an extension of the atom-pair fingerprint. Triplets represent three atoms and the distance at which they are separated. A triplet can be presented in 6 different ways according to the theory of permutation and combination. An operation is performed to exclude the bits which correspond only to the permutation Typea-dab-Typeb-dbc-Typec-dca. The assessment method is identical to the description under atom pairs.

3.
Binary fingerprint—fp linear [29]

Description: linear fragments $+$ ring closures.

The linear fingerprint fragments a molecule in every possible linear combination path. The default application considers up to 7 bonds. A hashing operation is performed for each linear fragment to generate a corresponding bit address. To apply a linear path on closed ring systems, linear paths may be extended up to 14 bonds. The linear fingerprints can be extensively applied to compounds with extended ring systems. The assessment method used is the same for atom pairs.

4.
Binary fingerprint—2D molprint [29]

Description: The molprint fingerprint uses heavy atoms present in a molecule and surrounding environment which is separated by a maximum of two bonds. The bit value is measured from a stored data containing a heavy atom and distance separated by other atoms by one or two bond orders. The assessment method used is the same for atom pairs.

5.
Binary fingerprint—fp dendritic [29]

Description: Linear and branched fragments.

The dendritic fingerprint uses a combination of both linear and branched fragments up to a user-defined value of 5 bonds. There is no special treatment for ring systems which are considered as branched fragments. The assessment method used is the same for atom pairs.

Results and discussion

1.
Results of kernel-based partial least square [30] regression on fingerprint—atom pairs (Figs. 1, 2)

The fingerprint atom pairs gave a statistically significant 2D QSAR model with excellent regression coefficient values and cross-validation coefficient values as represented in Tables 1 and 2. The model also showed good predictive accuracy in both test and training set molecules as demonstrated in Table 3. A hashing technique used is represented in Fig. 3.

For the assessment of atomic contribution to the model, three molecules were taken from each active and inactive set. Atoms positively contributing to activity were colored blue and atoms detrimentally contributing to the model were colored yellow which is shown in Fig. 4.

Table 6 Activity prediction (actual vs. predicted)

Full size table

2.
Results of kernel-based partial least square regression on fingerprint—atom triplets (Figs. 5, 6)

Table 7 KPLS results on training set

Full size table

The fingerprint atom triplets gave a statistically significant 2D QSAR model with excellent regression coefficient values and cross-validation coefficient values as represented in Tables 4 and 5. The model also showed good predictive accuracy in both test and training set molecules as evidenced in Table 6. The hashing pattern in the atom triplet is shown in Fig. 7.

Table 8 KPLS results on training set

Full size table

Table 9 Activity prediction (actual vs. predicted)

Full size table

For the assessment of atomic contribution to the model, three molecules were taken from each active and inactive set, atoms positively contributing to the activity were colored yellow and atoms detrimentally contributing to the model were colored green which is shown in Fig. 8.

The fingerprint fp linear gave a statistically significant 2D QSAR model with excellent regression coefficient values and cross-validation coefficient values as represented in Tables 7 and 8. The model also showed good predictive ability in both test and training set molecules as evidenced in Table 9. Hashing pattern in fp linear is represented in Fig. 9.

3.
Results of kernel-based partial least square regression on fingerprint—fp linear (Figs. 10, 11)

For the assessment of atomic contribution to the model [31], three molecules were taken from each active and inactive set, and atoms positively contributing to activity were colored red and atoms detrimentally contributing to the model were colored blue which is shown in Fig. 12.

4.
Results of kernel-based partial least square regression on fingerprint—2D molprint (Figs. 13, 14)

The finger print 2D molprint gave a statistically insignificant 2D QSAR model with a large difference in regression coefficient values and cross-validation coefficient values as presented in Tables 10 and 11. 2D molprint is not a suitable fingerprint for the selected flavone class. The model also showed poor predictive accuracy in both test and training set molecules as shown in Table 12. Hashing pattern is represented in Fig. 15.

For the assessment of atomic contribution to the model, three molecules were taken from each active and inactive sets and atoms positively contributing to the activity were colored maroon and atoms detrimentally contributing to the model were colored blue as shown in Fig. 16.

5.
Results of kernel-based partial least square regression on fingerprint—fp dendritic

The fingerprint fp dendritic gave a statistically insignificant 2D QSAR model with a large difference in the regression coefficient values and cross-validation coefficient values as represented in Tables 13 and 14. Therefore, fp dendritic [10] is not a suitable fingerprint for the selected flavone class (Figs. 17, 18, 19). The model also showed poor predictive ability in both test and training set molecules as evidenced in Table 15. The hashing pattern is represented in Fig. 20.

For the assessment of atomic contribution to the model, three molecules were taken from each active and inactive set, and atoms positively contributing to the activity were colored Orange and atoms detrimentally contributing to the model were colored maroon which is shown in Fig. 19.

Structure–activity relationship studies

Figure 21 shows that substitution with electron-donating groups (e.g., hydroxyl) on both rings A and C was found to increase activity in all the 5 fingerprint models. The only exception was molecule Tan 1, which is an active molecule even though it has no substitutions on both rings A and C. Substitution with electron withdrawing groups on both rings A and C was found to decrease activity in all the 5 fingerprint models. This may be attributed to the electronic parameters exerted by various functional groups on the aromatic rings.

Table 10 KPLS results on training set

Full size table

Table 11 KPLS results on test set

Full size table

Table 12 Activity prediction (actual vs. predicted)

Full size table

Table 13 KPLS results on training set

Full size table

Table 14 KPLS results on test set

Full size table

Table 15 Activity prediction (actual vs. predicted)

Full size table

During the analysis of binding site of flavones in the active site of Tankyrase 1, it is evident that hydrogen bond donor interactions are prominent for the effective binding. Consequently, electron-donating groups such as –OH and –NH2 were found to increase binding affinity toward Tankyrase I according to the developed QSAR model.

Conclusions

Kernel-based partial least square regression was performed on a series of novel mono-substituted flavones using five binary fingerprinting methods. The contribution of each fingerprint model to the activity depends on several factors. In the present study, the fingerprint atom pairs gave a statistically significant 2D QSAR model with excellent regression values. The factors responsible for the success of pairwise fingerprint are molecular size and molecular weight. All the selected test compounds were bicyclic fused systems with mono-substitutions on rings A and C. The molecular weight of all the test compounds was in the range of 200–400 Daltons. The fingerprint atom triplets also gave a statistically significant 2D QSAR equation due to the involvement of atom triplets which occur at particular site and at a particular distance. The three remaining fingerprints, namely linear, 2D Molprint and dendritic, could not reach acceptable regression values. This failure may be attributed to the atom typing scheme and structural variation. Suitable fingerprints should be selected based on structure, molecular size, types of ring systems and nature of extended branching.

References

Ferlay J, Shin HR, Bray F, Forman D, Mathers C, Parkin DM (2010) Estimates of worldwide burden of cancer in 2008: GLOBOCAN 2008. Int J Cancer 127:2893–2917. https://doi.org/10.1002/ijc.25516
Article PubMed CAS Google Scholar
Morin PJ, Sparks AB, Korinek V, Barker N, Clevers H, Vogelstein B, Kinzler KW (1997) Activation of $\beta $-catenin-TCF signaling in colon cancer by mutations in $\beta $-catenin or APC. Science 275:1787–1790. https://doi.org/10.1126/science.275.5307.1787
Article PubMed CAS Google Scholar
De Sousa EMF, Vermeulen L, Richel D, Medema JP (2011) Targeting WNT signaling in colon cancer stem cells. Clin Cancer Res 17:647–653. https://doi.org/10.1158/1078-0432.CCR-10-1204
Article PubMed CAS Google Scholar
Ikeda S, Kishida M, Matsuura Y, Usui H, Kikuchi A (2000) GSK-3 [beta]-dependent phosphorylation of adenomatous polyposis coli gene product can be modulated by [beta]-catenin and protein phosphatase 2A complexed with Axin. Oncogene 19:537. https://doi.org/10.1038/sj.onc.1203359
Article PubMed CAS Google Scholar
Huang SMA, Mishina YM, Liu S, Cheung A, Stegmeier F, Michaud GA, Hild M (2009) Tankyrase inhibition stabilizes axin and antagonizes Wnt signalling. Nature 461:614–620. https://doi.org/10.1038/nature08356
Article PubMed CAS Google Scholar
Donigian JR, de Lange T (2007) The role of the poly (ADP-ribose) polymerase tankyrase1 in telomere length control by the TRF1 component of the shelterin complex. J Biol Chem 282:22662–22667. https://doi.org/10.1074/jbc.M702620200
Article PubMed CAS Google Scholar
Narwal M, Koivunen J, Haikarainen T, Obaji E, Legala OE, Venkannagari H, Lehtiö L (2013) Discovery of tankyrase inhibiting flavones with increased potency and isoenzyme selectivity. J Med Chem 56:7880–7889. https://doi.org/10.1021/jm201510p
Article PubMed CAS Google Scholar
Schrödinger Release 2017-2: Maestro, Schrödinger, LLC, New York, NY (2017)
Okouchi S, Saegusa H (1989) Prediction of soil sorption coefficients of hydrophobic organic pollutants by adsorbability index. Bull Chem Soc Jpn 62:922–924. https://doi.org/10.1246/bcsj.62.922
Article CAS Google Scholar
Dancoff SM, Quastler H (1953) The information content and error rate of living things. In: Essays on the use of information theory in biology. University of Illinois Press, Urbana, p 263
Bertz SH (1981) The first general index of molecular complexity. J Am Chem Soc 103:3599–3601. https://doi.org/10.1021/ja00402a071
Article CAS Google Scholar
Ośmiałowski K, Halkiewicz J, Kaliszan R (1986) Quantum chemical parameters in correlation analysis of gas–liquid chromatographic retention indices of amines. J Chromatogr A 361:63–69
Article Google Scholar
Karelson M, Lobanov VS, Katritzky AR (1996) Quantum-chemical descriptors in QSAR/QSPR studies. Chem Rev 496:1027–1040. https://doi.org/10.1021/cr950202r
Article Google Scholar
Stanton DT, Jurs PC (1990) Development and use of charged partial surface area structural descriptors in computer-assisted quantitative structure–property relationship studies. Anal Chem 62:2323–2329. https://doi.org/10.1021/ac00220a013
Article CAS Google Scholar
Raychaudhury C, Ray SK, Ghosh JJ, Roy AB, Basak SC (1984) Discrimination of isomeric structures using information theoretic topological indices. J Comput Chem 5:581–588. https://doi.org/10.1002/jcc.540050612
Article CAS Google Scholar
Chenzhong C, Zhiliang L (1998) Molecular polarizability: a relationship to water solubility of alkanes and alcohols. J Chem Inf Comput Sci 38:1–7. https://doi.org/10.1021/ci9601729
Article Google Scholar
Mulliken RS (1955) Electronic population analysis on LCAO–MO molecular wave functions. J Chem Phys 23:1833–1840. https://doi.org/10.1063/1.1740588
Article CAS Google Scholar
Knox JH, Kaliszan R (1985) Theory of solvent disturbance peaks and experimental determination of thermodynamic dead-volume in column liquid chromatography. J Chromatogr A 349:211–234
Article CAS Google Scholar
Luco JM, Yamin LJ, Ferretti HF (1995) Molecular topology and quantum chemical descriptors in the study of reversed-phase liquid chromatography:hydrogen-bonding behavior of chalcones and flavonones. J Pharm Sci 84:903–908. https://doi.org/10.1002/jps.2600840722
Article PubMed CAS Google Scholar
Hammette LP (1970) Physical organic chemistry: reaction rates, equilibria and mechanism. Mc Graw Hill, New York
Google Scholar
Joshi RK, Meister T, Scapozza L, Ha TK (1994) A new quantum chemical approach in QSAR-analysis: parametrisation of conformational energies into molecular descriptors JMn (steric) and JSn (electronic). Arzneimittelforschung 44:779–790
PubMed CAS Google Scholar
Beckhaus HD (1978) $\cal{S}_f$ parameters: a measure of the front strain of alkyl groups. Angew Chem Int Ed Engl 17:593–594. https://doi.org/10.1002/cber.19781110107
Article Google Scholar
Ivanciuc O, Balaban AT (1996) Design of topological indices: a new topological parameter for the steric effect of alkyl substituents. Croat Chem Acta 69:75–83. https://doi.org/10.1021/ci034266b
Article CAS Google Scholar
Dash SC, Behera GB (1980) A new steric parameter to explain ortho-substituent effect. Indian J Chem Sect A 19:541–543
Google Scholar
Miyaki Y, Einaga Y, Fujita H (1978) Excluded-volume effects in dilute polymer solutions: very high molecular weight polystyrene in benzene and cyclohexane. Macromolecules 11:1180–1186. https://doi.org/10.1021/ma60066a022
Article CAS Google Scholar
Carbó R, Leyda L, Arnau M (1980) How similar is a molecule to another? An electron density measure of similarity between two molecular structures. Int J Quantum Chem 17:1185–1189
Article Google Scholar
Parr RG, Pearson RG (1983) Absolute hardness: companion parameter to absolute electronegativity. J Am Chem Soc 105:7512–7516. https://doi.org/10.1021/ja00364a005
Article CAS Google Scholar
Kier LB (1989) An index of molecular flexibility from kappa shape attributes. Mol Inf 8:221–224. https://doi.org/10.1021/acs.jcim.6b00565
Article CAS Google Scholar
Sastry M, Lowrie JF, Dixon SL, Sherman W (2010) Large-scale systematic analysis of 2D fingerprint methods and parameters to improve virtual screening enrichments. J Chem Inf Model 50:771–784. https://doi.org/10.1021/ci100062n
Article PubMed CAS Google Scholar
Duan J, Dixon SL, Lowrie JF, Sherman W (2010) Analysis and comparison of 2D fingerprints: insights into database screening performance using eight fingerprint methods. J Mol Graph 29:157–170. https://doi.org/10.1016/j.jmgm.2010.05.008
Article CAS Google Scholar
Schrödinger Release 2016-3: Canvas, Schrödinger, LLC, New York, NY (2016)

Download references

Acknowledgements

We sincerely acknowledge the support of Mr. Mikal Rekdal, Department of Chemical Engineering, Norwegian University of Science and technology, Norway. Authors acknowledge Manipal University for providing necessary facilities. Authors acknowledge Schrödinger Inc. USA for the software and technical support.

Author information

Authors and Affiliations

Department of Pharmaceutical Quality Assurance, Manipal College of Pharmaceutical Sciences (MCOPS), Manipal University, Manipal, Karnataka, India
B. S. Muddukrishna
Department of Pharmacognosy, Manipal College of Pharmaceutical Sciences (MCOPS), Manipal University, Manipal, Karnataka, India
Vasudev Pai & Richard Lobo
Department of Pharmaceutical Chemistry, Manipal College of Pharmaceutical Sciences (MCOPS), Manipal University, Manipal, Karnataka, 576 104, India
Aravinda Pai

Authors

B. S. Muddukrishna
View author publications
You can also search for this author in PubMed Google Scholar
Vasudev Pai
View author publications
You can also search for this author in PubMed Google Scholar
Richard Lobo
View author publications
You can also search for this author in PubMed Google Scholar
Aravinda Pai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Aravinda Pai.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (rar 884 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Muddukrishna, B.S., Pai, V., Lobo, R. et al. Application of two-dimensional binary fingerprinting methods for the design of selective Tankyrase I inhibitors. Mol Divers 22, 359–381 (2018). https://doi.org/10.1007/s11030-017-9793-0

Download citation

Received: 12 January 2017
Accepted: 24 October 2017
Published: 22 November 2017
Issue Date: May 2018
DOI: https://doi.org/10.1007/s11030-017-9793-0

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.