Bond-based 2D TOMOCOMD-CARDD approach for drug discovery: aiding decision-making in ‘in silico’ selection of new lead tyrosinase inhibitors

Marrero-Ponce, Yovani; Khan, Mahmud Tareq Hassan; Casañola-Martín, Gerardo M.; Ather, Arjumand; Sultankhodzhaev, Mukhlis N.; García-Domenech, Ramón; Torrens, Francisco; Rotondo, Richard

doi:10.1007/s10822-006-9094-7

Bond-based 2D TOMOCOMD-CARDD approach for drug discovery: aiding decision-making in ‘in silico’ selection of new lead tyrosinase inhibitors

Original Paper
Published: 28 February 2007

Volume 21, pages 167–188, (2007)
Cite this article

Download PDF

Access provided by CONRICYT-eBooks

Journal of Computer-Aided Molecular Design Aims and scope Submit manuscript

Bond-based 2D TOMOCOMD-CARDD approach for drug discovery: aiding decision-making in ‘in silico’ selection of new lead tyrosinase inhibitors

Download PDF

Yovani Marrero-Ponce^1,2,3,
Mahmud Tareq Hassan Khan^4,5,
Gerardo M. Casañola-Martín^2,6,
Arjumand Ather⁷,
Mukhlis N. Sultankhodzhaev⁸,
Ramón García-Domenech³,
Francisco Torrens¹ &
…
Richard Rotondo⁹

174 Accesses
30 Citations
Explore all metrics

Abstract

In this paper, we present a new set of bond-level TOMOCOMD-CARDD molecular descriptors (MDs), the bond-based bilinear indices, based on a bilinear map similar to those defined in linear algebra. These novel MDs are used here in Quantitative Structure–Activity Relationship (QSAR) studies of tyrosinase inhibitors, for finding functions that discriminate between the tyrosinase inhibitor compounds and inactive ones. In total 14 models were obtained and the best two discriminant functions (Eqs. 32 and 33) shown globally good classification of 91.00% and 90.17%, respectively, in the training set. The test set had accuracies of 93.33% and 88.89% for the models 32 and 33, correspondingly. A simulated virtual screening was also carried out to prove the quality of the determined models. In a final step, the fitted models were used in the biosilico identification of new synthesized tetraketones, where a good agreement could be observed between the theoretical and experimental results. Four compounds of the novel bioactive chemicals discovered as tyrosinase inhibitors: TK10 (IC₅₀ = 2.09 μM), TK11 (IC₅₀ = 2.61 μM), TK21 (IC₅₀ = 2.06 μM), TK23 (IC₅₀ = 3.19 μM), showed more potent activity than l-mimose (IC₅₀ = 3.68 μM). Besides, for this study a heterogeneous database of tyrosinase inhibitors was collected, and could be a useful tool for the scientist in the domain of tyrosinase enzyme researches. The current report could help to shed some clues in the identification of new chemicals that inhibits enzyme tyrosinase, for entering in the pipeline of drug discovery development.

3D QSAR and molecular docking studies of 4-alkoxy- and 4-acyloxy –phenyl ethylene thiosemicarbazone derivatives as tyrosinase inhibitors

Article 24 August 2017

QuBiLS-MAS, open source multi-platform software for atom- and bond-based topological (2D) and chiral (2.5D) algebraic molecular descriptors computations

Article Open access 07 June 2017

Application of two-dimensional binary fingerprinting methods for the design of selective Tankyrase I inhibitors

Article 22 November 2017

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Melanogenesis is a physiological process resulting in the synthesis of melanin pigments, which play a crucial protective role against skin photocarcinogenesis. In humans and other mammals, the biosynthesis of melanin takes place in a lineage of cells known as melanocytes, which contain the enzyme tyrosinase [1]. Tyrosinase (phenoloxidase) is known to be a key enzyme for melanin biosynthesis. This enzyme is mainly involved in the initial steps of the pathway which consist of the hydroxylation of the l-tyrosine (monophenolase activity) and the oxidation of the product of this reaction, the l-DOPA (diphenolase activity), to give rise to o-dopaquinone [2]. This o-quinone is transformed into melanins, followed by a series of divergent steps that give rise to a predominantly indolic pigment (eumelanin) and a closely related pigment containing benzothiazine subunits (phaeomelanin).The current view is that most human pigmentation involves a combination of these pathways giving rise to mixtures of varying composition [3, 4].

Many approaches are based on the use of analogue substrates for tyrosinase which are designed to maximize the generation of reactive orthoquinone oxidation products and increasing their diffusion range by preventing the spontaneous self-extinguishing cyclization reaction [5]. These, if released into the cytosol through the defective melanosomal membranes of malignant melanocytes, have the potential to react with vital cellular components and cause irreversible damage [6]. Therefore, inhibitors of tyrosinase should be useful as therapeutic agents for the treatment of melanin hyperpigmentation and cosmetic materials for whitening after sunburn [7, 8].

On other hand, the computational methods have become in a suitable alternative to the drug design, and have recently applied to QSAR studies of tyrosinase inhibitors [9–11], using congeneric or heterogeneous dataset of compounds. In this sense QSAR methods can reduce the costly failures of drug candidates in clinical trials by filtering virtual libraries of chemicals.

One of our research group has carried out QSAR/QSPR studies related to chemical, physicochemical and biological properties of different chemicals and drugs [12–16], including studies in nucleic acid–drug interactions [17, 18] and discovery of antimalarial compounds [19]. The ‘in house’ TOpologicalMOlecular COMputer Design-Computer Aided‘Rational’ Drug Design (TOMOCOMD-CARDD) software [20] a novel computer-aided molecular design scheme, based in the graph theory and linear algebra; has been used to develop this entire works and many others.

Here we propose a new set of molecular descriptors (MDs) namely non-stochastic and stochastic bond-based bilinear indices, its application to discriminate tyrosinase inhibitor compounds (actives) from inactive ones using QSAR models, is shown. Furthermore a virtual screening is carried out with a small library of chemicals and as a final point we present the in silico identification, synthesis and in vitro assays of a new set of tetraketones, a procedure that can arise the potentialities of these new MDs into a real world application, that could help to speed up the discovery of new lead compounds to treat the hyperpigmentation and skin disorders.

Theoretical framework

The basis of the extension of bilinear indices that will be given here is the edge-adjacency matrix considered and explicitly defined in the chemical graph-theory literature [21, 22], and rediscovered by Estrada as an important source of new MDs [23–28]. In this section, we first will define the nomenclature to be used in this work, then the atom-based molecular vector $({\bar{x}})$ will be redefined for bond characterization using the same approach as previously reported, and finally some new definition of bond-based non-stochastic and stochastic bilinear indices will be given.

Background in edge-adjacency matrix and new edge-relations: stochastic edge-adjacency matrix

Let G = (V, E) be a simple graph, with ${V=\{v_{1}, v_{2},\ldots, v_{n}\}}$ and ${E=\{e_{1}, e_{2}, \ldots e_{m}\}}$ being the vertex- and edge-sets of G, respectively. Then G represents a molecular graph having n vertices and m edge (bonds). The edge-adjacency matrix E of G (likewise called bond-adjacency matrix, B) is a square and symmetric matrix whose elements e _ij are 1 if and only if edge i is adjacent to edge j [25, 28–30]. Two edges are adjacent if they are incidental to a common vertex. This matrix corresponds to the vertex-adjacency matrix of the associated line graph. Finally, the sum of the ith row (or column) of E is named the edge-degree of bond ${i,\,\,\delta (e_{i})}$ [23, 26, 27, 29, 30].

By using the edge (bond)–adjacency relationships we can find other new relation for a molecular graph that will be introduced here. The kth stochastic edge-adjacency matrix, ${{\bf ES}^{\varvec k}}$ can be obtained directly from ${{\bf E}^{\varvec k}}$. Here, ${{\bf ES}^{\varvec k}=[{}^{k}es_{ij}]}$ is a square table of order m (m = number of bonds) and the elements ${^{k}es_{ij}}$ are defined as follows:

$$ {}^{k}es_{ij}=\frac{{}^ke_{ij}}{{}^k\hbox{SUM}(E^k)_i}=\frac{{}^ke_{ij}}{{}^k\delta (e)_i}$$

(1)

where, ${{}^{k}e_{ij}}$ are the elements of the kth power of E and the SUM of the ith row of E ^k are named the k-order edge degree of bond i, ^kδ(e)_i. Note that the matrix ${{\bf ES}^{k}}$ in Eq. 1 has the property that the sum of the elements in each row is 1. An m × m matrix with nonnegative entries having this property is called a stochastic matrix [31].

Chemical information and bond-based molecular vector

The atom-based molecular vector (${\bar{x}}$) used to represent small-to-medium size organic chemicals has been explained in some detail elsewhere [12–14, 16, 17, 32–44]. In a manner parallel to the development of ${\bar{x}}$, we present the expansion of the bond-based molecular vector (${\bar{w}}$). The components (w) of ${\bar{w}}$ are numeric values, which represent a certain standard bond property (bond-label). That is to say, these weights correspond to different bond properties for organic molecules. Thus, a molecule having ${5, 10, 15,\ldots,m}$ bonds can be represented by means of vectors, with ${5, 10, 15,\ldots,m}$ components, belonging to the spaces ${\Re^{5}}$, ${\Re^{10}}$, ${\Re^{15},\ldots}$, ${\Re^{m}}$, respectively; where m is the dimension of the real sets (${\Re^{m})}$. This approach allows us encoding organic molecules such as 3-hydroxy-2-butenenitrile through the molecular vector ${\bar{w}}$ = [${w_{\rm Csp3-Csp2}}$, ${w_{\rm Csp2=Csp2}}$, ${w_{\rm Csp2-Osp3}}$, ${w_{\rm H-Osp3}}$, ${w_{\rm Csp2-Csp}}$, ${w_{\rm Csp\equiv Nsp}}$ ]. This vector belongs to the product space ${\Re^{6}}$.

These properties characterize each kind of bond (and bond-types) within the molecule. Diverse kinds of bond weights (w) can be used in order to codify information related to each bond in the molecule. These bond labels are chemically meaningful numbers such as standard bond distance [45–48], standard bond dipole [45–48] or even mathematical expressions involving atomic weights such as atomic log P [49], surface contributions of polar atoms [50], atomic molar refractivity [51], atomic hybrid polarizabilities [52], and Gasteiger–Marsilli atomic charge [53], atomic electronegativity in Pauling scale [54] and so on. Here, we characterized each bond with the following parameter:

$$ w=x_{i}/\delta_{i}+ x_{j}/\delta_{j}\ $$

(2)

which characterizes each bond. In this expression x _i can be any standard weight of the atom i bonded with atom j. δi is the vertex (atom) degree of atom i. The use of each scale (bond property) defines alternative molecular vectors, ${\bar{w}}$.

The chemical information can also be codify by means of two different molecular vectors, for instance, ${\bar{w}=[w_{1}, \ldots,w_{n}]}$ and ${\bar{u}=[u_{1}, \ldots ,u_{n}]}$; then different combinations of molecular vectors (${\bar{w}\ne \bar{u}}$) are possible when a weighting scheme is used. In the present report, we characterized each bond with mathematical expressions involving the following parameters: atomic masses (M) [55], the van der Waals volumes (V) [55], the atomic polarizabilities (P) [55], and atomic electronegativity (E) in Mulliken scale [55]. The values of these atomic labels are shown in Table 1. From this weighting scheme, six (or 12 if ${\bar{w}_{M}\hbox{-}\bar{u}_{V} \neq \bar{w}_{V}\hbox{-}\bar{u}_{M}}$) combinations (pairs) of molecular vectors (${\bar{w},\bar{u};\bar{w}\neq \bar{u}}$) can be computed, ${\bar{w}_{M}\hbox{-}\bar{u}_{V}}$, ${\bar{w}_{M}\hbox{-}\bar{u}_{P}}$, ${\bar{w}_{M}\hbox{-}\bar{u}_{K}}$, ${\bar{w}_{V}\hbox{-}\bar{u}_{P}}$, ${\bar{w}_{V}\hbox{-}\bar{u}_{K}}$, and ${\bar{w}_{P}\hbox{-}\bar{u}_{K}}$. Here, we used the symbols ${\bar{w}_{X}\hbox{-}\bar{u}_{Z}}$, where the subscripts _X and _Z mean two mathematical expressions involving atomic properties from our weighting scheme and a hyphen (-) expresses the combination (pair) of two selected bond-label chemical properties. In order to illustrate this we will consider this in an example describe in other section of this work.

Table 1 Values of the atom weights used for linear indices calculation [54–57]

Full size table

Definition of mathematical bilinear forms

In mathematics, a bilinear form in a real vector space is a mapping ${b:VxV\to \Re}$, which is linear in both arguments [58–60]. That is, this function satisfies the following axioms for any scalar α and any choice of vectors ${\bar{v},\bar{w},\bar{v}_1,\bar{v}_2 ,\bar{w}_1}$ and ${\bar{w}_2}$.

i.
${b(\alpha \bar{v},\bar{w})=b(\bar{v},\alpha \bar{w})=\alpha b(\bar{v},\bar{w})}$
ii.
${b(\bar{v}_1 +\bar{v}_2 ,\bar{w})=b(\bar{v}_1 ,\bar{w})+b(\bar{v}_2 ,\bar{w})}$
iii.
${b(\bar{v},\bar{w}_1 +\bar{w}_2 )=b(\bar{v},\bar{w}_1 )+b(\bar{v},\bar{w}_2)}$

That is, b is bilinear if it is linear in each parameter, taken separately.

Let V be a real vector space in ${\Re^n(V\in \Re^n)}$ and consider that the following vector set, ${\left\{ {\bar{e}_1 ,\bar{e}_2 ,\ldots,\bar{e}_n} \right\}}$ is a basis set of ${\Re^n}$. This basis set permits us to write in unambiguous form any vectors ${\bar{w}}$ and ${\bar{y}}$ of V, where ${(w^1,w^2,\ldots,w^n)\in \Re^n}$ and ${(u^1,u^2,\ldots,u^n)\in \Re^n}$ are the coordinates of the vectors ${\bar{x}}$ and ${\bar{u}}$, respectively. That is to say,

$$ \bar{w}=\sum\limits_{i=1}^n {x^i\bar{e}_i } $$

(3)

and,

$$ \bar{u}=\sum\limits_{i=1}^n {y^j\bar{e}_j } $$

(4)

Subsequently,

$$ b(\bar{w},\bar{u})=b(w^i\bar{e}_i ,u^j\bar{e}_j )=w^iu^jb(\bar{e}_i ,\bar{e}_j ) $$

(5)

if we take the a _ij as the n × n scalars ${b(\bar{e}_i ,\bar{e}_j)}$, That is,

$$ a_{ij} =b(\bar{e}_i ,\bar{e}_j ),\quad \hbox{ to }i=1,2,\ldots,n\hbox{ and }j=1,2,\ldots,n $$

(6)

Then,

$$ b(\bar{w},\bar{u})=\sum\limits_{i,j}^n {a_{ij} w^iu^j=\left[ W \right]^TA\left[ U \right]} =\left[ \begin{array}{lll} {w^1} & \ldots & {w^n} \\ \end{array} \right]\left[ \begin{array}{lll} {a_{11}} & \ldots & {a_{jn}} \\ \ldots & \ldots & \ldots \\ {a_{n1}} & \ldots & {a_{nn}} \\ \end{array} \right]\left[ \begin{array}{l} {u^1} \\ \vdots \\ {u^n} \\ \end{array} \right] $$

(7)

As it can be seen, the defined equation for b may be written as the single matrix equation (see Eq. 7), where [U] is a column vector (an n × 1 matrix) of the coordinates of ${\bar{u}}$ in a basis set of ${\Re^{n}}$, and [W]^T (a 1 × n matrix) is the transpose of [W], where [W] is a column vector (an n × 1 matrix) of the coordinates of ${\bar{w}}$ in the same basis of ${\Re^{n}}$.

Finally, we introduce the formal definition of symmetric bilinear form. Let V be a real vector space and b be a bilinear function in V × V. The bilinear function b is called symmetric if ${b(\bar{w},\bar{u})=b(\bar{u},\bar{w}),\forall \bar{w},\bar{u}\in V}$ [58–60] Then,

$$ b(\bar{w},\bar{u})=\sum\limits_{i,j}^n {a_{ij} w^iu^j} =\sum\limits_{i,j}^n {a_{ji} w^ju^i} =b(\bar{u},\bar{w}) $$

(8)

The total non-stochastic and stochastic bond-based bilinear indices

If a molecule consists of m bonds (vector of ${\Re^{m}}$), then the kth total bilinear indices are calculated as bilinear maps (bilinear form) in ${\Re^{m}}$ in canonical basis set. Specifically, the kth total non-stochastic and stochastic bond bilinear indices, ${b_{k}(\bar{w},\bar{u})}$ and ${{}^{s}b_{k}(\bar{w},\bar{u})}$, are computed from these kth non-stochastic and stochastic edge adjacency matrices, ${{\bf E}^{\varvec k}}$ and ${{\bf ES}^{\varvec k}}$, as shown in Eqs. 9 and 10, correspondingly:

$$ b_k (\bar{w},\bar{u})=\sum\limits_{i=1} ^m \sum\limits_{j=1}^m {{}^ke_{ij} w^iu^j}=[{W}]^{t}{\bf E}^{\varvec k}[{U}] $$

(9)

$$ {}^sb_k (\bar{w},\bar{u})=\sum\limits_{i=1}^m \sum\limits_{j=1}^m {{}^kes_{ij} w^iu^j}=[{W}]^{t} {\bf ES}^{\varvec k}[{U}] $$

(10)

where, m is the number of bonds of the molecule, and ${w^{1}, \ldots ,w^{m}}$ and ${u^{1} ,\ldots, u^{m}}$ are the coordinates of the bond-based molecular vectors ${\bar{w}}$ and ${\bar{u}}$ in a canonical basis set of ${\Re^{n}}$. Therefore, if we used the canonical basis set, the coordinates [${(w^{1},\ldots ,w^{n})}$ and ${(u^{1},\ldots ,u^{n})}$ ] of any molecular vectors (${\bar{w}}$ and ${\bar{u}}$) coincide with the components of those vectors [(${w_{1},\ldots ,w_{n})}$ and ${(u_{1},\ldots ,u_{n})}$ ] [28, 45, 46]. For that reason, those coordinates can be considered as weights (bond-labels) of the edge of the molecular graph. The coefficients ${{}^{k}e_{ij}}$ and ${{}^{k}es_{ij}}$ are the elements of the kth power of the matrix E(G) and ES(G), correspondingly, of the molecular pseudograph. The defining Eqs. 9 and 10 for ${b_{k}(\bar{w},\bar{u})}$ and ${^{s}b_{k}(\bar{w},\bar{u})}$, respectively, may be also written as the single matrix equation (see Eqs. 9 and 10), where [U] is a column vector (an n × 1 matrix) of the coordinates of ${\bar{u}}$ in the canonical basis set of ${\Re^{n}}$, and [W]^t is the transpose of [W], where [W is a column vector (an n × 1 matrix) of the coordinates of ${\bar{w}}$ in the canonical basis of ${\Re^{n}}$. Here, ${{\bf E}^{\varvec k}}$ and ${{\bf ES}^{\varvec k}}$ denote the matrices of bilinear maps with respect to the natural basis set.

It should be remarked that non-stochastic and stochastic bilinear indices are symmetric and non-symmetric bilinear forms, respectively. Therefore, if in the following weighting scheme, M and V are used as weights to compute theses MDs, two different sets of stochastic bilinear indices, ${^{{M{\rm -}V} {\varvec s}}{\varvec b}_{\varvec k}^{\bf H}(\bar{w},\bar{u})}$ and ${^{{V{\rm -}M} {\varvec s}}{\varvec b}_{\varvec k}^{\bf H}(\bar{w},\bar{u})}$ [because ${\bar{w}_{M}\hbox{-}\bar{u}_{V} \neq \bar{w}_{V}\hbox{-}\bar{u}_{M}}$] can be obtained and only one group of non-stochastic bilinear indices (${{}^{M{\rm-}V {\varvec s}}{\varvec b}_{\varvec k}^{\bf H}(\bar{w},\bar{u})={}^{\rm V-M {\varvec s}}{\varvec b}_{\varvec b}^{\bf H}(\bar{w},\bar{u})}$ because in this case ${\bar{w}_{M}\hbox{-}\bar{u}_{V}=\bar{w}_{V}\hbox{-}\bar{u}_{ M})}$ can be calculated.

The local non-stochastic and stochastic bond-based bilinear indices

Finally, in addition to total bond-based quadratic indices, computed for the whole molecule, a local-fragment (bond and bond-type) formalism can be developed. These descriptors are termed local non-stochastic and stochastic bilinear indices, ${b_{kL}(\bar{w},\bar{u})}$ and ${{}^{s}b_{kL}(\bar{w},\bar{u})}$, respectively. The definition of these descriptors is as follows:

$$b_{kL} (\bar{w},\bar{u})=\sum\limits_{i=1}^m \sum\limits_{j=1}^m {{}^ke_{ijL} w^iu^j}=[{W}]^{t}{\bf E}^{\varvec k}_{\bf L}[{U}] $$

(11)

$$ {}^sb_{kL} (\bar{w},\bar{u})=\sum\limits_{i=1}^m \sum\limits_{j=1}^m {{}^kes_{ijL} w^iu^j}=[{W}]^{t}{\bf ES}^{\varvec k}_{\bf L}[{U}] $$

(12)

where, m is the number of bonds and ${{}^{k}e_{ijL} [{}^{k}es_{ijL}]}$ is the kth element of the row “i” and column “j” of the local matrix ${{\bf E}^{\varvec k}_{\bf L}[{\bf ES}^{\varvec k}_{\bf L}]}$. This matrix is extracted from the ${{\bf E}^{\varvec k}[{\bf ES}^{\varvec k}]}$ matrix and contains information referred to the edges (bonds) of the specific molecular fragments and also of the molecular environment in k steps. The matrix ${{\bf E}^{\varvec k}_{\bf L}[{\bf ES}^{\varvec k}_{\bf L}]}$ with elements ${{}^{k}e_{ijL} [{}^{k}es_{ijL}]}$ is defined as follows:

$$ \begin{aligned} {}^{k}e_{ijL }[{}^{k}{\it es}_{ijL}] &={}^{ k}e_{ij }[{}^{k}es_{ijL}] \hbox{ if }\hbox{ both }e_{i}\hbox{ and }e_{j}\hbox{ are edges (bonds) contained within the molecular fragment}\\ &=1/2 {}^{k}e_{ij}[{}^{k}es_{ijL}] \hbox{ if }e_{i}\hbox{ and }e_{j}\hbox{ are edges (bonds) contained within the molecular fragment but not both}\\ &= 0\hbox{ otherwise} \end{aligned} $$

(13)

Is important to highlight that the scheme above follows the spirit of a Mulliken population analysis [61]. It should be remarked also that for every partitioning of a molecule into Z molecular fragments there will be Z local molecular fragment matrices. In this case, if a molecule is partitioned into Z molecular fragments, the matrices ${{\bf E}^{\varvec k} [{\bf ES}^{\varvec k}]}$ can be correspondingly partitioned into Z local matrices ${{\bf E}^{\varvec k}_{\bf L}[{\bf ES}^{\varvec k}_{\bf L}]}$, ${L\,=\,1,\ldots,Z}$, and the kth power of matrix E [ES] is exactly the sum of the kth power of the local Z matrices. In this way, the total (both non-stochastic and stochastic) bond-based bilinear indices are the sum of the non-stochastic and stochastic bond-based bilinear indices, respectively, of the Z molecular fragments:

$$ b_k (\bar{w},\bar{u})=\sum\limits_{L=1}^Z {b_{kL}} (\bar{w},\bar{u}) $$

(14)

$$ {}^sb_k (\bar{w},\bar{u})=\sum\limits_{L=1}^Z {{}^sb_{kL}} (\bar{w},\bar{u}) $$

(15)

Bond and bond-type bilinear fingerprints are specific cases of local bond-based bilinear indices. The kth bond-type bilinear indices of the edge-adjacency matrix are calculated by summing up the kth bond bilinear indices for all bonds of the same type in the molecule. That is to say, this extension of the bond bilinear index is similar to group additive schemes, in which an index appears for each bond type in the molecule together with its contribution based of the bond bilinear index.

In the bond-type bilinear indices formalism, each bond in the molecule is classified into a bond-type (fragment). In this sense, bonds may be classified into bond types in terms of the characteristics of the two atoms that define the bond. For all data sets, including those with a common molecular scaffold as well as those with very diverse structure, the kth fragment (bond-type) quadratic indices provide much useful information. Thus, the development of the bond-type bilinear indices description provides the basis for application to a wider range of biological problems in which the local formalism is applicable without the need for superposition or a closely related set of structures.

It is useful to perform a calculation on a molecule to illustrate the steps in the procedure. For this, in the next section we depict a pictorial representation of the calculus of the non-stochastic and stochastic bilinear indices of the bond matrix (both total and local) using a simple chemical example.

Sample calculation

The bilinear indices of the bond matrix are calculated in the following way. Considering the molecule of 3-hydroxy-2-butenenitrile as a simple example, we have the following labeled molecular graph and bond-based adjacency matrices (E and ES). The second (k = 2) and third (k = 3) power of these matrices and bond-based molecular vector, ${\bar{w}}$, are also given:

$$\begin{array}{l} E^0=ES^0=\left[ \begin{array}{lllll} 1 & & & & \\ & 1 & & & \\ & & 1 & & \\ & & & 1 & \\ & & & & 1 \\ \end{array} \right] E^1=\left[ \begin{array}{lllll} 0 & 1 & 0 & 0 & 1 \\ 1 & 0 & 1 & 0 & 1 \\ 0 & 1 & 0 & 1 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 1 & 1 & 0 & 0 & 0 \\ \end{array} \right] E^2=\left[ \begin{array}{lllll} 2 & 1 & 1 & 0 & 1 \\ 1 & 3 & 0 & 1 & 1 \\ 1 & 0 & 2 & 0 & 1 \\ 0 & 1 & 0 & 1 & 0 \\ 1 & 1 & 1 & 0 & 2 \\ \end{array} \right] E^3=\left[ \begin{array}{lllll} 2 & 4 & 1 & 1 & 3 \\ 4 & 2 & 4 & 0 & 4 \\ 1 & 4 & 0 & 2 & 1 \\ 1 & 0 & 2 & 0 & 1 \\ 3 & 4 & 1 & 1 & 2 \\ \end{array} \right] \\ \\ ES^1=\left[ \begin{array}{lllll} 0 & 0.5 & 0 & 0 & 0.5 \\ 0.33 & 0 & 0.33 & 0 & 0.33 \\ 0 & 0.5 & 0 & 0.5 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 0.5 & 0.5 & 0 & 0 & 0 \\ \end{array} \right] ES^2=\left[ \begin{array}{lllll} 0.4 & 0.2 & 0.2 & 0 & 0.2 \\ 0.16 & 0.5 & 0 & 0.16 & 0.16 \\ 0.25 & 0 & 0.5 & 0 & 0.25 \\ 0 & 0.5 & 0 & 0.5 & 0 \\ 0.2 & 0.2 & 0.2 & 0 & 0.4 \\ \end{array} \right] ES^3=\left[ \begin{array}{lllll} 0.18 & 0.36 & 0.090 & 0.090 & 0.27 \\ 0.28 & 0.14 & 0.28 & 0 & 0.28 \\ 0.12 & 0.5 & 0 & 0.25 & 0.12 \\ 0.25 & 0 & 0.5 & 0 & 0.25 \\ 0.27 & 0.36 & 0.090 & 0.090 & 0.18 \\ \end{array} \right] \\ \end{array} $$

The molecule contains five localized bonds (corresponding to five edges in the H-suppressed molecular graph). To these we will associate the five “bond orbitals” ${w_{1}, w_{2}, w_{3}, w_{4}}$, and w ₅. Thus, ${\bar{w}=[w_{1}, w_{2}, w_{3}, w_{4}, w_{5}] = [w_{(\rm C-C)}, w_{(\rm C=C)}, w_{(\rm C-C)}, w_{(\rm C\equiv N)}, w_{(\rm C-O)}]}$ and each “bond orbital” can be computed by Eq. 2 using, for instance, the atomic electronegativity in Pauling scale (x) [54] as atomic weight (atom-label):

$$ \begin{array}{l} w_{1}=x_{C} /1 + x_{C} /4=2.55/1 + 2.55/4=3.1875\\ w_{2}=x_{C} /4 + x_{C }/3=2.55/4 + 2.55/3=1.4875\\ w_{3}=x_{C} /3 + x_{C} /4=2.55/3 + 2.55/4=1.4875\\ w_{4}=x_{C} /4 + x_{N} /3=2.55/4 + 3.04/3=1.650833\\ w_{5}=x_{C} /4 + x_{O }/1=2.55/4 + 3.44/1=4.0775 \end{array} $$

and therefore, ${\bar{w}}$ = [3.1875, 1.4875, 1.4875, 1.650833, 4.0775].

Besides other vector, ${\bar{u}}$ must be calculated in the same way that ${\bar{w}}$, but using other property, for example the atomic masses [55] as atomic weight (atom-label):

$$ \begin{array}{l} u_{1}=y_{C} /1 + y_{C} /4=12.01/1 + 12.01/4=15.0125\\ u_{2}=y_{C} /4 + y_{C }/3=12.01/4 + 12.01/3=7.005833\\ u_{3}=y_{C} /3 + y_{C} /4=12.01/3 + 12.01/4=7.005833\\ u_{4}=y_{C} /4 + y_{N} /3=12.01/4 + 14.01/3=7.6725\\ u_{5}=y_{C} /4 + y_{O }/1=12.01/4 + 16.00/1=19.0025 \end{array} $$

and therefore, ${\bar{u}}$ = [15.0125, 7.005833, 7.005833, 7.6725, 19.0025].

Each non-stochastic and stochastic total bilinear index will have the form:

$$ \begin{aligned} {\varvec b}_{k}(\bar{w},\bar{u})=&{}^{k}e_{11}w^{1}u^{1} + {}^{k}e_{21}w^{1}u^{2}+{}^{k}e_{31}w^{1}u^{3} +{}^{k}e_{41}w^{1}u^{4}+{}^{k}e_{51}w^{1}u^{5} +{}^{k}e_{12}w^{1}u^{2}+{}^{k}e_{22}w^{2}u^{2}\\ &+{}^{k}e_{32}w^{2}u^{3}+{}^{k}e_{42}w^{2}u^{4}+ {}^{k}e_{52}w^{2}u^{5}+{}^{k}e_{13}w^{1}u^{3} + {}^{k}e_{23}w^{2}u^{3}+{}^{k}e_{33}w^{3}u^{3} + {}^{k}e_{43}w^{3}u^{4}\\ &+{}^{k}e_{53}w^{3}u^{5}+{}^{k}e_{14}w^{1}u^{4} + {}^{k}e_{24}w^{2}u^{4}+{}^{k}e_{34}w^{3}u^{4}+ {}^{k}e_{44}w^{4}u^{4}+{}^{k}e_{54}w^{4}u^{5} +{}^{k}e_{15}w^{1}u^{5}\\ &+{}^{k}e_{25}w^{2}u^{5}+{}^{k}e_{35}w^{3}u^{5}+ {}^{k}e_{45}w^{4}u^{5}+{}^{k}e_{55}w^{5}u^{5}=\sum\limits_{(i)} {}^ke_{ii} w^iu^i+2\sum\limits_{(i,j)} {{}^ke_{ij} w^iu^j} \end{aligned} $$

(16)

$$ \begin{aligned} {}^{s}{\varvec b}_{k}(\bar{w},\bar{u})=&+{}^{k}es_{11}w^{1}u^{1} +{}^{k}es_{21}w^{1}u^{2}+{}^{k}es_{31}w^{1}u^{3} +{}^{k}es_{41}w^{1}u^{4}+{}^{k}es_{51}w^{1}u^{5} +{}^{k}es_{12}w^{1}u^{2}\\ &+{}^{k}es_{22}w^{2}u^{2}+{}^{k}es_{32}w^{2}u^{3} +{}^{k}es_{42}w^{2}u^{4}+{}^{k}es_{52}w^{2}u^{5} +{}^{k}es_{13}w^{1}u^{3}+{}^{k}es_{23}w^{2}u^{3}\\ &+{}^{k}es_{33}w^{3}u^{3}+{}^{k}es_{43}w^{3}u^{4} +{}^{k}es_{53}w^{3}u^{5}+{}^{k}es_{14}w^{1}u^{4} +{}^{k}es_{24}w^{2}u^{4}+{}^{k}es_{34}w^{3}u^{4}\\ &+{}^{k}es_{44}w^{4}u^{4}+{}^{k}es_{54}w^{4}u^{5} +{}^{k}es_{15}w^{1}u^{5}+{}^{k}es_{25}w^{2}u^{5} +{}^{k}es_{35}w^{3}u^{5}+{}^{k}es_{45}w^{4}u^{5}\\ &+{}^{k}es_{55}w_{5}u_{5}=\sum\limits_{(i)} {}^kes_{ii} w^iu^j+2\sum\limits_{(i,j)} {{}^kes_{ij} w^iu^j} \end{aligned} $$

(17)

The ${{}^{k}e_{ii}}$ ’s and ${{}^{k}es_{ii}}$ ’s can be considered a measure of the attraction of an electron for a bond in the k step. The ${{}^{k}e_{ij}}$ ’s and ${{}^{k}es_{ij}}$ ’s are the terms of interaction between two bonds in the k step. The ${{}^{k}e_{ij}}$ ’s =${{}^{k}e_{ji}}$ ’s are equal by symmetry (non-oriented molecular graph). However, ${{}^{k}es_{ij}\neq {}^{k}es_{ji}}$. This is a logical result because the kth es _ij elements are the transition probabilities with the ‘electrons’ moving from bond i to j at the discrete time periods t _k and it should be different in both senses. This result is in total agreement if the electronegativity of the two atom types in the bonds are taken into account.

In this way, ${{\bf E}^{\varvec k}}$ and ${{\bf ES}^{\varvec k}}$ can be seen as graph-theoretic electronic-structure models [62]. In fact, quantum chemistry starts from the fact a molecule is made up of electrons and nuclei. The distinction here between bonded and non-bonded atoms is difficult to justify. Any two nuclei of a molecule interact directly and indirectly through the electrons present in the molecule. Only the intensity of this interaction varies in going from one pair of nuclei to another. In this sense, the electron in an arbitrary bond i can move (step-by-step) to other bonds at different discrete time periods t _k ${(k=0, 1, 2, 3,\ldots)}$ through the chemical-bonding network. That is to say, the ${{\bf E}^{1}}$ and ${{\bf ES}^{1}}$ matrices consider the valence-bond electrons in one step and their power ${(k=0, 1, 2, 3\ldots)}$ can be considering as an interacting-electron chemical-network model in k step. This model can be seen as an intermediate between the quantitative quantum-mechanical Schrödinger equation and classical chemical bonding ideas [62].

On the other hand, the kth (k = 0–3) non-stochastic total quadratic indices can be expressed as the sum of the local (bond) quadratic indices for this molecule as follows:

$$ \begin{aligned} {\varvec q}_{0}(\bar{w},\bar{u})=&q_{0L}(\bar{w},\bar{u}_{1}) + q_{ 0L}(\bar{w},\bar{u}_{2})+q_{ 0L}(\bar{w},\bar{u}_{3}) + q_{ 0L}(\bar{w},\bar{u}_{4})+q_{ 0L}(\bar{w},\bar{u}_{5}) = 47.85234\\ &+ 10,42118 + 10,42118 + 12,66602 + 77,48269=158,8434\\ {\varvec q}_{1}(\bar{w},\bar{u})=&q_{ 1L}(\bar{w},\bar{u}_{1}) + q_{ 1L}(\bar{w},\bar{u}_{2})+q_{1L}(\bar{w},\bar{u}_{3}) + q_{ 1L}(\bar{w},\bar{u}_{4})+q_{1L}(\bar{w},\bar{u}_{5}) = 83,22306\\ &+ 61,16852 + 21,91033 + 11,48915 + 89,30822=267,09929\\ {\varvec q}_{2}(\bar{w},\bar{u})=&q_{ 2L}(\bar{w},\bar{u}_{1}) + q_{ 2L}(\bar{w},\bar{u}_{2})+q_{ 2L}(\bar{w},\bar{u}_{3}) + q_{ 2L}(\bar{w},\bar{u}_{4})+q_{ 2L}(\bar{w},\bar{u}_{5}) = 201,2588\\ &+ 93,50003 + 71,5897 + 24,15517 + 272,6899=663,1936\\ {\varvec q}_{3}(\bar{w},\bar{u})=&q_{ 3L}(\bar{w},\bar{u}_{1}) + q_{ 3L}(\bar{w},\bar{u}_{2})+q_{ 3L}(\bar{w},\bar{u}_{3}) + q_{ 3L}(\bar{w},\bar{u}_{4})+q_{ 3L}(\bar{w},\bar{u}_{5}) = 414,6557\\ &+ 265,5164 + 115,4104 + 78,92521 + 511,0498=1385,5575 \end{aligned} $$

The terms in the summations for calculating the total quadratic indices are the so-called local (bond) quadratic indices. We have written these terms in the consecutive order of the bond labels in the graph. For instance, the non-stochastic bond quadratic indices of order 0, 1, 2 and 3 for the bond labeled as 1 are 47.85234, 83.22306, 201.2588 and 414.6557, respectively.

The kth total stochastic quadratic indices values are also the sum of the kth local (bond) stochastic quadratic indices values for all bonds in the molecule:

$$ \begin{aligned} {}^{s}{\varvec q}_{0}(\bar{w},\bar{u})=&{}^{s}q_{0L}(\bar{w},\bar{u}_{1}) + {}^{s}q_{ 0L}(\bar{w},\bar{u}_{2})+{}^{s}q_{ 0L}(\bar{w},\bar{u}_{3})+{}^{s}q_{ 0L}(\bar{w},\bar{u}_{4}) + {}^{s}q_{ 0L}(\bar{w},\bar{u}_{5}) =\\ &47,85234 + 10,42118 + 10,42118 + 12,66602 + 77,48269=158,8434\\ {\varvec q}_{1}(\bar{w},\bar{u})=&{}^{s}q_{ 1L}(\bar{w},\bar{u}_{1}) + {}^{s}q_{ 1L}(\bar{w},\bar{u}_{2})+{}^{s}q_{ 1L}(\bar{w},\bar{u}_{3})+{}^{s}q_{ 1L}(\bar{w},\bar{u}_{4}) + {}^{s}q_{ 1L}(\bar{w},\bar{u}_{5}) =\\ &39,75061 + 25,47438 + 12,93994 + 8,597788 + 42,27359=129,0363\\ {}^{s}{\varvec q}_{2}(\bar{w},\bar{u})=&{}^{s}q_{ 2L}(\bar{w},\bar{u}_{1}) + {}^{s}q_{ 2L}(\bar{w},\bar{u}_{2})+{}^{s}q_{ 2L}(\bar{w},\bar{u}_{3})+{}^{s}q_{ 2L}(\bar{w},\bar{u}_{4}) + {}^{s}q_{ 2L}(\bar{w},\bar{u}_{5}) =\\ &40,43786 + 18,32877 + 16,63249 + 10,15001 + 54,77602=140.3252\\ {}^{s}{\varvec q}_{3}(\bar{w},\bar{u})=&{}^{s}q_{ 3L}(\bar{w},\bar{u}_{1}) + {}^{s}q_{ 3L}(\bar{w},\bar{u}_{2})+{}^{s}q_{ 3L}(\bar{w},\bar{u}_{3})+{}^{s}q_{ 3L}(\bar{w},\bar{u}_{4}) + {}^{s}q_{ 3L}(\bar{w},\bar{u}_{5}) =\\ &39,15194 + 22,05334 + 13,87389 + 13,8189 + 48,32158=137,2196 \end{aligned} $$

Material and methods

TOMOCOMD-CARDD approach

The total and local (bond-type) bond-based bilinear indices were calculate by the interactive program for molecular design and bioinformatic research TOMOCOMD-CARDD [20]. The software was developed based on a user-friendly philosophy. That is to say, this computer graphics software shows a great efficiency of interaction with the user, without prior knowledge of programming skills (e.g. practicing pharmaceutic and organic chemist, teacher, university student, and so on). CARDD subprogram allows drawing the structures (drawing mode) and calculating 2D (topologic), 3D-chiral (2.5D) and 3D (geometric and topographic) non-stocahstic and stochastic MDs (calculation mode).

The main steps for the application of this method in QSAR/QSPR and for drug design can be briefly summarized as follows:

1.
Drawing of the molecular pseudographs for each molecule in the data set, using the drawing mode.
2.
Use appropriate weights in order to differentiate the molecular atoms. The weights used in this work are those previously proposed for the calculation of the DRAGON descriptors [55–57], i.e., atomic mass (M), atomic polarizability (P), atomic Mullinken electronegativity (K) plus the van der Waals atomic volume (V). The values of these atomic labels are shown in Table 1 [54–57].
3.
Computation of the total and local (bond and bond-type) bond bilinear indices of the bond adjacency matrix can be carried out in the software calculation mode, where one can select the atomic properties and the descriptor family before calculating the molecular indices. This software generates a table in which the rows correspond to the compounds, and the columns correspond to the bond-based (both total and local) bilinear maps or other MD family implemented in this program.
4.
Development of a QSPR/QSAR equation by using several multivariate analytical techniques, for instance, linear discrimination analysis. That is to say, one can find a quantitative relationship between an activity A and the bond-based bilinear fingerprints having, for instance, the following appearance:
$$ {\bf A}=a_{0}{\varvec b}_{0}(\bar{w},\bar{u})+a_{1}{\bf b}_{1}(\bar{w},\bar{u}) + a_{2}{\varvec b}_{2}(\bar{w},\bar{u}) +\cdots+ a_{k}{\varvec b}_{k}(\bar{w},\bar{u}) + \hbox{c} $$
(18)
where A is the measured activity, ${{\varvec b}_{k}(\bar{w},\bar{u})}$ are the kth non-stochastic total bond-based bilinear indices, and the a _k′s are the coefficients obtained by the linear regression analysis.
5.
Test of the robustness and predictive power of the QSPR/QSAR equation by using internal [leave-one-out (LOO)] and external (using a test set and an external predicting set) validation techniques.

The bond-based TOMOCOMD-CARDD descriptors computed in this study were the following:

(1)
kth (k = 15) total non-stochastic bond-based bilinear indices not considering and considering H-atoms in the molecular graph (G) [${{\varvec b}_{\varvec b}(\bar{w},\bar{u})}$ and ${{\varvec b}_{\varvec k}^{ H}(\bar{w},\bar{u})}$, respectively].
(2)
kth (k = 15) total stochastic bond-based bilinear indices not considering and considering H-atoms in the molecular graph (G) [${{}^{\varvec s}{\varvec b}_{\varvec b}(\bar{w},\bar{u})}$ and ${{}^{\varvec s}{\varvec b}_{\varvec b}^{ H}(\bar{w},\bar{u})}$, respectively].
(3)
kth (k = 15) bond-type local (group = heteroatoms: S, N, O) non-stochastic bilinear indices not considering and considering H-atoms in the molecular graph (G) [${{\varvec b}_{{\varvec k}{ L}}(\bar{w}_E ,\bar{u}_E)}$ and ${{\varvec b}_{{\varvec k}{ L}}^{ H}(\bar{w}_E ,\bar{u}_E)}$, correspondingly]. These local descriptors are putative molecular charge, dipole moment, and H-bonding acceptors.
(4)
kth (k = 15) bond-type local (group = heteroatoms: S, N, O) stochastic bilinear indices not considering and considering H-atoms in the molecular graph (G) [${{}^{\varvec s}{\varvec b}_{{\varvec b}{ L}}(\bar{w}_E ,\bar{u}_E)}$, and ${{}^{\varvec s}{\varvec b}_{{\varvec b}{ L}}^{ H}(\bar{w}_E ,\bar{u}_E)}$, correspondingly]. These local descriptors are putative molecular charge, dipole moment, and H-bonding acceptors.

Database construction

The database collected to our study of tyrosinase inhibitory activity consists of 685 compounds in total. The active compounds inside this set were of 246, having reported activity against the enzyme tyrosinase. The rest, 412 organic-chemicals were chosen as inactive compounds. In both cases (active and inactive ones) we consider the structural molecular variability as important goal to assure the quality of our QSAR study.

In the case of tyrosinase inhibitor compounds (actives) many different subsystems were included. An example of the most representative tyrosinase reference drugs is illustrates in Fig. 1, together with some tyrosinase inhibitors of different families.

The names of compounds in the active database together with their experimental data taken from the literature are shown in Table 1 of Supporting Information. In the same way, we depict in Table 2 (Supporting Information) the molecular structures of these 246 tyrosinase inhibitors. This dataset provides a helpful tool for scientific research in many chemistry fields related with the tyrosinase enzyme and its inhibitors.

Table 2 Main results of the k-MCAs, for tyrosinase inhibitors and inactives drug-like compounds

Full size table

By other way, the rest 412 compounds having different pharmacological uses were selected for the inactive set. All these chemicals were taken from the Negwer Handbook [63], where their names, synonyms and structural formulas can be found.

Statistical techniques

The STATISTICA software [64] was used to develop the different statistical methods used in this report. In first place we employed the cluster analysis as a method that recognizes similarities among cases and it contains them according to these criteria [65]. In our case k-MCA (k-means cluster analysis) and k-NNCA (k-nearest neighbors cluster analysis) algorithms were used to design the training and prediction series [64–67]. The dendrograms were obtained using the Euclidean distance (X-axis) and the complete linkage (Y-axis), and show the distance between the compounds inside the clusters, that are grouped according to its chemical similarity encoded by the MDs used as variables. Linear Discriminant Analysis (LDA) a simple and very useful technique in drug design was carried out to find the QSAR models [13, 16, 17, 19, 34, 35, 37, 38, 42–47, 68–73]. Here, the forward stepwise procedure was fixed as the strategy for variable selection and taken into account the principle of parsimony (Occam’s razor) for model selection.

The classification of cases was carried out by mean of posterior classification probabilities. Tyrosinase inhibitory activity was codified by a dummy variable “Class”. This variable indicates the presence of either an active compound (Class = 1) or an inactive compound (Class = −1). By using the models, one compound can then be classified as active, if ${\Delta P\% > 0}$, being ${\Delta P\%=[P\hbox{(Active)} - P\hbox{(Inactive)}]\times 100}$, or as inactive otherwise. P (Active) and P (Inactive) are the probabilities with which the equations classify a compound as active or inactive, respectively.

The Randić’s method of orthogonalization was used in this study as a way to avoid the interrelation among the molecular fingerprints [45, 74–79]. This may possible a better statistical interpretation of the correlation coefficient and to evaluate the role of individual MDs in the QSAR model.

The data set was standardized before the orthogonalization process, because the different MDs included here used entirely “different types of scales”. This process to proportionate each variable has a mean of 0 and a standard deviation of 1.

Experimental methods

The synthesis and characterization of the 24 tetraketones, their biological studies and cross references have been reported by other of our research team [80].

Tyrosinase inhibition assay was performed with kojic acid and l-mimosine as standard inhibitors for the tyrosinase in a 96-well microplate format using a SpectraMax 340 micro-plate reader (Molecular Devices, CA, USA) according to the method developed by Hearing [81]. Briefly, the compounds were first screened for the o-diphenolase inhibitory activity of tyrosinase using l-DOPA as substrate. All the active inhibitors from the preliminary screening were subjected to IC₅₀ studies. Compounds were dissolved in methanol to a concentration of 2.5%. Thirty units of mushroom tyrosinase (28 nM from Sigma Chemical Co., USA) were first preincubated with the test compounds in 50 nM Na-phosphate buffer (pH 6.8) for 10 min at 25 °C. Then the l-DOPA (0.5 mM) was added to the reaction mixture and the enzymatic reaction was monitored by measuring the change in absorbance at 475 nm (at 37 °C) due to the formation of the DOPAchrome for 10 min. The percentage of inhibition of the enzyme was calculated as follows, by using MS Excel${^{\rm \circledR TM}}$ 2000 (Microsoft Corp., USA) based program developed for this purpose:

$$ \hbox{Percent inhibition}=[({B}-{S})/{B}]\times 100 $$

(19)

Here, B and S are the absorbances for the blank and samples, respectively. After the screening of the compounds, 50% inhibitory concentrations (IC₅₀) were also calculated. Kojic acid and l-mimosine were used as standard inhibitors for the tyrosinase and both of them were purchased from Sigma Chem. Co., USA.

Results and discussion

Dividing the training and prediction series through cluster analysis

In above section we describe the database selection process, now the structural variability of such set must be proved. This is a crucial aspect in any QSAR study in order to explain its reliability. Following this main reason, different cluster analysis techniques were carried out. In first place was used a k-NNCA to prove the structural diversity in the families presented in the data. Two dendrograms, one for the active compounds series and other for the inactive ones, were obtained through hierarchical cluster analysis (Figs. 2, 3) were can be observed different structural patterns which demonstrate the chemical variability of the database.

Now the dataset should be partitioned in training and prediction sets, to find the discriminant functions, but due to the difficulty of evaluating the output dendrograms other kind of CA must be do it, for the selection of compounds in a ‘rational’ way.

Therefore we chose the k-MCA to solve this problem, and were applied to active and inactive subsets. The first k-MCA for tyrosinase inhibitors divide this set into 10 clusters. On other hand the k-MCA II split the inactive set into 12 clusters. The variables used were the kth non-stochastic bond-bilinear indices, and the analyses of variance for these k-MCAs are depicted in Table 2.

The following process using the cluster analysis techniques to divide entire database in training and prediction series is shown in shown in Fig. 4. How can be observed in the same diagram there are 183 active compounds and 295 inactive ones belonging to training set (478 organic-chemicals). The prediction series of 180 compounds have 63 tyrosinase inhibitors and 117 non-inhibitors of tyrosinase.

Developing the discriminant functions

The representative selection of training set permit continues to the next step, the finding of the classification functions to discriminate between active and inactive. For this we select the LDA as statistical technique due to it’s broadly use and simplicity.

In total were obtained fourteen models, the first six models developed with the non-stochastic bond-based bilinear indices and the other first six perform with the stochastic molecular descriptors, these equations are depicted Table 3. Besides, below we shown the Eqs. 32 and 33 of the last seven models in both cases (non-stochastic and stochastic molecular fingerprints) resulting in a combination of all pairs of atom weights (atomic labels):

$$ \begin{aligned} {\bf Class}= &-0.636 -8.422\times 10^{-2 {MP}}{\varvec b}_{0L}^{ H}(\bar{w}_E ,\bar{u}_E ) +0.107^{ MP}{\varvec b}_{ 0L}(\bar{w}_E ,\bar{u}_E )\\ &+1.792\times 10^{-2 { MK}}{\varvec b}_{ 1L}^{ H}(\bar{w}_E ,\bar{u}_E ) -2.373\times 10^{-2 { MK}}{\varvec b}_{ 1L}(\bar{w}_E ,\bar{u}_E ) +3.287\times 10^{-5 { VP}}{\varvec b}_{5}^{ H}(\bar{w},\bar{u})\\ &-9.590\times 10^{-2 { VP}}{\varvec b}_{ 0L}(\bar{w}_E ,\bar{u}_E ) +1.166\times 10^{-2 { VP}}{\varvec b}_{ 1L}(\bar{w}_E ,\bar{u}_E ) +2.277\times 10^{-2 {VK}}{\varvec b}_{0}^{ H}(\bar{w},\bar{u})\\ &+5.4\times 10^{-3 { VK}}{\varvec b}_{1}^{ H}(\bar{w},\bar{u}) -4.04\times 10^{-3 { VK}}{\varvec b}_{2}^{ H}(\bar{w},\bar{u}) +2.34\times 10^{-2 { VK}}{\varvec b}_{ 0L}(\bar{w}_E ,\bar{u}_E ) \end{aligned} $$

(32)

N = 478 λ = 0.45 D ² = 5.13 F = 51.6 Canonical R = 0.74 χ² = 374.8 Q _Total = 91.00% C = 0.81

$$ \begin{aligned} {\bf Class}=& -0.302 +5.290\times 10^{-3 { MV}}{\varvec b}_{5L}^{ H}(\bar{w}_E ,\bar{u}_E ) +6.267\times 10^{-3 {MP}}{\varvec b}_{ 0L}(\bar{w}_E ,\bar{u}_E)\\ &+1.262\times 10^{-2 { MK}}{\varvec b}_{0}(\bar{w},\bar{u}) -3.458\times 10^{-2 { MK}}{\varvec b}_{ 0L}^{ H}(\bar{w}_E ,\bar{u}_E) -1.734\times 10^{-2 { VP}}{\varvec b}_{0}(\bar{w},\bar{u})\\ &+1.286\times 10^{-2 { VP}}{\varvec b}_{ 14L}(\bar{w},\bar{u}) -4.840\times 10^{-2 { VP}}{\varvec b}_{ 4L}(\bar{w}_E ,\bar{u}_E) +0.129^{ VK}{\varvec b}_{ 2L}^{ H}(\bar{w}_E ,\bar{u}_E)\\ &-0.133^{ VK}{\varvec b}_{ 3L}^{ H}(\bar{w}_E ,\bar{u}_E ) +1.067\times 10^{-2 { VK}}b_{ 0L}(\bar{w}_E ,\bar{u}_E) \end{aligned} $$

(33)

N = 478 λ = 0.46 D ² = 5.00 F = 55.4 Canonical R = 0.74 χ² = 368.5 Q _Total = 90.17% C = 0.79

Table 3 Discriminant models obtained with total and local non-stochastic and stochastic bond-based bilinear indices used in this study

Full size table

Prediction performances of all the obtained models including these last two equations are given in Table 4, together with the Wilks’ statistics (λ), the square of the Mahalanobis distances (D ²), and the Fisher ratio (F). The models selected showed to be statistically significant at p-level <0.0001.

Table 4 Prediction performances and statistical parameters for LDA-based QSAR models in the training set

Full size table

The fitted models 32 and 33, resulting of the combination of weighting schemes for the non-stochastic and stochastic bond-level bilinear indices, respectively, exhibit the best results, how can be observed in Table 4. These best two equations correctly classified the 91.00% and 90.17% of the training set, and showed values of the Matthews correlation coefficients (C) of 0.81 and 0.79, respectively. The most common parameters in medical statistics for all the models are depicted in the same Table 4.

Although these two best models exhibited good results, the interpretation of the individual role of every index in the model can become in a difficulty due to the interrelation among them (data not shown). This impelled us to use the Randić’s orthogonalization process to avoid this problem, and eliminate the collinearity between the variables [74–78].

In Eqs. 34 and 35 are depicted the results of the orthogonalization process for the best two models of the non-stochastic and stochastic bilinear indices, correspondingly.

$$ \begin{aligned} {\bf Class} =&-0.331 -1.515^{1}O(^{ VP}{\varvec b}_{ 1L}^{ H}(\bar{w}_E ,\bar{u}_E )) +2.037^{2}O(^{ VK}{\varvec b}_{1}^{ H}(\bar{w},\bar{u})) 2.406^{3}O(^{ VK}{\varvec b}_{ 0L}(\bar{w}_E ,\bar{u}_E ))\\ &-3.176^{4}O(^{ VK}{\varvec b}_{0}^{ H}(\bar{w},\bar{u})) +0.805^{5}O(^{ VP}{\varvec b}_{5}^{ H}(\bar{w},\bar{u})) -6.540^{6}O(^{ VK}{\varvec b}_{2}^{ H}(\bar{w},\bar{u}))\\ &-1.959^{7}O(^{ VP}{\varvec b}_{ 0L}(\bar{w}_E ,\bar{u}_E )) -1.015^{8}O(^{ MK}{\varvec b}_{ 1L}^{ H}(\bar{w}_E ,\bar{u}_E )) +1.913^{9}O(^{ MP}{\varvec b}_{ 0L}(\bar{w}_E ,\bar{u}_E ))\\ &-5.997^{10}O(^{ MK}{\varvec b}_{ 1L}(\bar{w}_E ,\bar{u}_E )) -16.337^{11}O(^{ MP}{\varvec b}_{ 0L}^{ H}(\bar{w}_E ,\bar{u}_E )) \end{aligned} $$

(34)

N = 478 λ = 0.45 D ² = 5.13 F = 51.6 Canonical R = 0.74 χ² = 374.8 Q _Total = 91.00% C = 0.81.

$$ \begin{aligned} {\bf Class} =& -2.414\times 10 ^{-2} -1.197 ^{1}O(^{VP}{\varvec b}_{4L}(\bar{w}_E ,\bar{u}_E )) +4.346 ^{2}O(^{ VK}{\varvec b}_{ 0L}(\bar{w}_E ,\bar{u}_E ))\\ &+1.075 ^{3}O(^{ VP}{\varvec b}_{14}^{ H}(\bar{w},\bar{u})) -3.196 ^{4}O(^{ VP}{\varvec b}_{0}(\bar{w},\bar{u})) +0.899 ^{5}O(^{ MP}{\varvec b}_{ 0L}(\bar{w}_E ,\bar{u}_E )) \\ &-1.197 ^{6}O(^{ MK}{\varvec b}_{ 0L}^{ H}(\bar{w}_E ,\bar{u}_E )) +5.474 ^{7}O(^{ MK}{\varvec b}_{0}(\bar{w},\bar{u})) +2.093 ^{8}O(^{ VK}{\varvec b}_{ 2L}^{ H}(\bar{w}_E ,\bar{u}_E )) \\ &-7.371 ^{9}O(^{ VK}{\varvec b}_{ 3L}^{ H}(\bar{w}_E ,\bar{u}_E )) +3.461 ^{10}O(^{ MV}{\varvec b}_{ 5L}^{ H}(\bar{w}_E ,\bar{u}_E )) \end{aligned} $$

(35)

N = 478 λ = 0.46 D ² = 5.00 F = 55.4 Canonical R = 0.74 χ² = 368.5 Q _Total = 90.17% C = 0.79. Here, we used the symbols ${{}^{m}O[{\varvec b}_{k}(\bar{w},\bar{u})]}$, where the superscript m expresses the order of importance of the variable ${{\varvec b}_{k}(\bar{w},\bar{u})}$ after a preliminary forward-stepwise analysis and O means orthogonal. If we take a look to the statistical parameter to every model before and after of the orthogonalization process, can be observed that they keep be the same for the non-orthogonal and orthogonal descriptors.

Assessing the predictive power of the models

Validation external process or most commonly namely test set is necessary to ensure the quality and extrapolation power of the QSAR models found in this report [82, 83]. Following this aim all the equations were evaluated and results are shown in Table 5. In the case of the best two discriminant functions Eqs. 32 and 33, presented overall accuracies of 93.33% (C = 0.85) and 88.89% (C = 0.77), respectively. Likewise a plot of the ΔP% for the entire dataset using models (32) and (33), is illustrates in Figs. 5 and 6.

Table 5 Prediction performances for LDA-based QSAR models in the test set

Full size table

The results of the classification using the total fourteen models, for all the active and inactive organic-chemicals in the training and external series are shown in Tables 3–10 of Supporting Information.

Simulated virtual screening of new tyrosinase inhibitors

The good behavior of the results obtained above, encouraged us, to expand moreover the possibilities of this novel approach for the in silico discovery of novel tyrosinase inhibitor compounds. Virtual High Throughput Screening (HTS) can become an important tool capable to resolve the largely query of database of thousands of compounds, and has the potential to transform early-stage drug discovery. To prove the ability of our models a simulated virtual screening to a data of 104 organic-chemicals (Table 6) reported from the literature as inactive/inactive (see the last column of the same Table 6: Ref) was carried out. The molecular structures of these compounds are shown in Table 11 of Supporting Information.

Table 6 Results of the virtual screening

Full size table

Besides to assure the great possibility of our models to identify several classes of compounds a k-NNCA to this data was carried and the dendrogram obtained can be observed in Fig. 7, where a great molecular diversity can be visualized. In Table 6 are depicted the results of the classifications of the 104 compounds. Additionally the posterior classification probabilities (including canonical scores) for all the equations are summarized in Table 12 (Supporting Information). The percent of globally good classification were of 85.57% and 84.61% for the non-stochastic and stochastic molecular descriptors, correspondingly.

This method could very useful, due that making use of this many great databases of drug-like compounds could be make it, and some compounds identified reported with the new biological activity, also taken into account that this kind of chemicals have well-established methods of synthesis, as well as their toxicological, pharmacodynamical and pharmaceutical properties are well known.

Biosilico identification of novel tyrosinase inhibitors and experimental corroboration

The entire algorithm describes in the above sections, was make up with the main objective to explore the possibilities of the current in silico approach for the identification of hits from largely databases. In this sense an in silico screening of novel compounds looking for the biological activity concern to this work was performed. To make this, a pool of compounds never described in the literature as tyrosinase inhibitors was chosen. Later the in silico essays were done using all the models developed inside this report, to find bioactive chemicals that present tyrosinase inhibitory activity.

Here, 24 tetraketones were evaluated with the LDA-based QSAR models, and the in vitro assays of the synthesized compounds were done to corroborate the in silico predictions. The values of the posterior classification probabilities (ΔP%) obtained with all the equations for the data are shown in Table 7. Hence here we can see that exits a good concordance among the theoretical predictions and the experimental results for all the organic-chemicals, and all were active against the tyrosinase enzyme in the in vitro assays. It is important to stand out that the majority of compounds showed values of activity higher than Kojic acid (standard tyrosinase inhibitor: IC₅₀ = 16.67 μM) with the exception of TK2 (IC₅₀ = 26.63 μM), TK4 (IC₅₀ = 16.99μM), TK7 (IC₅₀ = 19.73μM) and TK19 (IC₅₀ = 71.47 μM). By other way, four chemicals TK10 (IC₅₀ = 2.09 μM), TK11 (IC₅₀ = 2.61 μM), TK21 (IC₅₀ = 2.06 μM), TK23 (IC₅₀ = 3.19 μM), exhibited more potent activity compared with l-mimosine (IC₅₀ = 3.68 μM) a reference drug. In Table 8 are depicted the molecular structures of these tetraketones and the rest used in this study.

Table 7 Results of ligand-based in silico screening and tyrosinase inhibitory activities of new tetraketones

Full size table

Table 8 Molecular structure of the new tetraketones

Full size table

As a final point, a hierarchical cluster analysis was performed for all the active compounds of the training, test, virtual screening and the new tetraketones (Fig. 8). The aim of this k-NNCA was compared if there was any similarity between the novel bioactive chemicals and some subsystems in the rest of the active database. After an exhaustive analysis to each cluster, we observed that these tetraketones were distributed in many clusters, which is reasonable because this class of compounds don’t have common structural features with none of the compounds in the active database. Therefore, they can be selected to make a structural optimization with the objective to find a more potent tyrosinase inhibitory activity, and afterward a complete study of ADMET properties should be carried out to entering these organic-chemicals discovered into the pipeline of the drug market development.

Summary and outlook

Many studies in the area of tyrosinase inhibitory activity are involved to finding novel inhibitors from different sources, due to its wide applications as food additives, depigmentation agents, in the treatment of melanogenesis disorders, to control insect pests and so on. The interest of pharmaceutical, cosmetic, and agricultural sciences in this kind of chemicals is referred to its broad spectrum of applications, and wide distribution through all the phylogenetic scale.

The advent of virtual High Through Screening (vHTS) encompassing in silico techniques in the drug discovery, are solutions that enable research to proceed faster and more efficiently. These new algorithms starting from the convergence of information technology and drug discovery, can be useful to resolve the question of accelerate the pace of drug discovery in the identification of higher quality compounds. Nevertheless, in this case, the process of searching of new tyrosinase inhibitor compounds until now is through trial-error traditional methods [84, 85].

Taken all these into consideration, we made use of the non-stochastic and stochastic bond-based bilinear indices, a new set of MDs, together with pattern recognition techniques to discriminate active compounds from inactive ones. QSAR models found here were used in a virtual screening to arising from the in silico to ‘real’ world applications. Besides, is reported the biosilico identification of a novel tetraketone family as tyrosinase inhibitors using the new molecular fingerprints. The experimental in vitro assays were also carried out to prove the usefulness of the TOMOCOMD-CARDD descriptors for the rational design of new bioactive agents.

These kinds of works are in the light of new challenges for the pharmaceutical industries because a research in modern drug discovery needs training and experience in multiple life science domain areas as well as in computer science [86]. Finally, the present report could permit us to look forward to many exciting new insights in the field of tyrosinase inhibitor compounds research for the treatment of hyperpigmentation and melanogenesis disorders in the years ahead.

Supporting information available

The complete list of compounds used in training and prediction sets, as well as their structures, posterior classification and scores according to LDA-based QSAR models, chemistry and data analysis of the obtained chemicals is available free of charge via Internet at...

References

Robb DA (1984) Copper proteins and copper enzymes. CRC Press, Boca Raton, FL
Google Scholar
Baurin N, Arnoult E, Scior T, Do QT, Bernard P (2002) J Ethnopharmacol 82:155
Article CAS Google Scholar
Prota G (1988) Med Res Rev 8:525
Article CAS Google Scholar
Prota G (1992) Melanins and melanogenesis. Academic Press, San Diego, CA
Google Scholar
Riley PA (1996) In: Hori Y, Hearing VJ, Nakayama J (eds) Melanogenesis and malignant melanoma: biochemistry, cell biology, molecular biology, pathophysiology, diagnosis and treatment. Elsevier, Amsterdam
Google Scholar
Riley PA (2003) Pigment Cell Res 16:548
Article CAS Google Scholar
Fitzpatrick TB, Seji M, McGugan AD (1961) New Engl J Med 265:374
Article CAS Google Scholar
Maeda K, Fukuda M (1996) J Pharmacol Exp Ther 276:765
CAS Google Scholar
Li W, Kubo I (2004) Bioorg Med Chem 12:701
Article CAS Google Scholar
Casañola-Martin GM, Khan MT, Marrero-Ponce Y, Ather A, Sultankhodzhaev MN, Torrens F (2006) Bioorg Med Chem Lett 16:324
Article CAS Google Scholar
Marrero-Ponce Y, Khan MTH, Casañola-Martín GM, Ather A, Sultankhodzhaev MN, Torrens F (2006) QSAR Comb Sci DOI: 10.1002/qsar.200610156
Marrero-Ponce Y (2003) Molecules 8:687
Google Scholar
Marrero-Ponce Y, Huesca-Guillen A, Ibarra-Velarde F (2005) J Mol Struct (Theochem) 717:67
Article CAS Google Scholar
Meneses-Marcel A, Marrero-Ponce Y, Machado-Tugores Y, Montero-Torres A, Pereira DM, Escario JA, Nogal-Ruiz JJ, Ochoa C, Aran VJ, Martinez-Fernandez AR, Garcia Sanchez RN (2005) Bioorg Med Chem Lett 15:3838
Article CAS Google Scholar
Vega MC, Montero-Torres A, Marrero-Ponce Y, Rolon M, Gomez-Barrio A, Escario JA, Aran VJ, Nogal JJ, Meneses-Marcel A, Torrens F (2006) Bioorg Med Chem Lett 16:1898
Article CAS Google Scholar
Marrero-Ponce Y, Medina-Marrero R, Martinez Y, Torrens F, Romero-Zaldivar V, Castro EA (2006) J Mol Mod 12:255
Article CAS Google Scholar
Marrero-Ponce Y, Nodarse D, González HD, Ramos de Armas R, Romero-Zaldivar V, Torrens F, Castro E (2004) Int J Mol Sci 5:276
Google Scholar
Marrero Ponce Y, Castillo Garit JA, Nodarse D (2005) Bioorg Med Chem 13:3397
Article CAS Google Scholar
Marrero-Ponce Y, Iyarreta-Veitia M, Montero-Torres A, Romero-Zaldivar C, Brandt CA, Avila PE, Kirchgatter K, Machado Y (2005) J Chem Inf Model 45:1082
Article CAS Google Scholar
Marrero-Ponce Y, Romero V, TOMOCOMD software, Central University of Las Villas (2002) TOMOCOMD (TOpological MOlecular COMputer Design) for Windows, version 1.0 is a preliminary experimental version; in future a professional version can be obtained upon request to Y. Marrero: yovanimp@qf.uclv.edu.cu or ymarrero77@yahoo.es
Rouvray DH (1976) In: Balaban AT (ed) Chemical applications of graph theory. Academic Press, London, pp 180–181
Google Scholar
Trinajstić N (1983) Chemical graph theory. CRC Press, Boca Raton FL
Google Scholar
Estrada E (1995) J Chem Inf Comput Sci 35:31
Article CAS Google Scholar
Estrada E, Ramirez A (1996) J Chem Inf Comput Sci 36:837
Article CAS Google Scholar
Estrada E (1996) J Chem Inf Comput Sci 36:844
Article CAS Google Scholar
Estrada E, Guevara N, Gutman I (1998) J Chem Inf Comput Sci 38:428
Article CAS Google Scholar
Estrada E (1999) J Chem Inf Comput Sci 39:1042
Article CAS Google Scholar
Estrada E, Molina E (2001) J Mol Graph Model 20:54
Article CAS Google Scholar
Todeschini R, Consonni V (2000) Handbook of molecular descriptors. Wiley-VCH, Germany
Google Scholar
Ivanciuc O, Balaban AT (1999) In: Devillers J, Balaban AT (eds) Topological indices and related descriptors in QSAR and QSPR. Gordon and Breach, The Netherlands, 73 p
Google Scholar
Edwards CH, Penney DE (1988) Elementary linear algebra. Prentice-Hall, Englewood Cliffs, New Jersey, USA
Google Scholar
Marrero-Ponce Y (2004) Bioorg Med Chem 12:6351
Article CAS Google Scholar
Marrero-Ponce Y, Castillo-Garit JA, Olazabal E, Serrano HS, Morales A, Castañedo N, Ibarra-Velarde F, Huesca-Guillen A, Jorge E, del Valle A, Torrens F, Castro EA (2004) J Comput Aided Mol Des 18:615
Article CAS Google Scholar
Marrero-Ponce Y, Medina-Marrero R, Torrens F, Martinez Y, Romero-Zaldivar V, Castro EA (2005) Bioorg Med Chem 13:2881
Article CAS Google Scholar
Marrero-Ponce Y, Díaz HG, Romero V, Torrens F, Castro EA (2004) Bioorg Med Chem 12:5331
Article CAS Google Scholar
Marrero-Ponce Y, Cabrera MA, Romero V, Ofori E, Montero LA (2003) Int J Mol Sci 4:512
Google Scholar
Marrero-Ponce Y, Cabrera MA, Romero V, González DH, Torrens F (2004) J Pharm Pharmaceut Sci 7:186
Google Scholar
Marrero-Ponce Y, Cabrera MA, Romero-Zaldivar V, Bermejo M, Siverio D, Torrens F (2005) Internet Electron J Mol Des 4:124
Google Scholar
Marrero-Ponce Y, Medina R, Castro EA, de Armas R, González H, Romero V, Torrens F (2004) Molecules 9:1124
Google Scholar
Marrero Ponce Y (2004) J Chem Inf Comput Sci 44:2010
Article CAS Google Scholar
Marrero-Ponce Y, Castillo-Garit JA, Torrens F, Romero-Zaldivar V, Castro E (2004) Molecules 9:1100
Google Scholar
Marrero-Ponce Y, Montero-Torres A, Zaldivar CR, Veitia MI, Perez MM, Sanchez RN (2005) Bioorg Med Chem 13:1293
Article CAS Google Scholar
Marrero-Ponce Y, Castillo-Garit JA, Olazabal E, Serrano HS, Morales A, Castanedo N, Ibarra-Velarde F, Huesca-Guillen A, Sanchez AM, Torrens F, Castro EA (2005) Bioorg Med Chem 13:1005
Article CAS Google Scholar
Marrero-Ponce Y, Castillo-Garit JA (2005) J Comput Aided Mol Des 19:369
Article CAS Google Scholar
Estrada E, Vilar S, Uriarte E, Gutierrez Y (2002) J Chem Inf Comput Sci 42:1194
Article CAS Google Scholar
Estrada E, Uriarte E, Montero A, Teijeira M, Santana L, De Clercq E (2000) J Med Chem 43:1975
Article CAS Google Scholar
Estrada E, Peña A, Garcia-Domenech R (1998) J Comput Aided Mol Des 12:583
Article CAS Google Scholar
Potapov VM (1978) Stereochemistry. Mir Moscow
Wang R, Gao Y, Lai L (2000) Perspect Drug Dis Des 19:47
Article CAS Google Scholar
Ertl P, Rohde B, Selzer P (2000) J Med Chem 43:3714
Article CAS Google Scholar
Ghose AK, Crippen GM (1987) J Chem Inf Comput Sci 27:21
Article CAS Google Scholar
Miller KJ (1990) J Am Chem Soc 112:8533
Article CAS Google Scholar
Gasteiger J, Marsili M (1978) Tetrahedron Lett 19:3181
Article Google Scholar
Pauling L (1939) The nature of chemical bond. Cornell University Press, Ithaca, New York
Google Scholar
Kier LB, Hall LH (1986) Molecular connectivity in structure–activity analysis. Research Studies Press, Letchworth, UK
Google Scholar
Consonni V, Todeschini R, Pavan M (2002) J Chem Inf Comput Sci 42:682
Article CAS Google Scholar
Todeschini R, Gramatica P (1998) Perspect Drug Dis Des 9–11:355
Article Google Scholar
Jacobson N (1985) In: Freeman WHC (ed) Basic algebra I. New York, pp 343–361
Riley KF, Hobson MP, Vence SJ (1998) Mathematical methods for physics and engineering. Cambridge University Press
Werner G (1981) Linear algebra, 4th edn Springer-Verlag, New York
Google Scholar
Walker PD, Mezey PG (1993) J Am Chem Soc 115:12423
Article CAS Google Scholar
Klein DJ (2003) Internet Electron J Mol Des 2:814
CAS Google Scholar
Negwer M (1987) Organic-chemical drugs and their synonyms. Akademie-Verlag, Berlin
Google Scholar
STATISTICA (data analysis software system), v S I (2001) www.statsoft.com
Xu J, Hagler A (2002) Molecules 7:566
Article CAS Google Scholar
Mc Farland JW, Gans DJ (1995) In: Waterbeemd H (ed), Chemometric Methods in Molecular Design. VCH Publishers, New York, pp 295–307
Google Scholar
Johnson RA, Wichern DW (1988) Applied multivariate statistical analysis. Prentice-Hall, New Jersey
Google Scholar
Duart MJ, Garcia-Domenech R, Anton-Fos GM, Galvez J (2001) J Comput Aided Mol Des 15:561
Article CAS Google Scholar
van de Waterbeemd H (1995) In: van de Waterbeemd H (ed) Chemometric methods in molecular design. VCH Publishers, Weinheim, pp 265–288
Google Scholar
de Julian-Ortiz JV, de Alapont CG, Ríos-Santamarina I, Garcia-Domenech R, Galvez E (1998) J Mol Graphics Mod 16:14
Article Google Scholar
Gálvez J, García R, Salabert MT, Soler R (1994) J Chem Inf Comput Sci 34:520
Article Google Scholar
Gonzales-Diaz H, Marrero Ponce Y, Hernadez I, Bastida I, Tenorio E, Nasco O, Uriarte E, Castanedo N, Cabrera MA, Aguila E, Marrero O, Morales A, Perez M (2003) Chem Res Toxicol 16:1318
Article CAS Google Scholar
Estrada E, Peña A (2000) Bioorg Med Chem 8:2755
Article CAS Google Scholar
Randić M (1991) J Mol Struct (Theochem) 233:45
Article Google Scholar
Randić M (1991) J Chem Inf Comput Sci 31:311
Article Google Scholar
Randić M (1991) New J Chem 15:517
Google Scholar
Lučić B, Nikolić S, Trinajstić N, Jurić D (1995) J Chem Inf Comput Sci 35:532
Google Scholar
Klein DJ, Randić M, Babić D, Lučić B, Nikolić S, Trinajstić N (1997) Int J Quantum Chem 63:215
Article CAS Google Scholar
Estrada E, Uriarte E (2001) Curr Med Chem 8:1573
CAS Google Scholar
Khan KM, Maharvi GM, Khan MT, Jabbar Shaikh A, Perveen S, Begum S, Choudhary MI (2006) Bioorg Med Chem 14:344
Article CAS Google Scholar
Hearing VJ (1987) Methods in enzymology. Academic Press, New York
Google Scholar
Wold S, Erikson L (1995) In: van de Waterbeemd H (ed) Chemometric methods in molecular design. VCH Publishers, New York, pp 309–318
Google Scholar
Golbraikh A, Tropsha A (2002) J Mol Graph Model 20:269
Article CAS Google Scholar
Okombi S, Rival D, Bonnet S, Mariotte AM, Perrier E, Boumendjel A (2006) J Med Chem 49:329
Article CAS Google Scholar
Zhang J-P, Chen Q-X, Song K-K, Xie J-J (2006) Food Chem 92:579
Article CAS Google Scholar
Apweiler R (2003) Biosilico 1:5
Article Google Scholar

Download references

Acknowledgments

One of the authors (M-P. Y) thanks the program ‘Estades Temporals per a Investigadors Convidats’ for a fellowship to work at Valencia University (2006–2007). M-P. Y thanks are also given to the Generalitat Valenciana, (Spain) for partial financial support as well as support from Spanish MEC (Project Reference: SAF2006-04698). MTHK is the recipient of a grant from MCBN-UNESCO (grant no. 1056), and fellowships from CIB (Italy) and Associasione Veneta per la Lotta alla Talassemia (AVTL, Italy). F. T. acknowledges financial support from the Spanish MEC DGI (Project No.CTQ2004-07768-C02-01/BQU) and Generalitat Valenciana (DGEUI INF01-051 and INFRA03-047, and OCYT GRUPOS03-173.

Author information

Authors and Affiliations

Institut Universitari de Ciència Molecular, Universitat de València, Edifici d’Instituts de Paterna, Poligon la Coma s/n (detras de Canal Nou), P.O. Box 22085, 46071, Valencia, Spain
Yovani Marrero-Ponce & Francisco Torrens
Unit of Computer-Aided Molecular “Biosilico” Discovery and Bioinformatic Research (CAMD-BIR Unit), Faculty of Chemistry-Pharmacy, Department of Drug Design, Chemical Bioactive Center, Central University of Las Villas, Santa Clara, Villa Clara, 54830, Cuba
Yovani Marrero-Ponce & Gerardo M. Casañola-Martín
Unidad de Investigación de Diseño de Fármacos y Conectividad Molecular, Departamento de Química Física, Facultad de Farmacia, Universitat de València, Valencia, Spain
Yovani Marrero-Ponce & Ramón García-Domenech
Pharmacology Research Lab., Faculty of Pharmaceutical Sciences, University of Science and Technology, Chittagong, Bangladesh
Mahmud Tareq Hassan Khan
Department of Pharmacology, Institute of Medical Biology, University of Tromso, Tromso, 9037, Norway
Mahmud Tareq Hassan Khan
Department of Biological Sciences, Faculty of Agricultural Sciences, University of Ciego de Avila, 69450, Ciego de Avila, Cuba
Gerardo M. Casañola-Martín
The Norwegian Structural Biology Centre (NorStruct), University of Tromso, Tromso, 9037, Norway
Arjumand Ather
S. Yunusov Institute of Chemistry of Plant Substances, Academy of Sciences, Uzbekistan, Tashkent
Mukhlis N. Sultankhodzhaev
Advanced Medisyns, Inc., 601 Carlson Parkway, Suite 1050, Minnetonka, MN, 55305, USA
Richard Rotondo

Authors

Yovani Marrero-Ponce
View author publications
You can also search for this author in PubMed Google Scholar
Mahmud Tareq Hassan Khan
View author publications
You can also search for this author in PubMed Google Scholar
Gerardo M. Casañola-Martín
View author publications
You can also search for this author in PubMed Google Scholar
Arjumand Ather
View author publications
You can also search for this author in PubMed Google Scholar
Mukhlis N. Sultankhodzhaev
View author publications
You can also search for this author in PubMed Google Scholar
Ramón García-Domenech
View author publications
You can also search for this author in PubMed Google Scholar
Francisco Torrens
View author publications
You can also search for this author in PubMed Google Scholar
Richard Rotondo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yovani Marrero-Ponce.

Additional information

Y. Marrero-Ponce address web: URL: www.uv.es/yoma

Electronic supplementary material

Below is the electronic supplementary material.

ESM 1 (PDF 576 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Marrero-Ponce, Y., Khan, M.T.H., Casañola-Martín, G.M. et al. Bond-based 2D TOMOCOMD-CARDD approach for drug discovery: aiding decision-making in ‘in silico’ selection of new lead tyrosinase inhibitors. J Comput Aided Mol Des 21, 167–188 (2007). https://doi.org/10.1007/s10822-006-9094-7

Download citation

Received: 23 August 2006
Accepted: 02 December 2006
Published: 28 February 2007
Issue Date: April 2007
DOI: https://doi.org/10.1007/s10822-006-9094-7

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Bond-based 2D TOMOCOMD-CARDD approach for drug discovery: aiding decision-making in ‘in silico’ selection of new lead tyrosinase inhibitors

Abstract

Similar content being viewed by others

3D QSAR and molecular docking studies of 4-alkoxy- and 4-acyloxy –phenyl ethylene thiosemicarbazone derivatives as tyrosinase inhibitors

QuBiLS-MAS, open source multi-platform software for atom- and bond-based topological (2D) and chiral (2.5D) algebraic molecular descriptors computations

Application of two-dimensional binary fingerprinting methods for the design of selective Tankyrase I inhibitors

Introduction

Theoretical framework

Background in edge-adjacency matrix and new edge-relations: stochastic edge-adjacency matrix

Chemical information and bond-based molecular vector

Definition of mathematical bilinear forms

The total non-stochastic and stochastic bond-based bilinear indices