Quasi-SMILES-Based QSPR/QSAR Modeling

Ahmadi, Shahin; Azimi, Neda

doi:10.1007/978-3-031-28401-4_8

Shahin Ahmadi⁴ &
Neda Azimi⁵

Part of the book series: Challenges and Advances in Computational Chemistry and Physics ((COCH,volume 33))

283 Accesses
1 Citations

Abstract

Quantitative structure–property/activity relationships (QSPRs/QSARs) have been used to predict the physicochemical property and biological activity of different substances, considering that the physicochemical property/biological activity of a new or untested substance can be inferred from the molecular structure or other properties of similar compounds whose properties/activities have already been assessed. Traditional QSPR/QSAR models based on physicochemical properties and molecular information are not so successful in predicting endpoint of substances such as nanomaterials due to scarcity of available dataset in same conditions. A new approach using eclectic information as descriptors to predict the endpoint of substance materials was developed in CORAL software (http://www.insilico.eu/coral). In this approach, physicochemical properties and the experimental conditions of substance are represented by so-called quasi-SMILES, which are character-based representations derived from traditional Simplified Molecular Input Line Entry System (SMILES). Thus, a main advantage of the quasi-SMILES is to increase the number of available datasets by using the eclectic data in developing quasi-SMILES-based QSPRs/QSARs models. This chapter provides instructions on how to use CORAL software for building QSPR/QSAR models based on quasi-SMILES.

Access provided by Autonomous University of Puebla. Download chapter PDF

Quasi-SMILES: quantitative structure–activity relationships to predict anticancer activity

Article 10 October 2018

Development of the Latest Tools for Building up “Nano-QSAR”: Quantitative Features—Property/Activity Relationships (QFPRs/QFARs)

QSAR modeling of toxicities of ionic liquids toward Staphylococcus aureus using SMILES and graph invariants

Article 09 July 2020

Keywords

1 Introduction

Quantitative structure–activity/property relationship (QSAR/QSPR) approach is indubitably of considerable importance in food chemistry [1, 2], environmental chemistry [3], modern chemistry [4,5,6], biochemistry [7], nanotechnology [8, 9], and drug design [10, 11]. The QSAR/QSPR approach is the mathematical and computerized search for compounds with desired activities/properties using chemical intuition and experience. Once a structure–activity/property correlation has been established, any number of compounds, including those not yet synthesized, can be easily screened on a computer to select structures with the desired activity/properties. Then the most promising compounds can be found for synthesis and experimental testing [12]. Therefore, QSAR/QSPR study saves cost and time for the development process of new molecules as drugs, materials, additives, or any other purpose. While finding successful structure–activity models is not an easy task, the recent increase in the number of papers in QSPR/QSAR research clearly indicates the rapid evolution in this area. To obtain a significant correlation, it is very important to use appropriate descriptors, whether they are theoretical, empirical, or derived from easily empirical properties of the constructs [12]. A group of descriptors shows simple molecular properties and therefore can give insight into the physicochemical nature of the activity/property under consideration.

Considering the growth of nanotechnology, modeling the properties or toxicity of nanoparticles (NPs) on living organisms is very important [13,14,15]. Although it is difficult to conduct toxicological experiments or obtain physical properties of NPs on a case-by-case basis, QSPR/QSAR is a computationally efficient technique because it saves time, cost, and animal sacrifice. The first part of nano-QSPR/QSAR model implementation includes data collection (including descriptors and endpoints) and data processing. The dataset can be obtained from the literature, databases, experiments, or integrated multiple sources. Therefore, to construct nano-QSPR/QSAR models, it is important to identify a new set of descriptors that can accurately represent the properties of NPs as well as the experimental conditions.

During recent years, the Simplified Molecular Input Line Entry System (SMILES) and quasi-SMILES descriptors have been examined by some researchers for QSPR/QSAR modeling [16,17,18,19]. The SMILES can reveal molecular structures, and quasi-SMILES can represent molecular structure and physicochemical properties and exposure conditions [8, 20, 21]. SMILES of a molecule is based on a set of rules that allow a molecular structure to be represented as a sequence of atom and bond symbols, but quasi-SMILES imports the physicochemical properties and experimental conditions as a string of characters after SMILES symbol.

2 Principals of QSPR/QSAR Models

Although QSPR/QSAR modeling has been used for over five decades, many studies still do not follow the Organization of Economic Co-operation and Development (OECD) guidelines. Figure 8.1 summarizes the best practices for each step of QSPR/QSAR approach using models in peer reviewed literature. Dearden et al. have reported a detailed description of common errors in QSPR/QSAR research [22].

A flow chart depicts the steps of the Q S P R forward slash Q S A R approach using models in peer-reviewed literature. The flowchart is divided into two parts model training and testing. — **Fig. 8.1**

According to OECD guidelines, if a QSPR/QSAR study is to be reliable, the following five principles must be met: (i) a well-defined endpoint, (ii) an unambiguous algorithm, (iii) a defined applicability domain (AD), (iv) appropriate measures of goodness-of-fit, robustness, and predictivity, and (v) a mechanistic interpretation, if possible.

3 Monte Carlo Technique for Nano-QSPR/QSAR

3.1 SMILES and Quasi-SMILES

SMILES is a chemical notation system designed by Weininger et al. [23, 24]. According to the principles of molecular graph theory, SMILES uses a very small, natural grammar to specify precise structural features. The SMILES symbol system is also suitable for fast machine processing. Quasi-SMILES is an alternative to SMILES, which is used for substances considering physicochemical properties and experimental conditions.

3.2 The Main Step for QSPR/QSAR Modeling by SMILES or Quasi-SMILES

CORrelation And Logic (CORAL) software (http://www.insilico.eu/coral) has two possibilities for building QSPR/QSAR models based on SMILES or quasi-SMILES. In the following, the method of preparing the input data for the CORAL software is described.

3.2.1 Dataset Preparation for Models Based on SMILES

The SMILES string is a procedure for representing a two-dimensional molecular graph as a one-dimensional string that can show the connectivity and chirality of a molecule. In most cases, there are too many SMILES strings for a structure. Canonical SMILES gives a single ‘canonical’ form for any particular molecule. Molecular structures of desired compounds were transformed to canonical SMILES using different software such as Open Babel and ACD/ChemSketch program. Figure 8.2a, b indicates the sample of data based on SMILES, and quasi-SMILES as input for CORAL software, respectively. The first column indicates set, the second is compound ID, the third is SMILES/quasi-SMILES, and the last column is desired property/activity.

Two tables depict the input for CORAL software. Part a represents the data for SMILES and part b represents Quasi-SMILES. — **Fig. 8.2**

3.2.2 Dataset Preparation for Models Based on Quasi-SMILES

For building of QSPR/QSAR in different physicochemical properties and/or the experimental conditions of substance, one can use quasi-SMILES instead of SMILES of molecules. Dataset preparation for quasi-SMILES is same as SMILES, only SMILES is replaced by quasi-SMILES.

3.2.3 Quasi-SMILES Definition for Various Datasets/Endpoints

Quasi-SMILES is a sequence of symbols that not only represents the molecular structure but also the different conditions that can affect the endpoint under investigation. Eclectic data can include: different physical properties such as temperature, pressure, and assay of experiment to obtain an endpoint, or cell line type, time exposition, concentration, etc. to obtain an activity. The type and number of eclectic data can be different in various datasets.

Quasi-SMILES may be made by eclectic condition, only [4, 13] or combination of SMILES and eclectic conditions [5, 8]. The continuous eclectic conditions can be normalized by the following equation for assigning codes:

$$ {\text{Norm}}\left( {E_{i} } \right) = \frac{{\min \left( {E_{i} } \right) + E_{i} }}{{\min \left( {E_{i} } \right) + \max \left( {E_{i} } \right)}} $$

(8.1)

E_i is its value of physicochemical parameter E, min($E_{i}$) is minimum value of E, and max($E_{i}$) indicates maximum value of E.

According to Table 8.1, the number of unique values in each parameter was less than 10; therefore, the quasi-SMILES descriptors representations could be coded by assigning a number between zero and nine in a single character.

Table 8.1 Distinction of standardized physiochemical features into classes 1–9 according to its value

Full size table

A further development of the CORAL software (CORAL-2020) allows the display of experimental conditions through groups of symbols enclosed in parentheses. Table 8.2 shows the comparison codes in the last version (CORAL-2020) and old version of CORAL for creating quasi-SMILES in recently proposed models for cytotoxicity of metal oxide NPs [4]. One can see codes-2020 are quite transparent and consequently are more convenient for a user. As is clearly evident, CORAL-2020 codes being quite transparent and thus more user-friendly. Table 8.2 indicates codes used for the cell line, method, time exposition, concentration, nanoparticle size, and metal oxide type. Table 8.3 indicates the examples of quasi-SMILES obtained based on these codes.

Table 8.2 Codes used for the cell line, method, time exposition, concentration, nanoparticle size, and metal oxide type to convert various information of the experimental data to quasi-SMILES [4]

Full size table

Table 8.3 Some examples for quasi-SMILES extracted by codes indicated in Table 8.2

Full size table

Toropov and Toropova developed a QSAR model based on the new version of CORAL for the toxicity of ZnO NPs [14]. Experimental data from the literature are toxicity assessment of ZnO NPs and ZnO NPs coated with polyethylene glycol (PEG), which are investigated by intraperitoneal injections in the rat (50, 100, 200 mg/kg) for one month. Measurement of the toxic effects of renal factors including creatinine, uric acid, and blood urea nitrogen was measured after 15 and 30 days after injection. Table 8.4 shows the quasi-SMILES attributes together with experimental conditions. Table 8.5 represents examples of available quasi-SMILES obtained based on this condition and related activity.

Table 8.4 Codes used as fragments of quasi-SMILEs and their meaning

Full size table

Table 8.5 Some examples for quasi-SMILES extracted by codes presented in Table 8.4

Full size table

Toropova et al. developed new nano-QSAR model for predicting toxicity of nano-mixtures to Daphnia magna based on quasi-SMILES [25]. The binary mixtures of TiO₂ NPs and with of one of the second component including AgNO₃, Cd(NO₃)₂, Cu(NO₃)₂, CuSO₄, Na₂HAsO₄, NaAsO₂, benzylparaben, and benzophenone-3 have been investigated. Quasi-SMILES contain the following information: (1) Second component of mixture represented by SMILES; (2) core diameter of TiO₂ NPs; (3) Zeta potential of TiO₂ NPs; (4) mole fraction of TiO₂ NPs; (5) mole fraction of mixed substance; and (6) exposure time. Figure 8.3 shows the transformation of the experimental condition and substance into the quasi-SMILES.

A screenshot depicts the transformation of the experimental condition and substance into the quasi-SMILES. The upper half of the image represents the experimental data. — **Fig. 8.3**

3.2.4 Model Development

Model development has several steps that can be organized in CORAL software and does not require any software for data partitioning, descriptor generation, and model validation. In the following sections, the main step for QSPR/QSAR modeling using CORAL software is described.

3.2.5 Dataset Splitting

After the preparation and curation of dataset, the next step of building a QSAR/QSPR model for an endpoint by CORAL software (http://www.insilico.eu/coral) is loading an array of lines. Each line consists of four components.

The first column is the types of set which ‘+’, ‘−’, ‘#’, and ‘*’ indicate the active training, passive training, calibration, and validation, respectively (Fig. 8.2).

The second column without space with type of set is number or ID of compound.
The third column is quasi-SMILES.
The last column is endpoint value.

After the preparation of input file, the dataset was splitted into training, passive training, calibration, and validation sets using CORAL software, randomly with desired present for each set.

3.2.6 Monte Carlo Optimization Process

Quasi-SMILES is a group of attributes where each attribute group is converted into a group of coefficients called correlation weights. Monte Carlo optimization refines the correlation weights that provide numerical data on them, which maximizes the predictive potential of a model as much as possible. Figure 8.4 shows the flowchart of one cycle of Monte Carlo optimization of correlation weights (n is the number of correlation weights that contribute to model construction).

The flowchart depicts the one cycle of Monte Carlo optimization of correlation weights. The flowchart represents both the condition of an algorithm. — **Fig. 8.4**

There are different target functions (TFs) in CORAL software for Monte Carlo optimization [25,26,27,28,29], which are introduced below four TFs:

$$ {\text{TF}}_{0} = r_{{{\text{AT}}}} + r_{{{\text{PT}}}} - \left| {r_{{{\text{AT}}}} - r_{{{\text{PT}}}} } \right| \times C $$

(8.2)

$$ {\text{TF}}_{1} = {\text{TF}}_{1} + {\text{IIC}}_{{\text{C}}} \times W_{{{\text{IIC}}}} $$

(8.3)

$$ {\text{TF}}_{2} = {\text{TF}}_{1} + {\text{CII}}_{{\text{C}}} \times W_{{{\text{CII}}}} $$

(8.4)

$$ {\text{TF}}_{3} = {\text{TF}}_{1} + {\text{IIC}}_{{\text{C}}} \times W_{{{\text{IIC}}}} + {\text{CII}}_{{\text{C}}} \times W_{{{\text{CII}}}} $$

(8.5)

$r_{{{\text{AT}}}}$ and $r_{{{\text{PT}}}}$ represent the correlation coefficient between the experimental and predicted endpoints for active and passive training sets, respectively. Empirical constant (C), W_IIC, and W_CII have a defined numerical value [1, 18, 30,31,32,33].

IIC_C is the index of ideality correlation. IIC_C is obtained based on the calibration set as follows:

$$ {\text{CII}}_{{\text{C}}} = r_{{\text{C}}} \frac{{{\text{min}}\left( {{^{-}{\text{MAE}}}_{{\text{C}}} ,{^{+}{\text{MAE}}}_{{\text{C}}} } \right)}}{{{\text{max}}\left( {{^{-}{\text{MAE}}}_{{\text{C}}} ,{^{+}{\text{MAE}}}_{{\text{C}}} } \right)}} $$

(8.6)

$${^{-}{\text{MAE}}}_{{\text{C}}} = \frac{1}{{{^{-}N}}}\sum \left| {\Delta_{i} } \right|, {^{-}N}\,{\text{is}}\,{\text{the}}\,{\text{number}}\,{\text{of}}\,\Delta_{i} < 0 $$

(8.7)

$${^{+}{\text{MAE}}}_{{\text{C}}} = \frac{1}{{{^{-}N}}}\sum \left| {\Delta_{i} } \right|, {^{+}N}\,{\text{is}}\,{\text{the}}\,{\text{number}}\,{\text{of}}\,\Delta_{i} \ge 0 $$

(8.8)

$$ \Delta_{i} = {\text{Obs}}_{i} - {\text{Calc}}_{i} $$

(8.9)

The ${\text{Obs}}_{i}$ and ${\text{Calc}}_{i}$ are the experimental and predicted endpoint for $i{\text{th}}$ compound.

The correlation intensity index (CII), like IIC criteria, was developed to modify the quality of the Monte Carlo optimization used to build the QSPR/QSAR models. CII is formulated as follows:

$$ {\text{CII}} = 1 - \sum \Delta R_{i}^{2} > 0,\,{\text{If}}\,\Delta R_{i}^{2} < 0\,\,{\text{then}}\,\Delta R_{i}^{2} = 0 $$

(8.10)

$$ \Delta R_{i}^{2} = R_{i}^{2} - R^{2} $$

(8.11)

where R² is the coefficient of determination for all endpoints and $R_{i}^{2}$ is the coefficient of determination for all endpoints in the absence of ith compound. Therefore, if $\Delta R_{i}^{2}$ is greater than zero, the meaning of ith is an ‘opposite’ for the correlation between the experimental and calculated values of the set.

A small sum of $\Delta R_{i}^{2}$ means a more ‘intensive’ correlation.

The CORAL model for an endpoint (EP) is defined by the below equation:

$$ {\text{EP}} = C_{0} + C_{1} \times {\text{DW}}\left( {T,N} \right) $$

(8.12)

C₀ and C₁ represent regression coefficients, T is a threshold, and N is the number of optimization cycles. The DCW(T, N) is defined as the below equation:

$$ {\text{DCW}}\left( {T,N} \right) = \sum {\text{CW}}\left( {S_{k} } \right) $$

(8.13)

where S_k represents the symbol of a quasi-SMILES line; the CW(S_k) shows the correlation weights of S_k.

3.2.7 Applicability Domain

The AD of QSAR/QSAR models for CORAL software is determined in two steps based on the distribution of SMILES or quasi-SMILES features in the training and calibration sets:

Step 1: the statistical defect (d_k) is calculated for each involved (unblocked) SMILES or quasi-SMILES feature (S_k) to build the model with the following equation:

$$ d_{k} = \frac{{\left| {P\left( {S_{k} } \right) - P^{\prime}\left( {S_{k} } \right)} \right|}}{{N\left( {S_{k} } \right) + N^{\prime}\left( {S_{k} } \right)}} $$

(8.14)

here, P(S_k) and P′(S_k) represent the probability of S_k in the active training set and calibration sets, respectively; N(S_k) and N′(S_k) denote the frequencies of S_k in the active training and calibration sets, respectively.

Step 2: the quasi-SMILES (D_i) statistical defect of all compounds is defined according to the following equation:

$$ D_{i} = \mathop \sum \limits_{k = 1}^{{N_{{\text{A}}} }} d_{k} $$

(8.15)

here N_A denotes the number of non-blocked quasi-SMILES features in the quasi-SMILES.

Quasi-SMILES falls in the AD if:

$$ D_{i} < 2 \times \overline{D} $$

(8.16)

where $\overline{D}$ represents average statistical defect of the training set.

3.2.8 Model Validation

Validation, as the fourth principle of OECD, is recognized as an intrinsic component to check the robustness, predictability, and reliability of any QSPR/QSAR models. There are three approaches to examine the robustness, reliability, and predictive potential of the QSPR/QSAR models in CORAL software, including:

Internal validation
External validation
Y-scrambling or data randomization.

Various statistical criteria such as determination coefficient (R²), concordance correlation coefficient (CCC), cross-validated correlation coefficient (Q²), $Q_{F1}^{2}$, $Q_{F2}^{2}$, $Q_{F3}^{2}$, standard error of estimation (s), mean absolute error (MAE), Fischer ratio (F) and root-mean-square error (RMSE), $R_{{\text{m}}}^{2}$, and average of $R_{{\text{m}}}^{2}$ metric ($\overline{{R_{{\text{m}}}^{2} }}$) are calculated to authenticate the QSPR/QSAR models constructed based on the Monte Carlo optimization by the CORAL software. Table 8.6 indicates the mathematical equation of diverse statistical benchmark of the predictive potential for CORAL models.

Table 8.6 Mathematical formulation of different statistical benchmark of the predictive potential for CORAL models

Full size table

3.2.9 Mechanistic Interpretation

The 5th OECD principle focuses on mechanistic interpretation of the QSPR/QSAR model if possible. The model interpretation is used to examine the critical and responsible attributes that influence the endpoint. Finally, the new compounds are designed based on these attributes. In the QSPR/QSAR modeling based on the CORAL software, the same structural attributes (S_k) collected from three or more different splits are used to perform the mechanistic interpretation [39,40,41,42]. These structural attributes (S_k) are divided into three categories according to previous studies:

Increasing factor if the CW(S_k) is positive in all splits and in three attempts,
Decreasing factor if the CW(S_k) is negative in all splits and in three attempts,
Undefined attributes if the CW(S_k) is both positive and negative [43,44,45].

4 Examples of Quasi-SMILES-Based QSPR/QSAR Models

Some examples of QSAR/QSPR models base on quasi-SMILES with CORAL software using different TFs are presented in Table 8.7.

Table 8.7 Some examples of QSAR/QSPR models base on quasi-SMILES with CORAL software using different TFs

Full size table

5 Conclusion and Future Direction

QSPR/QSAR modeling based on SMILES and quasi-SMILES by CORAL software is useful for big dataset. In CORAL software, QSPR/QSAR generally follows the five OECD principles. In addition, additional principles may be defined practically for nano-QSPR/QSAR that reflect the nature of the nanomaterial under investigation. For example, the new principles should take into account the test conditions and the quality of the applied equipment.

The use of CORAL software in building QSPR/QSAR models for nanomaterials in different conditions is simple, and the models can be easily predicted and interpreted. There are very good TFs (TF₀–TF₃) to find reliable correlation weights and this is one of the important capabilities of CORAL for building excellent QSAR/QSAR models. The type and number of input features can change the performance of a QSAR/QSPR model. But there is one of a shortcoming for CORAL software, the user can use only CORAL software descriptors, and it is impossible to add the other descriptors produced by other descriptor generators.

In CORAL software, there is only Monte Carlo algorithm to find correlation weights. The use of various algorithms can increase the quasi-SMILES QSPR/QSAR performance. Data splitting in CORAL software is done randomly; the possibility of using different methods of data splitting can increase the validity of the models. Since the correlation weight of the descriptors in this software is calculated through Monte Carlo approach, the use of consensus modeling can dramatically increase the prediction results.

Abbreviations

AD:: Applicability Domain
CCC:: Concordance Correlation Coefficient
CORAL:: CORrelation And Logic
CII:: Correlation Intensity Index
EP:: Endpoint
F :: Fischer ratio
IIC :: Index of Ideality Correlation
MAE:: Mean Absolute Error
NPs:: Nanoparticles
OECD:: Organization of Economic Co-operation and Development
QSAR:: Quantitative Structure–Activity Relationship
QSPR:: Quantitative Structure–Property Relationship
RMSE:: Root-Mean-Square Error
SMILES:: Simplified Molecular Input Line Entry System
TF:: Target Function

References

Ahmadi S, Ghanbari H, Lotfi S, Azimi N (2021) Mol Divers 25(1):87–97. https://doi.org/10.1007/s11030-019-10026-9
Article CAS PubMed Google Scholar
Achary PGR, Toropova AP, Toropov AA (2019) Food Res Int 122:40–46. https://doi.org/10.1016/j.foodres.2019.03.067
Article CAS PubMed Google Scholar
Kumar A, Kumar P (2021) J Hazard Mater 402:123777. https://doi.org/10.1016/j.jhazmat.2020.123777
Article CAS PubMed Google Scholar
Ahmadi S (2020) Chemosphere 242:125192. https://doi.org/10.1016/j.chemosphere.2019.125192
Article CAS PubMed Google Scholar
Ahmadi S, Ketabi S, Qomi M (2022) New J Chem 46:8827–8837. https://doi.org/10.1039/D2NJ00596D
Article CAS Google Scholar
Lotfi S, Ahmadi S, Kumar P (2021) RSC Adv 11:33849–33857. https://doi.org/10.1039/D1RA06861J
Article CAS PubMed PubMed Central Google Scholar
Ahmadi S, Khazaei MR, Abdolmaleki A (2014) Med Chem Res 23:1148–1161. https://doi.org/10.1007/s00044-013-0716-z
Article CAS Google Scholar
Ahmadi S, Aghabeygi S, Farahmandjou M, Azimi N (2021) Struct Chem 32:1893–1905. https://doi.org/10.1007/s11224-021-01748-4
Article CAS Google Scholar
Kumar P, Kumar A (2021) Nanotoxicology 15:1199–1214. https://doi.org/10.1080/17435390.2021.2008039
Article CAS PubMed Google Scholar
Ghasedi N, Ahmadi S, Ketabi S, Almasirad A (2022) J Recept Signal Transduct 42:418–428. https://doi.org/10.1080/10799893.2021.1988971
Article CAS Google Scholar
Ahmadi S, Moradi Z, Kumar A, Almasirad A (2022) J Recept Signal Transduct 42:361–372. https://doi.org/10.1080/10799893.2021.1957932
Article CAS Google Scholar
Karelson M, Lobanov VS, Katritzky AR (1996) Chem Rev 96:1027–1044. https://doi.org/10.1021/cr950202r
Article CAS PubMed Google Scholar
Ahmadi S, Toropova AP, Toropov AA (2020) Nanotoxicology 14:1118–1126. https://doi.org/10.1080/17435390.2020.1808252
Article CAS PubMed Google Scholar
Toropov AA, Toropova AP (2021) Sci Total Environ 772:145532. https://doi.org/10.1016/j.scitotenv.2021.145532
Article CAS PubMed Google Scholar
Toropov AA, Toropova AP (2020) Sci Total Environ 737:139720. https://doi.org/10.1016/j.scitotenv.2020.139720
Article CAS PubMed Google Scholar
Ahmadi S, Akbari A (2018) Environ Res 29:895–909. https://doi.org/10.1080/1062936X.2018.1526821
Article CAS Google Scholar
Lotfi S, Ahmadi S, Kumar P (2022) RSC Adv 12:24988–24997. https://doi.org/10.1039/D2RA03936B
Article CAS PubMed PubMed Central Google Scholar
Jafari K, Fatemi MH, Toropova AP, Toropov AA (2022) Chemom Intell Lab Syst 222:104500. https://doi.org/10.1016/j.chemolab.2022.104500
Article CAS Google Scholar
Toropov A, Toropova A, Lombardo A, Roncaglioni A, Lavado G, Benfenati E (2021) Environ Res 32:463–471. https://doi.org/10.1080/1062936X.2021.1914156
Article CAS Google Scholar
Choi J-S, Trinh TX, Yoon T-H, Kim J, Byun H-G (2019) Chemosphere 217:243–249. https://doi.org/10.1016/j.chemosphere.2018.11.014
Article CAS PubMed Google Scholar
Lotfi S, Ahmadi S, Zohrabi P (2020) Struct Chem 31:2257–2270. https://doi.org/10.1007/s11224-020-01568-y
Article CAS Google Scholar
Dearden JC, Cronin MTD, Kaiser KLE (2009) Environ Res 20:241–266. https://doi.org/10.1080/10629360902949567
Article CAS Google Scholar
Weininger D (1988) J Chem Inf Model 28:31–36. https://doi.org/10.1021/ci00057a005
Article CAS Google Scholar
Weininger D, Weininger A, Weininger JL (1989) J Chem Inf Comput Sci 29:97–101. https://doi.org/10.1021/ci00062a008
Article CAS Google Scholar
Toropova AP, Toropov AA, Fjodorova N (2022) NanoImpact 28:100427. https://doi.org/10.1016/j.impact.2022.100427
Article CAS PubMed Google Scholar
Kumar P, Kumar A, Lal S, Singh D, Lotfi S, Ahmadi S (2022) J Mol Struct 1265:133437. https://doi.org/10.1016/j.molstruc.2022.133437
Article CAS Google Scholar
Azimi A, Ahmadi S, Kumar A, Qomi M, Almasirad A (2022) Polycycl Aromat Compd 1–21. https://doi.org/10.1080/10406638.2022.2067194
Ahmadi S, Lotfi S, Afshari S, Kumar P, Ghasemi E (2021) Environ Res 32:1013–1031. https://doi.org/10.1080/1062936X.2021.2003429
Article CAS Google Scholar
Ahmadi S, Mehrabi M, Rezaei S, Mardafkan N (2019) J Mol Struct 1191:165–174. https://doi.org/10.1016/j.molstruc.2019.04.103
Article CAS Google Scholar
Nimbhal M, Bagri K, Kumar P, Kumar A (2020) Struct Chem 31:831–839. https://doi.org/10.1007/s11224-019-01468-w
Article CAS Google Scholar
Toropova AP, Duchowicz PR, Saavedra LM, Castro EA, Toropov AA (2020) Mol Inform 39:1900070. https://doi.org/10.1002/minf.201900070
Article CAS Google Scholar
Toropova AP, Toropov AA, Carnesecchi E, Benfenati E, Dorne JL (2020) Environ Sci Pollut Res 27:13339–13347. https://doi.org/10.1007/s11356-020-07820-6
Article CAS Google Scholar
Kumar P, Kumar A (2021) J Mol Struct 1246:131205. https://doi.org/10.1016/j.molstruc.2021.131205
Article CAS Google Scholar
Shayanfar A, Shayanfar S (2014) Eur J Pharm Sci 59:31–35. https://doi.org/10.1016/j.ejps.2014.03.007
Article CAS PubMed Google Scholar
Consonni V, Ballabio D, Todeschini R (2009) J Chem Inf Model 49:1669–1678. https://doi.org/10.1021/ci900115y
Article CAS PubMed Google Scholar
Roy K, Kar S (2014) Eur J Pharm Sci 62:111–114. https://doi.org/10.1016/j.ejps.2014.05.019
Article CAS PubMed Google Scholar
Lin LI-K (1992) Biometrics 48:599. https://doi.org/10.2307/2532314
Rücker C, Rücker G, Meringer M (2007) J Chem Inf Model 47:2345–2357. https://doi.org/10.1021/ci700157b
Article CAS PubMed Google Scholar
Manisha, Chauhan S, Kumar P, Kumar A (2019) Environ Res 30:145–159. https://doi.org/10.1080/1062936X.2019.1568299
Kumar P, Kumar A, Sindhu J, Lal S (2019) Drug Res (Stuttg) 69:159–167. https://doi.org/10.1055/a-0652-5290
Article CAS PubMed Google Scholar
Kumar P, Kumar A, Sindhu J (2019) Environ Res 30:63–80. https://doi.org/10.1080/1062936X.2018.1564067
Article CAS Google Scholar
Kumar P, Kumar A, Sindhu J (2019) SAR QSAR Environ Res 30:525–541. https://doi.org/10.1080/1062936X.2019.1629998
Article CAS PubMed Google Scholar
Toropov AA, Toropova AP, Benfenati E, Gini G, Leszczynska D, Leszczynski J (2012) Anticancer Agents Med Chem 12:807–817. https://doi.org/10.2174/187152012802650255
Article CAS PubMed Google Scholar
Nesmerak K, Toropov AA, Toropova AP, Kohoutova P, Waisser K (2013) Eur J Med Chem 67:111–114. https://doi.org/10.1016/j.ejmech.2013.05.031
Article CAS PubMed Google Scholar
Veselinović AM, Milosavljević JB, Toropov AA, Nikolić GM (2013) Eur J Pharm Sci 48:532–541. https://doi.org/10.1016/j.ejps.2012.12.021
Article CAS PubMed Google Scholar
Toropov AA, Kjeldsen F, Toropova AP (2022) Chemosphere 303:135086. https://doi.org/10.1016/j.chemosphere.2022.135086
Article CAS PubMed Google Scholar
Trinh TX, Choi J-S, Jeon H, Byun H-G, Yoon T-H, Kim J (2018) Chem Res Toxicol 31:183–190. https://doi.org/10.1021/acs.chemrestox.7b00303
Article CAS PubMed Google Scholar
Leone C, Bertuzzi EE, Toropova AP, Toropov AA, Benfenati E (2018) Chemosphere 210:52–56. https://doi.org/10.1016/j.chemosphere.2018.06.161
Article CAS PubMed Google Scholar
Toropova AP, Toropov AA, Benfenati E, Korenstein R, Leszczynska D, Leszczynski J (2015) Environ Sci Pollut Res 22:745–757. https://doi.org/10.1007/s11356-014-3566-4
Article CAS Google Scholar

Download references

Author information

Authors and Affiliations

Department of Chemistry, Faculty of Pharmaceutical Chemistry, Tehran Medical Sciences, Islamic Azad University, Tehran, Iran
Shahin Ahmadi
Advanced Chemical Engineering Research Center, Razi University, Kermanshah, Iran
Neda Azimi

Authors

Shahin Ahmadi
View author publications
You can also search for this author in PubMed Google Scholar
Neda Azimi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shahin Ahmadi .

Editor information

Editors and Affiliations

Department of Environmental Health Science, Institute of Pharmacological Research Mario Negri IRCCS, Milan, Italy
Alla P. Toropova
Department of Environmental Health Science, Institute of Pharmacological Research Mario Negri IRCCS, Milan, Italy
Andrey A. Toropov

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Ahmadi, S., Azimi, N. (2023). Quasi-SMILES-Based QSPR/QSAR Modeling. In: Toropova, A.P., Toropov, A.A. (eds) QSPR/QSAR Analysis Using SMILES and Quasi-SMILES. Challenges and Advances in Computational Chemistry and Physics, vol 33. Springer, Cham. https://doi.org/10.1007/978-3-031-28401-4_8

Download citation

DOI: https://doi.org/10.1007/978-3-031-28401-4_8
Published: 11 June 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-28400-7
Online ISBN: 978-3-031-28401-4
eBook Packages: Chemistry and Materials ScienceChemistry and Material Science (R0)

Publish with us

Policies and ethics

Quasi-SMILES-Based QSPR/QSAR Modeling

Abstract

Similar content being viewed by others

Quasi-SMILES: quantitative structure–activity relationships to predict anticancer activity

Development of the Latest Tools for Building up “Nano-QSAR”: Quantitative Features—Property/Activity Relationships (QFPRs/QFARs)

QSAR modeling of toxicities of ionic liquids toward Staphylococcus aureus using SMILES and graph invariants

Keywords

1 Introduction

2 Principals of QSPR/QSAR Models