Use of the index of ideality of correlation to improve models of eco-toxicity

Toropova, Alla P.; Toropov, Andrey A.

doi:10.1007/s11356-018-3291-5

Use of the index of ideality of correlation to improve models of eco-toxicity

Short Research and Discussion Article
Published: 25 September 2018

Volume 25, pages 31771–31775, (2018)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Environmental Science and Pollution Research Aims and scope Submit manuscript

Use of the index of ideality of correlation to improve models of eco-toxicity

Download PDF

359 Accesses
27 Citations
Explore all metrics

Abstract

Persistent organic pollutants are compounds used for various everyday purposes, such as personal care products, food, pesticides, and pharmaceuticals. Decomposition of considerable part of the above pollutants is a long-time process. Under such circumstances, estimation of toxicity for large arrays of organic substances corresponding to the above category of pollutants is a necessary component of theoretical chemistry. The CORAL software is a tool to establish quantitative structure—activity relationships (QSARs). The index of ideality of correlation (IIC) was suggested as a criterion of predictive potential of QSAR. The statistical quality of models for eco-toxicity of organic pollutants, which are built up, with use of the IIC is better than statistical quality of models, which are built up without use of data on the IIC.

Index of Ideality of Correlation: new possibilities to validate QSAR: a case study

Article 13 July 2017

Environmental Toxicity of Pesticides, and Its Modeling by QSAR Approaches

A Brief Introduction to Quantitative Structure-Activity Relationships as Useful Tools in Predictive Ecotoxicology

Discover the latest articles, news and stories from top researchers in related subjects.

Environmental Chemistry

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Eco-toxicity of nonreactive organic pollutants (personal care products, food, pesticides, and pharmaceuticals) is important data for development and improvement of chemical technology (Concu et al. 2017; Castillo-Garit et al. 2016; Kleandrova et al. 2014a, b). Exposure of chemical contaminants to the aquatic environment (Baun et al. 2000; Sánchez-Bayo 2006; Parvez et al. 2008) to air (Raevsky et al. 2011) poses serious threats to the preservation of environmental quality and to human health and is recognized as a global problem (Kleandrova et al. 2014a, b; Castillo-Garit et al. 2008; Papa et al. 2005; de Morais e Silva et al. 2018). In addition, ionic liquids are important class of the organic pollutants caused by their use of everyday life (Peric et al. 2015; Ma et al. 2015). Other source of eco-toxicologic pollutants is associated with the massive use of petroleum-derived organic solvents (Perales et al. 2017). Finally, nanomaterials become additional source of eco-toxic effects (Nowack and Mitrano 2018). Thus, the development of databases together with predictive models related to eco-toxicity data for nonreactive pollutants becomes an important task of biochemistry and medicinal chemistry.

The aim of this study is estimation of the CORAL software (Toropova and Toropov 2014) as a possible tool to build up predictive models for eco-toxicity. The index of ideality of correlation (IIC) (Toropova and Toropov 2017; Toropov and Toropova 2017; Toropov et al. 2018; Toropov and Toropova 2018) is examined as a criterion of predictive potential of the CORAL model of eco-toxicity.

Method

Data

The experimental values measured for EC50 (effective molar concentration) (mol/L) are represented by negative decimal logarithm pEC50. The data taken in the literature (de Morais e Silva et al. 2018). These numerical data (n = 111) were randomly distributed into the training (n = 28), invisible training (n = 27), calibration (n = 29), and external validation (n = 27) sets. Table 1 confirms that the percentage of the identical distribution is not large.

Table 1 Percentage of identical distribution of compounds into the training, invisible training, calibration, and validation sets

Full size table

Optimal descriptor

The optimal descriptor (Toropova and Toropov 2014) used here is calculated as the following:

$$ DCW\left({T}^{\ast },{N}^{\ast}\right)=\sum \limits_{k=1}^{NA} CW\left({S}_k\right)+\sum \limits_{k=1}^{NA-1} CW\left(S{S}_k\right) $$

(1)

The S_k is the “SMILES-atom,” i.e., one symbol or two symbols (e.g.. “C,” “N,” and “O”) which cannot be examined separately (e.g., “Cl” and “Si”); the SS_k is a combination of two SMILES-atoms. The CW(S_k) and CW(SS_k) are so-called correlation weights of the above-mentioned attributes of SMILES. The numerical data on the CW(S_k) and CW(SSS_k) are calculated with the Monte Carlo method, i.e., the optimization procedure which gives maximal value of a target function (TF).

QSAR models, calculated with the Monte Carlo optimization of target functions TF₁ and TF₂:

$$ {TF}_1={r}_{TRN}+{r}_{iTRN}-\left|{r}_{TRN}-{r}_{iTRN}\right|\ast 0.1 $$

(2)

$$ {TF}_2={TF}_3+{IIC}_{CLB}\ast 0.1 $$

(3)

The r_TRN and r_iTRN are correlation coefficient between observed and predicted endpoint for the training and invisible training sets, respectively.

The IIC_CLB is calculated with data on the calibration (CLB) set as the following:

$$ {IIC}_{CLB}={r}_{CLB}\frac{\min \Big({}{}^{-}{MAE}_{CLB},{}{}^{+}{MAE}_{CLB}\Big)}{\mathit{\max}\Big({}{}^{-}{MAE}_{CLB},{}{}^{+}{MAE}_{CLB}\Big)} $$

(4)

$$ {}{}^{-}M{AE}_{CLB}=\frac{1}{{}{}^{-}N}\sum \limits_{k=1}^{-N}\mid {\varDelta}_k\mid, {\varDelta}_k<0;{}{}^{-}N\ \mathrm{is}\ \mathrm{the}\ \mathrm{number}\ \mathrm{of}\ {\varDelta}_k<0 $$

(5)

$$ {}{}^{+}M{AE}_{CLB}=\frac{1}{{}{}^{+}N}\sum \limits_{k=1}^{-N}\mid {\varDelta}_k\mid, \kern0.5em {\varDelta}_k\ge 0;{}{}^{+}N\ \mathrm{is}\ \mathrm{the}\ \mathrm{number}\ \mathrm{of}\ {\varDelta}_k\ge 0 $$

(6)

$$ {\varDelta}_k={\mathrm{observed}}_k-{\mathrm{calculated}}_k $$

(7)

The observed and calculated are corresponding values of pEC50.

Having the numerical data on the CW(S_k) and CW(SS_k), the predictive model is calculated by the least squares method with compounds from the training set:

$$ p{EC}_{50}={C}_0+{C}_1\ast DCW\left({T}^{\ast },{N}^{\ast}\right) $$

(8)

Results and discussion

Three models for pEC50 are built up using three random splits with two versions of target function TF₁ calculated with Eq. 2 and TF₂ calculated with Eq. 3.

In the case of TF₁ these models are the following:

$$ \mathrm{pEC}50=1.732\left(\pm 0.027\right)+0.3695\left(\pm 0.0047\right)\ast \mathrm{DCW}\left(1,2\right) $$

(9)

$$ \mathrm{pEC}50=1.842\left(\pm 0.042\right)+0.3694\left(\pm 0.0063\right)\ast \mathrm{DCW}\left(1,6\right) $$

(10)

$$ \mathrm{pEC}50=1.784\left(\pm 0.023\right)+0.4488\left(\pm 0.0046\right)\ast \mathrm{DCW}\left(1,2\right) $$

(11)

In the case of TF₂, these models are the following:

$$ \mathrm{pEC}50=1.582\left(\pm 0.048\right)+0.3745\left(\pm 0.0069\right)\ast \mathrm{DCW}\left(1,15\right) $$

(12)

$$ \mathrm{pEC}50=1.366\left(\pm 0.054\right)+0.2766\left(\pm 0.0052\right)\ast \mathrm{DCW}\left(1,15\right) $$

(13)

$$ \mathrm{pEC}50=2.009\left(\pm 0.036\right)+0.4891\left(\pm 0.0091\right)\ast \mathrm{DCW}\left(1,15\right) $$

(14)

Table 2 contains the statistical characteristics of the models calculated with Eqs. 3–5. Comparison of these models with model from the literature (de Morais e Silva et al. 2018) shows that the CORAL-models are better for the external validation set.

Table 2 The statistical characteristics of models for eco-toxicity

Full size table

Figure 1 contains comparison of co-evolutions of correlations between observed and calculated pEC50 for training, invisible training, and calibration sets. The absence of overtraining is the main difference between the optimization with TF₂ and optimization with TF₁. Factually, this is an advantage of the optimization with TF₂.

Concordance correlation coefficient (CCC) (I-Kuei Lin 1989) and average <R_m²> (Roy et al. 2009; Ojha et al. 2011) are widely used criteria of predictive potential of a QSAR model. In other words, if there are model-1 and model-2 and CCC-1 is larger than CCC-2, then the model-1 should has better predictive potential for external compounds. Analogically, if there are model-1 and model-2 and R_m²-1 is larger than R_m²-2, then the model-1 should has better predictive potential for external compounds. The same principle is related to IIC: larger value of IIC should be observed for model with better predictive potential. The CCC and <R_m²> give correct recommendation for pair of models built up with TF₁ and TF₂ for split #1 and #3, but for split #2 these criteria give wrong recommendation (Table 2). The IIC gives correct recommendations for all splits #1, #2, and #3. Thus, CCC (I-Kuei Lin 1989), <R_m²> (Roy et al. 2009; Ojha et al. 2011) and IIC (Toropova and Toropov 2017; Toropov and Toropova 2017; Toropov et al. 2018; Toropov and Toropova 2018) are different criteria of predictive potential.

Supplementary materials contain confirmation of the compliances of the CORAL approach to OECD principles: Table S1 contains definition of the domain of applicability; Table S2 contains mechanistic interpretation of the CORAL model in terms of SMILES-attributes, which are promoters of increase or decrease for pEC50. Table S3 contains observed and calculated pEC50 together with distribution into the training, invisible training, calibration, and validation sets.

Conclusions

The CORAL software factually is a tool to build up predictive models for eco-toxicity of compounds examined here. The target function TF₂ gives models with better predictive potential in comparison with models based on the Monte Carlo optimization with TF₁. In other words, the IIC is checked up with three random splits. Hence, the IIC can be a useful criterion of the predictive potential of QSAR models of eco-toxicity.

References

Baun A, Jensen SD, Bjerg PL, Christensen TH, Nyholm N (2000) Toxicity of organic chemical pollution in groundwater downgradient of a Landfill (Grindsted, Denmark). Environ Sci Technol 34(9):1647–1652. https://doi.org/10.1021/es9902524
Article CAS Google Scholar
Castillo-Garit JA, Marrero-Ponce Y, Escobar J, Torrens F, Rotondo R (2008) A novel approach to predict aquatic toxicity from molecular structure. Chemosphere 73(3):415–427. https://doi.org/10.1016/j.chemosphere.2008.05.024
Article CAS Google Scholar
Castillo-Garit JA, Abad C, Casañola-Martin GM, Barigye SJ, Torrens F, Torreblanca A (2016) Prediction of aquatic toxicity of benzene derivatives to tetrahymena pyriformis according to OECD principles. Curr Pharm Des 22(33):5085–5094. https://doi.org/10.2174/1381612822666160804095107
Article CAS Google Scholar
Concu R, Kleandrova VV, Speck-Planche A, Cordeiro MNDS (2017) Probing the toxicity of nanoparticles: a unified in silico machine learning model based on perturbation theory. Nanotoxicology 11(7):891–906. https://doi.org/10.1080/17435390.2017.1379567
Article CAS Google Scholar
de Morais e Silva L, Alves MF, Scotti L, Lopes WS, Scotti MT (2018) Predictive ecotoxicity of MoA 1 of organic chemicals using in silico approaches. Ecotoxicol Environ Saf 153:151–159. https://doi.org/10.1016/j.ecoenv.2018.01.054
Article CAS Google Scholar
I-Kuei Lin L (1989) A concordance correlation coefficient to evaluate reproducibility. Biometrics 45(1):255–268. https://doi.org/10.2307/2532051
Article Google Scholar
Kleandrova VV, Luan F, González-Díaz H, Ruso JM, Melo A, Speck-Planche A, Cordeiro MNDS (2014a) Computational ecotoxicology: simultaneous prediction of ecotoxic effects of nanoparticles under different experimental conditions. Environ Int 73:288–294. https://doi.org/10.1016/j.envint.2014.08.009
Article CAS Google Scholar
Kleandrova VV, Luan F, González-Díaz H, Ruso JM, Speck-Planche A, Cordeiro MNDS (2014b) Computational tool for risk assessment of nanomaterials: novel QSTR-perturbation model for simultaneous prediction of ecotoxicity and cytotoxicity of uncoated and coated nanoparticles under multiple experimental conditions. Environ Sci Technol 48(24):14686–14694. https://doi.org/10.1021/es503861x
Article CAS Google Scholar
Ma S, Lv M, Deng F, Zhang X, Zhai H, Lv W (2015) Predicting the ecotoxicity of ionic liquids towards Vibrio fischeri using genetic function approximation and least squares support vector machine. J Hazard Mater 283:591–598. https://doi.org/10.1016/j.jhazmat.2014.10.011
Article CAS Google Scholar
Nowack B, Mitrano DM (2018) Procedures for the production and use of synthetically aged and product released nanomaterials for further environmental and ecotoxicity testing. NanoImpact 10:70–80. https://doi.org/10.1016/j.impact.2017.12.001
Article Google Scholar
Ojha PK, Mitra I, Das RN, Roy K (2011) Further exploring R_m ² metrics for validation of QSPR models. Chemom Intell Lab Syst 107(1):194–205. https://doi.org/10.1016/j.chemolab.2011.03.011
Article CAS Google Scholar
Papa E, Battaini F, Gramatica P (2005) Ranking of aquatic toxicity of esters modelled by QSAR. Chemosphere 58(5):559–570. https://doi.org/10.1016/j.chemosphere.2004.08.003
Article CAS Google Scholar
Parvez S, Venkataraman C, Mukherji S (2008) Toxicity assessment of organic pollutants: reliability of bioluminescence inhibition assay and univariate QSAR models using freshly prepared Vibrio fischeri. Toxicol in Vitro 22(7):1806–1813. https://doi.org/10.1016/j.tiv.2008.07.011
Article CAS Google Scholar
Perales E, García JI, Pires E, Aldea L, Lomba L, Giner B (2017) Ecotoxicity and QSAR studies of glycerol ethers in Daphnia magna. Chemosphere 183:277–285. https://doi.org/10.1016/j.chemosphere.2017.05.107
Article CAS Google Scholar
Peric B, Sierra J, Martí E, Cruañas R, Garau MA (2015) Quantitative structure-activity relationship (QSAR) prediction of (eco)toxicity of short aliphatic protic ionic liquids. Ecotoxicol Environ Saf 115:257–262. https://doi.org/10.1016/j.ecoenv.2015.02.027
Article CAS Google Scholar
Raevsky OA, Modina EA, Raevskaya OE (2011) QSAR models of the inhalation toxicity of organic compounds. Pharm Chem J 45(3):165–169. https://doi.org/10.1007/s11094-011-0585-z
Article CAS Google Scholar
Roy PP, Paul S, Mitra I, Roy K (2009) On two novel parameters for validation of predictive QSAR models. Molecules 14(5):1660–1701. https://doi.org/10.3390/molecules14051660
Article CAS Google Scholar
Sánchez-Bayo F (2006) Comparative acute toxicity of organic pollutants and reference values for crustaceans. I. Branchiopoda, Copepoda and Ostracoda. Environ Pollut 139(3):385–420. https://doi.org/10.1016/j.envpol.2005.06.016
Article CAS Google Scholar
Toropov AA, Toropova AP (2017) The index of ideality of correlation: a criterion of predictive potential of QSPR/QSAR models? Mutat Res Genet Toxicol Environ Mutagen 819:31–37. https://doi.org/10.1016/j.mrgentox.2017.05.008
Article CAS Google Scholar
Toropov AA, Toropova AP (2018) Application of the Monte Carlo method for building up models for octanol-water partition coefficient of platinum complexes. Chem Phys Lett 701:137–146. https://doi.org/10.1016/j.cplett.2018.04.012
Article CAS Google Scholar
Toropov AA, Carbó-Dorca R, Toropova AP (2018) Index of ideality of correlation: new possibilities to validate QSAR: a case study. Struct Chem 29(1):33–38. https://doi.org/10.1007/s11224-017-0997-9
Article CAS Google Scholar
Toropova AP, Toropov AA (2014) CORAL software: prediction of carcinogenicity of drugs by means of the Monte Carlo method. Eur J Pharm Sci 52(1):21–25. https://doi.org/10.1016/j.ejps.2013.10.005
Article CAS Google Scholar
Toropova AP, Toropov AA (2017) The index of ideality of correlation: a criterion of predictability of QSAR models for skin permeability? Sci Total Environ 586:466–472. https://doi.org/10.1016/j.scitotenv.2017.01.198
Article CAS Google Scholar

Download references

Funding

This research was supported by the LIFE-CONCERT project (LIFE17 GIE/IT/000461).

Author information

Authors and Affiliations

Laboratory of Environmental Chemistry and Toxicology, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Via La Masa 19, 20156, Milan, Italy
Alla P. Toropova & Andrey A. Toropov

Authors

Alla P. Toropova
View author publications
You can also search for this author in PubMed Google Scholar
Andrey A. Toropov
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Authors have done equivalent contributions to this work.

Corresponding author

Correspondence to Alla P. Toropova.

Additional information

Responsible editor: Philippe Garrigues

Electronic supplementary material

ESM 1

(DOCX 34 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Toropova, A.P., Toropov, A.A. Use of the index of ideality of correlation to improve models of eco-toxicity. Environ Sci Pollut Res 25, 31771–31775 (2018). https://doi.org/10.1007/s11356-018-3291-5

Download citation

Received: 26 April 2018
Accepted: 18 September 2018
Published: 25 September 2018
Issue Date: November 2018
DOI: https://doi.org/10.1007/s11356-018-3291-5

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Use of the index of ideality of correlation to improve models of eco-toxicity

Abstract

Similar content being viewed by others

Index of Ideality of Correlation: new possibilities to validate QSAR: a case study

Environmental Toxicity of Pesticides, and Its Modeling by QSAR Approaches

A Brief Introduction to Quantitative Structure-Activity Relationships as Useful Tools in Predictive Ecotoxicology

Introduction

Method

Data

Optimal descriptor

Results and discussion

Conclusions

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Additional information

Electronic supplementary material

ESM 1

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Use of the index of ideality of correlation to improve models of eco-toxicity

Abstract

Similar content being viewed by others

Index of Ideality of Correlation: new possibilities to validate QSAR: a case study

Environmental Toxicity of Pesticides, and Its Modeling by QSAR Approaches

A Brief Introduction to Quantitative Structure-Activity Relationships as Useful Tools in Predictive Ecotoxicology

Explore related subjects

Introduction

Method

Data

Optimal descriptor

Results and discussion

Conclusions

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Additional information

Electronic supplementary material

ESM 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation