Introduction

The diversity in the chemistry of the several environmental toxic compounds is one of the important considerations to study their toxicity (Raies et al. 2016; Chen et al. 2013; Basant et al. 2016). Some of these toxic substances are extensively used in day-to-day life as well as many industrial processes as solvents, as precursors for another compound such as dyes, plasticizers, pharmaceuticals, pesticides. Every year, millions of tons of chemicals are released into the environment (Settivari et al. 2015; Awaleh and Soubaneh 2014; Perocco et al. 1983). In order to protect human health and the environment, risk assessments should be conducted for all toxic chemicals, especially for those already found in the environment. Several tons of these toxic compounds produced from industry and their accumulation in the environmental habitats represent a global ecological danger. Also, many of these substances have also been reported to be persistent in nature (Halling-Sørensen et al. 1998; Battaglin and Kolpin 2009; Li et al. 2011; Luo et al. 2015; Satpathy et al. 2015a, b). Toxicity tests are usually used to predict the harmful effect of these compounds by considering the animal models (Raies et al. 2016). However, this is a time-consuming process; also it is not economically feasible to conduct toxicity tests on each chemical individually and also difficult to study the combinatorial effects of the toxic substances on organisms. Therefore, in the field of environmental toxicology and environmental chemistry, quantitative structure–activity relationships (QSARs) are currently used to predict toxicity for a number of toxic chemicals (Doke and Dhawale 2015; Raunio 2011; Boobis et al. 2002).

In the QSAR analysis, the relationship between physicochemical properties and toxicological effects can be represented as mathematical models, termed quantitative structure–activity relationships models (Fig. 1).

Fig. 1
figure 1

Overview of quantitative structure–activity relationships (QSAR)-based analysis on toxic compounds

Furthermore, by a proper selection of the training compounds, QSARs can cover large series of similar compounds and thus provide the basis for an efficient and comparative screening assessment of toxic substances (Tunkel et al. 2005; Hulzebos et al. 2005; Satpathy et al. 2015a, b). This article is an abridged version of the chapter published by Satpathy (2018), chapter 10 in the series Environmental Chemistry for a Sustainable World (http://www.springer.com/series/11480).

Quantitative structure–activity relationship analytical methods

The QSAR method begins by taking a suitable group of compounds known as training dataset. The required molecular features of a suitable training set are data maybe taken from the literature, various databases and sometimes computed by the software tools. The parameter to be studied and correlated with the molecular features may be either a physicochemical property. After the proper selection of training set, the actual QSAR modeling starts. The modeling procedure is to statistically correlate the molecular feature along with the biological/environmental properties of consideration (Fig. 2). Finally, the measure of goodness-of-fit of the QSAR models (validation) is performed by determination coefficient (R2) or adjusted determination coefficient (Radj2), summary square error (SSE) (reflects the deviation of predicted values from observed values), root mean square error (RMSE), used to indicate the precision of prediction and F-test (a variance test method of the overall significance level).

Fig. 2
figure 2

Steps of quantitative structure–activity relationship modeling methods for prediction of toxic compounds

Computational tools

QSAR-based toxicology research utilizes a wide variety of computational tools (Pirhadi et al. 2016; Liao et al. 2011), such as databases for storing data about chemicals, their toxicity and chemical properties, software for generating molecular descriptors and simulation tools to generate the QSAR equation and validation (Tables 1 and 2). However, the good predictive models for toxicity parameters depend crucially on selecting the right mathematical approach and the right molecular descriptors for the particular toxicity endpoint.

Table 1 Software tools and servers details for calculating variables in toxicity prediction in case of compounds
Table 2 Description of nine major types of databases to compute the descriptor calculation in case of toxic compounds

Applications for the prediction of toxic compounds

In the field of environmental toxicology, quantitative structure–activity relationships (QSARs) methods have been used as robust tools for predicting the toxicity of chemicals whenever no or little amount of data are available. The QSAR analysis has been used to predict a specific chemical class that behaves in a toxicologically similar manner (Pavan et al. 2008; Chen et al. 2004; Li et al. 2006). In the case of human being, the prediction of the acute toxicity of a compound is an important task in order to justify the in the regulatory assessment of particular compounds. However, mostly this information is obtained from the animal studies that is related to animal ethics and cost considerations. Therefore, the method alternative to animal experiments is preferable (Lapenna et al. 2010; Raies and Vladimir 2016). As a powerful technique, the QSAR methods have been widely applied in toxicology by many researchers. Cronin et al. (2003) emphasized that the application of QSAR techniques to predict ecologic effects and environmental fate of chemicals for facilitating the regulatory agencies and authorities will find them to be acceptable alternatives to chemical testing. Roberts (1991) studied the acute lethal toxicity data for a range of anionic and nonionic surfactants by QSAR modeling to predict by calculated log P (octanol/water) values. The toxicity prediction about active ingredients in pharmaceutical products and their importance and mechanism have been reviewed by Kruhlak et al. (2007). The successful prediction of genotoxicity of the compounds like 2-amino-9H-pyrido [2, 3-b] indole (AαC) and 2-aminoacetophenone (2-AAP) by QSAR has been studied by Worth et al. (2013). Comparative evaluation and prediction of mammalian acute toxicity, by considering lethal dose (LD50) as a dependent variable, have been studied by Gonella Diaza et al. (2015) in a dataset of 7417 toxic chemical compounds. Also, to predict no observed effect level (NOEL), developmental and reproductive toxicities have been successfully predicted by Hisaki et al. (2015) from a total of 892 toxic chemicals.

Advantages of quantitative structure–activity relationship-based studies

The QSAR-based study of toxic compounds having advantages summarized below:

  • Prediction of the environmental fate of the toxic compounds such as bio-concentration, soil sorption, biodegradation and so on.

  • Since the prediction methods are computer based, they provide a rapid assessment of toxicity of these compounds.

  • Further, they have the capability of reducing, and even replacing, animal tests for toxicological assessment of the pollutant compounds.

  • Industrial users can apply these models to screen new compounds and to assist in the process of designing out toxic features of new chemical entities. Ultimately, this information can be used by the regulatory agencies to impose the regulation of new and existing chemical compounds.

  • Prediction of toxicity can be applied to environmental risk assessments for common pollutants.

Challenges

Although QSAR-based methods have enormous potential for analyzing the toxic profile of compounds, certain challenges should be overcome.

  • Problems in Biological dataset

The foremost important thing in QSAR analysis is the data. One of the limitations is that very little amount of biochemical data are available in terms of mechanism of toxic action. Therefore, for validation purpose, it faces a problem, thereby causing inconsistency in prediction.

  • Better selection of dependent and independent variables and domain applicability

Predictions of toxicity should be made within the domain of applicability of an appropriately validated QSAR. Appropriately choosing descriptor variables and dependent variable leads to a good prediction of the model. Also, the number of independent variables in the case of specific chemicals is important for model generation and prediction.

  • Variability in toxicity action of compounds

Usually, it is expected that similar toxic chemicals possess similar mechanism of action in toxicity, but any of the compounds that do not possess the same mechanism of action will show up as outliers; that is, they will not be well modeled by the QSAR.

Conclusion

The development of models for quantitative structure–activity relationships (QSARs) and its application for toxicological effects are of great importance. A million number of chemicals from industry and other sources are released and exposed to the environment, but little is known about the toxicity of them. It would be difficult, expensive and time-consuming to test all such chemicals for toxicity. Due to this, QSAR methods have been used to interpret the toxicity and also many regulatory agencies are beginning to accept toxicities predicted by QSAR. The basic principles behind QSAR analysis are a prediction of biological activities from chemical structures that are closely related. However, while performing the analysis many of the aspects are to be considered such as avoiding the false correlation of the data and perfect experimental design. Therefore, validation process for the experiment must be observed carefully.